Sample records for relevant statistical methods

  1. Understanding common statistical methods, Part I: descriptive methods, probability, and continuous data.

    PubMed

    Skinner, Carl G; Patel, Manish M; Thomas, Jerry D; Miller, Michael A

    2011-01-01

    Statistical methods are pervasive in medical research and general medical literature. Understanding general statistical concepts will enhance our ability to critically appraise the current literature and ultimately improve the delivery of patient care. This article intends to provide an overview of the common statistical methods relevant to medicine.

  2. Clinical relevance vs. statistical significance: Using neck outcomes in patients with temporomandibular disorders as an example.

    PubMed

    Armijo-Olivo, Susan; Warren, Sharon; Fuentes, Jorge; Magee, David J

    2011-12-01

    Statistical significance has been used extensively to evaluate the results of research studies. Nevertheless, it offers only limited information to clinicians. The assessment of clinical relevance can facilitate the interpretation of the research results into clinical practice. The objective of this study was to explore different methods to evaluate the clinical relevance of the results using a cross-sectional study as an example comparing different neck outcomes between subjects with temporomandibular disorders and healthy controls. Subjects were compared for head and cervical posture, maximal cervical muscle strength, endurance of the cervical flexor and extensor muscles, and electromyographic activity of the cervical flexor muscles during the CranioCervical Flexion Test (CCFT). The evaluation of clinical relevance of the results was performed based on the effect size (ES), minimal important difference (MID), and clinical judgement. The results of this study show that it is possible to have statistical significance without having clinical relevance, to have both statistical significance and clinical relevance, to have clinical relevance without having statistical significance, or to have neither statistical significance nor clinical relevance. The evaluation of clinical relevance in clinical research is crucial to simplify the transfer of knowledge from research into practice. Clinical researchers should present the clinical relevance of their results. Copyright © 2011 Elsevier Ltd. All rights reserved.

  3. Evaluation of a Partial Genome Screening of Two Asthma Susceptibility Regions Using Bayesian Network Based Bayesian Multilevel Analysis of Relevance

    PubMed Central

    Antal, Péter; Kiszel, Petra Sz.; Gézsi, András; Hadadi, Éva; Virág, Viktor; Hajós, Gergely; Millinghoffer, András; Nagy, Adrienne; Kiss, András; Semsei, Ágnes F.; Temesi, Gergely; Melegh, Béla; Kisfali, Péter; Széll, Márta; Bikov, András; Gálffy, Gabriella; Tamási, Lilla; Falus, András; Szalai, Csaba

    2012-01-01

    Genetic studies indicate high number of potential factors related to asthma. Based on earlier linkage analyses we selected the 11q13 and 14q22 asthma susceptibility regions, for which we designed a partial genome screening study using 145 SNPs in 1201 individuals (436 asthmatic children and 765 controls). The results were evaluated with traditional frequentist methods and we applied a new statistical method, called Bayesian network based Bayesian multilevel analysis of relevance (BN-BMLA). This method uses Bayesian network representation to provide detailed characterization of the relevance of factors, such as joint significance, the type of dependency, and multi-target aspects. We estimated posteriors for these relations within the Bayesian statistical framework, in order to estimate the posteriors whether a variable is directly relevant or its association is only mediated. With frequentist methods one SNP (rs3751464 in the FRMD6 gene) provided evidence for an association with asthma (OR = 1.43(1.2–1.8); p = 3×10−4). The possible role of the FRMD6 gene in asthma was also confirmed in an animal model and human asthmatics. In the BN-BMLA analysis altogether 5 SNPs in 4 genes were found relevant in connection with asthma phenotype: PRPF19 on chromosome 11, and FRMD6, PTGER2 and PTGDR on chromosome 14. In a subsequent step a partial dataset containing rhinitis and further clinical parameters was used, which allowed the analysis of relevance of SNPs for asthma and multiple targets. These analyses suggested that SNPs in the AHNAK and MS4A2 genes were indirectly associated with asthma. This paper indicates that BN-BMLA explores the relevant factors more comprehensively than traditional statistical methods and extends the scope of strong relevance based methods to include partial relevance, global characterization of relevance and multi-target relevance. PMID:22432035

  4. Counting Better? An Examination of the Impact of Quantitative Method Teaching on Statistical Anxiety and Confidence

    ERIC Educational Resources Information Center

    Chamberlain, John Martyn; Hillier, John; Signoretta, Paola

    2015-01-01

    This article reports the results of research concerned with students' statistical anxiety and confidence to both complete and learn to complete statistical tasks. Data were collected at the beginning and end of a quantitative method statistics module. Students recognised the value of numeracy skills but felt they were not necessarily relevant for…

  5. Statistical interpretation of machine learning-based feature importance scores for biomarker discovery.

    PubMed

    Huynh-Thu, Vân Anh; Saeys, Yvan; Wehenkel, Louis; Geurts, Pierre

    2012-07-01

    Univariate statistical tests are widely used for biomarker discovery in bioinformatics. These procedures are simple, fast and their output is easily interpretable by biologists but they can only identify variables that provide a significant amount of information in isolation from the other variables. As biological processes are expected to involve complex interactions between variables, univariate methods thus potentially miss some informative biomarkers. Variable relevance scores provided by machine learning techniques, however, are potentially able to highlight multivariate interacting effects, but unlike the p-values returned by univariate tests, these relevance scores are usually not statistically interpretable. This lack of interpretability hampers the determination of a relevance threshold for extracting a feature subset from the rankings and also prevents the wide adoption of these methods by practicians. We evaluated several, existing and novel, procedures that extract relevant features from rankings derived from machine learning approaches. These procedures replace the relevance scores with measures that can be interpreted in a statistical way, such as p-values, false discovery rates, or family wise error rates, for which it is easier to determine a significance level. Experiments were performed on several artificial problems as well as on real microarray datasets. Although the methods differ in terms of computing times and the tradeoff, they achieve in terms of false positives and false negatives, some of them greatly help in the extraction of truly relevant biomarkers and should thus be of great practical interest for biologists and physicians. As a side conclusion, our experiments also clearly highlight that using model performance as a criterion for feature selection is often counter-productive. Python source codes of all tested methods, as well as the MATLAB scripts used for data simulation, can be found in the Supplementary Material.

  6. Geomatic Methods for the Analysis of Data in the Earth Sciences: Lecture Notes in Earth Sciences, Vol. 95

    NASA Astrophysics Data System (ADS)

    Pavlis, Nikolaos K.

    Geomatics is a trendy term that has been used in recent years to describe academic departments that teach and research theories, methods, algorithms, and practices used in processing and analyzing data related to the Earth and other planets. Naming trends aside, geomatics could be considered as the mathematical and statistical “toolbox” that allows Earth scientists to extract information about physically relevant parameters from the available data and accompany such information with some measure of its reliability. This book is an attempt to present the mathematical-statistical methods used in data analysis within various disciplines—geodesy, geophysics, photogrammetry and remote sensing—from a unifying perspective that inverse problem formalism permits. At the same time, it allows us to stretch the relevance of statistical methods in achieving an optimal solution.

  7. Accuracy of Orthognathic Surgical Outcomes Using 2- and 3-Dimensional Landmarks-The Case for Apples and Oranges?

    PubMed

    Borba, Alexandre Meireles; José da Silva, Everton; Fernandes da Silva, André Luis; Han, Michael D; da Graça Naclério-Homem, Maria; Miloro, Michael

    2018-01-12

    To verify predicted versus obtained surgical movements in 2-dimensional (2D) and 3-dimensional (3D) measurements and compare the equivalence between these methods. A retrospective observational study of bimaxillary orthognathic surgeries was performed. Postoperative cone-beam computed tomographic (CBCT) scans were superimposed on preoperative scans and a lateral cephalometric radiograph was generated from each CBCT scan. After identification of the sella, nasion, and upper central incisor tip landmarks on 2D and 3D images, actual and planned movements were compared by cephalometric measurements. One-sample t test was used to statistically evaluate results, with expected mean discrepancy values ranging from 0 to 2 mm. Equivalence of 2D and 3D values was compared using paired t test. The final sample of 46 cases showed by 2D cephalometry that differences between actual and planned movements in the horizontal axis were statistically relevant for expected means of 0, 0.5, and 2 mm without relevance for expected means of 1 and 1.5 mm; vertical movements were statistically relevant for expected means of 0 and 0.5 mm without relevance for expected means of 1, 1.5, and 2 mm. For 3D cephalometry in the horizontal axis, there were statistically relevant differences for expected means of 0, 1.5, and 2 mm without relevance for expected means of 0.5 and 1 mm; vertical movements showed statistically relevant differences for expected means of 0, 0.5, 1.5 and 2 mm without relevance for the expected mean of 1 mm. Comparison of 2D and 3D values displayed statistical differences for the horizontal and vertical axes. Comparison of 2D and 3D surgical outcome assessments should be performed with caution because there seems to be a difference in acceptable levels of accuracy between these 2 methods of evaluation. Moreover, 3D accuracy studies should no longer rely on a 2-mm level of discrepancy but on a 1-mm level. Copyright © 2018 American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.

  8. A robust bayesian estimate of the concordance correlation coefficient.

    PubMed

    Feng, Dai; Baumgartner, Richard; Svetnik, Vladimir

    2015-01-01

    A need for assessment of agreement arises in many situations including statistical biomarker qualification or assay or method validation. Concordance correlation coefficient (CCC) is one of the most popular scaled indices reported in evaluation of agreement. Robust methods for CCC estimation currently present an important statistical challenge. Here, we propose a novel Bayesian method of robust estimation of CCC based on multivariate Student's t-distribution and compare it with its alternatives. Furthermore, we extend the method to practically relevant settings, enabling incorporation of confounding covariates and replications. The superiority of the new approach is demonstrated using simulation as well as real datasets from biomarker application in electroencephalography (EEG). This biomarker is relevant in neuroscience for development of treatments for insomnia.

  9. Statistical Literacy in the Data Science Workplace

    ERIC Educational Resources Information Center

    Grant, Robert

    2017-01-01

    Statistical literacy, the ability to understand and make use of statistical information including methods, has particular relevance in the age of data science, when complex analyses are undertaken by teams from diverse backgrounds. Not only is it essential to communicate to the consumers of information but also within the team. Writing from the…

  10. A guide to statistical analysis in microbial ecology: a community-focused, living review of multivariate data analyses.

    PubMed

    Buttigieg, Pier Luigi; Ramette, Alban

    2014-12-01

    The application of multivariate statistical analyses has become a consistent feature in microbial ecology. However, many microbial ecologists are still in the process of developing a deep understanding of these methods and appreciating their limitations. As a consequence, staying abreast of progress and debate in this arena poses an additional challenge to many microbial ecologists. To address these issues, we present the GUide to STatistical Analysis in Microbial Ecology (GUSTA ME): a dynamic, web-based resource providing accessible descriptions of numerous multivariate techniques relevant to microbial ecologists. A combination of interactive elements allows users to discover and navigate between methods relevant to their needs and examine how they have been used by others in the field. We have designed GUSTA ME to become a community-led and -curated service, which we hope will provide a common reference and forum to discuss and disseminate analytical techniques relevant to the microbial ecology community. © 2014 The Authors. FEMS Microbiology Ecology published by John Wiley & Sons Ltd on behalf of Federation of European Microbiological Societies.

  11. Some Tests of Randomness with Applications

    DTIC Science & Technology

    1981-02-01

    freedom. For further details, the reader is referred to Gnanadesikan (1977, p. 169) wherein other relevant tests are also given, Graphical tests, as...sample from a gamma distri- bution. J. Am. Statist. Assoc. 71, 480-7. Gnanadesikan , R. (1977). Methods for Statistical Data Analysis of Multivariate

  12. Comparative evaluation of topographical data of dental implant surfaces applying optical interferometry and scanning electron microscopy.

    PubMed

    Kournetas, N; Spintzyk, S; Schweizer, E; Sawada, T; Said, F; Schmid, P; Geis-Gerstorfer, J; Eliades, G; Rupp, F

    2017-08-01

    Comparability of topographical data of implant surfaces in literature is low and their clinical relevance often equivocal. The aim of this study was to investigate the ability of scanning electron microscopy and optical interferometry to assess statistically similar 3-dimensional roughness parameter results and to evaluate these data based on predefined criteria regarded relevant for a favorable biological response. Four different commercial dental screw-type implants (NanoTite Certain Prevail, TiUnite Brånemark Mk III, XiVE S Plus and SLA Standard Plus) were analyzed by stereo scanning electron microscopy and white light interferometry. Surface height, spatial and hybrid roughness parameters (Sa, Sz, Ssk, Sku, Sal, Str, Sdr) were assessed from raw and filtered data (Gaussian 50μm and 5μm cut-off-filters), respectively. Data were statistically compared by one-way ANOVA and Tukey-Kramer post-hoc test. For a clinically relevant interpretation, a categorizing evaluation approach was used based on predefined threshold criteria for each roughness parameter. The two methods exhibited predominantly statistical differences. Dependent on roughness parameters and filter settings, both methods showed variations in rankings of the implant surfaces and differed in their ability to discriminate the different topographies. Overall, the analyses revealed scale-dependent roughness data. Compared to the pure statistical approach, the categorizing evaluation resulted in much more similarities between the two methods. This study suggests to reconsider current approaches for the topographical evaluation of implant surfaces and to further seek after proper experimental settings. Furthermore, the specific role of different roughness parameters for the bioresponse has to be studied in detail in order to better define clinically relevant, scale-dependent and parameter-specific thresholds and ranges. Copyright © 2017 The Academy of Dental Materials. Published by Elsevier Ltd. All rights reserved.

  13. A Response to White and Gorard: Against Inferential Statistics: How and Why Current Statistics Teaching Gets It Wrong

    ERIC Educational Resources Information Center

    Nicholson, James; Ridgway, Jim

    2017-01-01

    White and Gorard make important and relevant criticisms of some of the methods commonly used in social science research, but go further by criticising the logical basis for inferential statistical tests. This paper comments briefly on matters we broadly agree on with them and more fully on matters where we disagree. We agree that too little…

  14. Biometrics in the Medical School Curriculum: Making the Necessary Relevant.

    ERIC Educational Resources Information Center

    Murphy, James R.

    1980-01-01

    Because a student is more likely to learn and retain course content perceived as relevant, an attempt was made to change medical students' perceptions of a biometrics course by introducing statistical methods as a means of solving problems in the interpretation of clinical lab data. Retrospective analysis of student course evaluations indicates a…

  15. Bayesian Statistics in Educational Research: A Look at the Current State of Affairs

    ERIC Educational Resources Information Center

    König, Christoph; van de Schoot, Rens

    2018-01-01

    The ability of a scientific discipline to build cumulative knowledge depends on its predominant method of data analysis. A steady accumulation of knowledge requires approaches which allow researchers to consider results from comparable prior research. Bayesian statistics is especially relevant for establishing a cumulative scientific discipline,…

  16. Teaching Biology through Statistics: Application of Statistical Methods in Genetics and Zoology Courses

    ERIC Educational Resources Information Center

    Colon-Berlingeri, Migdalisel; Burrowes, Patricia A.

    2011-01-01

    Incorporation of mathematics into biology curricula is critical to underscore for undergraduate students the relevance of mathematics to most fields of biology and the usefulness of developing quantitative process skills demanded in modern biology. At our institution, we have made significant changes to better integrate mathematics into the…

  17. Technical Note: Higher-order statistical moments and a procedure that detects potentially anomalous years as two alternative methods describing alterations in continuous environmental data

    Treesearch

    I. Arismendi; S. L. Johnson; J. B. Dunham

    2015-01-01

    Statistics of central tendency and dispersion may not capture relevant or desired characteristics of the distribution of continuous phenomena and, thus, they may not adequately describe temporal patterns of change. Here, we present two methodological approaches that can help to identify temporal changes in environmental regimes. First, we use higher-order statistical...

  18. Hybrid statistics-simulations based method for atom-counting from ADF STEM images.

    PubMed

    De Wael, Annelies; De Backer, Annick; Jones, Lewys; Nellist, Peter D; Van Aert, Sandra

    2017-06-01

    A hybrid statistics-simulations based method for atom-counting from annular dark field scanning transmission electron microscopy (ADF STEM) images of monotype crystalline nanostructures is presented. Different atom-counting methods already exist for model-like systems. However, the increasing relevance of radiation damage in the study of nanostructures demands a method that allows atom-counting from low dose images with a low signal-to-noise ratio. Therefore, the hybrid method directly includes prior knowledge from image simulations into the existing statistics-based method for atom-counting, and accounts in this manner for possible discrepancies between actual and simulated experimental conditions. It is shown by means of simulations and experiments that this hybrid method outperforms the statistics-based method, especially for low electron doses and small nanoparticles. The analysis of a simulated low dose image of a small nanoparticle suggests that this method allows for far more reliable quantitative analysis of beam-sensitive materials. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Data and methods for studying commercial motor vehicle driver fatigue, highway safety and long-term driver health.

    PubMed

    Stern, Hal S; Blower, Daniel; Cohen, Michael L; Czeisler, Charles A; Dinges, David F; Greenhouse, Joel B; Guo, Feng; Hanowski, Richard J; Hartenbaum, Natalie P; Krueger, Gerald P; Mallis, Melissa M; Pain, Richard F; Rizzo, Matthew; Sinha, Esha; Small, Dylan S; Stuart, Elizabeth A; Wegman, David H

    2018-03-09

    This article summarizes the recommendations on data and methodology issues for studying commercial motor vehicle driver fatigue of a National Academies of Sciences, Engineering, and Medicine study. A framework is provided that identifies the various factors affecting driver fatigue and relating driver fatigue to crash risk and long-term driver health. The relevant factors include characteristics of the driver, vehicle, carrier and environment. Limitations of existing data are considered and potential sources of additional data described. Statistical methods that can be used to improve understanding of the relevant relationships from observational data are also described. The recommendations for enhanced data collection and the use of modern statistical methods for causal inference have the potential to enhance our understanding of the relationship of fatigue to highway safety and to long-term driver health. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  20. Quality Control of the Print with the Application of Statistical Methods

    NASA Astrophysics Data System (ADS)

    Simonenko, K. V.; Bulatova, G. S.; Antropova, L. B.; Varepo, L. G.

    2018-04-01

    The basis for standardizing the process of offset printing is the control of print quality indicators. The solution of this problem has various approaches, among which the most important are statistical methods. Practical implementation of them for managing the quality of the printing process is very relevant and is reflected in this paper. The possibility of using the method of constructing a Control Card to identify the reasons for the deviation of the optical density for a triad of inks in offset printing is shown.

  1. Evaluation of Cepstrum Algorithm with Impact Seeded Fault Data of Helicopter Oil Cooler Fan Bearings and Machine Fault Simulator Data

    DTIC Science & Technology

    2013-02-01

    of a bearing must be put into practice. There are many potential methods, the most traditional being the use of statistical time-domain features...accelerate degradation to test multiples bearings to gain statistical relevance and extrapolate results to scale for field conditions. Temperature...as time statistics , frequency estimation to improve the fault frequency detection. For future investigations, one can further explore the

  2. Cluster detection methods applied to the Upper Cape Cod cancer data.

    PubMed

    Ozonoff, Al; Webster, Thomas; Vieira, Veronica; Weinberg, Janice; Ozonoff, David; Aschengrau, Ann

    2005-09-15

    A variety of statistical methods have been suggested to assess the degree and/or the location of spatial clustering of disease cases. However, there is relatively little in the literature devoted to comparison and critique of different methods. Most of the available comparative studies rely on simulated data rather than real data sets. We have chosen three methods currently used for examining spatial disease patterns: the M-statistic of Bonetti and Pagano; the Generalized Additive Model (GAM) method as applied by Webster; and Kulldorff's spatial scan statistic. We apply these statistics to analyze breast cancer data from the Upper Cape Cancer Incidence Study using three different latency assumptions. The three different latency assumptions produced three different spatial patterns of cases and controls. For 20 year latency, all three methods generally concur. However, for 15 year latency and no latency assumptions, the methods produce different results when testing for global clustering. The comparative analyses of real data sets by different statistical methods provides insight into directions for further research. We suggest a research program designed around examining real data sets to guide focused investigation of relevant features using simulated data, for the purpose of understanding how to interpret statistical methods applied to epidemiological data with a spatial component.

  3. Automated detection of diagnostically relevant regions in H&E stained digital pathology slides

    NASA Astrophysics Data System (ADS)

    Bahlmann, Claus; Patel, Amar; Johnson, Jeffrey; Ni, Jie; Chekkoury, Andrei; Khurd, Parmeshwar; Kamen, Ali; Grady, Leo; Krupinski, Elizabeth; Graham, Anna; Weinstein, Ronald

    2012-03-01

    We present a computationally efficient method for analyzing H&E stained digital pathology slides with the objective of discriminating diagnostically relevant vs. irrelevant regions. Such technology is useful for several applications: (1) It can speed up computer aided diagnosis (CAD) for histopathology based cancer detection and grading by an order of magnitude through a triage-like preprocessing and pruning. (2) It can improve the response time for an interactive digital pathology workstation (which is usually dealing with several GByte digital pathology slides), e.g., through controlling adaptive compression or prioritization algorithms. (3) It can support the detection and grading workflow for expert pathologists in a semi-automated diagnosis, hereby increasing throughput and accuracy. At the core of the presented method is the statistical characterization of tissue components that are indicative for the pathologist's decision about malignancy vs. benignity, such as, nuclei, tubules, cytoplasm, etc. In order to allow for effective yet computationally efficient processing, we propose visual descriptors that capture the distribution of color intensities observed for nuclei and cytoplasm. Discrimination between statistics of relevant vs. irrelevant regions is learned from annotated data, and inference is performed via linear classification. We validate the proposed method both qualitatively and quantitatively. Experiments show a cross validation error rate of 1.4%. We further show that the proposed method can prune ~90% of the area of pathological slides while maintaining 100% of all relevant information, which allows for a speedup of a factor of 10 for CAD systems.

  4. Selecting relevant 3D image features of margin sharpness and texture for lung nodule retrieval.

    PubMed

    Ferreira, José Raniery; de Azevedo-Marques, Paulo Mazzoncini; Oliveira, Marcelo Costa

    2017-03-01

    Lung cancer is the leading cause of cancer-related deaths in the world. Its diagnosis is a challenge task to specialists due to several aspects on the classification of lung nodules. Therefore, it is important to integrate content-based image retrieval methods on the lung nodule classification process, since they are capable of retrieving similar cases from databases that were previously diagnosed. However, this mechanism depends on extracting relevant image features in order to obtain high efficiency. The goal of this paper is to perform the selection of 3D image features of margin sharpness and texture that can be relevant on the retrieval of similar cancerous and benign lung nodules. A total of 48 3D image attributes were extracted from the nodule volume. Border sharpness features were extracted from perpendicular lines drawn over the lesion boundary. Second-order texture features were extracted from a cooccurrence matrix. Relevant features were selected by a correlation-based method and a statistical significance analysis. Retrieval performance was assessed according to the nodule's potential malignancy on the 10 most similar cases and by the parameters of precision and recall. Statistical significant features reduced retrieval performance. Correlation-based method selected 2 margin sharpness attributes and 6 texture attributes and obtained higher precision compared to all 48 extracted features on similar nodule retrieval. Feature space dimensionality reduction of 83 % obtained higher retrieval performance and presented to be a computationaly low cost method of retrieving similar nodules for the diagnosis of lung cancer.

  5. Innovations in curriculum design: A multi-disciplinary approach to teaching statistics to undergraduate medical students

    PubMed Central

    Freeman, Jenny V; Collier, Steve; Staniforth, David; Smith, Kevin J

    2008-01-01

    Background Statistics is relevant to students and practitioners in medicine and health sciences and is increasingly taught as part of the medical curriculum. However, it is common for students to dislike and under-perform in statistics. We sought to address these issues by redesigning the way that statistics is taught. Methods The project brought together a statistician, clinician and educational experts to re-conceptualize the syllabus, and focused on developing different methods of delivery. New teaching materials, including videos, animations and contextualized workbooks were designed and produced, placing greater emphasis on applying statistics and interpreting data. Results Two cohorts of students were evaluated, one with old style and one with new style teaching. Both were similar with respect to age, gender and previous level of statistics. Students who were taught using the new approach could better define the key concepts of p-value and confidence interval (p < 0.001 for both). They were more likely to regard statistics as integral to medical practice (p = 0.03), and to expect to use it in their medical career (p = 0.003). There was no significant difference in the numbers who thought that statistics was essential to understand the literature (p = 0.28) and those who felt comfortable with the basics of statistics (p = 0.06). More than half the students in both cohorts felt that they were comfortable with the basics of medical statistics. Conclusion Using a variety of media, and placing emphasis on interpretation can help make teaching, learning and understanding of statistics more people-centred and relevant, resulting in better outcomes for students. PMID:18452599

  6. No-reference image quality assessment based on natural scene statistics and gradient magnitude similarity

    NASA Astrophysics Data System (ADS)

    Jia, Huizhen; Sun, Quansen; Ji, Zexuan; Wang, Tonghan; Chen, Qiang

    2014-11-01

    The goal of no-reference/blind image quality assessment (NR-IQA) is to devise a perceptual model that can accurately predict the quality of a distorted image as human opinions, in which feature extraction is an important issue. However, the features used in the state-of-the-art "general purpose" NR-IQA algorithms are usually natural scene statistics (NSS) based or are perceptually relevant; therefore, the performance of these models is limited. To further improve the performance of NR-IQA, we propose a general purpose NR-IQA algorithm which combines NSS-based features with perceptually relevant features. The new method extracts features in both the spatial and gradient domains. In the spatial domain, we extract the point-wise statistics for single pixel values which are characterized by a generalized Gaussian distribution model to form the underlying features. In the gradient domain, statistical features based on neighboring gradient magnitude similarity are extracted. Then a mapping is learned to predict quality scores using a support vector regression. The experimental results on the benchmark image databases demonstrate that the proposed algorithm correlates highly with human judgments of quality and leads to significant performance improvements over state-of-the-art methods.

  7. Survey of the Methods and Reporting Practices in Published Meta-analyses of Test Performance: 1987 to 2009

    ERIC Educational Resources Information Center

    Dahabreh, Issa J.; Chung, Mei; Kitsios, Georgios D.; Terasawa, Teruhiko; Raman, Gowri; Tatsioni, Athina; Tobar, Annette; Lau, Joseph; Trikalinos, Thomas A.; Schmid, Christopher H.

    2013-01-01

    We performed a survey of meta-analyses of test performance to describe the evolution in their methods and reporting. Studies were identified through MEDLINE (1966-2009), reference lists, and relevant reviews. We extracted information on clinical topics, literature review methods, quality assessment, and statistical analyses. We reviewed 760…

  8. Group Influences on Young Adult Warfighters’ Risk Taking

    DTIC Science & Technology

    2016-12-01

    Statistical Analysis Latent linear growth models were fitted using the maximum likelihood estimation method in Mplus (version 7.0; Muthen & Muthen...condition had a higher net score than those in the alone condition (b = 20.53, SE = 6.29, p < .001). Results of the relevant statistical analyses are...8.56 110.86*** 22.01 158.25*** 29.91 Model fit statistics BIC 4004.50 5302.539 5540.58 Chi-square (df) 41.51*** (16) 38.10** (20) 42.19** (20

  9. A multi-analyte serum test for the detection of non-small cell lung cancer

    PubMed Central

    Farlow, E C; Vercillo, M S; Coon, J S; Basu, S; Kim, A W; Faber, L P; Warren, W H; Bonomi, P; Liptay, M J; Borgia, J A

    2010-01-01

    Background: In this study, we appraised a wide assortment of biomarkers previously shown to have diagnostic or prognostic value for non-small cell lung cancer (NSCLC) with the intent of establishing a multi-analyte serum test capable of identifying patients with lung cancer. Methods: Circulating levels of 47 biomarkers were evaluated against patient cohorts consisting of 90 NSCLC and 43 non-cancer controls using commercial immunoassays. Multivariate statistical methods were used on all biomarkers achieving statistical relevance to define an optimised panel of diagnostic biomarkers for NSCLC. The resulting biomarkers were fashioned into a classification algorithm and validated against serum from a second patient cohort. Results: A total of 14 analytes achieved statistical relevance upon evaluation. Multivariate statistical methods then identified a panel of six biomarkers (tumour necrosis factor-α, CYFRA 21-1, interleukin-1ra, matrix metalloproteinase-2, monocyte chemotactic protein-1 and sE-selectin) as being the most efficacious for diagnosing early stage NSCLC. When tested against a second patient cohort, the panel successfully classified 75 of 88 patients. Conclusions: Here, we report the development of a serum algorithm with high specificity for classifying patients with NSCLC against cohorts of various ‘high-risk' individuals. A high rate of false positives was observed within the cohort in which patients had non-neoplastic lung nodules, possibly as a consequence of the inflammatory nature of these conditions. PMID:20859284

  10. Some new mathematical methods for variational objective analysis

    NASA Technical Reports Server (NTRS)

    Wahba, Grace; Johnson, Donald R.

    1994-01-01

    Numerous results were obtained relevant to remote sensing, variational objective analysis, and data assimilation. A list of publications relevant in whole or in part is attached. The principal investigator gave many invited lectures, disseminating the results to the meteorological community as well as the statistical community. A list of invited lectures at meetings is attached, as well as a list of departmental colloquia at various universities and institutes.

  11. Graphical Methods: A Review of Current Methods and Computer Hardware and Software. Technical Report No. 27.

    ERIC Educational Resources Information Center

    Bessey, Barbara L.; And Others

    Graphical methods for displaying data, as well as available computer software and hardware, are reviewed. The authors have emphasized the types of graphs which are most relevant to the needs of the National Center for Education Statistics (NCES) and its readers. The following types of graphs are described: tabulations, stem-and-leaf displays,…

  12. Profiling under UNIX by patching

    NASA Technical Reports Server (NTRS)

    Bishop, Matt

    1986-01-01

    Profiling under UNIX is done by inserting counters into programs either before or during the compilation or assembly phases. A fourth type of profiling involves monitoring the execution of a program, and gathering relevant statistics during the run. This method and an implementation of this method are examined, and its advantages and disadvantages are discussed.

  13. The transfer of analytical procedures.

    PubMed

    Ermer, J; Limberger, M; Lis, K; Wätzig, H

    2013-11-01

    Analytical method transfers are certainly among the most discussed topics in the GMP regulated sector. However, they are surprisingly little regulated in detail. General information is provided by USP, WHO, and ISPE in particular. Most recently, the EU emphasized the importance of analytical transfer by including it in their draft of the revised GMP Guideline. In this article, an overview and comparison of these guidelines is provided. The key to success for method transfers is the excellent communication between sending and receiving unit. In order to facilitate this communication, procedures, flow charts and checklists for responsibilities, success factors, transfer categories, the transfer plan and report, strategies in case of failed transfers, tables with acceptance limits are provided here, together with a comprehensive glossary. Potential pitfalls are described such that they can be avoided. In order to assure an efficient and sustainable transfer of analytical procedures, a practically relevant and scientifically sound evaluation with corresponding acceptance criteria is crucial. Various strategies and statistical tools such as significance tests, absolute acceptance criteria, and equivalence tests are thoroughly descibed and compared in detail giving examples. Significance tests should be avoided. The success criterion is not statistical significance, but rather analytical relevance. Depending on a risk assessment of the analytical procedure in question, statistical equivalence tests are recommended, because they include both, a practically relevant acceptance limit and a direct control of the statistical risks. However, for lower risk procedures, a simple comparison of the transfer performance parameters to absolute limits is also regarded as sufficient. Copyright © 2013 Elsevier B.V. All rights reserved.

  14. Computer-aided auditing of prescription drug claims.

    PubMed

    Iyengar, Vijay S; Hermiz, Keith B; Natarajan, Ramesh

    2014-09-01

    We describe a methodology for identifying and ranking candidate audit targets from a database of prescription drug claims. The relevant audit targets may include various entities such as prescribers, patients and pharmacies, who exhibit certain statistical behavior indicative of potential fraud and abuse over the prescription claims during a specified period of interest. Our overall approach is consistent with related work in statistical methods for detection of fraud and abuse, but has a relative emphasis on three specific aspects: first, based on the assessment of domain experts, certain focus areas are selected and data elements pertinent to the audit analysis in each focus area are identified; second, specialized statistical models are developed to characterize the normalized baseline behavior in each focus area; and third, statistical hypothesis testing is used to identify entities that diverge significantly from their expected behavior according to the relevant baseline model. The application of this overall methodology to a prescription claims database from a large health plan is considered in detail.

  15. Brain fingerprinting classification concealed information test detects US Navy military medical information with P300

    PubMed Central

    Farwell, Lawrence A.; Richardson, Drew C.; Richardson, Graham M.; Furedy, John J.

    2014-01-01

    A classification concealed information test (CIT) used the “brain fingerprinting” method of applying P300 event-related potential (ERP) in detecting information that is (1) acquired in real life and (2) unique to US Navy experts in military medicine. Military medicine experts and non-experts were asked to push buttons in response to three types of text stimuli. Targets contain known information relevant to military medicine, are identified to subjects as relevant, and require pushing one button. Subjects are told to push another button to all other stimuli. Probes contain concealed information relevant to military medicine, and are not identified to subjects. Irrelevants contain equally plausible, but incorrect/irrelevant information. Error rate was 0%. Median and mean statistical confidences for individual determinations were 99.9% with no indeterminates (results lacking sufficiently high statistical confidence to be classified). We compared error rate and statistical confidence for determinations of both information present and information absent produced by classification CIT (Is a probe ERP more similar to a target or to an irrelevant ERP?) vs. comparison CIT (Does a probe produce a larger ERP than an irrelevant?) using P300 plus the late negative component (LNP; together, P300-MERMER). Comparison CIT produced a significantly higher error rate (20%) and lower statistical confidences: mean 67%; information-absent mean was 28.9%, less than chance (50%). We compared analysis using P300 alone with the P300 + LNP. P300 alone produced the same 0% error rate but significantly lower statistical confidences. These findings add to the evidence that the brain fingerprinting methods as described here provide sufficient conditions to produce less than 1% error rate and greater than 95% median statistical confidence in a CIT on information obtained in the course of real life that is characteristic of individuals with specific training, expertise, or organizational affiliation. PMID:25565941

  16. A methodological approach to identify external factors for indicator-based risk adjustment illustrated by a cataract surgery register

    PubMed Central

    2014-01-01

    Background Risk adjustment is crucial for comparison of outcome in medical care. Knowledge of the external factors that impact measured outcome but that cannot be influenced by the physician is a prerequisite for this adjustment. To date, a universal and reproducible method for identification of the relevant external factors has not been published. The selection of external factors in current quality assurance programmes is mainly based on expert opinion. We propose and demonstrate a methodology for identification of external factors requiring risk adjustment of outcome indicators and we apply it to a cataract surgery register. Methods Defined test criteria to determine the relevance for risk adjustment are “clinical relevance” and “statistical significance”. Clinical relevance of the association is presumed when observed success rates of the indicator in the presence and absence of the external factor exceed a pre-specified range of 10%. Statistical significance of the association between the external factor and outcome indicators is assessed by univariate stratification and multivariate logistic regression adjustment. The cataract surgery register was set up as part of a German multi-centre register trial for out-patient cataract surgery in three high-volume surgical sites. A total of 14,924 patient follow-ups have been documented since 2005. Eight external factors potentially relevant for risk adjustment were related to the outcome indicators “refractive accuracy” and “visual rehabilitation” 2–5 weeks after surgery. Results The clinical relevance criterion confirmed 2 (“refractive accuracy”) and 5 (“visual rehabilitation”) external factors. The significance criterion was verified in two ways. Univariate and multivariate analyses revealed almost identical external factors: 4 were related to “refractive accuracy” and 7 (6) to “visual rehabilitation”. Two (“refractive accuracy”) and 5 (“visual rehabilitation”) factors conformed to both criteria and were therefore relevant for risk adjustment. Conclusion In a practical application, the proposed method to identify relevant external factors for risk adjustment for comparison of outcome in healthcare proved to be feasible and comprehensive. The method can also be adapted to other quality assurance programmes. However, the cut-off score for clinical relevance needs to be individually assessed when applying the proposed method to other indications or indicators. PMID:24965949

  17. Pitfalls in the statistical examination and interpretation of the correspondence between physician and patient satisfaction ratings and their relevance for shared decision making research

    PubMed Central

    2011-01-01

    Background The correspondence of satisfaction ratings between physicians and patients can be assessed on different dimensions. One may examine whether they differ between the two groups or focus on measures of association or agreement. The aim of our study was to evaluate methodological difficulties in calculating the correspondence between patient and physician satisfaction ratings and to show the relevance for shared decision making research. Methods We utilised a structured tool for cardiovascular prevention (arriba™) in a pragmatic cluster-randomised controlled trial. Correspondence between patient and physician satisfaction ratings after individual primary care consultations was assessed using the Patient Participation Scale (PPS). We used the Wilcoxon signed-rank test, the marginal homogeneity test, Kendall's tau-b, weighted kappa, percentage of agreement, and the Bland-Altman method to measure differences, associations, and agreement between physicians and patients. Results Statistical measures signal large differences between patient and physician satisfaction ratings with more favourable ratings provided by patients and a low correspondence regardless of group allocation. Closer examination of the raw data revealed a high ceiling effect of satisfaction ratings and only slight disagreement regarding the distributions of differences between physicians' and patients' ratings. Conclusions Traditional statistical measures of association and agreement are not able to capture a clinically relevant appreciation of the physician-patient relationship by both parties in skewed satisfaction ratings. Only the Bland-Altman method for assessing agreement augmented by bar charts of differences was able to indicate this. Trial registration ISRCTN: ISRCT71348772 PMID:21592337

  18. Statistical Methods Applied to Gamma-ray Spectroscopy Algorithms in Nuclear Security Missions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fagan, Deborah K.; Robinson, Sean M.; Runkle, Robert C.

    2012-10-01

    In a wide range of nuclear security missions, gamma-ray spectroscopy is a critical research and development priority. One particularly relevant challenge is the interdiction of special nuclear material for which gamma-ray spectroscopy supports the goals of detecting and identifying gamma-ray sources. This manuscript examines the existing set of spectroscopy methods, attempts to categorize them by the statistical methods on which they rely, and identifies methods that have yet to be considered. Our examination shows that current methods effectively estimate the effect of counting uncertainty but in many cases do not address larger sources of decision uncertainty—ones that are significantly moremore » complex. We thus explore the premise that significantly improving algorithm performance requires greater coupling between the problem physics that drives data acquisition and statistical methods that analyze such data. Untapped statistical methods, such as Bayes Modeling Averaging and hierarchical and empirical Bayes methods have the potential to reduce decision uncertainty by more rigorously and comprehensively incorporating all sources of uncertainty. We expect that application of such methods will demonstrate progress in meeting the needs of nuclear security missions by improving on the existing numerical infrastructure for which these analyses have not been conducted.« less

  19. Relevant Scatterers Characterization in SAR Images

    NASA Astrophysics Data System (ADS)

    Chaabouni, Houda; Datcu, Mihai

    2006-11-01

    Recognizing scenes in a single look meter resolution Synthetic Aperture Radar (SAR) images, requires the capability to identify relevant signal signatures in condition of variable image acquisition geometry, arbitrary objects poses and configurations. Among the methods to detect relevant scatterers in SAR images, we can mention the internal coherence. The SAR spectrum splitted in azimuth generates a series of images which preserve high coherence only for particular object scattering. The detection of relevant scatterers can be done by correlation study or Independent Component Analysis (ICA) methods. The present article deals with the state of the art for SAR internal correlation analysis and proposes further extensions using elements of inference based on information theory applied to complex valued signals. The set of azimuth looks images is analyzed using mutual information measures and an equivalent channel capacity is derived. The localization of the "target" requires analysis in a small image window, thus resulting in imprecise estimation of the second order statistics of the signal. For a better precision, a Hausdorff measure is introduced. The method is applied to detect and characterize relevant objects in urban areas.

  20. The Content of Statistical Requirements for Authors in Biomedical Research Journals

    PubMed Central

    Liu, Tian-Yi; Cai, Si-Yu; Nie, Xiao-Lu; Lyu, Ya-Qi; Peng, Xiao-Xia; Feng, Guo-Shuang

    2016-01-01

    Background: Robust statistical designing, sound statistical analysis, and standardized presentation are important to enhance the quality and transparency of biomedical research. This systematic review was conducted to summarize the statistical reporting requirements introduced by biomedical research journals with an impact factor of 10 or above so that researchers are able to give statistical issues’ serious considerations not only at the stage of data analysis but also at the stage of methodological design. Methods: Detailed statistical instructions for authors were downloaded from the homepage of each of the included journals or obtained from the editors directly via email. Then, we described the types and numbers of statistical guidelines introduced by different press groups. Items of statistical reporting guideline as well as particular requirements were summarized in frequency, which were grouped into design, method of analysis, and presentation, respectively. Finally, updated statistical guidelines and particular requirements for improvement were summed up. Results: Totally, 21 of 23 press groups introduced at least one statistical guideline. More than half of press groups can update their statistical instruction for authors gradually relative to issues of new statistical reporting guidelines. In addition, 16 press groups, covering 44 journals, address particular statistical requirements. The most of the particular requirements focused on the performance of statistical analysis and transparency in statistical reporting, including “address issues relevant to research design, including participant flow diagram, eligibility criteria, and sample size estimation,” and “statistical methods and the reasons.” Conclusions: Statistical requirements for authors are becoming increasingly perfected. Statistical requirements for authors remind researchers that they should make sufficient consideration not only in regards to statistical methods during the research design, but also standardized statistical reporting, which would be beneficial in providing stronger evidence and making a greater critical appraisal of evidence more accessible. PMID:27748343

  1. Detecting clinically relevant new information in clinical notes across specialties and settings.

    PubMed

    Zhang, Rui; Pakhomov, Serguei V S; Arsoniadis, Elliot G; Lee, Janet T; Wang, Yan; Melton, Genevieve B

    2017-07-05

    Automated methods for identifying clinically relevant new versus redundant information in electronic health record (EHR) clinical notes is useful for clinicians and researchers involved in patient care and clinical research, respectively. We evaluated methods to automatically identify clinically relevant new information in clinical notes, and compared the quantity of redundant information across specialties and clinical settings. Statistical language models augmented with semantic similarity measures were evaluated as a means to detect and quantify clinically relevant new and redundant information over longitudinal clinical notes for a given patient. A corpus of 591 progress notes over 40 inpatient admissions was annotated for new information longitudinally by physicians to generate a reference standard. Note redundancy between various specialties was evaluated on 71,021 outpatient notes and 64,695 inpatient notes from 500 solid organ transplant patients (April 2015 through August 2015). Our best method achieved at best performance of 0.87 recall, 0.62 precision, and 0.72 F-measure. Addition of semantic similarity metrics compared to baseline improved recall but otherwise resulted in similar performance. While outpatient and inpatient notes had relatively similar levels of high redundancy (61% and 68%, respectively), redundancy differed by author specialty with mean redundancy of 75%, 66%, 57%, and 55% observed in pediatric, internal medicine, psychiatry and surgical notes, respectively. Automated techniques with statistical language models for detecting redundant versus clinically relevant new information in clinical notes do not improve with the addition of semantic similarity measures. While levels of redundancy seem relatively similar in the inpatient and ambulatory settings in the Fairview Health Services, clinical note redundancy appears to vary significantly with different medical specialties.

  2. A benchmark for statistical microarray data analysis that preserves actual biological and technical variance.

    PubMed

    De Hertogh, Benoît; De Meulder, Bertrand; Berger, Fabrice; Pierre, Michael; Bareke, Eric; Gaigneaux, Anthoula; Depiereux, Eric

    2010-01-11

    Recent reanalysis of spike-in datasets underscored the need for new and more accurate benchmark datasets for statistical microarray analysis. We present here a fresh method using biologically-relevant data to evaluate the performance of statistical methods. Our novel method ranks the probesets from a dataset composed of publicly-available biological microarray data and extracts subset matrices with precise information/noise ratios. Our method can be used to determine the capability of different methods to better estimate variance for a given number of replicates. The mean-variance and mean-fold change relationships of the matrices revealed a closer approximation of biological reality. Performance analysis refined the results from benchmarks published previously.We show that the Shrinkage t test (close to Limma) was the best of the methods tested, except when two replicates were examined, where the Regularized t test and the Window t test performed slightly better. The R scripts used for the analysis are available at http://urbm-cluster.urbm.fundp.ac.be/~bdemeulder/.

  3. Origin of the spike-timing-dependent plasticity rule

    NASA Astrophysics Data System (ADS)

    Cho, Myoung Won; Choi, M. Y.

    2016-08-01

    A biological synapse changes its efficacy depending on the difference between pre- and post-synaptic spike timings. Formulating spike-timing-dependent interactions in terms of the path integral, we establish a neural-network model, which makes it possible to predict relevant quantities rigorously by means of standard methods in statistical mechanics and field theory. In particular, the biological synaptic plasticity rule is shown to emerge as the optimal form for minimizing the free energy. It is further revealed that maximization of the entropy of neural activities gives rise to the competitive behavior of biological learning. This demonstrates that statistical mechanics helps to understand rigorously key characteristic behaviors of a neural network, thus providing the possibility of physics serving as a useful and relevant framework for probing life.

  4. Electrical Conductivity of Charged Particle Systems and Zubarev's Nonequilibrium Statistical Operator Method

    NASA Astrophysics Data System (ADS)

    Röpke, G.

    2018-01-01

    One of the fundamental problems in physics that are not yet rigorously solved is the statistical mechanics of nonequilibrium processes. An important contribution to describing irreversible behavior starting from reversible Hamiltonian dynamics was given by D. N. Zubarev, who invented the method of the nonequilibrium statistical operator. We discuss this approach, in particular, the extended von Neumann equation, and as an example consider the electrical conductivity of a system of charged particles. We consider the selection of the set of relevant observables. We show the relation between kinetic theory and linear response theory. Using thermodynamic Green's functions, we present a systematic treatment of correlation functions, but the convergence needs investigation. We compare different expressions for the conductivity and list open questions.

  5. 3 CFR - Enhanced Collection of Relevant Data and Statistics Relating to Women

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 3 The President 1 2012-01-01 2012-01-01 false Enhanced Collection of Relevant Data and Statistics Relating to Women Presidential Documents Other Presidential Documents Memorandum of March 4, 2011 Enhanced Collection of Relevant Data and Statistics Relating to Women Memorandum for the Heads of Executive Departments and Agencies I am proud to work...

  6. Postgraduate Taught Portfolio Review--The Cluster Approach, Non-Subject-Based Grouping of Courses and Relevant Performance Indicators

    ERIC Educational Resources Information Center

    Konstantinidis-Pereira, Alicja

    2018-01-01

    This paper summarises a new method of grouping postgraduate taught (PGT) courses introduced at Oxford Brookes University as a part of a Portfolio Review. Instead of classifying courses by subject, the new cluster approach uses statistical methods to group the courses based on factors including flexibility of study options, level of specialisation,…

  7. Deriving Signatures of In Vivo Toxicity Using Both Efficacy and Potency Information from In Vitro Assays: Evaluating Model Performance as a Function of Increasing Variability in Experimental Data

    EPA Science Inventory

    The US EPA ToxCast program aims to develop methods for mechanistically-based chemical prioritization using a suite of high throughput, in vitro assays that probe relevant biological pathways, and coupling them with statistical and machine learning methods that produce predictive ...

  8. Implementation of the common phrase index method on the phrase query for information retrieval

    NASA Astrophysics Data System (ADS)

    Fatmawati, Triyah; Zaman, Badrus; Werdiningsih, Indah

    2017-08-01

    As the development of technology, the process of finding information on the news text is easy, because the text of the news is not only distributed in print media, such as newspapers, but also in electronic media that can be accessed using the search engine. In the process of finding relevant documents on the search engine, a phrase often used as a query. The number of words that make up the phrase query and their position obviously affect the relevance of the document produced. As a result, the accuracy of the information obtained will be affected. Based on the outlined problem, the purpose of this research was to analyze the implementation of the common phrase index method on information retrieval. This research will be conducted in English news text and implemented on a prototype to determine the relevance level of the documents produced. The system is built with the stages of pre-processing, indexing, term weighting calculation, and cosine similarity calculation. Then the system will display the document search results in a sequence, based on the cosine similarity. Furthermore, system testing will be conducted using 100 documents and 20 queries. That result is then used for the evaluation stage. First, determine the relevant documents using kappa statistic calculation. Second, determine the system success rate using precision, recall, and F-measure calculation. In this research, the result of kappa statistic calculation was 0.71, so that the relevant documents are eligible for the system evaluation. Then the calculation of precision, recall, and F-measure produces precision of 0.37, recall of 0.50, and F-measure of 0.43. From this result can be said that the success rate of the system to produce relevant documents is low.

  9. Data Science in the Research Domain Criteria Era: Relevance of Machine Learning to the Study of Stress Pathology, Recovery, and Resilience

    PubMed Central

    Galatzer-Levy, Isaac R.; Ruggles, Kelly; Chen, Zhe

    2017-01-01

    Diverse environmental and biological systems interact to influence individual differences in response to environmental stress. Understanding the nature of these complex relationships can enhance the development of methods to: (1) identify risk, (2) classify individuals as healthy or ill, (3) understand mechanisms of change, and (4) develop effective treatments. The Research Domain Criteria (RDoC) initiative provides a theoretical framework to understand health and illness as the product of multiple inter-related systems but does not provide a framework to characterize or statistically evaluate such complex relationships. Characterizing and statistically evaluating models that integrate multiple levels (e.g. synapses, genes, environmental factors) as they relate to outcomes that a free from prior diagnostic benchmarks represents a challenge requiring new computational tools that are capable to capture complex relationships and identify clinically relevant populations. In the current review, we will summarize machine learning methods that can achieve these goals. PMID:29527592

  10. To What Extent Is Mathematical Ability Predictive of Performance in a Methodology and Statistics Course? Can an Action Research Approach Be Used to Understand the Relevance of Mathematical Ability in Psychology Undergraduates?

    ERIC Educational Resources Information Center

    Bourne, Victoria J.

    2014-01-01

    Research methods and statistical analysis is typically the least liked and most anxiety provoking aspect of a psychology undergraduate degree, in large part due to the mathematical component of the content. In this first cycle of a piece of action research, students' mathematical ability is examined in relation to their performance across…

  11. Models of dyadic social interaction.

    PubMed Central

    Griffin, Dale; Gonzalez, Richard

    2003-01-01

    We discuss the logic of research designs for dyadic interaction and present statistical models with parameters that are tied to psychologically relevant constructs. Building on Karl Pearson's classic nineteenth-century statistical analysis of within-organism similarity, we describe several approaches to indexing dyadic interdependence and provide graphical methods for visualizing dyadic data. We also describe several statistical and conceptual solutions to the 'levels of analytic' problem in analysing dyadic data. These analytic strategies allow the researcher to examine and measure psychological questions of interdependence and social influence. We provide illustrative data from casually interacting and romantic dyads. PMID:12689382

  12. Contribution of Apollo lunar photography to the establishment of selenodetic control

    NASA Technical Reports Server (NTRS)

    Dermanis, A.

    1975-01-01

    Among the various types of available data relevant to the establishment of geometric control on the moon, the only one covering significant portions of the lunar surface (20%) with sufficient information content, is lunar photography, taken at the proximity of the moon from lunar orbiters. The idea of free geodetic networks is introduced as a tool for the statistical comparison of the geometric aspects of the various data used. Methods were developed for the updating of the statistics of observations and the a priori parameter estimates to obtain statistically consistent solutions by means of the optimum relative weighting concept.

  13. Pathway-GPS and SIGORA: identifying relevant pathways based on the over-representation of their gene-pair signatures

    PubMed Central

    Foroushani, Amir B.K.; Brinkman, Fiona S.L.

    2013-01-01

    Motivation. Predominant pathway analysis approaches treat pathways as collections of individual genes and consider all pathway members as equally informative. As a result, at times spurious and misleading pathways are inappropriately identified as statistically significant, solely due to components that they share with the more relevant pathways. Results. We introduce the concept of Pathway Gene-Pair Signatures (Pathway-GPS) as pairs of genes that, as a combination, are specific to a single pathway. We devised and implemented a novel approach to pathway analysis, Signature Over-representation Analysis (SIGORA), which focuses on the statistically significant enrichment of Pathway-GPS in a user-specified gene list of interest. In a comparative evaluation of several published datasets, SIGORA outperformed traditional methods by delivering biologically more plausible and relevant results. Availability. An efficient implementation of SIGORA, as an R package with precompiled GPS data for several human and mouse pathway repositories is available for download from http://sigora.googlecode.com/svn/. PMID:24432194

  14. The Content of Statistical Requirements for Authors in Biomedical Research Journals.

    PubMed

    Liu, Tian-Yi; Cai, Si-Yu; Nie, Xiao-Lu; Lyu, Ya-Qi; Peng, Xiao-Xia; Feng, Guo-Shuang

    2016-10-20

    Robust statistical designing, sound statistical analysis, and standardized presentation are important to enhance the quality and transparency of biomedical research. This systematic review was conducted to summarize the statistical reporting requirements introduced by biomedical research journals with an impact factor of 10 or above so that researchers are able to give statistical issues' serious considerations not only at the stage of data analysis but also at the stage of methodological design. Detailed statistical instructions for authors were downloaded from the homepage of each of the included journals or obtained from the editors directly via email. Then, we described the types and numbers of statistical guidelines introduced by different press groups. Items of statistical reporting guideline as well as particular requirements were summarized in frequency, which were grouped into design, method of analysis, and presentation, respectively. Finally, updated statistical guidelines and particular requirements for improvement were summed up. Totally, 21 of 23 press groups introduced at least one statistical guideline. More than half of press groups can update their statistical instruction for authors gradually relative to issues of new statistical reporting guidelines. In addition, 16 press groups, covering 44 journals, address particular statistical requirements. The most of the particular requirements focused on the performance of statistical analysis and transparency in statistical reporting, including "address issues relevant to research design, including participant flow diagram, eligibility criteria, and sample size estimation," and "statistical methods and the reasons." Statistical requirements for authors are becoming increasingly perfected. Statistical requirements for authors remind researchers that they should make sufficient consideration not only in regards to statistical methods during the research design, but also standardized statistical reporting, which would be beneficial in providing stronger evidence and making a greater critical appraisal of evidence more accessible.

  15. Sparse approximation of currents for statistics on curves and surfaces.

    PubMed

    Durrleman, Stanley; Pennec, Xavier; Trouvé, Alain; Ayache, Nicholas

    2008-01-01

    Computing, processing, visualizing statistics on shapes like curves or surfaces is a real challenge with many applications ranging from medical image analysis to computational geometry. Modelling such geometrical primitives with currents avoids feature-based approach as well as point-correspondence method. This framework has been proved to be powerful to register brain surfaces or to measure geometrical invariants. However, if the state-of-the-art methods perform efficiently pairwise registrations, new numerical schemes are required to process groupwise statistics due to an increasing complexity when the size of the database is growing. Statistics such as mean and principal modes of a set of shapes often have a heavy and highly redundant representation. We propose therefore to find an adapted basis on which mean and principal modes have a sparse decomposition. Besides the computational improvement, this sparse representation offers a way to visualize and interpret statistics on currents. Experiments show the relevance of the approach on 34 sets of 70 sulcal lines and on 50 sets of 10 meshes of deep brain structures.

  16. Diffusion-Based Density-Equalizing Maps: an Interdisciplinary Approach to Visualizing Homicide Rates and Other Georeferenced Statistical Data

    NASA Astrophysics Data System (ADS)

    Mazzitello, Karina I.; Candia, Julián

    2012-12-01

    In every country, public and private agencies allocate extensive funding to collect large-scale statistical data, which in turn are studied and analyzed in order to determine local, regional, national, and international policies regarding all aspects relevant to the welfare of society. One important aspect of that process is the visualization of statistical data with embedded geographical information, which most often relies on archaic methods such as maps colored according to graded scales. In this work, we apply nonstandard visualization techniques based on physical principles. We illustrate the method with recent statistics on homicide rates in Brazil and their correlation to other publicly available data. This physics-based approach provides a novel tool that can be used by interdisciplinary teams investigating statistics and model projections in a variety of fields such as economics and gross domestic product research, public health and epidemiology, sociodemographics, political science, business and marketing, and many others.

  17. Text mining by Tsallis entropy

    NASA Astrophysics Data System (ADS)

    Jamaati, Maryam; Mehri, Ali

    2018-01-01

    Long-range correlations between the elements of natural languages enable them to convey very complex information. Complex structure of human language, as a manifestation of natural languages, motivates us to apply nonextensive statistical mechanics in text mining. Tsallis entropy appropriately ranks the terms' relevance to document subject, taking advantage of their spatial correlation length. We apply this statistical concept as a new powerful word ranking metric in order to extract keywords of a single document. We carry out an experimental evaluation, which shows capability of the presented method in keyword extraction. We find that, Tsallis entropy has reliable word ranking performance, at the same level of the best previous ranking methods.

  18. Global Topology Optimisation

    DTIC Science & Technology

    2016-10-31

    statistical physics. Sec. IV includes several examples of the application of the stochastic method, including matching of a shape to a fixed design, and...an important part of any future application of this method. Second, re-initialization of the level set can lead to small but significant movements of...of engineering design problems [6, 17]. However, many of the relevant applications involve non-convex optimisation problems with multiple locally

  19. Comparison of the perceived relevance of oral biology reported by students and interns of a Pakistani dental college.

    PubMed

    Farooq, I; Ali, S

    2014-11-01

    The purpose of this study was to analyse and compare the perceived relevance of oral biology with dentistry as reported by dental students and interns and to investigate the most popular teaching approach and learning resource. A questionnaire aiming to ask about the relevance of oral biology to dentistry, most popular teaching method and learning resource was utilised in this study. Study groups encompassed second-year dental students who had completed their course and dental interns. The data were obtained and analysed statistically. The overall response rate for both groups was 60%. Both groups reported high relevance of oral biology to dentistry. Perception of dental interns regarding the relevance of oral biology to dentistry was higher than that of students. Both groups identified student presentations as the most important teaching method. Amongst the most important learning resources, textbooks were considered most imperative by interns, whereas lecture handouts received the highest importance score by students. Dental students and interns considered oral biology to be relevant to dentistry, although greater relevance was reported by interns. Year-wise advancement in dental education and training improves the perception of the students about the relevance of oral biology to dentistry. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  20. Statistical Relational Learning to Predict Primary Myocardial Infarction from Electronic Health Records

    PubMed Central

    Weiss, Jeremy C; Page, David; Peissig, Peggy L; Natarajan, Sriraam; McCarty, Catherine

    2013-01-01

    Electronic health records (EHRs) are an emerging relational domain with large potential to improve clinical outcomes. We apply two statistical relational learning (SRL) algorithms to the task of predicting primary myocardial infarction. We show that one SRL algorithm, relational functional gradient boosting, outperforms propositional learners particularly in the medically-relevant high recall region. We observe that both SRL algorithms predict outcomes better than their propositional analogs and suggest how our methods can augment current epidemiological practices. PMID:25360347

  1. A note on generalized Genome Scan Meta-Analysis statistics

    PubMed Central

    Koziol, James A; Feng, Anne C

    2005-01-01

    Background Wise et al. introduced a rank-based statistical technique for meta-analysis of genome scans, the Genome Scan Meta-Analysis (GSMA) method. Levinson et al. recently described two generalizations of the GSMA statistic: (i) a weighted version of the GSMA statistic, so that different studies could be ascribed different weights for analysis; and (ii) an order statistic approach, reflecting the fact that a GSMA statistic can be computed for each chromosomal region or bin width across the various genome scan studies. Results We provide an Edgeworth approximation to the null distribution of the weighted GSMA statistic, and, we examine the limiting distribution of the GSMA statistics under the order statistic formulation, and quantify the relevance of the pairwise correlations of the GSMA statistics across different bins on this limiting distribution. We also remark on aggregate criteria and multiple testing for determining significance of GSMA results. Conclusion Theoretical considerations detailed herein can lead to clarification and simplification of testing criteria for generalizations of the GSMA statistic. PMID:15717930

  2. Multivariate analysis: A statistical approach for computations

    NASA Astrophysics Data System (ADS)

    Michu, Sachin; Kaushik, Vandana

    2014-10-01

    Multivariate analysis is a type of multivariate statistical approach commonly used in, automotive diagnosis, education evaluating clusters in finance etc and more recently in the health-related professions. The objective of the paper is to provide a detailed exploratory discussion about factor analysis (FA) in image retrieval method and correlation analysis (CA) of network traffic. Image retrieval methods aim to retrieve relevant images from a collected database, based on their content. The problem is made more difficult due to the high dimension of the variable space in which the images are represented. Multivariate correlation analysis proposes an anomaly detection and analysis method based on the correlation coefficient matrix. Anomaly behaviors in the network include the various attacks on the network like DDOs attacks and network scanning.

  3. Active browsing using similarity pyramids

    NASA Astrophysics Data System (ADS)

    Chen, Jau-Yuen; Bouman, Charles A.; Dalton, John C.

    1998-12-01

    In this paper, we describe a new approach to managing large image databases, which we call active browsing. Active browsing integrates relevance feedback into the browsing environment, so that users can modify the database's organization to suit the desired task. Our method is based on a similarity pyramid data structure, which hierarchically organizes the database, so that it can be efficiently browsed. At coarse levels, the similarity pyramid allows users to view the database as large clusters of similar images. Alternatively, users can 'zoom into' finer levels to view individual images. We discuss relevance feedback for the browsing process, and argue that it is fundamentally different from relevance feedback for more traditional search-by-query tasks. We propose two fundamental operations for active browsing: pruning and reorganization. Both of these operations depend on a user-defined relevance set, which represents the image or set of images desired by the user. We present statistical methods for accurately pruning the database, and we propose a new 'worm hole' distance metric for reorganizing the database, so that members of the relevance set are grouped together.

  4. Economic Psychology: Its Connections with Research-Oriented Courses

    ERIC Educational Resources Information Center

    Christopher, Andrew N.; Marek, Pam; Benigno, Joann

    2003-01-01

    To enhance student interest in research methods, tests and measurement, and statistics classes, we describe how teachers may use resources from economic psychology to illustrate key concepts in these courses. Because of their applied nature and relevance to student experiences, topics covered by these resources may capture student attention and…

  5. Analysis and Interpretation of Findings Using Multiple Regression Techniques

    ERIC Educational Resources Information Center

    Hoyt, William T.; Leierer, Stephen; Millington, Michael J.

    2006-01-01

    Multiple regression and correlation (MRC) methods form a flexible family of statistical techniques that can address a wide variety of different types of research questions of interest to rehabilitation professionals. In this article, we review basic concepts and terms, with an emphasis on interpretation of findings relevant to research questions…

  6. Fear and loathing: undergraduate nursing students' experiences of a mandatory course in applied statistics.

    PubMed

    Hagen, Brad; Awosoga, Oluwagbohunmi A; Kellett, Peter; Damgaard, Marie

    2013-04-23

    This article describes the results of a qualitative research study evaluating nursing students' experiences of a mandatory course in applied statistics, and the perceived effectiveness of teaching methods implemented during the course. Fifteen nursing students in the third year of a four-year baccalaureate program in nursing participated in focus groups before and after taking the mandatory course in statistics. The interviews were transcribed and analyzed using content analysis to reveal four major themes: (i) "one of those courses you throw out?," (ii) "numbers and terrifying equations," (iii) "first aid for statistics casualties," and (iv) "re-thinking curriculum." Overall, the data revealed that although nursing students initially enter statistics courses with considerable skepticism, fear, and anxiety, there are a number of concrete actions statistics instructors can take to reduce student fear and increase the perceived relevance of courses in statistics.

  7. Statistical science: a grammar for research.

    PubMed

    Cox, David R

    2017-06-01

    I greatly appreciate the invitation to give this lecture with its century long history. The title is a warning that the lecture is rather discursive and not highly focused and technical. The theme is simple. That statistical thinking provides a unifying set of general ideas and specific methods relevant whenever appreciable natural variation is present. To be most fruitful these ideas should merge seamlessly with subject-matter considerations. By contrast, there is sometimes a temptation to regard formal statistical analysis as a ritual to be added after the serious work has been done, a ritual to satisfy convention, referees, and regulatory agencies. I want implicitly to refute that idea.

  8. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wurtz, R.; Kaplan, A.

    Pulse shape discrimination (PSD) is a variety of statistical classifier. Fully-­realized statistical classifiers rely on a comprehensive set of tools for designing, building, and implementing. PSD advances rely on improvements to the implemented algorithm. PSD advances can be improved by using conventional statistical classifier or machine learning methods. This paper provides the reader with a glossary of classifier-­building elements and their functions in a fully-­designed and operational classifier framework that can be used to discover opportunities for improving PSD classifier projects. This paper recommends reporting the PSD classifier’s receiver operating characteristic (ROC) curve and its behavior at a gamma rejectionmore » rate (GRR) relevant for realistic applications.« less

  9. Methods to Approach Velocity Data Reduction and Their Effects on Conformation Statistics in Viscoelastic Turbulent Channel Flows

    NASA Astrophysics Data System (ADS)

    Samanta, Gaurab; Beris, Antony; Handler, Robert; Housiadas, Kostas

    2009-03-01

    Karhunen-Loeve (KL) analysis of DNS data of viscoelastic turbulent channel flows helps us to reveal more information on the time-dependent dynamics of viscoelastic modification of turbulence [Samanta et. al., J. Turbulence (in press), 2008]. A selected set of KL modes can be used for a data reduction modeling of these flows. However, it is pertinent that verification be done against established DNS results. For this purpose, we did comparisons of velocity and conformations statistics and probability density functions (PDFs) of relevant quantities obtained from DNS and reconstructed fields using selected KL modes and time-dependent coefficients. While the velocity statistics show good agreement between results from DNS and KL reconstructions even with just hundreds of KL modes, tens of thousands of KL modes are required to adequately capture the trace of polymer conformation resulting from DNS. New modifications to KL method have therefore been attempted to account for the differences in conformation statistics. The applicability and impact of these new modified KL methods will be discussed in the perspective of data reduction modeling.

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nose, Y.

    Methods were developed for generating an integrated, statistical model of the anatomical structures within the human thorax relevant to radioisotope powered artificial heart implantation. These methods involve measurement and analysis of anatomy in four areas: chest wall, pericardium, vascular connections, and great vessels. A model for the prediction of thorax outline from radiograms was finalized. These models were combined with 100 radiograms to arrive at a size distribution representing the adult male and female populations. (CH)

  11. Establishing Statistical Equivalence of Data from Different Sampling Approaches for Assessment of Bacterial Phenotypic Antimicrobial Resistance

    PubMed Central

    2018-01-01

    ABSTRACT To assess phenotypic bacterial antimicrobial resistance (AMR) in different strata (e.g., host populations, environmental areas, manure, or sewage effluents) for epidemiological purposes, isolates of target bacteria can be obtained from a stratum using various sample types. Also, different sample processing methods can be applied. The MIC of each target antimicrobial drug for each isolate is measured. Statistical equivalence testing of the MIC data for the isolates allows evaluation of whether different sample types or sample processing methods yield equivalent estimates of the bacterial antimicrobial susceptibility in the stratum. We demonstrate this approach on the antimicrobial susceptibility estimates for (i) nontyphoidal Salmonella spp. from ground or trimmed meat versus cecal content samples of cattle in processing plants in 2013-2014 and (ii) nontyphoidal Salmonella spp. from urine, fecal, and blood human samples in 2015 (U.S. National Antimicrobial Resistance Monitoring System data). We found that the sample types for cattle yielded nonequivalent susceptibility estimates for several antimicrobial drug classes and thus may gauge distinct subpopulations of salmonellae. The quinolone and fluoroquinolone susceptibility estimates for nontyphoidal salmonellae from human blood are nonequivalent to those from urine or feces, conjecturally due to the fluoroquinolone (ciprofloxacin) use to treat infections caused by nontyphoidal salmonellae. We also demonstrate statistical equivalence testing for comparing sample processing methods for fecal samples (culturing one versus multiple aliquots per sample) to assess AMR in fecal Escherichia coli. These methods yield equivalent results, except for tetracyclines. Importantly, statistical equivalence testing provides the MIC difference at which the data from two sample types or sample processing methods differ statistically. Data users (e.g., microbiologists and epidemiologists) may then interpret practical relevance of the difference. IMPORTANCE Bacterial antimicrobial resistance (AMR) needs to be assessed in different populations or strata for the purposes of surveillance and determination of the efficacy of interventions to halt AMR dissemination. To assess phenotypic antimicrobial susceptibility, isolates of target bacteria can be obtained from a stratum using different sample types or employing different sample processing methods in the laboratory. The MIC of each target antimicrobial drug for each of the isolates is measured, yielding the MIC distribution across the isolates from each sample type or sample processing method. We describe statistical equivalence testing for the MIC data for evaluating whether two sample types or sample processing methods yield equivalent estimates of the bacterial phenotypic antimicrobial susceptibility in the stratum. This includes estimating the MIC difference at which the data from the two approaches differ statistically. Data users (e.g., microbiologists, epidemiologists, and public health professionals) can then interpret whether that present difference is practically relevant. PMID:29475868

  12. Establishing Statistical Equivalence of Data from Different Sampling Approaches for Assessment of Bacterial Phenotypic Antimicrobial Resistance.

    PubMed

    Shakeri, Heman; Volkova, Victoriya; Wen, Xuesong; Deters, Andrea; Cull, Charley; Drouillard, James; Müller, Christian; Moradijamei, Behnaz; Jaberi-Douraki, Majid

    2018-05-01

    To assess phenotypic bacterial antimicrobial resistance (AMR) in different strata (e.g., host populations, environmental areas, manure, or sewage effluents) for epidemiological purposes, isolates of target bacteria can be obtained from a stratum using various sample types. Also, different sample processing methods can be applied. The MIC of each target antimicrobial drug for each isolate is measured. Statistical equivalence testing of the MIC data for the isolates allows evaluation of whether different sample types or sample processing methods yield equivalent estimates of the bacterial antimicrobial susceptibility in the stratum. We demonstrate this approach on the antimicrobial susceptibility estimates for (i) nontyphoidal Salmonella spp. from ground or trimmed meat versus cecal content samples of cattle in processing plants in 2013-2014 and (ii) nontyphoidal Salmonella spp. from urine, fecal, and blood human samples in 2015 (U.S. National Antimicrobial Resistance Monitoring System data). We found that the sample types for cattle yielded nonequivalent susceptibility estimates for several antimicrobial drug classes and thus may gauge distinct subpopulations of salmonellae. The quinolone and fluoroquinolone susceptibility estimates for nontyphoidal salmonellae from human blood are nonequivalent to those from urine or feces, conjecturally due to the fluoroquinolone (ciprofloxacin) use to treat infections caused by nontyphoidal salmonellae. We also demonstrate statistical equivalence testing for comparing sample processing methods for fecal samples (culturing one versus multiple aliquots per sample) to assess AMR in fecal Escherichia coli These methods yield equivalent results, except for tetracyclines. Importantly, statistical equivalence testing provides the MIC difference at which the data from two sample types or sample processing methods differ statistically. Data users (e.g., microbiologists and epidemiologists) may then interpret practical relevance of the difference. IMPORTANCE Bacterial antimicrobial resistance (AMR) needs to be assessed in different populations or strata for the purposes of surveillance and determination of the efficacy of interventions to halt AMR dissemination. To assess phenotypic antimicrobial susceptibility, isolates of target bacteria can be obtained from a stratum using different sample types or employing different sample processing methods in the laboratory. The MIC of each target antimicrobial drug for each of the isolates is measured, yielding the MIC distribution across the isolates from each sample type or sample processing method. We describe statistical equivalence testing for the MIC data for evaluating whether two sample types or sample processing methods yield equivalent estimates of the bacterial phenotypic antimicrobial susceptibility in the stratum. This includes estimating the MIC difference at which the data from the two approaches differ statistically. Data users (e.g., microbiologists, epidemiologists, and public health professionals) can then interpret whether that present difference is practically relevant. Copyright © 2018 Shakeri et al.

  13. Assessment of NDE Reliability Data

    NASA Technical Reports Server (NTRS)

    Yee, B. G. W.; Chang, F. H.; Couchman, J. C.; Lemon, G. H.; Packman, P. F.

    1976-01-01

    Twenty sets of relevant Nondestructive Evaluation (NDE) reliability data have been identified, collected, compiled, and categorized. A criterion for the selection of data for statistical analysis considerations has been formulated. A model to grade the quality and validity of the data sets has been developed. Data input formats, which record the pertinent parameters of the defect/specimen and inspection procedures, have been formulated for each NDE method. A comprehensive computer program has been written to calculate the probability of flaw detection at several confidence levels by the binomial distribution. This program also selects the desired data sets for pooling and tests the statistical pooling criteria before calculating the composite detection reliability. Probability of detection curves at 95 and 50 percent confidence levels have been plotted for individual sets of relevant data as well as for several sets of merged data with common sets of NDE parameters.

  14. Publications in anesthesia journals: quality and clinical relevance.

    PubMed

    Lauritsen, Jakob; Moller, Ann M

    2004-11-01

    Clinicians performing evidence-based anesthesia rely on anesthesia journals for clinically relevant information. The objective of this study was to analyze the proportion of clinically relevant articles in five high impact anesthesia journals. We evaluated all articles published in Anesthesiology, Anesthesia & Analgesia, British Journal of Anesthesia, Anesthesia, and Acta Anaesthesiologica Scandinavica from January to June, 2000. Articles were assessed and classified according to type, outcome, and design; 1379 articles consisting of 5468 pages were evaluated and categorized. The most common types of article were animal and laboratory research (31.2%) and randomized clinical trial (20.4%). A clinically relevant article was defined as an article that used a statistically valid method and had a clinically relevant end-point. Altogether 18.6% of the pages had as their subject matter clinically relevant trials. We compared the Journal Impact Factor (a measure of the number of citations per article in a journal) and the proportion of clinically relevant pages and found that they were inversely proportional to each other.

  15. 76 FR 12823 - Enhanced Collection of Relevant Data and Statistics Relating to Women

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-03-09

    ... greater understanding of policies and programs. Preparation of this report revealed the vast data... Collection of Relevant Data and Statistics Relating to Women Memorandum for the Heads of Executive... accompanying website collection of relevant data, will assist Government officials in crafting policies in...

  16. Analysis of respiratory events in obstructive sleep apnea syndrome: Inter-relations and association to simple nocturnal features.

    PubMed

    Ghandeharioun, H; Rezaeitalab, F; Lotfi, R

    2016-01-01

    This study carefully evaluates the association of different respiration-related events to each other and to simple nocturnal features in obstructive sleep apnea-hypopnea syndrome (OSAS). The events include apneas, hypopneas, respiratory event-related arousals and snores. We conducted a statistical study on 158 adults who underwent polysomnography between July 2012 and May 2014. To monitor relevance, along with linear statistical strategies like analysis of variance and bootstrapping a correlation coefficient standard error, the non-linear method of mutual information is also applied to illuminate vague results of linear techniques. Based on normalized mutual information weights (NMIW), indices of apnea are 1.3 times more relevant to AHI values than those of hypopnea. NMIW for the number of blood oxygen desaturation below 95% is considerable (0.531). The next relevant feature is "respiratory arousals index" with NMIW of 0.501. Snore indices (0.314), and BMI (0.203) take the next place. Based on NMIW values, snoring events are nearly one-third (29.9%) more dependent to hypopneas than RERAs. 1. The more sever the OSAS is, the more frequently the apneic events happen. 2. The association of snore with hypopnea/RERA revealed which is routinely ignored in regression-based OSAS modeling. 3. The statistical dependencies of oximetry features potentially can lead to home-based screening of OSAS. 4. Poor ESS-AHI relevance in the database under study indicates its disability for the OSA diagnosis compared to oximetry. 5. Based on poor RERA-snore/ESS relevance, detailed history of the symptoms plus polysomnography is suggested for accurate diagnosis of RERAs. Copyright © 2015 Sociedade Portuguesa de Pneumologia. Published by Elsevier España, S.L.U. All rights reserved.

  17. A statistical assessment of differences and equivalences between genetically modified and reference plant varieties

    PubMed Central

    2011-01-01

    Background Safety assessment of genetically modified organisms is currently often performed by comparative evaluation. However, natural variation of plant characteristics between commercial varieties is usually not considered explicitly in the statistical computations underlying the assessment. Results Statistical methods are described for the assessment of the difference between a genetically modified (GM) plant variety and a conventional non-GM counterpart, and for the assessment of the equivalence between the GM variety and a group of reference plant varieties which have a history of safe use. It is proposed to present the results of both difference and equivalence testing for all relevant plant characteristics simultaneously in one or a few graphs, as an aid for further interpretation in safety assessment. A procedure is suggested to derive equivalence limits from the observed results for the reference plant varieties using a specific implementation of the linear mixed model. Three different equivalence tests are defined to classify any result in one of four equivalence classes. The performance of the proposed methods is investigated by a simulation study, and the methods are illustrated on compositional data from a field study on maize grain. Conclusions A clear distinction of practical relevance is shown between difference and equivalence testing. The proposed tests are shown to have appropriate performance characteristics by simulation, and the proposed simultaneous graphical representation of results was found to be helpful for the interpretation of results from a practical field trial data set. PMID:21324199

  18. Big-Data RHEED analysis for understanding epitaxial film growth processes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vasudevan, Rama K; Tselev, Alexander; Baddorf, Arthur P

    Reflection high energy electron diffraction (RHEED) has by now become a standard tool for in-situ monitoring of film growth by pulsed laser deposition and molecular beam epitaxy. Yet despite the widespread adoption and wealth of information in RHEED image, most applications are limited to observing intensity oscillations of the specular spot, and much additional information on growth is discarded. With ease of data acquisition and increased computation speeds, statistical methods to rapidly mine the dataset are now feasible. Here, we develop such an approach to the analysis of the fundamental growth processes through multivariate statistical analysis of RHEED image sequence.more » This approach is illustrated for growth of LaxCa1-xMnO3 films grown on etched (001) SrTiO3 substrates, but is universal. The multivariate methods including principal component analysis and k-means clustering provide insight into the relevant behaviors, the timing and nature of a disordered to ordered growth change, and highlight statistically significant patterns. Fourier analysis yields the harmonic components of the signal and allows separation of the relevant components and baselines, isolating the assymetric nature of the step density function and the transmission spots from the imperfect layer-by-layer (LBL) growth. These studies show the promise of big data approaches to obtaining more insight into film properties during and after epitaxial film growth. Furthermore, these studies open the pathway to use forward prediction methods to potentially allow significantly more control over growth process and hence final film quality.« less

  19. Psycho-Pedagogical Measuring Bases of Educational Competences of Students

    ERIC Educational Resources Information Center

    Kenzhegaliev, Kulush K.; Shayakhmetova, Aisulu A.; Zulkarnayeva, Zhamila A.; Iksatova, Balzhan K.; Shonova, Bakytgul A.

    2016-01-01

    The relevance of the research problem is conditioned by the weak development of measurement, assessment of educational competences at an operational level, at the level of actions, by insufficient applications of psycho-pedagogical theories and methods of mathematical statistics. The aim of the work is to develop through teaching experiments the…

  20. Collaborative Writing in a Statistics and Research Methods Course.

    ERIC Educational Resources Information Center

    Dunn, Dana S.

    1996-01-01

    Describes a collaborative writing project in which students must identify key variables, search and read relevant literature, and reason through a research idea by working closely with a partner. The end result is a polished laboratory report in the APA style. The class includes a peer review workshop prior to final editing. (MJP)

  1. Testing for Additivity in Chemical Mixtures Using a Fixed-Ratio Ray Design and Statistical Equivalence Testing Methods

    EPA Science Inventory

    Fixed-ratio ray designs have been used for detecting and characterizing interactions of large numbers of chemicals in combination. Single chemical dose-response data are used to predict an “additivity curve” along an environmentally relevant ray. A “mixture curve” is estimated fr...

  2. Rapid label-free identification of Klebsiella pneumoniae antibiotic resistant strains by the drop-coating deposition surface-enhanced Raman scattering method

    NASA Astrophysics Data System (ADS)

    Cheong, Youjin; Kim, Young Jin; Kang, Heeyoon; Choi, Samjin; Lee, Hee Joo

    2017-08-01

    Although many methodologies have been developed to identify unknown bacteria, bacterial identification in clinical microbiology remains a complex and time-consuming procedure. To address this problem, we developed a label-free method for rapidly identifying clinically relevant multilocus sequencing typing-verified quinolone-resistant Klebsiella pneumoniae strains. We also applied the method to identify three strains from colony samples, ATCC70063 (control), ST11 and ST15; these are the prevalent quinolone-resistant K. pneumoniae strains in East Asia. The colonies were identified using a drop-coating deposition surface-enhanced Raman scattering (DCD-SERS) procedure coupled with a multivariate statistical method. Our workflow exhibited an enhancement factor of 11.3 × 106 to Raman intensities, high reproducibility (relative standard deviation of 7.4%), and a sensitive limit of detection (100 pM rhodamine 6G), with a correlation coefficient of 0.98. All quinolone-resistant K. pneumoniae strains showed similar spectral Raman shifts (high correlations) regardless of bacterial type, as well as different Raman vibrational modes compared to Escherichia coli strains. Our proposed DCD-SERS procedure coupled with the multivariate statistics-based identification method achieved excellent performance in discriminating similar microbes from one another and also in subtyping of K. pneumoniae strains. Therefore, our label-free DCD-SERS procedure coupled with the computational decision supporting method is a potentially useful method for the rapid identification of clinically relevant K. pneumoniae strains.

  3. External cooling methods for treatment of fever in adults: a systematic review.

    PubMed

    Chan, E Y; Chen, W T; Assam, P N

    It is unclear if the use of external cooling to treat fever contributes to better patient outcomes. Despite this, it is a common practice to treat febrile patients using external cooling methods alone or in combination with pharmacological antipyretics. The objective of this systematic review was to evaluate the effectiveness and complications of external cooling methods in febrile adults in acute care settings. We included adults admitted to acute care settings and developed elevated body temperature.We considered any external cooling method compared to no cooling.We considered randomised control trials (RCTs), quasi-randomised trials and controlled trials with concurrent control groups SEARCH STRATEGY: We searched relevant published or unpublished studies up to October 2009 regardless of language. We searched major databases, reference lists, bibliographies of all relevant articles, and contacted experts in the field for additional studies. Two reviewers independently screened titles and abstracts, and retrieved all potentially relevant studies. Two reviewers independently conducted the assessment of methodological quality of included studies. The results of studies where appropriate was quantitatively summarised. Relative risks or weighted mean difference and their 95% confidence intervals were calculated using the random effects model in Review Manager 5. For each pooled comparison, heterogeneity was assessed using the chi-squared test at the 5% level of statistical significance, with I statistic used to assess the impact of statistical heterogeneity on study results. Where statistical summary was not appropriate or possible, the findings were summarised in narrative form. We found six RCTs that compared the effectiveness and complications of external cooling methods against no external cooling. There was wide variation in the outcome measures between the included trials. We performed meta-analyses on data from two RCTs totalling 356 patients testing external cooling combined with antipyretics versus antipyretics alone, for the resolution of fever. The results did not show a statistically significant reduction in fever (relative risk 1.12, 95% CI 0.95 to 1.31; P=0.35; I =0%).The evidence from four trials suggested that there was no difference in the mean drop in body temperature post treatment initiation, between external cooling and no cooling groups. The results of most other outcomes also did not demonstrate a statistically significant difference. However summarising the results of five trials consisting of 371 patients found that the external cooling group was more likely to shiver when compared to the no cooling group (relative risk 6.37, 95% CI 2.01 to 20.11; P=0.61; I =0%).Overall this review suggested that external cooling methods (whether used alone or in combination with pharmacologic methods) were not effective in treating fever among adults admitted to acute care settings. Yet they were associated with higher incidences of shivering. These results should be interpreted in light of the methodological limitations of available trials. Given the current available evidence, the routine use of external cooling methods to treat fever in adults may not be warranted until further evidence is available. They could be considered for patients whose conditions are unable to tolerate even slight increase in temperature or who request for them. Whenever they are used, shivering should be prevented. Well-designed, adequately powered, randomised trials comparing external cooling methods against no cooling are needed.

  4. Conducting Simulation Studies in the R Programming Environment.

    PubMed

    Hallgren, Kevin A

    2013-10-12

    Simulation studies allow researchers to answer specific questions about data analysis, statistical power, and best-practices for obtaining accurate results in empirical research. Despite the benefits that simulation research can provide, many researchers are unfamiliar with available tools for conducting their own simulation studies. The use of simulation studies need not be restricted to researchers with advanced skills in statistics and computer programming, and such methods can be implemented by researchers with a variety of abilities and interests. The present paper provides an introduction to methods used for running simulation studies using the R statistical programming environment and is written for individuals with minimal experience running simulation studies or using R. The paper describes the rationale and benefits of using simulations and introduces R functions relevant for many simulation studies. Three examples illustrate different applications for simulation studies, including (a) the use of simulations to answer a novel question about statistical analysis, (b) the use of simulations to estimate statistical power, and (c) the use of simulations to obtain confidence intervals of parameter estimates through bootstrapping. Results and fully annotated syntax from these examples are provided.

  5. Identifying Epigenetic Biomarkers using Maximal Relevance and Minimal Redundancy Based Feature Selection for Multi-Omics Data.

    PubMed

    Mallik, Saurav; Bhadra, Tapas; Maulik, Ujjwal

    2017-01-01

    Epigenetic Biomarker discovery is an important task in bioinformatics. In this article, we develop a new framework of identifying statistically significant epigenetic biomarkers using maximal-relevance and minimal-redundancy criterion based feature (gene) selection for multi-omics dataset. Firstly, we determine the genes that have both expression as well as methylation values, and follow normal distribution. Similarly, we identify the genes which consist of both expression and methylation values, but do not follow normal distribution. For each case, we utilize a gene-selection method that provides maximal-relevant, but variable-weighted minimum-redundant genes as top ranked genes. For statistical validation, we apply t-test on both the expression and methylation data consisting of only the normally distributed top ranked genes to determine how many of them are both differentially expressed andmethylated. Similarly, we utilize Limma package for performing non-parametric Empirical Bayes test on both expression and methylation data comprising only the non-normally distributed top ranked genes to identify how many of them are both differentially expressed and methylated. We finally report the top-ranking significant gene-markerswith biological validation. Moreover, our framework improves positive predictive rate and reduces false positive rate in marker identification. In addition, we provide a comparative analysis of our gene-selection method as well as othermethods based on classificationperformances obtained using several well-known classifiers.

  6. A combined pre-clinical meta-analysis and randomized confirmatory trial approach to improve data validity for therapeutic target validation.

    PubMed

    Kleikers, Pamela W M; Hooijmans, Carlijn; Göb, Eva; Langhauser, Friederike; Rewell, Sarah S J; Radermacher, Kim; Ritskes-Hoitinga, Merel; Howells, David W; Kleinschnitz, Christoph; Schmidt, Harald H H W

    2015-08-27

    Biomedical research suffers from a dramatically poor translational success. For example, in ischemic stroke, a condition with a high medical need, over a thousand experimental drug targets were unsuccessful. Here, we adopt methods from clinical research for a late-stage pre-clinical meta-analysis (MA) and randomized confirmatory trial (pRCT) approach. A profound body of literature suggests NOX2 to be a major therapeutic target in stroke. Systematic review and MA of all available NOX2(-/y) studies revealed a positive publication bias and lack of statistical power to detect a relevant reduction in infarct size. A fully powered multi-center pRCT rejects NOX2 as a target to improve neurofunctional outcomes or achieve a translationally relevant infarct size reduction. Thus stringent statistical thresholds, reporting negative data and a MA-pRCT approach can ensure biomedical data validity and overcome risks of bias.

  7. Statistics teaching in medical school: opinions of practising doctors.

    PubMed

    Miles, Susan; Price, Gill M; Swift, Louise; Shepstone, Lee; Leinster, Sam J

    2010-11-04

    The General Medical Council expects UK medical graduates to gain some statistical knowledge during their undergraduate education; but provides no specific guidance as to amount, content or teaching method. Published work on statistics teaching for medical undergraduates has been dominated by medical statisticians, with little input from the doctors who will actually be using this knowledge and these skills after graduation. Furthermore, doctor's statistical training needs may have changed due to advances in information technology and the increasing importance of evidence-based medicine. Thus there exists a need to investigate the views of practising medical doctors as to the statistical training required for undergraduate medical students, based on their own use of these skills in daily practice. A questionnaire was designed to investigate doctors' views about undergraduate training in statistics and the need for these skills in daily practice, with a view to informing future teaching. The questionnaire was emailed to all clinicians with a link to the University of East Anglia Medical School. Open ended questions were included to elicit doctors' opinions about both their own undergraduate training in statistics and recommendations for the training of current medical students. Content analysis was performed by two of the authors to systematically categorize and describe all the responses provided by participants. 130 doctors responded, including both hospital consultants and general practitioners. The findings indicated that most had not recognised the value of their undergraduate teaching in statistics and probability at the time, but had subsequently found the skills relevant to their career. Suggestions for improving undergraduate teaching in these areas included referring to actual research and ensuring relevance to, and integration with, clinical practice. Grounding the teaching of statistics in the context of real research studies and including examples of typical clinical work may better prepare medical students for their subsequent career.

  8. DISTMIX: direct imputation of summary statistics for unmeasured SNPs from mixed ethnicity cohorts.

    PubMed

    Lee, Donghyung; Bigdeli, T Bernard; Williamson, Vernell S; Vladimirov, Vladimir I; Riley, Brien P; Fanous, Ayman H; Bacanu, Silviu-Alin

    2015-10-01

    To increase the signal resolution for large-scale meta-analyses of genome-wide association studies, genotypes at unmeasured single nucleotide polymorphisms (SNPs) are commonly imputed using large multi-ethnic reference panels. However, the ever increasing size and ethnic diversity of both reference panels and cohorts makes genotype imputation computationally challenging for moderately sized computer clusters. Moreover, genotype imputation requires subject-level genetic data, which unlike summary statistics provided by virtually all studies, is not publicly available. While there are much less demanding methods which avoid the genotype imputation step by directly imputing SNP statistics, e.g. Directly Imputing summary STatistics (DIST) proposed by our group, their implicit assumptions make them applicable only to ethnically homogeneous cohorts. To decrease computational and access requirements for the analysis of cosmopolitan cohorts, we propose DISTMIX, which extends DIST capabilities to the analysis of mixed ethnicity cohorts. The method uses a relevant reference panel to directly impute unmeasured SNP statistics based only on statistics at measured SNPs and estimated/user-specified ethnic proportions. Simulations show that the proposed method adequately controls the Type I error rates. The 1000 Genomes panel imputation of summary statistics from the ethnically diverse Psychiatric Genetic Consortium Schizophrenia Phase 2 suggests that, when compared to genotype imputation methods, DISTMIX offers comparable imputation accuracy for only a fraction of computational resources. DISTMIX software, its reference population data, and usage examples are publicly available at http://code.google.com/p/distmix. dlee4@vcu.edu Supplementary Data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  9. Identifying natural flow regimes using fish communities

    NASA Astrophysics Data System (ADS)

    Chang, Fi-John; Tsai, Wen-Ping; Wu, Tzu-Ching; Chen, Hung-kwai; Herricks, Edwin E.

    2011-10-01

    SummaryModern water resources management has adopted natural flow regimes as reasonable targets for river restoration and conservation. The characterization of a natural flow regime begins with the development of hydrologic statistics from flow records. However, little guidance exists for defining the period of record needed for regime determination. In Taiwan, the Taiwan Eco-hydrological Indicator System (TEIS), a group of hydrologic statistics selected for fisheries relevance, is being used to evaluate ecological flows. The TEIS consists of a group of hydrologic statistics selected to characterize the relationships between flow and the life history of indigenous species. Using the TEIS and biosurvey data for Taiwan, this paper identifies the length of hydrologic record sufficient for natural flow regime characterization. To define the ecological hydrology of fish communities, this study connected hydrologic statistics to fish communities by using methods to define antecedent conditions that influence existing community composition. A moving average method was applied to TEIS statistics to reflect the effects of antecedent flow condition and a point-biserial correlation method was used to relate fisheries collections with TEIS statistics. The resulting fish species-TEIS (FISH-TEIS) hydrologic statistics matrix takes full advantage of historical flows and fisheries data. The analysis indicates that, in the watersheds analyzed, averaging TEIS statistics for the present year and 3 years prior to the sampling date, termed MA(4), is sufficient to develop a natural flow regime. This result suggests that flow regimes based on hydrologic statistics for the period of record can be replaced by regimes developed for sampled fish communities.

  10. Bayes in biological anthropology.

    PubMed

    Konigsberg, Lyle W; Frankenberg, Susan R

    2013-12-01

    In this article, we both contend and illustrate that biological anthropologists, particularly in the Americas, often think like Bayesians but act like frequentists when it comes to analyzing a wide variety of data. In other words, while our research goals and perspectives are rooted in probabilistic thinking and rest on prior knowledge, we often proceed to use statistical hypothesis tests and confidence interval methods unrelated (or tenuously related) to the research questions of interest. We advocate for applying Bayesian analyses to a number of different bioanthropological questions, especially since many of the programming and computational challenges to doing so have been overcome in the past two decades. To facilitate such applications, this article explains Bayesian principles and concepts, and provides concrete examples of Bayesian computer simulations and statistics that address questions relevant to biological anthropology, focusing particularly on bioarchaeology and forensic anthropology. It also simultaneously reviews the use of Bayesian methods and inference within the discipline to date. This article is intended to act as primer to Bayesian methods and inference in biological anthropology, explaining the relationships of various methods to likelihoods or probabilities and to classical statistical models. Our contention is not that traditional frequentist statistics should be rejected outright, but that there are many situations where biological anthropology is better served by taking a Bayesian approach. To this end it is hoped that the examples provided in this article will assist researchers in choosing from among the broad array of statistical methods currently available. Copyright © 2013 Wiley Periodicals, Inc.

  11. ReliefSeq: A Gene-Wise Adaptive-K Nearest-Neighbor Feature Selection Tool for Finding Gene-Gene Interactions and Main Effects in mRNA-Seq Gene Expression Data

    PubMed Central

    McKinney, Brett A.; White, Bill C.; Grill, Diane E.; Li, Peter W.; Kennedy, Richard B.; Poland, Gregory A.; Oberg, Ann L.

    2013-01-01

    Relief-F is a nonparametric, nearest-neighbor machine learning method that has been successfully used to identify relevant variables that may interact in complex multivariate models to explain phenotypic variation. While several tools have been developed for assessing differential expression in sequence-based transcriptomics, the detection of statistical interactions between transcripts has received less attention in the area of RNA-seq analysis. We describe a new extension and assessment of Relief-F for feature selection in RNA-seq data. The ReliefSeq implementation adapts the number of nearest neighbors (k) for each gene to optimize the Relief-F test statistics (importance scores) for finding both main effects and interactions. We compare this gene-wise adaptive-k (gwak) Relief-F method with standard RNA-seq feature selection tools, such as DESeq and edgeR, and with the popular machine learning method Random Forests. We demonstrate performance on a panel of simulated data that have a range of distributional properties reflected in real mRNA-seq data including multiple transcripts with varying sizes of main effects and interaction effects. For simulated main effects, gwak-Relief-F feature selection performs comparably to standard tools DESeq and edgeR for ranking relevant transcripts. For gene-gene interactions, gwak-Relief-F outperforms all comparison methods at ranking relevant genes in all but the highest fold change/highest signal situations where it performs similarly. The gwak-Relief-F algorithm outperforms Random Forests for detecting relevant genes in all simulation experiments. In addition, Relief-F is comparable to the other methods based on computational time. We also apply ReliefSeq to an RNA-Seq study of smallpox vaccine to identify gene expression changes between vaccinia virus-stimulated and unstimulated samples. ReliefSeq is an attractive tool for inclusion in the suite of tools used for analysis of mRNA-Seq data; it has power to detect both main effects and interaction effects. Software Availability: http://insilico.utulsa.edu/ReliefSeq.php. PMID:24339943

  12. The MDI Method as a Generalization of Logit, Probit and Hendry Analyses in Marketing.

    DTIC Science & Technology

    1980-04-01

    model involves nothing more than fitting a normal distribution function ( Hanushek and Jackson (1977)). For a given value of x, the probit model...preference shifts within the soft drink category. --For applications of probit models relevant for marketing, see Hausman and Wise (1978) and Hanushek and...Marketing Research" JMR XIV, Feb. (1977). Hanushek , E.A., and J.E. Jackson, Statistical Methods for Social Scientists. Academic Press, New York (1977

  13. Establishing glucose- and ABA-regulated transcription networks in Arabidopsis by microarray analysis and promoter classification using a Relevance Vector Machine.

    PubMed

    Li, Yunhai; Lee, Kee Khoon; Walsh, Sean; Smith, Caroline; Hadingham, Sophie; Sorefan, Karim; Cawley, Gavin; Bevan, Michael W

    2006-03-01

    Establishing transcriptional regulatory networks by analysis of gene expression data and promoter sequences shows great promise. We developed a novel promoter classification method using a Relevance Vector Machine (RVM) and Bayesian statistical principles to identify discriminatory features in the promoter sequences of genes that can correctly classify transcriptional responses. The method was applied to microarray data obtained from Arabidopsis seedlings treated with glucose or abscisic acid (ABA). Of those genes showing >2.5-fold changes in expression level, approximately 70% were correctly predicted as being up- or down-regulated (under 10-fold cross-validation), based on the presence or absence of a small set of discriminative promoter motifs. Many of these motifs have known regulatory functions in sugar- and ABA-mediated gene expression. One promoter motif that was not known to be involved in glucose-responsive gene expression was identified as the strongest classifier of glucose-up-regulated gene expression. We show it confers glucose-responsive gene expression in conjunction with another promoter motif, thus validating the classification method. We were able to establish a detailed model of glucose and ABA transcriptional regulatory networks and their interactions, which will help us to understand the mechanisms linking metabolism with growth in Arabidopsis. This study shows that machine learning strategies coupled to Bayesian statistical methods hold significant promise for identifying functionally significant promoter sequences.

  14. Prediction of biomechanical parameters of the proximal femur using statistical appearance models and support vector regression.

    PubMed

    Fritscher, Karl; Schuler, Benedikt; Link, Thomas; Eckstein, Felix; Suhm, Norbert; Hänni, Markus; Hengg, Clemens; Schubert, Rainer

    2008-01-01

    Fractures of the proximal femur are one of the principal causes of mortality among elderly persons. Traditional methods for the determination of femoral fracture risk use methods for measuring bone mineral density. However, BMD alone is not sufficient to predict bone failure load for an individual patient and additional parameters have to be determined for this purpose. In this work an approach that uses statistical models of appearance to identify relevant regions and parameters for the prediction of biomechanical properties of the proximal femur will be presented. By using Support Vector Regression the proposed model based approach is capable of predicting two different biomechanical parameters accurately and fully automatically in two different testing scenarios.

  15. Memory Effects and Nonequilibrium Correlations in the Dynamics of Open Quantum Systems

    NASA Astrophysics Data System (ADS)

    Morozov, V. G.

    2018-01-01

    We propose a systematic approach to the dynamics of open quantum systems in the framework of Zubarev's nonequilibrium statistical operator method. The approach is based on the relation between ensemble means of the Hubbard operators and the matrix elements of the reduced statistical operator of an open quantum system. This key relation allows deriving master equations for open systems following a scheme conceptually identical to the scheme used to derive kinetic equations for distribution functions. The advantage of the proposed formalism is that some relevant dynamical correlations between an open system and its environment can be taken into account. To illustrate the method, we derive a non-Markovian master equation containing the contribution of nonequilibrium correlations associated with energy conservation.

  16. Metrology Standards for Quantitative Imaging Biomarkers

    PubMed Central

    Obuchowski, Nancy A.; Kessler, Larry G.; Raunig, David L.; Gatsonis, Constantine; Huang, Erich P.; Kondratovich, Marina; McShane, Lisa M.; Reeves, Anthony P.; Barboriak, Daniel P.; Guimaraes, Alexander R.; Wahl, Richard L.

    2015-01-01

    Although investigators in the imaging community have been active in developing and evaluating quantitative imaging biomarkers (QIBs), the development and implementation of QIBs have been hampered by the inconsistent or incorrect use of terminology or methods for technical performance and statistical concepts. Technical performance is an assessment of how a test performs in reference objects or subjects under controlled conditions. In this article, some of the relevant statistical concepts are reviewed, methods that can be used for evaluating and comparing QIBs are described, and some of the technical performance issues related to imaging biomarkers are discussed. More consistent and correct use of terminology and study design principles will improve clinical research, advance regulatory science, and foster better care for patients who undergo imaging studies. © RSNA, 2015 PMID:26267831

  17. Assessment of NDE reliability data

    NASA Technical Reports Server (NTRS)

    Yee, B. G. W.; Couchman, J. C.; Chang, F. H.; Packman, D. F.

    1975-01-01

    Twenty sets of relevant nondestructive test (NDT) reliability data were identified, collected, compiled, and categorized. A criterion for the selection of data for statistical analysis considerations was formulated, and a model to grade the quality and validity of the data sets was developed. Data input formats, which record the pertinent parameters of the defect/specimen and inspection procedures, were formulated for each NDE method. A comprehensive computer program was written and debugged to calculate the probability of flaw detection at several confidence limits by the binomial distribution. This program also selects the desired data sets for pooling and tests the statistical pooling criteria before calculating the composite detection reliability. An example of the calculated reliability of crack detection in bolt holes by an automatic eddy current method is presented.

  18. SPATIO-TEMPORAL MODELING OF AGRICULTURAL YIELD DATA WITH AN APPLICATION TO PRICING CROP INSURANCE CONTRACTS

    PubMed Central

    Ozaki, Vitor A.; Ghosh, Sujit K.; Goodwin, Barry K.; Shirota, Ricardo

    2009-01-01

    This article presents a statistical model of agricultural yield data based on a set of hierarchical Bayesian models that allows joint modeling of temporal and spatial autocorrelation. This method captures a comprehensive range of the various uncertainties involved in predicting crop insurance premium rates as opposed to the more traditional ad hoc, two-stage methods that are typically based on independent estimation and prediction. A panel data set of county-average yield data was analyzed for 290 counties in the State of Paraná (Brazil) for the period of 1990 through 2002. Posterior predictive criteria are used to evaluate different model specifications. This article provides substantial improvements in the statistical and actuarial methods often applied to the calculation of insurance premium rates. These improvements are especially relevant to situations where data are limited. PMID:19890450

  19. Identifying and exploiting trait-relevant tissues with multiple functional annotations in genome-wide association studies

    PubMed Central

    Zhang, Shujun

    2018-01-01

    Genome-wide association studies (GWASs) have identified many disease associated loci, the majority of which have unknown biological functions. Understanding the mechanism underlying trait associations requires identifying trait-relevant tissues and investigating associations in a trait-specific fashion. Here, we extend the widely used linear mixed model to incorporate multiple SNP functional annotations from omics studies with GWAS summary statistics to facilitate the identification of trait-relevant tissues, with which to further construct powerful association tests. Specifically, we rely on a generalized estimating equation based algorithm for parameter inference, a mixture modeling framework for trait-tissue relevance classification, and a weighted sequence kernel association test constructed based on the identified trait-relevant tissues for powerful association analysis. We refer to our analytic procedure as the Scalable Multiple Annotation integration for trait-Relevant Tissue identification and usage (SMART). With extensive simulations, we show how our method can make use of multiple complementary annotations to improve the accuracy for identifying trait-relevant tissues. In addition, our procedure allows us to make use of the inferred trait-relevant tissues, for the first time, to construct more powerful SNP set tests. We apply our method for an in-depth analysis of 43 traits from 28 GWASs using tissue-specific annotations in 105 tissues derived from ENCODE and Roadmap. Our results reveal new trait-tissue relevance, pinpoint important annotations that are informative of trait-tissue relationship, and illustrate how we can use the inferred trait-relevant tissues to construct more powerful association tests in the Wellcome trust case control consortium study. PMID:29377896

  20. Big-data reflection high energy electron diffraction analysis for understanding epitaxial film growth processes.

    PubMed

    Vasudevan, Rama K; Tselev, Alexander; Baddorf, Arthur P; Kalinin, Sergei V

    2014-10-28

    Reflection high energy electron diffraction (RHEED) has by now become a standard tool for in situ monitoring of film growth by pulsed laser deposition and molecular beam epitaxy. Yet despite the widespread adoption and wealth of information in RHEED images, most applications are limited to observing intensity oscillations of the specular spot, and much additional information on growth is discarded. With ease of data acquisition and increased computation speeds, statistical methods to rapidly mine the data set are now feasible. Here, we develop such an approach to the analysis of the fundamental growth processes through multivariate statistical analysis of a RHEED image sequence. This approach is illustrated for growth of La(x)Ca(1-x)MnO(3) films grown on etched (001) SrTiO(3) substrates, but is universal. The multivariate methods including principal component analysis and k-means clustering provide insight into the relevant behaviors, the timing and nature of a disordered to ordered growth change, and highlight statistically significant patterns. Fourier analysis yields the harmonic components of the signal and allows separation of the relevant components and baselines, isolating the asymmetric nature of the step density function and the transmission spots from the imperfect layer-by-layer (LBL) growth. These studies show the promise of big data approaches to obtaining more insight into film properties during and after epitaxial film growth. Furthermore, these studies open the pathway to use forward prediction methods to potentially allow significantly more control over growth process and hence final film quality.

  1. Identifying Minefields and Verifying Clearance: Adapting Statistical Methods for UXO Target Detection

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gilbert, Richard O.; O'Brien, Robert F.; Wilson, John E.

    2003-09-01

    It may not be feasible to completely survey large tracts of land suspected of containing minefields. It is desirable to develop a characterization protocol that will confidently identify minefields within these large land tracts if they exist. Naturally, surveying areas of greatest concern and most likely locations would be necessary but will not provide the needed confidence that an unknown minefield had not eluded detection. Once minefields are detected, methods are needed to bound the area that will require detailed mine detection surveys. The US Department of Defense Strategic Environmental Research and Development Program (SERDP) is sponsoring the development ofmore » statistical survey methods and tools for detecting potential UXO targets. These methods may be directly applicable to demining efforts. Statistical methods are employed to determine the optimal geophysical survey transect spacing to have confidence of detecting target areas of a critical size, shape, and anomaly density. Other methods under development determine the proportion of a land area that must be surveyed to confidently conclude that there are no UXO present. Adaptive sampling schemes are also being developed as an approach for bounding the target areas. These methods and tools will be presented and the status of relevant research in this area will be discussed.« less

  2. On the statistical assessment of classifiers using DNA microarray data

    PubMed Central

    Ancona, N; Maglietta, R; Piepoli, A; D'Addabbo, A; Cotugno, R; Savino, M; Liuni, S; Carella, M; Pesole, G; Perri, F

    2006-01-01

    Background In this paper we present a method for the statistical assessment of cancer predictors which make use of gene expression profiles. The methodology is applied to a new data set of microarray gene expression data collected in Casa Sollievo della Sofferenza Hospital, Foggia – Italy. The data set is made up of normal (22) and tumor (25) specimens extracted from 25 patients affected by colon cancer. We propose to give answers to some questions which are relevant for the automatic diagnosis of cancer such as: Is the size of the available data set sufficient to build accurate classifiers? What is the statistical significance of the associated error rates? In what ways can accuracy be considered dependant on the adopted classification scheme? How many genes are correlated with the pathology and how many are sufficient for an accurate colon cancer classification? The method we propose answers these questions whilst avoiding the potential pitfalls hidden in the analysis and interpretation of microarray data. Results We estimate the generalization error, evaluated through the Leave-K-Out Cross Validation error, for three different classification schemes by varying the number of training examples and the number of the genes used. The statistical significance of the error rate is measured by using a permutation test. We provide a statistical analysis in terms of the frequencies of the genes involved in the classification. Using the whole set of genes, we found that the Weighted Voting Algorithm (WVA) classifier learns the distinction between normal and tumor specimens with 25 training examples, providing e = 21% (p = 0.045) as an error rate. This remains constant even when the number of examples increases. Moreover, Regularized Least Squares (RLS) and Support Vector Machines (SVM) classifiers can learn with only 15 training examples, with an error rate of e = 19% (p = 0.035) and e = 18% (p = 0.037) respectively. Moreover, the error rate decreases as the training set size increases, reaching its best performances with 35 training examples. In this case, RLS and SVM have error rates of e = 14% (p = 0.027) and e = 11% (p = 0.019). Concerning the number of genes, we found about 6000 genes (p < 0.05) correlated with the pathology, resulting from the signal-to-noise statistic. Moreover the performances of RLS and SVM classifiers do not change when 74% of genes is used. They progressively reduce up to e = 16% (p < 0.05) when only 2 genes are employed. The biological relevance of a set of genes determined by our statistical analysis and the major roles they play in colorectal tumorigenesis is discussed. Conclusions The method proposed provides statistically significant answers to precise questions relevant for the diagnosis and prognosis of cancer. We found that, with as few as 15 examples, it is possible to train statistically significant classifiers for colon cancer diagnosis. As for the definition of the number of genes sufficient for a reliable classification of colon cancer, our results suggest that it depends on the accuracy required. PMID:16919171

  3. Statistical simulation of the magnetorotational dynamo.

    PubMed

    Squire, J; Bhattacharjee, A

    2015-02-27

    Turbulence and dynamo induced by the magnetorotational instability (MRI) are analyzed using quasilinear statistical simulation methods. It is found that homogenous turbulence is unstable to a large-scale dynamo instability, which saturates to an inhomogenous equilibrium with a strong dependence on the magnetic Prandtl number (Pm). Despite its enormously reduced nonlinearity, the dependence of the angular momentum transport on Pm in the quasilinear model is qualitatively similar to that of nonlinear MRI turbulence. This demonstrates the importance of the large-scale dynamo and suggests how dramatically simplified models may be used to gain insight into the astrophysically relevant regimes of very low or high Pm.

  4. Size and shape measurement in contemporary cephalometrics.

    PubMed

    McIntyre, Grant T; Mossey, Peter A

    2003-06-01

    The traditional method of analysing cephalograms--conventional cephalometric analysis (CCA)--involves the calculation of linear distance measurements, angular measurements, area measurements, and ratios. Because shape information cannot be determined from these 'size-based' measurements, an increasing number of studies employ geometric morphometric tools in the cephalometric analysis of craniofacial morphology. Most of the discussions surrounding the appropriateness of CCA, Procrustes superimposition, Euclidean distance matrix analysis (EDMA), thin-plate spline analysis (TPS), finite element morphometry (FEM), elliptical Fourier functions (EFF), and medial axis analysis (MAA) have centred upon mathematical and statistical arguments. Surprisingly, little information is available to assist the orthodontist in the clinical relevance of each technique. This article evaluates the advantages and limitations of the above methods currently used to analyse the craniofacial morphology on cephalograms and investigates their clinical relevance and possible applications.

  5. Spike Triggered Covariance in Strongly Correlated Gaussian Stimuli

    PubMed Central

    Aljadeff, Johnatan; Segev, Ronen; Berry, Michael J.; Sharpee, Tatyana O.

    2013-01-01

    Many biological systems perform computations on inputs that have very large dimensionality. Determining the relevant input combinations for a particular computation is often key to understanding its function. A common way to find the relevant input dimensions is to examine the difference in variance between the input distribution and the distribution of inputs associated with certain outputs. In systems neuroscience, the corresponding method is known as spike-triggered covariance (STC). This method has been highly successful in characterizing relevant input dimensions for neurons in a variety of sensory systems. So far, most studies used the STC method with weakly correlated Gaussian inputs. However, it is also important to use this method with inputs that have long range correlations typical of the natural sensory environment. In such cases, the stimulus covariance matrix has one (or more) outstanding eigenvalues that cannot be easily equalized because of sampling variability. Such outstanding modes interfere with analyses of statistical significance of candidate input dimensions that modulate neuronal outputs. In many cases, these modes obscure the significant dimensions. We show that the sensitivity of the STC method in the regime of strongly correlated inputs can be improved by an order of magnitude or more. This can be done by evaluating the significance of dimensions in the subspace orthogonal to the outstanding mode(s). Analyzing the responses of retinal ganglion cells probed with Gaussian noise, we find that taking into account outstanding modes is crucial for recovering relevant input dimensions for these neurons. PMID:24039563

  6. High throughput single cell counting in droplet-based microfluidics.

    PubMed

    Lu, Heng; Caen, Ouriel; Vrignon, Jeremy; Zonta, Eleonora; El Harrak, Zakaria; Nizard, Philippe; Baret, Jean-Christophe; Taly, Valérie

    2017-05-02

    Droplet-based microfluidics is extensively and increasingly used for high-throughput single-cell studies. However, the accuracy of the cell counting method directly impacts the robustness of such studies. We describe here a simple and precise method to accurately count a large number of adherent and non-adherent human cells as well as bacteria. Our microfluidic hemocytometer provides statistically relevant data on large populations of cells at a high-throughput, used to characterize cell encapsulation and cell viability during incubation in droplets.

  7. Resilience Among Students at the Basic Enlisted Submarine School

    DTIC Science & Technology

    2016-12-01

    reported resilience. The Hayes’ Macro in the Statistical Package for the Social Sciences (SSPS) was used to uncover factors relevant to mediation analysis... Statistical Package for the Social Sciences (SPSS) was used to uncover factors relevant to mediation analysis. Findings suggest that the encouragement of...to Stressful Experiences Scale RTC Recruit Training Command SPSS Statistical Package for the Social Sciences SS Social Support SWB Subjective Well

  8. Burns education for non-burn specialist clinicians in Western Australia.

    PubMed

    McWilliams, Tania; Hendricks, Joyce; Twigg, Di; Wood, Fiona

    2015-03-01

    Burn patients often receive their initial care by non-burn specialist clinicians, with increasingly collaborative burn models of care. The provision of relevant and accessible education for these clinicians is therefore vital for optimal patient care. A two phase design was used. A state-wide survey of multidisciplinary non-burn specialist clinicians throughout Western Australia identified learning needs related to paediatric burn care. A targeted education programme was developed and delivered live via videoconference. Pre-post-test analysis evaluated changes in knowledge as a result of attendance at each education session. Non-burn specialist clinicians identified numerous areas of burn care relevant to their practice. Statistically significant differences between perceived relevance of care and confidence in care provision were reported for aspects of acute burn care. Following attendance at the education sessions, statistically significant increases in knowledge were noted for most areas of acute burn care. Identification of learning needs facilitated the development of a targeted education programme for non-burn specialist clinicians. Increased non-burn specialist clinician knowledge following attendance at most education sessions supports the use of videoconferencing as an acceptable and effective method of delivering burns education in Western Australia. Copyright © 2014 Elsevier Ltd and ISBI. All rights reserved.

  9. Stochastic or statistic? Comparing flow duration curve models in ungauged basins and changing climates

    NASA Astrophysics Data System (ADS)

    Müller, M. F.; Thompson, S. E.

    2015-09-01

    The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drives of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by a strong wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are strongly favored over statistical models.

  10. Comparing statistical and process-based flow duration curve models in ungauged basins and changing rain regimes

    NASA Astrophysics Data System (ADS)

    Müller, M. F.; Thompson, S. E.

    2016-02-01

    The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drivers of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by frequent wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are favored over statistical models.

  11. A statistical method to estimate low-energy hadronic cross sections

    NASA Astrophysics Data System (ADS)

    Balassa, Gábor; Kovács, Péter; Wolf, György

    2018-02-01

    In this article we propose a model based on the Statistical Bootstrap approach to estimate the cross sections of different hadronic reactions up to a few GeV in c.m.s. energy. The method is based on the idea, when two particles collide a so-called fireball is formed, which after a short time period decays statistically into a specific final state. To calculate the probabilities we use a phase space description extended with quark combinatorial factors and the possibility of more than one fireball formation. In a few simple cases the probability of a specific final state can be calculated analytically, where we show that the model is able to reproduce the ratios of the considered cross sections. We also show that the model is able to describe proton-antiproton annihilation at rest. In the latter case we used a numerical method to calculate the more complicated final state probabilities. Additionally, we examined the formation of strange and charmed mesons as well, where we used existing data to fit the relevant model parameters.

  12. Bootstrapping under constraint for the assessment of group behavior in human contact networks

    NASA Astrophysics Data System (ADS)

    Tremblay, Nicolas; Barrat, Alain; Forest, Cary; Nornberg, Mark; Pinton, Jean-François; Borgnat, Pierre

    2013-11-01

    The increasing availability of time- and space-resolved data describing human activities and interactions gives insights into both static and dynamic properties of human behavior. In practice, nevertheless, real-world data sets can often be considered as only one realization of a particular event. This highlights a key issue in social network analysis: the statistical significance of estimated properties. In this context, we focus here on the assessment of quantitative features of specific subset of nodes in empirical networks. We present a method of statistical resampling based on bootstrapping groups of nodes under constraints within the empirical network. The method enables us to define acceptance intervals for various null hypotheses concerning relevant properties of the subset of nodes under consideration in order to characterize by a statistical test its behavior as “normal” or not. We apply this method to a high-resolution data set describing the face-to-face proximity of individuals during two colocated scientific conferences. As a case study, we show how to probe whether colocating the two conferences succeeded in bringing together the two corresponding groups of scientists.

  13. A New Approach for the Calculation of Total Corneal Astigmatism Considering the Magnitude and Orientation of Posterior Corneal Astigmatism and Thickness.

    PubMed

    Piñero, David P; Caballero, María T; Nicolás-Albujer, Juan M; de Fez, Dolores; Camps, Vicent J

    2018-06-01

    To evaluate a new method of calculation of total corneal astigmatism based on Gaussian optics and the power design of a spherocylindrical lens (C) in the healthy eye and to compare it with keratometric (K) and power vector (PV) methods. A total of 92 healthy eyes of 92 patients (age, 17-65 years) were enrolled. Corneal astigmatism was calculated in all cases using K, PV, and our new approach C that considers the contribution of corneal thickness. An evaluation of the interchangeability of our new approach with the other 2 methods was performed using Bland-Altman analysis. Statistically significant differences between methods were found in the magnitude of astigmatism (P < 0.001), with the highest values provided by K. These differences in the magnitude of astigmatism were clinically relevant when K and C were compared [limits of agreement (LoA), -0.40 to 0.62 D), but not for the comparison between PV and C (LoA, -0.03 to 0.01 D). Differences in the axis of astigmatism between methods did not reach statistical significance (P = 0.408). However, they were clinically relevant when comparing K and C (LoA, -5.48 to 15.68 degrees) but not for the comparison between PV and C (LoA, -1.68 to 1.42 degrees). The use of our new approach for the calculation of total corneal astigmatism provides astigmatic results comparable to the PV method, which suggests that the effect of pachymetry on total corneal astigmatism is minimal in healthy eyes.

  14. A test of safety, violence prevention, and civility climate domain-specific relationships with relevant workplace hazards

    PubMed Central

    Spector, Paul E.

    2016-01-01

    Background Safety climate, violence prevention climate, and civility climate were independently developed and linked to domain-specific workplace hazards, although all three were designed to promote the physical and psychological safety of workers. Purpose To test domain specificity between conceptually related workplace climates and relevant workplace hazards. Methods Data were collected from 368 persons employed in various industries and descriptive statistics were calculated for all study variables. Correlational and relative weights analyses were used to test for domain specificity. Results The three climate domains were similarly predictive of most workplace hazards, regardless of domain specificity. Discussion This study suggests that the three climate domains share a common higher order construct that may predict relevant workplace hazards better than any of the scales alone. PMID:27110930

  15. Quantification of heterogeneity observed in medical images.

    PubMed

    Brooks, Frank J; Grigsby, Perry W

    2013-03-02

    There has been much recent interest in the quantification of visually evident heterogeneity within functional grayscale medical images, such as those obtained via magnetic resonance or positron emission tomography. In the case of images of cancerous tumors, variations in grayscale intensity imply variations in crucial tumor biology. Despite these considerable clinical implications, there is as yet no standardized method for measuring the heterogeneity observed via these imaging modalities. In this work, we motivate and derive a statistical measure of image heterogeneity. This statistic measures the distance-dependent average deviation from the smoothest intensity gradation feasible. We show how this statistic may be used to automatically rank images of in vivo human tumors in order of increasing heterogeneity. We test this method against the current practice of ranking images via expert visual inspection. We find that this statistic provides a means of heterogeneity quantification beyond that given by other statistics traditionally used for the same purpose. We demonstrate the effect of tumor shape upon our ranking method and find the method applicable to a wide variety of clinically relevant tumor images. We find that the automated heterogeneity rankings agree very closely with those performed visually by experts. These results indicate that our automated method may be used reliably to rank, in order of increasing heterogeneity, tumor images whether or not object shape is considered to contribute to that heterogeneity. Automated heterogeneity ranking yields objective results which are more consistent than visual rankings. Reducing variability in image interpretation will enable more researchers to better study potential clinical implications of observed tumor heterogeneity.

  16. Developing Topic-Specific Search Filters for PubMed with Click-Through Data

    PubMed Central

    Li, Jiao; Lu, Zhiyong

    2013-01-01

    Summary Objectives Search filters have been developed and demonstrated for better information access to the immense and ever-growing body of publications in the biomedical domain. However, to date the number of filters remains quite limited because the current filter development methods require significant human efforts in manual document review and filter term selection. In this regard, we aim to investigate automatic methods for generating search filters. Methods We present an automated method to develop topic-specific filters on the basis of users’ search logs in PubMed. Specifically, for a given topic, we first detect its relevant user queries and then include their corresponding clicked articles to serve as the topic-relevant document set accordingly. Next, we statistically identify informative terms that best represent the topic-relevant document set using a background set composed of topic irrelevant articles. Lastly, the selected representative terms are combined with Boolean operators and evaluated on benchmark datasets to derive the final filter with the best performance. Results We applied our method to develop filters for four clinical topics: nephrology, diabetes, pregnancy, and depression. For the nephrology filter, our method obtained performance comparable to the state of the art (sensitivity of 91.3%, specificity of 98.7%, precision of 94.6%, and accuracy of 97.2%). Similarly, high-performing results (over 90% in all measures) were obtained for the other three search filters. Conclusion Based on PubMed click-through data, we successfully developed a high-performance method for generating topic-specific search filters that is significantly more efficient than existing manual methods. All data sets (topic-relevant and irrelevant document sets) used in this study and a demonstration system are publicly available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/downloads/CQ_filter/ PMID:23666447

  17. Networking—a statistical physics perspective

    NASA Astrophysics Data System (ADS)

    Yeung, Chi Ho; Saad, David

    2013-03-01

    Networking encompasses a variety of tasks related to the communication of information on networks; it has a substantial economic and societal impact on a broad range of areas including transportation systems, wired and wireless communications and a range of Internet applications. As transportation and communication networks become increasingly more complex, the ever increasing demand for congestion control, higher traffic capacity, quality of service, robustness and reduced energy consumption requires new tools and methods to meet these conflicting requirements. The new methodology should serve for gaining better understanding of the properties of networking systems at the macroscopic level, as well as for the development of new principled optimization and management algorithms at the microscopic level. Methods of statistical physics seem best placed to provide new approaches as they have been developed specifically to deal with nonlinear large-scale systems. This review aims at presenting an overview of tools and methods that have been developed within the statistical physics community and that can be readily applied to address the emerging problems in networking. These include diffusion processes, methods from disordered systems and polymer physics, probabilistic inference, which have direct relevance to network routing, file and frequency distribution, the exploration of network structures and vulnerability, and various other practical networking applications.

  18. Water Quality Sensing and Spatio-Temporal Monitoring Structure with Autocorrelation Kernel Methods.

    PubMed

    Vizcaíno, Iván P; Carrera, Enrique V; Muñoz-Romero, Sergio; Cumbal, Luis H; Rojo-Álvarez, José Luis

    2017-10-16

    Pollution on water resources is usually analyzed with monitoring campaigns, which consist of programmed sampling, measurement, and recording of the most representative water quality parameters. These campaign measurements yields a non-uniform spatio-temporal sampled data structure to characterize complex dynamics phenomena. In this work, we propose an enhanced statistical interpolation method to provide water quality managers with statistically interpolated representations of spatial-temporal dynamics. Specifically, our proposal makes efficient use of the a priori available information of the quality parameter measurements through Support Vector Regression (SVR) based on Mercer's kernels. The methods are benchmarked against previously proposed methods in three segments of the Machángara River and one segment of the San Pedro River in Ecuador, and their different dynamics are shown by statistically interpolated spatial-temporal maps. The best interpolation performance in terms of mean absolute error was the SVR with Mercer's kernel given by either the Mahalanobis spatial-temporal covariance matrix or by the bivariate estimated autocorrelation function. In particular, the autocorrelation kernel provides with significant improvement of the estimation quality, consistently for all the six water quality variables, which points out the relevance of including a priori knowledge of the problem.

  19. Water Quality Sensing and Spatio-Temporal Monitoring Structure with Autocorrelation Kernel Methods

    PubMed Central

    Vizcaíno, Iván P.; Muñoz-Romero, Sergio; Cumbal, Luis H.

    2017-01-01

    Pollution on water resources is usually analyzed with monitoring campaigns, which consist of programmed sampling, measurement, and recording of the most representative water quality parameters. These campaign measurements yields a non-uniform spatio-temporal sampled data structure to characterize complex dynamics phenomena. In this work, we propose an enhanced statistical interpolation method to provide water quality managers with statistically interpolated representations of spatial-temporal dynamics. Specifically, our proposal makes efficient use of the a priori available information of the quality parameter measurements through Support Vector Regression (SVR) based on Mercer’s kernels. The methods are benchmarked against previously proposed methods in three segments of the Machángara River and one segment of the San Pedro River in Ecuador, and their different dynamics are shown by statistically interpolated spatial-temporal maps. The best interpolation performance in terms of mean absolute error was the SVR with Mercer’s kernel given by either the Mahalanobis spatial-temporal covariance matrix or by the bivariate estimated autocorrelation function. In particular, the autocorrelation kernel provides with significant improvement of the estimation quality, consistently for all the six water quality variables, which points out the relevance of including a priori knowledge of the problem. PMID:29035333

  20. Precipitation intensity probability distribution modelling for hydrological and construction design purposes

    NASA Astrophysics Data System (ADS)

    Koshinchanov, Georgy; Dimitrov, Dobri

    2008-11-01

    The characteristics of rainfall intensity are important for many purposes, including design of sewage and drainage systems, tuning flood warning procedures, etc. Those estimates are usually statistical estimates of the intensity of precipitation realized for certain period of time (e.g. 5, 10 min., etc) with different return period (e.g. 20, 100 years, etc). The traditional approach in evaluating the mentioned precipitation intensities is to process the pluviometer's records and fit probability distribution to samples of intensities valid for certain locations ore regions. Those estimates further become part of the state regulations to be used for various economic activities. Two problems occur using the mentioned approach: 1. Due to various factors the climate conditions are changed and the precipitation intensity estimates need regular update; 2. As far as the extremes of the probability distribution are of particular importance for the practice, the methodology of the distribution fitting needs specific attention to those parts of the distribution. The aim of this paper is to make review of the existing methodologies for processing the intensive rainfalls and to refresh some of the statistical estimates for the studied areas. The methodologies used in Bulgaria for analyzing the intensive rainfalls and produce relevant statistical estimates: The method of the maximum intensity, used in the National Institute of Meteorology and Hydrology to process and decode the pluviometer's records, followed by distribution fitting for each precipitation duration period; As the above, but with separate modeling of probability distribution for the middle and high probability quantiles. Method is similar to the first one, but with a threshold of 0,36 mm/min of intensity; Another method proposed by the Russian hydrologist G. A. Aleksiev for regionalization of estimates over some territory, improved and adapted by S. Gerasimov for Bulgaria; Next method is considering only the intensive rainfalls (if any) during the day with the maximal annual daily precipitation total for a given year; Conclusions are drown on the relevance and adequacy of the applied methods.

  1. Estimation of gene induction enables a relevance-based ranking of gene sets.

    PubMed

    Bartholomé, Kilian; Kreutz, Clemens; Timmer, Jens

    2009-07-01

    In order to handle and interpret the vast amounts of data produced by microarray experiments, the analysis of sets of genes with a common biological functionality has been shown to be advantageous compared to single gene analyses. Some statistical methods have been proposed to analyse the differential gene expression of gene sets in microarray experiments. However, most of these methods either require threshhold values to be chosen for the analysis, or they need some reference set for the determination of significance. We present a method that estimates the number of differentially expressed genes in a gene set without requiring a threshold value for significance of genes. The method is self-contained (i.e., it does not require a reference set for comparison). In contrast to other methods which are focused on significance, our approach emphasizes the relevance of the regulation of gene sets. The presented method measures the degree of regulation of a gene set and is a useful tool to compare the induction of different gene sets and place the results of microarray experiments into the biological context. An R-package is available.

  2. Selection and Reporting of Statistical Methods to Assess Reliability of a Diagnostic Test: Conformity to Recommended Methods in a Peer-Reviewed Journal

    PubMed Central

    Park, Ji Eun; Han, Kyunghwa; Sung, Yu Sub; Chung, Mi Sun; Koo, Hyun Jung; Yoon, Hee Mang; Choi, Young Jun; Lee, Seung Soo; Kim, Kyung Won; Shin, Youngbin; An, Suah; Cho, Hyo-Min

    2017-01-01

    Objective To evaluate the frequency and adequacy of statistical analyses in a general radiology journal when reporting a reliability analysis for a diagnostic test. Materials and Methods Sixty-three studies of diagnostic test accuracy (DTA) and 36 studies reporting reliability analyses published in the Korean Journal of Radiology between 2012 and 2016 were analyzed. Studies were judged using the methodological guidelines of the Radiological Society of North America-Quantitative Imaging Biomarkers Alliance (RSNA-QIBA), and COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) initiative. DTA studies were evaluated by nine editorial board members of the journal. Reliability studies were evaluated by study reviewers experienced with reliability analysis. Results Thirty-one (49.2%) of the 63 DTA studies did not include a reliability analysis when deemed necessary. Among the 36 reliability studies, proper statistical methods were used in all (5/5) studies dealing with dichotomous/nominal data, 46.7% (7/15) of studies dealing with ordinal data, and 95.2% (20/21) of studies dealing with continuous data. Statistical methods were described in sufficient detail regarding weighted kappa in 28.6% (2/7) of studies and regarding the model and assumptions of intraclass correlation coefficient in 35.3% (6/17) and 29.4% (5/17) of studies, respectively. Reliability parameters were used as if they were agreement parameters in 23.1% (3/13) of studies. Reproducibility and repeatability were used incorrectly in 20% (3/15) of studies. Conclusion Greater attention to the importance of reporting reliability, thorough description of the related statistical methods, efforts not to neglect agreement parameters, and better use of relevant terminology is necessary. PMID:29089821

  3. Guidelines for Genome-Scale Analysis of Biological Rhythms.

    PubMed

    Hughes, Michael E; Abruzzi, Katherine C; Allada, Ravi; Anafi, Ron; Arpat, Alaaddin Bulak; Asher, Gad; Baldi, Pierre; de Bekker, Charissa; Bell-Pedersen, Deborah; Blau, Justin; Brown, Steve; Ceriani, M Fernanda; Chen, Zheng; Chiu, Joanna C; Cox, Juergen; Crowell, Alexander M; DeBruyne, Jason P; Dijk, Derk-Jan; DiTacchio, Luciano; Doyle, Francis J; Duffield, Giles E; Dunlap, Jay C; Eckel-Mahan, Kristin; Esser, Karyn A; FitzGerald, Garret A; Forger, Daniel B; Francey, Lauren J; Fu, Ying-Hui; Gachon, Frédéric; Gatfield, David; de Goede, Paul; Golden, Susan S; Green, Carla; Harer, John; Harmer, Stacey; Haspel, Jeff; Hastings, Michael H; Herzel, Hanspeter; Herzog, Erik D; Hoffmann, Christy; Hong, Christian; Hughey, Jacob J; Hurley, Jennifer M; de la Iglesia, Horacio O; Johnson, Carl; Kay, Steve A; Koike, Nobuya; Kornacker, Karl; Kramer, Achim; Lamia, Katja; Leise, Tanya; Lewis, Scott A; Li, Jiajia; Li, Xiaodong; Liu, Andrew C; Loros, Jennifer J; Martino, Tami A; Menet, Jerome S; Merrow, Martha; Millar, Andrew J; Mockler, Todd; Naef, Felix; Nagoshi, Emi; Nitabach, Michael N; Olmedo, Maria; Nusinow, Dmitri A; Ptáček, Louis J; Rand, David; Reddy, Akhilesh B; Robles, Maria S; Roenneberg, Till; Rosbash, Michael; Ruben, Marc D; Rund, Samuel S C; Sancar, Aziz; Sassone-Corsi, Paolo; Sehgal, Amita; Sherrill-Mix, Scott; Skene, Debra J; Storch, Kai-Florian; Takahashi, Joseph S; Ueda, Hiroki R; Wang, Han; Weitz, Charles; Westermark, Pål O; Wijnen, Herman; Xu, Ying; Wu, Gang; Yoo, Seung-Hee; Young, Michael; Zhang, Eric Erquan; Zielinski, Tomasz; Hogenesch, John B

    2017-10-01

    Genome biology approaches have made enormous contributions to our understanding of biological rhythms, particularly in identifying outputs of the clock, including RNAs, proteins, and metabolites, whose abundance oscillates throughout the day. These methods hold significant promise for future discovery, particularly when combined with computational modeling. However, genome-scale experiments are costly and laborious, yielding "big data" that are conceptually and statistically difficult to analyze. There is no obvious consensus regarding design or analysis. Here we discuss the relevant technical considerations to generate reproducible, statistically sound, and broadly useful genome-scale data. Rather than suggest a set of rigid rules, we aim to codify principles by which investigators, reviewers, and readers of the primary literature can evaluate the suitability of different experimental designs for measuring different aspects of biological rhythms. We introduce CircaInSilico, a web-based application for generating synthetic genome biology data to benchmark statistical methods for studying biological rhythms. Finally, we discuss several unmet analytical needs, including applications to clinical medicine, and suggest productive avenues to address them.

  4. Guidelines for Genome-Scale Analysis of Biological Rhythms

    PubMed Central

    Hughes, Michael E.; Abruzzi, Katherine C.; Allada, Ravi; Anafi, Ron; Arpat, Alaaddin Bulak; Asher, Gad; Baldi, Pierre; de Bekker, Charissa; Bell-Pedersen, Deborah; Blau, Justin; Brown, Steve; Ceriani, M. Fernanda; Chen, Zheng; Chiu, Joanna C.; Cox, Juergen; Crowell, Alexander M.; DeBruyne, Jason P.; Dijk, Derk-Jan; DiTacchio, Luciano; Doyle, Francis J.; Duffield, Giles E.; Dunlap, Jay C.; Eckel-Mahan, Kristin; Esser, Karyn A.; FitzGerald, Garret A.; Forger, Daniel B.; Francey, Lauren J.; Fu, Ying-Hui; Gachon, Frédéric; Gatfield, David; de Goede, Paul; Golden, Susan S.; Green, Carla; Harer, John; Harmer, Stacey; Haspel, Jeff; Hastings, Michael H.; Herzel, Hanspeter; Herzog, Erik D.; Hoffmann, Christy; Hong, Christian; Hughey, Jacob J.; Hurley, Jennifer M.; de la Iglesia, Horacio O.; Johnson, Carl; Kay, Steve A.; Koike, Nobuya; Kornacker, Karl; Kramer, Achim; Lamia, Katja; Leise, Tanya; Lewis, Scott A.; Li, Jiajia; Li, Xiaodong; Liu, Andrew C.; Loros, Jennifer J.; Martino, Tami A.; Menet, Jerome S.; Merrow, Martha; Millar, Andrew J.; Mockler, Todd; Naef, Felix; Nagoshi, Emi; Nitabach, Michael N.; Olmedo, Maria; Nusinow, Dmitri A.; Ptáček, Louis J.; Rand, David; Reddy, Akhilesh B.; Robles, Maria S.; Roenneberg, Till; Rosbash, Michael; Ruben, Marc D.; Rund, Samuel S.C.; Sancar, Aziz; Sassone-Corsi, Paolo; Sehgal, Amita; Sherrill-Mix, Scott; Skene, Debra J.; Storch, Kai-Florian; Takahashi, Joseph S.; Ueda, Hiroki R.; Wang, Han; Weitz, Charles; Westermark, Pål O.; Wijnen, Herman; Xu, Ying; Wu, Gang; Yoo, Seung-Hee; Young, Michael; Zhang, Eric Erquan; Zielinski, Tomasz; Hogenesch, John B.

    2017-01-01

    Genome biology approaches have made enormous contributions to our understanding of biological rhythms, particularly in identifying outputs of the clock, including RNAs, proteins, and metabolites, whose abundance oscillates throughout the day. These methods hold significant promise for future discovery, particularly when combined with computational modeling. However, genome-scale experiments are costly and laborious, yielding “big data” that are conceptually and statistically difficult to analyze. There is no obvious consensus regarding design or analysis. Here we discuss the relevant technical considerations to generate reproducible, statistically sound, and broadly useful genome-scale data. Rather than suggest a set of rigid rules, we aim to codify principles by which investigators, reviewers, and readers of the primary literature can evaluate the suitability of different experimental designs for measuring different aspects of biological rhythms. We introduce CircaInSilico, a web-based application for generating synthetic genome biology data to benchmark statistical methods for studying biological rhythms. Finally, we discuss several unmet analytical needs, including applications to clinical medicine, and suggest productive avenues to address them. PMID:29098954

  5. Frequent statistics of link-layer bit stream data based on AC-IM algorithm

    NASA Astrophysics Data System (ADS)

    Cao, Chenghong; Lei, Yingke; Xu, Yiming

    2017-08-01

    At present, there are many relevant researches on data processing using classical pattern matching and its improved algorithm, but few researches on statistical data of link-layer bit stream. This paper adopts a frequent statistical method of link-layer bit stream data based on AC-IM algorithm for classical multi-pattern matching algorithms such as AC algorithm has high computational complexity, low efficiency and it cannot be applied to binary bit stream data. The method's maximum jump distance of the mode tree is length of the shortest mode string plus 3 in case of no missing? In this paper, theoretical analysis is made on the principle of algorithm construction firstly, and then the experimental results show that the algorithm can adapt to the binary bit stream data environment and extract the frequent sequence more accurately, the effect is obvious. Meanwhile, comparing with the classical AC algorithm and other improved algorithms, AC-IM algorithm has a greater maximum jump distance and less time-consuming.

  6. [The principal components analysis--method to classify the statistical variables with applications in medicine].

    PubMed

    Dascălu, Cristina Gena; Antohe, Magda Ecaterina

    2009-01-01

    Based on the eigenvalues and the eigenvectors analysis, the principal component analysis has the purpose to identify the subspace of the main components from a set of parameters, which are enough to characterize the whole set of parameters. Interpreting the data for analysis as a cloud of points, we find through geometrical transformations the directions where the cloud's dispersion is maximal--the lines that pass through the cloud's center of weight and have a maximal density of points around them (by defining an appropriate criteria function and its minimization. This method can be successfully used in order to simplify the statistical analysis on questionnaires--because it helps us to select from a set of items only the most relevant ones, which cover the variations of the whole set of data. For instance, in the presented sample we started from a questionnaire with 28 items and, applying the principal component analysis we identified 7 principal components--or main items--fact that simplifies significantly the further data statistical analysis.

  7. Spatial statistical analysis of tree deaths using airborne digital imagery

    NASA Astrophysics Data System (ADS)

    Chang, Ya-Mei; Baddeley, Adrian; Wallace, Jeremy; Canci, Michael

    2013-04-01

    High resolution digital airborne imagery offers unprecedented opportunities for observation and monitoring of vegetation, providing the potential to identify, locate and track individual vegetation objects over time. Analytical tools are required to quantify relevant information. In this paper, locations of trees over a large area of native woodland vegetation were identified using morphological image analysis techniques. Methods of spatial point process statistics were then applied to estimate the spatially-varying tree death risk, and to show that it is significantly non-uniform. [Tree deaths over the area were detected in our previous work (Wallace et al., 2008).] The study area is a major source of ground water for the city of Perth, and the work was motivated by the need to understand and quantify vegetation changes in the context of water extraction and drying climate. The influence of hydrological variables on tree death risk was investigated using spatial statistics (graphical exploratory methods, spatial point pattern modelling and diagnostics).

  8. [Evaluation of using statistical methods in selected national medical journals].

    PubMed

    Sych, Z

    1996-01-01

    The paper covers the performed evaluation of frequency with which the statistical methods were applied in analyzed works having been published in six selected, national medical journals in the years 1988-1992. For analysis the following journals were chosen, namely: Klinika Oczna, Medycyna Pracy, Pediatria Polska, Polski Tygodnik Lekarski, Roczniki Państwowego Zakładu Higieny, Zdrowie Publiczne. Appropriate number of works up to the average in the remaining medical journals was randomly selected from respective volumes of Pol. Tyg. Lek. The studies did not include works wherein the statistical analysis was not implemented, which referred both to national and international publications. That exemption was also extended to review papers, casuistic ones, reviews of books, handbooks, monographies, reports from scientific congresses, as well as papers on historical topics. The number of works was defined in each volume. Next, analysis was performed to establish the mode of finding out a suitable sample in respective studies, differentiating two categories: random and target selections. Attention was also paid to the presence of control sample in the individual works. In the analysis attention was also focussed on the existence of sample characteristics, setting up three categories: complete, partial and lacking. In evaluating the analyzed works an effort was made to present the results of studies in tables and figures (Tab. 1, 3). Analysis was accomplished with regard to the rate of employing statistical methods in analyzed works in relevant volumes of six selected, national medical journals for the years 1988-1992, simultaneously determining the number of works, in which no statistical methods were used. Concurrently the frequency of applying the individual statistical methods was analyzed in the scrutinized works. Prominence was given to fundamental statistical methods in the field of descriptive statistics (measures of position, measures of dispersion) as well as most important methods of mathematical statistics such as parametric tests of significance, analysis of variance (in single and dual classifications). non-parametric tests of significance, correlation and regression. The works, in which use was made of either multiple correlation or multiple regression or else more complex methods of studying the relationship for two or more numbers of variables, were incorporated into the works whose statistical methods were constituted by correlation and regression as well as other methods, e.g. statistical methods being used in epidemiology (coefficients of incidence and morbidity, standardization of coefficients, survival tables) factor analysis conducted by Jacobi-Hotellng's method, taxonomic methods and others. On the basis of the performed studies it has been established that the frequency of employing statistical methods in the six selected national, medical journals in the years 1988-1992 was 61.1-66.0% of the analyzed works (Tab. 3), and they generally were almost similar to the frequency provided in English language medical journals. On a whole, no significant differences were disclosed in the frequency of applied statistical methods (Tab. 4) as well as in frequency of random tests (Tab. 3) in the analyzed works, appearing in the medical journals in respective years 1988-1992. The most frequently used statistical methods in analyzed works for 1988-1992 were the measures of position 44.2-55.6% and measures of dispersion 32.5-38.5% as well as parametric tests of significance 26.3-33.1% of the works analyzed (Tab. 4). For the purpose of increasing the frequency and reliability of the used statistical methods, the didactics should be widened in the field of biostatistics at medical studies and postgraduation training designed for physicians and scientific-didactic workers.

  9. Myorelaxant Effect of Bee Venom Topical Skin Application in Patients with RDC/TMD Ia and RDC/TMD Ib: A Randomized, Double Blinded Study

    PubMed Central

    Baron, Stefan

    2014-01-01

    The aim of the study was the evaluation of myorelaxant action of bee venom (BV) ointment compared to placebo. Parallel group, randomized double blinded trial was performed. Experimental group patients were applying BV for 14 days, locally over masseter muscles, during 3-minute massage. Placebo group patients used vaseline for massage. Muscle tension was measured twice (TON1 and TON2) in rest muscle tonus (RMT) and maximal muscle contraction (MMC) on both sides, right and left, with Easy Train Myo EMG (Schwa-medico, Version 3.1). Reduction of muscle tonus was statistically relevant in BV group and irrelevant in placebo group. VAS scale reduction was statistically relevant in both groups: BV and placebo. Physiotherapy is an effective method for myofascial pain treatment, but 0,0005% BV ointment gets better relief in muscle tension reduction and analgesic effect. This trial is registered with Clinicaltrials.gov NCT02101632. PMID:25050337

  10. A MULTIPLE TESTING OF THE ABC METHOD AND THE DEVELOPMENT OF A SECOND-GENERATION MODEL. PART II, TEST RESULTS AND AN ANALYSIS OF RECALL RATIO.

    ERIC Educational Resources Information Center

    ALTMANN, BERTHOLD

    AFTER A BRIEF SUMMARY OF THE TEST PROGRAM (DESCRIBED MORE FULLY IN LI 000 318), THE STATISTICAL RESULTS TABULATED AS OVERALL "ABC (APPROACH BY CONCEPT)-RELEVANCE RATIOS" AND "ABC-RECALL FIGURES" ARE PRESENTED AND REVIEWED. AN ABSTRACT MODEL DEVELOPED IN ACCORDANCE WITH MAX WEBER'S "IDEALTYPUS" ("DIE OBJEKTIVITAET…

  11. Hazardous Communication and Tools for Quality: Basic Statistics. Responsive Text. Educational Materials for the Workplace.

    ERIC Educational Resources Information Center

    Vermont Inst. for Self-Reliance, Rutland.

    This guide provides a description of Responsive Text (RT), a method for presenting job-relevant information within a computer-based support system. A summary of what RT is and why it is important is provided first. The first section of the guide provides a brief overview of what research tells about the reading process and how the general design…

  12. Developing topic-specific search filters for PubMed with click-through data.

    PubMed

    Li, J; Lu, Z

    2013-01-01

    Search filters have been developed and demonstrated for better information access to the immense and ever-growing body of publications in the biomedical domain. However, to date the number of filters remains quite limited because the current filter development methods require significant human efforts in manual document review and filter term selection. In this regard, we aim to investigate automatic methods for generating search filters. We present an automated method to develop topic-specific filters on the basis of users' search logs in PubMed. Specifically, for a given topic, we first detect its relevant user queries and then include their corresponding clicked articles to serve as the topic-relevant document set accordingly. Next, we statistically identify informative terms that best represent the topic-relevant document set using a background set composed of topic irrelevant articles. Lastly, the selected representative terms are combined with Boolean operators and evaluated on benchmark datasets to derive the final filter with the best performance. We applied our method to develop filters for four clinical topics: nephrology, diabetes, pregnancy, and depression. For the nephrology filter, our method obtained performance comparable to the state of the art (sensitivity of 91.3%, specificity of 98.7%, precision of 94.6%, and accuracy of 97.2%). Similarly, high-performing results (over 90% in all measures) were obtained for the other three search filters. Based on PubMed click-through data, we successfully developed a high-performance method for generating topic-specific search filters that is significantly more efficient than existing manual methods. All data sets (topic-relevant and irrelevant document sets) used in this study and a demonstration system are publicly available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/downloads/CQ_filter/

  13. Quantifying the statistical importance of utilizing regression over classic energy intensity calculations for tracking efficiency improvements in industry

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nimbalkar, Sachin U.; Wenning, Thomas J.; Guo, Wei

    In the United States, manufacturing facilities account for about 32% of total domestic energy consumption in 2014. Robust energy tracking methodologies are critical to understanding energy performance in manufacturing facilities. Due to its simplicity and intuitiveness, the classic energy intensity method (i.e. the ratio of total energy use over total production) is the most widely adopted. However, the classic energy intensity method does not take into account the variation of other relevant parameters (i.e. product type, feed stock type, weather, etc.). Furthermore, the energy intensity method assumes that the facilities’ base energy consumption (energy use at zero production) is zero,more » which rarely holds true. Therefore, it is commonly recommended to utilize regression models rather than the energy intensity approach for tracking improvements at the facility level. Unfortunately, many energy managers have difficulties understanding why regression models are statistically better than utilizing the classic energy intensity method. While anecdotes and qualitative information may convince some, many have major reservations about the accuracy of regression models and whether it is worth the time and effort to gather data and build quality regression models. This paper will explain why regression models are theoretically and quantitatively more accurate for tracking energy performance improvements. Based on the analysis of data from 114 manufacturing plants over 12 years, this paper will present quantitative results on the importance of utilizing regression models over the energy intensity methodology. This paper will also document scenarios where regression models do not have significant relevance over the energy intensity method.« less

  14. Statistics on continuous IBD data: Exact distribution evaluation for a pair of full(half)-sibs and a pair of a (great-) grandchild with a (great-) grandparent

    PubMed Central

    Stefanov, Valeri T

    2002-01-01

    Background Pairs of related individuals are widely used in linkage analysis. Most of the tests for linkage analysis are based on statistics associated with identity by descent (IBD) data. The current biotechnology provides data on very densely packed loci, and therefore, it may provide almost continuous IBD data for pairs of closely related individuals. Therefore, the distribution theory for statistics on continuous IBD data is of interest. In particular, distributional results which allow the evaluation of p-values for relevant tests are of importance. Results A technology is provided for numerical evaluation, with any given accuracy, of the cumulative probabilities of some statistics on continuous genome data for pairs of closely related individuals. In the case of a pair of full-sibs, the following statistics are considered: (i) the proportion of genome with 2 (at least 1) haplotypes shared identical-by-descent (IBD) on a chromosomal segment, (ii) the number of distinct pieces (subsegments) of a chromosomal segment, on each of which exactly 2 (at least 1) haplotypes are shared IBD. The natural counterparts of these statistics for the other relationships are also considered. Relevant Maple codes are provided for a rapid evaluation of the cumulative probabilities of such statistics. The genomic continuum model, with Haldane's model for the crossover process, is assumed. Conclusions A technology, together with relevant software codes for its automated implementation, are provided for exact evaluation of the distributions of relevant statistics associated with continuous genome data on closely related individuals. PMID:11996673

  15. Statistical scaling of geometric characteristics in stochastically generated pore microstructures

    DOE PAGES

    Hyman, Jeffrey D.; Guadagnini, Alberto; Winter, C. Larrabee

    2015-05-21

    In this study, we analyze the statistical scaling of structural attributes of virtual porous microstructures that are stochastically generated by thresholding Gaussian random fields. Characterization of the extent at which randomly generated pore spaces can be considered as representative of a particular rock sample depends on the metrics employed to compare the virtual sample against its physical counterpart. Typically, comparisons against features and/patterns of geometric observables, e.g., porosity and specific surface area, flow-related macroscopic parameters, e.g., permeability, or autocorrelation functions are used to assess the representativeness of a virtual sample, and thereby the quality of the generation method. Here, wemore » rely on manifestations of statistical scaling of geometric observables which were recently observed in real millimeter scale rock samples [13] as additional relevant metrics by which to characterize a virtual sample. We explore the statistical scaling of two geometric observables, namely porosity (Φ) and specific surface area (SSA), of porous microstructures generated using the method of Smolarkiewicz and Winter [42] and Hyman and Winter [22]. Our results suggest that the method can produce virtual pore space samples displaying the symptoms of statistical scaling observed in real rock samples. Order q sample structure functions (statistical moments of absolute increments) of Φ and SSA scale as a power of the separation distance (lag) over a range of lags, and extended self-similarity (linear relationship between log structure functions of successive orders) appears to be an intrinsic property of the generated media. The width of the range of lags where power-law scaling is observed and the Hurst coefficient associated with the variables we consider can be controlled by the generation parameters of the method.« less

  16. Statistics of some atmospheric turbulence records relevant to aircraft response calculations

    NASA Technical Reports Server (NTRS)

    Mark, W. D.; Fischer, R. W.

    1981-01-01

    Methods for characterizing atmospheric turbulence are described. The methods illustrated include maximum likelihood estimation of the integral scale and intensity of records obeying the von Karman transverse power spectral form, constrained least-squares estimation of the parameters of a parametric representation of autocorrelation functions, estimation of the power spectra density of the instantaneous variance of a record with temporally fluctuating variance, and estimation of the probability density functions of various turbulence components. Descriptions of the computer programs used in the computations are given, and a full listing of these programs is included.

  17. Promoting Statistical Thinking in Schools with Road Injury Data

    ERIC Educational Resources Information Center

    Woltman, Marie

    2017-01-01

    Road injury is an immediately relevant topic for 9-19 year olds. Current availability of Open Data makes it increasingly possible to find locally relevant data. Statistical lessons developed from these data can mutually reinforce life lessons about minimizing risk on the road. Devon County Council demonstrate how a wide array of statistical…

  18. Utilisation of Local Inputs in the Funding and Administration of Education in Nigeria

    ERIC Educational Resources Information Center

    Akiri, Agharuwhe A.

    2014-01-01

    The article discussed how, why and who is in charge of administering and funding schools in Nigeria. The author utilised the relevant statistical approach which examined and discussed various political and historical trends affecting education. Besides this, relevant documented statistical data were used to both buttress and substantiate related…

  19. Heart Rate Variability Dynamics for the Prognosis of Cardiovascular Risk

    PubMed Central

    Ramirez-Villegas, Juan F.; Lam-Espinosa, Eric; Ramirez-Moreno, David F.; Calvo-Echeverry, Paulo C.; Agredo-Rodriguez, Wilfredo

    2011-01-01

    Statistical, spectral, multi-resolution and non-linear methods were applied to heart rate variability (HRV) series linked with classification schemes for the prognosis of cardiovascular risk. A total of 90 HRV records were analyzed: 45 from healthy subjects and 45 from cardiovascular risk patients. A total of 52 features from all the analysis methods were evaluated using standard two-sample Kolmogorov-Smirnov test (KS-test). The results of the statistical procedure provided input to multi-layer perceptron (MLP) neural networks, radial basis function (RBF) neural networks and support vector machines (SVM) for data classification. These schemes showed high performances with both training and test sets and many combinations of features (with a maximum accuracy of 96.67%). Additionally, there was a strong consideration for breathing frequency as a relevant feature in the HRV analysis. PMID:21386966

  20. Pitfalls of national routine death statistics for maternal mortality study.

    PubMed

    Saucedo, Monica; Bouvier-Colle, Marie-Hélène; Chantry, Anne A; Lamarche-Vadel, Agathe; Rey, Grégoire; Deneux-Tharaux, Catherine

    2014-11-01

    The lessons learned from the study of maternal deaths depend on the accuracy of data. Our objective was to assess time trends in the underestimation of maternal mortality (MM) in the national routine death statistics in France and to evaluate their current accuracy for the selection and causes of maternal deaths. National data obtained by enhanced methods in 1989, 1999, and 2007-09 were used as the gold standard to assess time trends in the underestimation of MM ratios (MMRs) in death statistics. Enhanced data and death statistics for 2007-09 were further compared by characterising false negatives (FNs) and false positives (FPs). The distribution of cause-specific MMRs, as assessed by each system, was described. Underestimation of MM in death statistics decreased from 55.6% in 1989 to 11.4% in 2007-09 (P < 0.001). In 2007-09, of 787 pregnancy-associated deaths, 254 were classified as maternal by the enhanced system and 211 by the death statistics; 34% of maternal deaths in the enhanced system were FNs in the death statistics, and 20% of maternal deaths in the death statistics were FPs. The hierarchy of causes of MM differed between the two systems. The discordances were mainly explained by the lack of precision in the drafting of death certificates by clinicians. Although the underestimation of MM in routine death statistics has decreased substantially over time, one third of maternal deaths remain unidentified, and the main causes of death are incorrectly identified in these data. Defining relevant priorities in maternal health requires the use of enhanced methods for MM study. © 2014 John Wiley & Sons Ltd.

  1. Establishment of a Uniform Format for Data Reporting of Structural Material Properties for Reliability Analysis

    DTIC Science & Technology

    1994-06-30

    tip Opening Displacement (CTOD) Fracture Toughness Measurement". 48 The method has found application in the elastic-plastic fracture mechanics ( EPFM ...68 6.1 Proposed Material Property Database Format and Hierarchy .............. 68 6.2 Sample Application of the Material Property Database...the E 49.05 sub-committee. The relevant quality indicators applicable to the present program are: source of data, statistical basis of data

  2. Statistical approach for selection of biologically informative genes.

    PubMed

    Das, Samarendra; Rai, Anil; Mishra, D C; Rai, Shesh N

    2018-05-20

    Selection of informative genes from high dimensional gene expression data has emerged as an important research area in genomics. Many gene selection techniques have been proposed so far are either based on relevancy or redundancy measure. Further, the performance of these techniques has been adjudged through post selection classification accuracy computed through a classifier using the selected genes. This performance metric may be statistically sound but may not be biologically relevant. A statistical approach, i.e. Boot-MRMR, was proposed based on a composite measure of maximum relevance and minimum redundancy, which is both statistically sound and biologically relevant for informative gene selection. For comparative evaluation of the proposed approach, we developed two biological sufficient criteria, i.e. Gene Set Enrichment with QTL (GSEQ) and biological similarity score based on Gene Ontology (GO). Further, a systematic and rigorous evaluation of the proposed technique with 12 existing gene selection techniques was carried out using five gene expression datasets. This evaluation was based on a broad spectrum of statistically sound (e.g. subject classification) and biological relevant (based on QTL and GO) criteria under a multiple criteria decision-making framework. The performance analysis showed that the proposed technique selects informative genes which are more biologically relevant. The proposed technique is also found to be quite competitive with the existing techniques with respect to subject classification and computational time. Our results also showed that under the multiple criteria decision-making setup, the proposed technique is best for informative gene selection over the available alternatives. Based on the proposed approach, an R Package, i.e. BootMRMR has been developed and available at https://cran.r-project.org/web/packages/BootMRMR. This study will provide a practical guide to select statistical techniques for selecting informative genes from high dimensional expression data for breeding and system biology studies. Published by Elsevier B.V.

  3. Discovering Conformational Sub-States Relevant to Protein Function

    PubMed Central

    Ramanathan, Arvind; Savol, Andrej J.; Langmead, Christopher J.; Agarwal, Pratul K.; Chennubhotla, Chakra S.

    2011-01-01

    Background Internal motions enable proteins to explore a range of conformations, even in the vicinity of native state. The role of conformational fluctuations in the designated function of a protein is widely debated. Emerging evidence suggests that sub-groups within the range of conformations (or sub-states) contain properties that may be functionally relevant. However, low populations in these sub-states and the transient nature of conformational transitions between these sub-states present significant challenges for their identification and characterization. Methods and Findings To overcome these challenges we have developed a new computational technique, quasi-anharmonic analysis (QAA). QAA utilizes higher-order statistics of protein motions to identify sub-states in the conformational landscape. Further, the focus on anharmonicity allows identification of conformational fluctuations that enable transitions between sub-states. QAA applied to equilibrium simulations of human ubiquitin and T4 lysozyme reveals functionally relevant sub-states and protein motions involved in molecular recognition. In combination with a reaction pathway sampling method, QAA characterizes conformational sub-states associated with cis/trans peptidyl-prolyl isomerization catalyzed by the enzyme cyclophilin A. In these three proteins, QAA allows identification of conformational sub-states, with critical structural and dynamical features relevant to protein function. Conclusions Overall, QAA provides a novel framework to intuitively understand the biophysical basis of conformational diversity and its relevance to protein function. PMID:21297978

  4. Estimating the Proportion of True Null Hypotheses Using the Pattern of Observed p-values

    PubMed Central

    Tong, Tiejun; Feng, Zeny; Hilton, Julia S.; Zhao, Hongyu

    2013-01-01

    Estimating the proportion of true null hypotheses, π0, has attracted much attention in the recent statistical literature. Besides its apparent relevance for a set of specific scientific hypotheses, an accurate estimate of this parameter is key for many multiple testing procedures. Most existing methods for estimating π0 in the literature are motivated from the independence assumption of test statistics, which is often not true in reality. Simulations indicate that most existing estimators in the presence of the dependence among test statistics can be poor, mainly due to the increase of variation in these estimators. In this paper, we propose several data-driven methods for estimating π0 by incorporating the distribution pattern of the observed p-values as a practical approach to address potential dependence among test statistics. Specifically, we use a linear fit to give a data-driven estimate for the proportion of true-null p-values in (λ, 1] over the whole range [0, 1] instead of using the expected proportion at 1 − λ. We find that the proposed estimators may substantially decrease the variance of the estimated true null proportion and thus improve the overall performance. PMID:24078762

  5. Estimating the Proportion of True Null Hypotheses Using the Pattern of Observed p-values.

    PubMed

    Tong, Tiejun; Feng, Zeny; Hilton, Julia S; Zhao, Hongyu

    2013-01-01

    Estimating the proportion of true null hypotheses, π 0 , has attracted much attention in the recent statistical literature. Besides its apparent relevance for a set of specific scientific hypotheses, an accurate estimate of this parameter is key for many multiple testing procedures. Most existing methods for estimating π 0 in the literature are motivated from the independence assumption of test statistics, which is often not true in reality. Simulations indicate that most existing estimators in the presence of the dependence among test statistics can be poor, mainly due to the increase of variation in these estimators. In this paper, we propose several data-driven methods for estimating π 0 by incorporating the distribution pattern of the observed p -values as a practical approach to address potential dependence among test statistics. Specifically, we use a linear fit to give a data-driven estimate for the proportion of true-null p -values in (λ, 1] over the whole range [0, 1] instead of using the expected proportion at 1 - λ. We find that the proposed estimators may substantially decrease the variance of the estimated true null proportion and thus improve the overall performance.

  6. VALUE - A Framework to Validate Downscaling Approaches for Climate Change Studies

    NASA Astrophysics Data System (ADS)

    Maraun, Douglas; Widmann, Martin; Gutiérrez, José M.; Kotlarski, Sven; Chandler, Richard E.; Hertig, Elke; Wibig, Joanna; Huth, Radan; Wilke, Renate A. I.

    2015-04-01

    VALUE is an open European network to validate and compare downscaling methods for climate change research. VALUE aims to foster collaboration and knowledge exchange between climatologists, impact modellers, statisticians, and stakeholders to establish an interdisciplinary downscaling community. A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. Here, we present the key ingredients of this framework. VALUE's main approach to validation is user-focused: starting from a specific user problem, a validation tree guides the selection of relevant validation indices and performance measures. Several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur: what is the isolated downscaling skill? How do statistical and dynamical methods compare? How do methods perform at different spatial scales? Do methods fail in representing regional climate change? How is the overall representation of regional climate, including errors inherited from global climate models? The framework will be the basis for a comprehensive community-open downscaling intercomparison study, but is intended also to provide general guidance for other validation studies.

  7. VALUE: A framework to validate downscaling approaches for climate change studies

    NASA Astrophysics Data System (ADS)

    Maraun, Douglas; Widmann, Martin; Gutiérrez, José M.; Kotlarski, Sven; Chandler, Richard E.; Hertig, Elke; Wibig, Joanna; Huth, Radan; Wilcke, Renate A. I.

    2015-01-01

    VALUE is an open European network to validate and compare downscaling methods for climate change research. VALUE aims to foster collaboration and knowledge exchange between climatologists, impact modellers, statisticians, and stakeholders to establish an interdisciplinary downscaling community. A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. In this paper, we present the key ingredients of this framework. VALUE's main approach to validation is user- focused: starting from a specific user problem, a validation tree guides the selection of relevant validation indices and performance measures. Several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur: what is the isolated downscaling skill? How do statistical and dynamical methods compare? How do methods perform at different spatial scales? Do methods fail in representing regional climate change? How is the overall representation of regional climate, including errors inherited from global climate models? The framework will be the basis for a comprehensive community-open downscaling intercomparison study, but is intended also to provide general guidance for other validation studies.

  8. Implementation of novel statistical procedures and other advanced approaches to improve analysis of CASA data.

    PubMed

    Ramón, M; Martínez-Pastor, F

    2018-04-23

    Computer-aided sperm analysis (CASA) produces a wealth of data that is frequently ignored. The use of multiparametric statistical methods can help explore these datasets, unveiling the subpopulation structure of sperm samples. In this review we analyse the significance of the internal heterogeneity of sperm samples and its relevance. We also provide a brief description of the statistical tools used for extracting sperm subpopulations from the datasets, namely unsupervised clustering (with non-hierarchical, hierarchical and two-step methods) and the most advanced supervised methods, based on machine learning. The former method has allowed exploration of subpopulation patterns in many species, whereas the latter offering further possibilities, especially considering functional studies and the practical use of subpopulation analysis. We also consider novel approaches, such as the use of geometric morphometrics or imaging flow cytometry. Finally, although the data provided by CASA systems provides valuable information on sperm samples by applying clustering analyses, there are several caveats. Protocols for capturing and analysing motility or morphometry should be standardised and adapted to each experiment, and the algorithms should be open in order to allow comparison of results between laboratories. Moreover, we must be aware of new technology that could change the paradigm for studying sperm motility and morphology.

  9. Analytics for Cyber Network Defense

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Plantenga, Todd.; Kolda, Tamara Gibson

    2011-06-01

    This report provides a brief survey of analytics tools considered relevant to cyber network defense (CND). Ideas and tools come from elds such as statistics, data mining, and knowledge discovery. Some analytics are considered standard mathematical or statistical techniques, while others re ect current research directions. In all cases the report attempts to explain the relevance to CND with brief examples.

  10. The Effects of Clinically Relevant Multiple-Choice Items on the Statistical Discrimination of Physician Clinical Competence.

    ERIC Educational Resources Information Center

    Downing, Steven M.; Maatsch, Jack L.

    To test the effect of clinically relevant multiple-choice item content on the validity of statistical discriminations of physicians' clinical competence, data were collected from a field test of the Emergency Medicine Examination, test items for the certification of specialists in emergency medicine. Two 91-item multiple-choice subscales were…

  11. Semantically enabled and statistically supported biological hypothesis testing with tissue microarray databases

    PubMed Central

    2011-01-01

    Background Although many biological databases are applying semantic web technologies, meaningful biological hypothesis testing cannot be easily achieved. Database-driven high throughput genomic hypothesis testing requires both of the capabilities of obtaining semantically relevant experimental data and of performing relevant statistical testing for the retrieved data. Tissue Microarray (TMA) data are semantically rich and contains many biologically important hypotheses waiting for high throughput conclusions. Methods An application-specific ontology was developed for managing TMA and DNA microarray databases by semantic web technologies. Data were represented as Resource Description Framework (RDF) according to the framework of the ontology. Applications for hypothesis testing (Xperanto-RDF) for TMA data were designed and implemented by (1) formulating the syntactic and semantic structures of the hypotheses derived from TMA experiments, (2) formulating SPARQLs to reflect the semantic structures of the hypotheses, and (3) performing statistical test with the result sets returned by the SPARQLs. Results When a user designs a hypothesis in Xperanto-RDF and submits it, the hypothesis can be tested against TMA experimental data stored in Xperanto-RDF. When we evaluated four previously validated hypotheses as an illustration, all the hypotheses were supported by Xperanto-RDF. Conclusions We demonstrated the utility of high throughput biological hypothesis testing. We believe that preliminary investigation before performing highly controlled experiment can be benefited. PMID:21342584

  12. Use of FEV1 in Cystic Fibrosis Epidemiologic Studies and Clinical Trials: A Statistical Perspective for the Clinical Researcher

    PubMed Central

    Szczesniak, Rhonda; Heltshe, Sonya L.; Stanojevic, Sanja; Mayer-Hamblett, Nicole

    2017-01-01

    Background Forced expiratory volume in 1 second (FEV1) is an established marker of cystic fibrosis (CF) disease progression that is used to capture clinical course and evaluate therapeutic efficacy. The research community has established FEV1 surveillance data through a variety of observational data sources such as patient registries, and there is a growing pipeline of new CF therapies demonstrated to be efficacious in clinical trials by establishing improvements in FEV1. Results In this review, we summarize from a statistical perspective the clinical relevance of FEV1 based on its association with morbidity and mortality in CF, its role in epidemiologic studies of disease progression and comparative effectiveness, and its utility in clinical trials. In addition, we identify opportunities to advance epidemiologic research and the clinical development pipeline through further statistical considerations. Conclusions Our understanding of CF disease course, therapeutics, and clinical care has evolved immensely in the past decades, in large part due to the thoughtful application of rigorous research methods and meaningful clinical endpoints such as FEV1. A continued commitment to conduct research that minimizes the potential for bias, maximizes the limited patient population, and harmonizes approaches to FEV1 analysis while maintaining clinical relevance, will facilitate further opportunities to advance CF care. PMID:28117136

  13. Radiomic analysis in prediction of Human Papilloma Virus status.

    PubMed

    Yu, Kaixian; Zhang, Youyi; Yu, Yang; Huang, Chao; Liu, Rongjie; Li, Tengfei; Yang, Liuqing; Morris, Jeffrey S; Baladandayuthapani, Veerabhadran; Zhu, Hongtu

    2017-12-01

    Human Papilloma Virus (HPV) has been associated with oropharyngeal cancer prognosis. Traditionally the HPV status is tested through invasive lab test. Recently, the rapid development of statistical image analysis techniques has enabled precise quantitative analysis of medical images. The quantitative analysis of Computed Tomography (CT) provides a non-invasive way to assess HPV status for oropharynx cancer patients. We designed a statistical radiomics approach analyzing CT images to predict HPV status. Various radiomics features were extracted from CT scans, and analyzed using statistical feature selection and prediction methods. Our approach ranked the highest in the 2016 Medical Image Computing and Computer Assisted Intervention (MICCAI) grand challenge: Oropharynx Cancer (OPC) Radiomics Challenge, Human Papilloma Virus (HPV) Status Prediction. Further analysis on the most relevant radiomic features distinguishing HPV positive and negative subjects suggested that HPV positive patients usually have smaller and simpler tumors.

  14. Statistical deprojection of galaxy pairs

    NASA Astrophysics Data System (ADS)

    Nottale, Laurent; Chamaraux, Pierre

    2018-06-01

    Aims: The purpose of the present paper is to provide methods of statistical analysis of the physical properties of galaxy pairs. We perform this study to apply it later to catalogs of isolated pairs of galaxies, especially two new catalogs we recently constructed that contain ≈1000 and ≈13 000 pairs, respectively. We are particularly interested by the dynamics of those pairs, including the determination of their masses. Methods: We could not compute the dynamical parameters directly since the necessary data are incomplete. Indeed, we only have at our disposal one component of the intervelocity between the members, namely along the line of sight, and two components of their interdistance, i.e., the projection on the sky-plane. Moreover, we know only one point of each galaxy orbit. Hence we need statistical methods to find the probability distribution of 3D interdistances and 3D intervelocities from their projections; we designed those methods under the term deprojection. Results: We proceed in two steps to determine and use the deprojection methods. First we derive the probability distributions expected for the various relevant projected quantities, namely intervelocity vz, interdistance rp, their ratio, and the product rp v_z^2, which is involved in mass determination. In a second step, we propose various methods of deprojection of those parameters based on the previous analysis. We start from a histogram of the projected data and we apply inversion formulae to obtain the deprojected distributions; lastly, we test the methods by numerical simulations, which also allow us to determine the uncertainties involved.

  15. Pharmacy Students’ Perceptions of Natural Science and Mathematics Subjects

    PubMed Central

    Wilson, Sarah Ellen; Wan, Kai-Wai

    2014-01-01

    Objective. To determine the level of importance pharmacy students placed on science and mathematics subjects for pursuing a career in pharmacy. Method. Two hundred fifty-four students completed a survey instrument developed to investigate students’ perceptions of the relevance of science and mathematics subjects to a career in pharmacy. Pharmacy students in all 4 years of a master of pharmacy (MPharm) degree program were invited to complete the survey instrument. Results. Students viewed chemistry-based and biology-based subjects as relevant to a pharmacy career, whereas mathematics subjects such as physics, logarithms, statistics, and algebra were not viewed important to a career in pharmacy. Conclusion. Students’ experience in pharmacy and year of study influenced their perceptions of subjects relevant to a pharmacy career. Pharmacy educators need to consider how they can help students recognize the importance of scientific knowledge earlier in the pharmacy curriculum. PMID:25147390

  16. Assessment methodologies and statistical issues for computer-aided diagnosis of lung nodules in computed tomography: contemporary research topics relevant to the lung image database consortium.

    PubMed

    Dodd, Lori E; Wagner, Robert F; Armato, Samuel G; McNitt-Gray, Michael F; Beiden, Sergey; Chan, Heang-Ping; Gur, David; McLennan, Geoffrey; Metz, Charles E; Petrick, Nicholas; Sahiner, Berkman; Sayre, Jim

    2004-04-01

    Cancer of the lung and bronchus is the leading fatal malignancy in the United States. Five-year survival is low, but treatment of early stage disease considerably improves chances of survival. Advances in multidetector-row computed tomography technology provide detection of smaller lung nodules and offer a potentially effective screening tool. The large number of images per exam, however, requires considerable radiologist time for interpretation and is an impediment to clinical throughput. Thus, computer-aided diagnosis (CAD) methods are needed to assist radiologists with their decision making. To promote the development of CAD methods, the National Cancer Institute formed the Lung Image Database Consortium (LIDC). The LIDC is charged with developing the consensus and standards necessary to create an image database of multidetector-row computed tomography lung images as a resource for CAD researchers. To develop such a prospective database, its potential uses must be anticipated. The ultimate applications will influence the information that must be included along with the images, the relevant measures of algorithm performance, and the number of required images. In this article we outline assessment methodologies and statistical issues as they relate to several potential uses of the LIDC database. We review methods for performance assessment and discuss issues of defining "truth" as well as the complications that arise when truth information is not available. We also discuss issues about sizing and populating a database.

  17. Outcomes associated with community-based research projects in teaching undergraduate public health.

    PubMed

    Bouhaimed, Manal; Thalib, Lukman; Doi, Suhail A R

    2008-01-01

    Community based research projects have been widely used in teaching public health in many institutions. Nevertheless, there is a paucity of information on the learning outcomes of such a teaching strategy. We therefore attempted to evaluate our experience with such a project based teaching process. The objective of this study was to evaluate the factors related to quality, impact and relevance of a 6-week student project for teaching public health in the faculty of medicine at Kuwait University. Interactive sessions familiarized students with research methods. Concurrently, they designed and completed a participatory project with a Community Medicine mentor. Questionnaires were used to assess quality, impact and relevance of the project, and these were correlated with multiple demographic, statistical and research design factors. We evaluated a total of 104 projects that were completed during the period of September 2001 to June 2006. Three dimensions of outcome were assessed: quality; impact and relevance. The average (mean + SE; maximum of 5) scores across all projects were 2.6 + 0.05 (range 1.7-3.7) for quality, 2.8 + 0.06 (range 1.7-4.3) for impact and 3.3 + 0.08 (range 1.3-5) for relevance. The analysis of the relationship between various factors and the scores on each dimension of assessment revealed that various factors were associated with improved quality, impact or relevance to public health practice. We conclude that use of more objective measurement instruments with better a priori conceptualization along with appropriate use of statistics and a more developed study design were likely to result in more meaningful research outcomes. We also found that a biostatistics or epidemiology mentor improved the research outcome.

  18. Supervised learning methods for pathological arterial pulse wave differentiation: A SVM and neural networks approach.

    PubMed

    Paiva, Joana S; Cardoso, João; Pereira, Tânia

    2018-01-01

    The main goal of this study was to develop an automatic method based on supervised learning methods, able to distinguish healthy from pathologic arterial pulse wave (APW), and those two from noisy waveforms (non-relevant segments of the signal), from the data acquired during a clinical examination with a novel optical system. The APW dataset analysed was composed by signals acquired in a clinical environment from a total of 213 subjects, including healthy volunteers and non-healthy patients. The signals were parameterised by means of 39pulse features: morphologic, time domain statistics, cross-correlation features, wavelet features. Multiclass Support Vector Machine Recursive Feature Elimination (SVM RFE) method was used to select the most relevant features. A comparative study was performed in order to evaluate the performance of the two classifiers: Support Vector Machine (SVM) and Artificial Neural Network (ANN). SVM achieved a statistically significant better performance for this problem with an average accuracy of 0.9917±0.0024 and a F-Measure of 0.9925±0.0019, in comparison with ANN, which reached the values of 0.9847±0.0032 and 0.9852±0.0031 for Accuracy and F-Measure, respectively. A significant difference was observed between the performances obtained with SVM classifier using a different number of features from the original set available. The comparison between SVM and NN allowed reassert the higher performance of SVM. The results obtained in this study showed the potential of the proposed method to differentiate those three important signal outcomes (healthy, pathologic and noise) and to reduce bias associated with clinical diagnosis of cardiovascular disease using APW. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Do polymorphisms of 5,10-methylenetetrahydrofolate reductase (MTHFR) gene affect the risk of childhood acute lymphoblastic leukemia?

    PubMed

    Pereira, Tiago Veiga; Rudnicki, Martina; Pereira, Alexandre Costa; Pombo-de-Oliveira, Maria S; Franco, Rendrik França

    2006-01-01

    Meta-analysis has become an important statistical tool in genetic association studies, since it may provide more powerful and precise estimates. However, meta-analytic studies are prone to several potential biases not only because the preferential publication of "positive'' studies but also due to difficulties in obtaining all relevant information during the study selection process. In this letter, we point out major problems in meta-analysis that may lead to biased conclusions, illustrating an empirical example of two recent meta-analyses on the relation between MTHFR polymorphisms and risk of acute lymphoblastic leukemia that, despite the similarity in statistical methods and period of study selection, provided partially conflicting results.

  20. Supplementing land-use statistics with landscape metrics: some methodological considerations.

    PubMed

    Herzog, F; Lausch, A

    2001-11-01

    Landscape monitoring usually relies on land-use statistics which reflect the share of land-sue/land cover types. In order to understand the functioning of landscapes, landscape pattern must be considered as well. Indicators which address the spatial configuration of landscapes are therefore needed. The suitability of landscape metrics, which are computed from the type, geometry and arrangement of patches, is examined. Two case studies in a surface mining region show that landscape metrics capture landscape structure but are highly dependent on the data model and on the methods of data analysis. For landscape metrics to become part of policy-relevant sets of environmental indicators, standardised procedures for their computation from remote sensing images must be developed.

  1. Reporting quality of statistical methods in surgical observational studies: protocol for systematic review.

    PubMed

    Wu, Robert; Glen, Peter; Ramsay, Tim; Martel, Guillaume

    2014-06-28

    Observational studies dominate the surgical literature. Statistical adjustment is an important strategy to account for confounders in observational studies. Research has shown that published articles are often poor in statistical quality, which may jeopardize their conclusions. The Statistical Analyses and Methods in the Published Literature (SAMPL) guidelines have been published to help establish standards for statistical reporting.This study will seek to determine whether the quality of statistical adjustment and the reporting of these methods are adequate in surgical observational studies. We hypothesize that incomplete reporting will be found in all surgical observational studies, and that the quality and reporting of these methods will be of lower quality in surgical journals when compared with medical journals. Finally, this work will seek to identify predictors of high-quality reporting. This work will examine the top five general surgical and medical journals, based on a 5-year impact factor (2007-2012). All observational studies investigating an intervention related to an essential component area of general surgery (defined by the American Board of Surgery), with an exposure, outcome, and comparator, will be included in this systematic review. Essential elements related to statistical reporting and quality were extracted from the SAMPL guidelines and include domains such as intent of analysis, primary analysis, multiple comparisons, numbers and descriptive statistics, association and correlation analyses, linear regression, logistic regression, Cox proportional hazard analysis, analysis of variance, survival analysis, propensity analysis, and independent and correlated analyses. Each article will be scored as a proportion based on fulfilling criteria in relevant analyses used in the study. A logistic regression model will be built to identify variables associated with high-quality reporting. A comparison will be made between the scores of surgical observational studies published in medical versus surgical journals. Secondary outcomes will pertain to individual domains of analysis. Sensitivity analyses will be conducted. This study will explore the reporting and quality of statistical analyses in surgical observational studies published in the most referenced surgical and medical journals in 2013 and examine whether variables (including the type of journal) can predict high-quality reporting.

  2. Determining Primary Care Physician Information Needs to Inform Ambulatory Visit Note Display

    PubMed Central

    Clarke, M.A.; Steege, L.M.; Moore, J.L.; Koopman, R.J.; Belden, J.L.; Kim, M.S.

    2014-01-01

    Summary Background With the increase in the adoption of electronic health records (EHR) across the US, primary care physicians are experiencing information overload. The purpose of this pilot study was to determine the information needs of primary care physicians (PCPs) as they review clinic visit notes to inform EHR display. Method Data collection was conducted with 15 primary care physicians during semi-structured interviews, including a third party observer to control bias. Physicians reviewed major sections of an artificial but typical acute and chronic care visit note to identify the note sections that were relevant to their information needs. Statistical methods used were McNemar-Mosteller’s and Cochran Q. Results Physicians identified History of Present Illness (HPI), Assessment, and Plan (A&P) as the most important sections of a visit note. In contrast, they largely judged the Review of Systems (ROS) to be superfluous. There was also a statistical difference in physicians’ highlighting among all seven major note sections in acute (p = 0.00) and chronic (p = 0.00) care visit notes. Conclusion A&P and HPI sections were most frequently identified as important which suggests that physicians may have to identify a few key sections out of a long, unnecessarily verbose visit note. ROS is viewed by doctors as mostly “not needed,” but can have relevant information. The ROS can contain information needed for patient care when other sections of the Visit note, such as the HPI, lack the relevant information. Future studies should include producing a display that provides only relevant information to increase physician efficiency at the point of care. Also, research on moving A&P to the top of visit notes instead of having A&P at the bottom of the page is needed, since those are usually the first sections physicians refer to and reviewing from top to bottom may cause cognitive load. PMID:24734131

  3. Parallel object-oriented decision tree system

    DOEpatents

    Kamath,; Chandrika, Cantu-Paz [Dublin, CA; Erick, [Oakland, CA

    2006-02-28

    A data mining decision tree system that uncovers patterns, associations, anomalies, and other statistically significant structures in data by reading and displaying data files, extracting relevant features for each of the objects, and using a method of recognizing patterns among the objects based upon object features through a decision tree that reads the data, sorts the data if necessary, determines the best manner to split the data into subsets according to some criterion, and splits the data.

  4. Structural damage detection based on stochastic subspace identification and statistical pattern recognition: II. Experimental validation under varying temperature

    NASA Astrophysics Data System (ADS)

    Lin, Y. Q.; Ren, W. X.; Fang, S. E.

    2011-11-01

    Although most vibration-based damage detection methods can acquire satisfactory verification on analytical or numerical structures, most of them may encounter problems when applied to real-world structures under varying environments. The damage detection methods that directly extract damage features from the periodically sampled dynamic time history response measurements are desirable but relevant research and field application verification are still lacking. In this second part of a two-part paper, the robustness and performance of the statistics-based damage index using the forward innovation model by stochastic subspace identification of a vibrating structure proposed in the first part have been investigated against two prestressed reinforced concrete (RC) beams tested in the laboratory and a full-scale RC arch bridge tested in the field under varying environments. Experimental verification is focused on temperature effects. It is demonstrated that the proposed statistics-based damage index is insensitive to temperature variations but sensitive to the structural deterioration or state alteration. This makes it possible to detect the structural damage for the real-scale structures experiencing ambient excitations and varying environmental conditions.

  5. Weighted analysis of composite endpoints with simultaneous inference for flexible weight constraints.

    PubMed

    Duc, Anh Nguyen; Wolbers, Marcel

    2017-02-10

    Composite endpoints are widely used as primary endpoints of randomized controlled trials across clinical disciplines. A common critique of the conventional analysis of composite endpoints is that all disease events are weighted equally, whereas their clinical relevance may differ substantially. We address this by introducing a framework for the weighted analysis of composite endpoints and interpretable test statistics, which are applicable to both binary and time-to-event data. To cope with the difficulty of selecting an exact set of weights, we propose a method for constructing simultaneous confidence intervals and tests that asymptotically preserve the family-wise type I error in the strong sense across families of weights satisfying flexible inequality or order constraints based on the theory of χ¯2-distributions. We show that the method achieves the nominal simultaneous coverage rate with substantial efficiency gains over Scheffé's procedure in a simulation study and apply it to trials in cardiovascular disease and enteric fever. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.

  6. Defining functioning levels in patients with schizophrenia: A combination of a novel clustering method and brain SPECT analysis.

    PubMed

    Catherine, Faget-Agius; Aurélie, Vincenti; Eric, Guedj; Pierre, Michel; Raphaëlle, Richieri; Marine, Alessandrini; Pascal, Auquier; Christophe, Lançon; Laurent, Boyer

    2017-12-30

    This study aims to define functioning levels of patients with schizophrenia by using a method of interpretable clustering based on a specific functioning scale, the Functional Remission Of General Schizophrenia (FROGS) scale, and to test their validity regarding clinical and neuroimaging characterization. In this observational study, patients with schizophrenia have been classified using a hierarchical top-down method called clustering using unsupervised binary trees (CUBT). Socio-demographic, clinical, and neuroimaging SPECT perfusion data were compared between the different clusters to ensure their clinical relevance. A total of 242 patients were analyzed. A four-group functioning level structure has been identified: 54 are classified as "minimal", 81 as "low", 64 as "moderate", and 43 as "high". The clustering shows satisfactory statistical properties, including reproducibility and discriminancy. The 4 clusters consistently differentiate patients. "High" functioning level patients reported significantly the lowest scores on the PANSS and the CDSS, and the highest scores on the GAF, the MARS and S-QoL 18. Functioning levels were significantly associated with cerebral perfusion of two relevant areas: the left inferior parietal cortex and the anterior cingulate. Our study provides relevant functioning levels in schizophrenia, and may enhance the use of functioning scale. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Back to the basics: Identifying and addressing underlying challenges in achieving high quality and relevant health statistics for indigenous populations in Canada

    PubMed Central

    Smylie, Janet; Firestone, Michelle

    2015-01-01

    Canada is known internationally for excellence in both the quality and public policy relevance of its health and social statistics. There is a double standard however with respect to the relevance and quality of statistics for Indigenous populations in Canada. Indigenous specific health and social statistics gathering is informed by unique ethical, rights-based, policy and practice imperatives regarding the need for Indigenous participation and leadership in Indigenous data processes throughout the spectrum of indicator development, data collection, management, analysis and use. We demonstrate how current Indigenous data quality challenges including misclassification errors and non-response bias systematically contribute to a significant underestimate of inequities in health determinants, health status, and health care access between Indigenous and non-Indigenous people in Canada. The major quality challenge underlying these errors and biases is the lack of Indigenous specific identifiers that are consistent and relevant in major health and social data sources. The recent removal of an Indigenous identity question from the Canadian census has resulted in further deterioration of an already suboptimal system. A revision of core health data sources to include relevant, consistent, and inclusive Indigenous self-identification is urgently required. These changes need to be carried out in partnership with Indigenous peoples and their representative and governing organizations. PMID:26793283

  8. Back to basics: an introduction to statistics.

    PubMed

    Halfens, R J G; Meijers, J M M

    2013-05-01

    In the second in the series, Professor Ruud Halfens and Dr Judith Meijers give an overview of statistics, both descriptive and inferential. They describe the first principles of statistics, including some relevant inferential tests.

  9. Statistical methods for meta-analyses including information from studies without any events-add nothing to nothing and succeed nevertheless.

    PubMed

    Kuss, O

    2015-03-30

    Meta-analyses with rare events, especially those that include studies with no event in one ('single-zero') or even both ('double-zero') treatment arms, are still a statistical challenge. In the case of double-zero studies, researchers in general delete these studies or use continuity corrections to avoid them. A number of arguments against both options has been given, and statistical methods that use the information from double-zero studies without using continuity corrections have been proposed. In this paper, we collect them and compare them by simulation. This simulation study tries to mirror real-life situations as completely as possible by deriving true underlying parameters from empirical data on actually performed meta-analyses. It is shown that for each of the commonly encountered effect estimators valid statistical methods are available that use the information from double-zero studies without using continuity corrections. Interestingly, all of them are truly random effects models, and so also the current standard method for very sparse data as recommended from the Cochrane collaboration, the Yusuf-Peto odds ratio, can be improved on. For actual analysis, we recommend to use beta-binomial regression methods to arrive at summary estimates for the odds ratio, the relative risk, or the risk difference. Methods that ignore information from double-zero studies or use continuity corrections should no longer be used. We illustrate the situation with an example where the original analysis ignores 35 double-zero studies, and a superior analysis discovers a clinically relevant advantage of off-pump surgery in coronary artery bypass grafting. Copyright © 2014 John Wiley & Sons, Ltd.

  10. Regional projection of climate impact indices over the Mediterranean region

    NASA Astrophysics Data System (ADS)

    Casanueva, Ana; Frías, M.; Dolores; Herrera, Sixto; Bedia, Joaquín; San Martín, Daniel; Gutiérrez, José Manuel; Zaninovic, Ksenija

    2014-05-01

    Climate Impact Indices (CIIs) are being increasingly used in different socioeconomic sectors to transfer information about climate change impacts and risks to stakeholders. CIIs are typically based on different weather variables such as temperature, wind speed, precipitation or humidity and comprise, in a single index, the relevant meteorological information for the particular impact sector (in this study wildfires and tourism). This dependence on several climate variables poses important limitations to the application of statistical downscaling techniques, since physical consistency among variables is required in most cases to obtain reliable local projections. The present study assesses the suitability of the "direct" downscaling approach, in which the downscaling method is directly applied to the CII. In particular, for illustrative purposes, we consider two popular indices used in the wildfire and tourism sectors, the Fire Weather Index (FWI) and the Physiological Equivalent Temperature (PET), respectively. As an example, two case studies are analysed over two representative Mediterranean regions of interest for the EU CLIM-RUN project: continental Spain for the FWI and Croatia for the PET. Results obtained with this "direct" downscaling approach are similar to those found from the application of the statistical downscaling to the individual meteorological drivers prior to the index calculation ("component" downscaling) thus, a wider range of statistical downscaling methods could be used. As an illustration, future changes in both indices are projected by applying two direct statistical downscaling methods, analogs and linear regression, to the ECHAM5 model. Larger differences were found between the two direct statistical downscaling approaches than between the direct and the component approaches with a single downscaling method. While these examples focus on particular indices and Mediterranean regions of interest for CLIM-RUN stakeholders, the same study could be extended to other indices and regions.

  11. Combining Shapley value and statistics to the analysis of gene expression data in children exposed to air pollution

    PubMed Central

    Moretti, Stefano; van Leeuwen, Danitsja; Gmuender, Hans; Bonassi, Stefano; van Delft, Joost; Kleinjans, Jos; Patrone, Fioravante; Merlo, Domenico Franco

    2008-01-01

    Background In gene expression analysis, statistical tests for differential gene expression provide lists of candidate genes having, individually, a sufficiently low p-value. However, the interpretation of each single p-value within complex systems involving several interacting genes is problematic. In parallel, in the last sixty years, game theory has been applied to political and social problems to assess the power of interacting agents in forcing a decision and, more recently, to represent the relevance of genes in response to certain conditions. Results In this paper we introduce a Bootstrap procedure to test the null hypothesis that each gene has the same relevance between two conditions, where the relevance is represented by the Shapley value of a particular coalitional game defined on a microarray data-set. This method, which is called Comparative Analysis of Shapley value (shortly, CASh), is applied to data concerning the gene expression in children differentially exposed to air pollution. The results provided by CASh are compared with the results from a parametric statistical test for testing differential gene expression. Both lists of genes provided by CASh and t-test are informative enough to discriminate exposed subjects on the basis of their gene expression profiles. While many genes are selected in common by CASh and the parametric test, it turns out that the biological interpretation of the differences between these two selections is more interesting, suggesting a different interpretation of the main biological pathways in gene expression regulation for exposed individuals. A simulation study suggests that CASh offers more power than t-test for the detection of differential gene expression variability. Conclusion CASh is successfully applied to gene expression analysis of a data-set where the joint expression behavior of genes may be critical to characterize the expression response to air pollution. We demonstrate a synergistic effect between coalitional games and statistics that resulted in a selection of genes with a potential impact in the regulation of complex pathways. PMID:18764936

  12. Assessing conservation relevance of organism-environment relations using predicted changes in response variables

    USGS Publications Warehouse

    Gutzwiller, Kevin J.; Barrow, Wylie C.; White, Joseph D.; Johnson-Randall, Lori; Cade, Brian S.; Zygo, Lisa M.

    2010-01-01

    1. Organism–environment models are used widely in conservation. The degree to which they are useful for informing conservation decisions – the conservation relevance of these relations – is important because lack of relevance may lead to misapplication of scarce conservation resources or failure to resolve important conservation dilemmas. Even when models perform well based on model fit and predictive ability, conservation relevance of associations may not be clear without also knowing the magnitude and variability of predicted changes in response variables. 2. We introduce a method for evaluating the conservation relevance of organism–environment relations that employs confidence intervals for predicted changes in response variables. The confidence intervals are compared to a preselected magnitude of change that marks a threshold (trigger) for conservation action. To demonstrate the approach, we used a case study from the Chihuahuan Desert involving relations between avian richness and broad-scale patterns of shrubland. We considered relations for three winters and two spatial extents (1- and 2-km-radius areas) and compared predicted changes in richness to three thresholds (10%, 20% and 30% change). For each threshold, we examined 48 relations. 3. The method identified seven, four and zero conservation-relevant changes in mean richness for the 10%, 20% and 30% thresholds respectively. These changes were associated with major (20%) changes in shrubland cover, mean patch size, the coefficient of variation for patch size, or edge density but not with major changes in shrubland patch density. The relative rarity of conservation-relevant changes indicated that, overall, the relations had little practical value for informing conservation decisions about avian richness. 4. The approach we illustrate is appropriate for various response and predictor variables measured at any temporal or spatial scale. The method is broadly applicable across ecological environments, conservation objectives, types of statistical predictive models and levels of biological organization. By focusing on magnitudes of change that have practical significance, and by using the span of confidence intervals to incorporate uncertainty of predicted changes, the method can be used to help improve the effectiveness of conservation efforts.

  13. Interchangeability of Procalcitonin Measurements Using the Point of Care Testing i-CHROMATM Reader and the Automated Liaison XL.

    PubMed

    Stenner, Elisabetta; Barbati, Giulia; West, Nicole; Ben, Fabia Del; Martin, Francesca; Ruscio, Maurizio

    2018-06-01

    Our aim was to verify if procalcitonin (PCT) measurements using the new point-of-care testing i-CHROMATM are interchangeable with those of Liaison XL. One hundred seventeen serum samples were processed sequentially on a Liaison XL and i-CHROMATM. Statistical analysis was done using the Passing-Bablok regression, Bland-Altman test, and Cohen's Kappa statistic. Proportional and constant differences were observed between i-CHROMATM and Liaison XL. The 95% CI of the mean bias% was very large, exceeding the maximum allowable TE% and the clinical reference change value. However, the concordance between methods at the clinical relevant cutoffs was strong, with the exception of the 0.25 ng/mL cutoff which was moderate. Our data suggest that i-CHROMATM is not interchangeable with Liaison XL. However, while the strong concordance at the clinical relevant cutoffs allows us to consider i-CHROMATM a suitable option to Liaison XL to support clinicians' decision-making; nevertheless, the moderate agreement at the 0.25 ng/mL cutoff recommends caution in interpreting the data around this cutoff.

  14. Probabilistic biological network alignment.

    PubMed

    Todor, Andrei; Dobra, Alin; Kahveci, Tamer

    2013-01-01

    Interactions between molecules are probabilistic events. An interaction may or may not happen with some probability, depending on a variety of factors such as the size, abundance, or proximity of the interacting molecules. In this paper, we consider the problem of aligning two biological networks. Unlike existing methods, we allow one of the two networks to contain probabilistic interactions. Allowing interaction probabilities makes the alignment more biologically relevant at the expense of explosive growth in the number of alternative topologies that may arise from different subsets of interactions that take place. We develop a novel method that efficiently and precisely characterizes this massive search space. We represent the topological similarity between pairs of aligned molecules (i.e., proteins) with the help of random variables and compute their expected values. We validate our method showing that, without sacrificing the running time performance, it can produce novel alignments. Our results also demonstrate that our method identifies biologically meaningful mappings under a comprehensive set of criteria used in the literature as well as the statistical coherence measure that we developed to analyze the statistical significance of the similarity of the functions of the aligned protein pairs.

  15. Event coincidence analysis for quantifying statistical interrelationships between event time series. On the role of flood events as triggers of epidemic outbreaks

    NASA Astrophysics Data System (ADS)

    Donges, J. F.; Schleussner, C.-F.; Siegmund, J. F.; Donner, R. V.

    2016-05-01

    Studying event time series is a powerful approach for analyzing the dynamics of complex dynamical systems in many fields of science. In this paper, we describe the method of event coincidence analysis to provide a framework for quantifying the strength, directionality and time lag of statistical interrelationships between event series. Event coincidence analysis allows to formulate and test null hypotheses on the origin of the observed interrelationships including tests based on Poisson processes or, more generally, stochastic point processes with a prescribed inter-event time distribution and other higher-order properties. Applying the framework to country-level observational data yields evidence that flood events have acted as triggers of epidemic outbreaks globally since the 1950s. Facing projected future changes in the statistics of climatic extreme events, statistical techniques such as event coincidence analysis will be relevant for investigating the impacts of anthropogenic climate change on human societies and ecosystems worldwide.

  16. An ANOVA approach for statistical comparisons of brain networks.

    PubMed

    Fraiman, Daniel; Fraiman, Ricardo

    2018-03-16

    The study of brain networks has developed extensively over the last couple of decades. By contrast, techniques for the statistical analysis of these networks are less developed. In this paper, we focus on the statistical comparison of brain networks in a nonparametric framework and discuss the associated detection and identification problems. We tested network differences between groups with an analysis of variance (ANOVA) test we developed specifically for networks. We also propose and analyse the behaviour of a new statistical procedure designed to identify different subnetworks. As an example, we show the application of this tool in resting-state fMRI data obtained from the Human Connectome Project. We identify, among other variables, that the amount of sleep the days before the scan is a relevant variable that must be controlled. Finally, we discuss the potential bias in neuroimaging findings that is generated by some behavioural and brain structure variables. Our method can also be applied to other kind of networks such as protein interaction networks, gene networks or social networks.

  17. Enhancing efficiency and quality of statistical estimation of immunogenicity assay cut points through standardization and automation.

    PubMed

    Su, Cheng; Zhou, Lei; Hu, Zheng; Weng, Winnie; Subramani, Jayanthi; Tadkod, Vineet; Hamilton, Kortney; Bautista, Ami; Wu, Yu; Chirmule, Narendra; Zhong, Zhandong Don

    2015-10-01

    Biotherapeutics can elicit immune responses, which can alter the exposure, safety, and efficacy of the therapeutics. A well-designed and robust bioanalytical method is critical for the detection and characterization of relevant anti-drug antibody (ADA) and the success of an immunogenicity study. As a fundamental criterion in immunogenicity testing, assay cut points need to be statistically established with a risk-based approach to reduce subjectivity. This manuscript describes the development of a validated, web-based, multi-tier customized assay statistical tool (CAST) for assessing cut points of ADA assays. The tool provides an intuitive web interface that allows users to import experimental data generated from a standardized experimental design, select the assay factors, run the standardized analysis algorithms, and generate tables, figures, and listings (TFL). It allows bioanalytical scientists to perform complex statistical analysis at a click of the button to produce reliable assay parameters in support of immunogenicity studies. Copyright © 2015 Elsevier B.V. All rights reserved.

  18. Monte Carlo errors with less errors

    NASA Astrophysics Data System (ADS)

    Wolff, Ulli; Alpha Collaboration

    2004-01-01

    We explain in detail how to estimate mean values and assess statistical errors for arbitrary functions of elementary observables in Monte Carlo simulations. The method is to estimate and sum the relevant autocorrelation functions, which is argued to produce more certain error estimates than binning techniques and hence to help toward a better exploitation of expensive simulations. An effective integrated autocorrelation time is computed which is suitable to benchmark efficiencies of simulation algorithms with regard to specific observables of interest. A Matlab code is offered for download that implements the method. It can also combine independent runs (replica) allowing to judge their consistency.

  19. Development of a method to extend by boron neutron capture process the therapeutic possibilities of a liver autograft

    NASA Astrophysics Data System (ADS)

    Pinelli, Tazio; Altieri, Saverio; Fossati, F.; Zonta, Aris; Prati, U.; Roveda, L.; Nano, Rosanna

    1997-02-01

    We present results on surgical technique, neutron filed and irradiation facility concerning the original treatment of the liver diffused metastases. Our method plans to irradiate the isolated organ at a thermal neutron field soon after having been explanted and boron enriched and before being grafted into the same donor. In particular the crucial point of boron uptake was investigated by a rat model with a relevant number of procedure. We give for the first time statistically significant results on the selective boron absorption by tumor tissues.

  20. Confidence interval or p-value?: part 4 of a series on evaluation of scientific publications.

    PubMed

    du Prel, Jean-Baptist; Hommel, Gerhard; Röhrig, Bernd; Blettner, Maria

    2009-05-01

    An understanding of p-values and confidence intervals is necessary for the evaluation of scientific articles. This article will inform the reader of the meaning and interpretation of these two statistical concepts. The uses of these two statistical concepts and the differences between them are discussed on the basis of a selective literature search concerning the methods employed in scientific articles. P-values in scientific studies are used to determine whether a null hypothesis formulated before the performance of the study is to be accepted or rejected. In exploratory studies, p-values enable the recognition of any statistically noteworthy findings. Confidence intervals provide information about a range in which the true value lies with a certain degree of probability, as well as about the direction and strength of the demonstrated effect. This enables conclusions to be drawn about the statistical plausibility and clinical relevance of the study findings. It is often useful for both statistical measures to be reported in scientific articles, because they provide complementary types of information.

  1. Mastication Evaluation With Unsupervised Learning: Using an Inertial Sensor-Based System.

    PubMed

    Lucena, Caroline Vieira; Lacerda, Marcelo; Caldas, Rafael; De Lima Neto, Fernando Buarque; Rativa, Diego

    2018-01-01

    There is a direct relationship between the prevalence of musculoskeletal disorders of the temporomandibular joint and orofacial disorders. A well-elaborated analysis of the jaw movements provides relevant information for healthcare professionals to conclude their diagnosis. Different approaches have been explored to track jaw movements such that the mastication analysis is getting less subjective; however, all methods are still highly subjective, and the quality of the assessments depends much on the experience of the health professional. In this paper, an accurate and non-invasive method based on a commercial low-cost inertial sensor (MPU6050) to measure jaw movements is proposed. The jaw-movement feature values are compared to the obtained with clinical analysis, showing no statistically significant difference between both methods. Moreover, We propose to use unsupervised paradigm approaches to cluster mastication patterns of healthy subjects and simulated patients with facial trauma. Two techniques were used in this paper to instantiate the method: Kohonen's Self-Organizing Maps and K-Means Clustering. Both algorithms have excellent performances to process jaw-movements data, showing encouraging results and potential to bring a full assessment of the masticatory function. The proposed method can be applied in real-time providing relevant dynamic information for health-care professionals.

  2. Comparison of the Cellient(™) automated cell block system and agar cell block method.

    PubMed

    Kruger, A M; Stevens, M W; Kerley, K J; Carter, C D

    2014-12-01

    To compare the Cellient(TM) automated cell block system with the agar cell block method in terms of quantity and quality of diagnostic material and morphological, histochemical and immunocytochemical features. Cell blocks were prepared from 100 effusion samples using the agar method and Cellient system, and routinely sectioned and stained for haematoxylin and eosin and periodic acid-Schiff with diastase (PASD). A preliminary immunocytochemical study was performed on selected cases (27/100 cases). Sections were evaluated using a three-point grading system to compare a set of morphological parameters. Statistical analysis was performed using Fisher's exact test. Parameters assessing cellularity, presence of single cells and definition of nuclear membrane, nucleoli, chromatin and cytoplasm showed a statistically significant improvement on Cellient cell blocks compared with agar cell blocks (P < 0.05). No significant difference was seen for definition of cell groups, PASD staining or the intensity or clarity of immunocytochemical staining. A discrepant immunocytochemistry (ICC) result was seen in 21% (13/63) of immunostains. The Cellient technique is comparable with the agar method, with statistically significant results achieved for important morphological features. It demonstrates potential as an alternative cell block preparation method which is relevant for the rapid processing of fine needle aspiration samples, malignant effusions and low-cellularity specimens, where optimal cell morphology and architecture are essential. Further investigation is required to optimize immunocytochemical staining using the Cellient method. © 2014 John Wiley & Sons Ltd.

  3. An exploration into study design for biomarker identification: issues and recommendations.

    PubMed

    Hall, Jacqueline A; Brown, Robert; Paul, Jim

    2007-01-01

    Genomic profiling produces large amounts of data and a challenge remains in identifying relevant biological processes associated with clinical outcome. Many candidate biomarkers have been identified but few have been successfully validated and make an impact clinically. This review focuses on some of the study design issues encountered in data mining for biomarker identification with illustrations of how study design may influence the final results. This includes issues of clinical endpoint use and selection, power, statistical, biological and clinical significance. We give particular attention to study design for the application of supervised clustering methods for identification of gene networks associated with clinical outcome and provide recommendations for future work to increase the success of identification of clinically relevant biomarkers.

  4. A metric to search for relevant words

    NASA Astrophysics Data System (ADS)

    Zhou, Hongding; Slater, Gary W.

    2003-11-01

    We propose a new metric to evaluate and rank the relevance of words in a text. The method uses the density fluctuations of a word to compute an index that measures its degree of clustering. Highly significant words tend to form clusters, while common words are essentially uniformly spread in a text. If a word is not rare, the metric is stable when we move any individual occurrence of this word in the text. Furthermore, we prove that the metric always increases when words are moved to form larger clusters, or when several independent documents are merged. Using the Holy Bible as an example, we show that our approach reduces the significance of common words when compared to a recently proposed statistical metric.

  5. Gender-, age-, and race/ethnicity-based differential item functioning analysis of the movement disorder society-sponsored revision of the Unified Parkinson's disease rating scale.

    PubMed

    Goetz, Christopher G; Liu, Yuanyuan; Stebbins, Glenn T; Wang, Lu; Tilley, Barbara C; Teresi, Jeanne A; Merkitch, Douglas; Luo, Sheng

    2016-12-01

    Assess MDS-UPDRS items for gender-, age-, and race/ethnicity-based differential item functioning. Assessing differential item functioning is a core rating scale validation step. For the MDS-UPDRS, differential item functioning occurs if item-score probability among people with similar levels of parkinsonism differ according to selected covariates (gender, age, race/ethnicity). If the magnitude of differential item functioning is clinically relevant, item-score interpretation must consider influences by these covariates. Differential item functioning can be nonuniform (covariate variably influences an item-score across different levels of parkinsonism) or uniform (covariate influences an item-score consistently over all levels of parkinsonism). Using the MDS-UPDRS translation database of more than 5,000 PD patients from 14 languages, we tested gender-, age-, and race/ethnicity-based differential item functioning. To designate an item as having clinically relevant differential item functioning, we required statistical confirmation by 2 independent methods, along with a McFadden pseudo-R 2 magnitude statistic greater than "negligible." Most items showed no gender-, age- or race/ethnicity-based differential item functioning. When differential item functioning was identified, the magnitude statistic was always in the "negligible" range, and the scale-level impact was minimal. The absence of clinically relevant differential item functioning across all items and all parts of the MDS-UPDRS is strong evidence that the scale can be used confidently. As studies of Parkinson's disease increasingly involve multinational efforts and the MDS-UPDRS has several validated non-English translations, the findings support the scale's broad applicability in populations with varying gender, age, and race/ethnicity distributions. © 2016 International Parkinson and Movement Disorder Society. © 2016 International Parkinson and Movement Disorder Society.

  6. Clinical relevance of IL-6 gene polymorphism in severely injured patients

    PubMed Central

    Jeremić, Vasilije; Alempijević, Tamara; Mijatović, Srđan; Šijački, Ana; Dragašević, Sanja; Pavlović, Sonja; Miličić, Biljana; Krstić, Slobodan

    2014-01-01

    In polytrauma, injuries that may be surgically treated under regular circumstances due to a systemic inflammatory response become life-threatening. The inflammatory response involves a complex pattern of humoral and cellular responses and the expression of related factors is thought to be governed by genetic variations. This aim of this paper is to examine the influence of interleukin (IL) 6 single nucleotide polymorphism (SNP) -174C/G and -596G/A on the treatment outcome in severely injured patients. Forty-seven severely injured patients were included in this study. Patients were assigned an Injury Severity Score. Blood samples were drawn within 24 h after admission (designated day 1) and on subsequent days (24, 48, 72 hours and 7days) of hospitalization. The IL-6 levels were determined through ELISA technique. Polymorphisms were analyzed by a method of Polymerase Chain Reaction-Restriction Fragment Length Polymorphism (PCR). Among subjects with different outcomes, no statistically relevant difference was found with regards to the gene IL-6 SNP-174G/C polymorphism. More than a half of subjects who died had the SNP-174G/C polymorphism, while this polymorphism was represented in a slightly lower number in survivors. The incidence of subjects without polymorphism and those with heterozygous and homozygous gene IL-6 SNP-596G/A polymorphism did not present statistically significant variations between survivors and those who died. The levels of IL-6 over the observation period did not present any statistically relevant difference among subjects without the IL-6 SNP-174 or IL-6 SNP -596 gene polymorphism and those who had either a heterozygous or a homozygous polymorphism. PMID:24856384

  7. Development of partial life-cycle experiments to assess the effects of endocrine disruptors on the freshwater gastropod Lymnaea stagnalis: a case-study with vinclozolin.

    PubMed

    Ducrot, Virginie; Teixeira-Alves, Mickaël; Lopes, Christelle; Delignette-Muller, Marie-Laure; Charles, Sandrine; Lagadic, Laurent

    2010-10-01

    Long-term effects of endocrine disruptors (EDs) on aquatic invertebrates remain difficult to assess, mainly due to the lack of appropriate sensitive toxicity test methods and relevant data analysis procedures. This study aimed at identifying windows of sensitivity to EDs along the life-cycle of the freshwater snail Lymnaea stagnalis, a candidate species for the development of forthcoming test guidelines. Juveniles, sub-adults, young adults and adults were exposed for 21 days to the fungicide vinclozolin (VZ). Survival, growth, onset of reproduction, fertility and fecundity were monitored weekly. Data were analyzed using standard statistical analysis procedures and mixed-effect models. No deleterious effect on survival and growth occurred in snails exposed to VZ at environmentally relevant concentrations. A significant impairment of the male function occurred in young adults, leading to infertility at concentrations exceeding 0.025 μg/L. Furthermore, fecundity was impaired in adults exposed to concentrations exceeding 25 μg/L. Biological responses depended on VZ concentration, exposure duration and on their interaction, leading to complex response patterns. The use of a standard statistical approach to analyze those data led to underestimation of VZ effects on reproduction, whereas effects could reliably be analyzed by mixed-effect models. L. stagnalis may be among the most sensitive invertebrate species to VZ, a 21-day reproduction test allowing the detection of deleterious effects at environmentally relevant concentrations of the fungicide. These results thus reinforce the relevance of L. stagnalis as a good candidate species for the development of guidelines devoted to the risk assessment of EDs.

  8. Cost-Effectiveness Analysis: a proposal of new reporting standards in statistical analysis

    PubMed Central

    Bang, Heejung; Zhao, Hongwei

    2014-01-01

    Cost-effectiveness analysis (CEA) is a method for evaluating the outcomes and costs of competing strategies designed to improve health, and has been applied to a variety of different scientific fields. Yet, there are inherent complexities in cost estimation and CEA from statistical perspectives (e.g., skewness, bi-dimensionality, and censoring). The incremental cost-effectiveness ratio that represents the additional cost per one unit of outcome gained by a new strategy has served as the most widely accepted methodology in the CEA. In this article, we call for expanded perspectives and reporting standards reflecting a more comprehensive analysis that can elucidate different aspects of available data. Specifically, we propose that mean and median-based incremental cost-effectiveness ratios and average cost-effectiveness ratios be reported together, along with relevant summary and inferential statistics as complementary measures for informed decision making. PMID:24605979

  9. Statistics of optimal information flow in ensembles of regulatory motifs

    NASA Astrophysics Data System (ADS)

    Crisanti, Andrea; De Martino, Andrea; Fiorentino, Jonathan

    2018-02-01

    Genetic regulatory circuits universally cope with different sources of noise that limit their ability to coordinate input and output signals. In many cases, optimal regulatory performance can be thought to correspond to configurations of variables and parameters that maximize the mutual information between inputs and outputs. Since the mid-2000s, such optima have been well characterized in several biologically relevant cases. Here we use methods of statistical field theory to calculate the statistics of the maximal mutual information (the "capacity") achievable by tuning the input variable only in an ensemble of regulatory motifs, such that a single controller regulates N targets. Assuming (i) sufficiently large N , (ii) quenched random kinetic parameters, and (iii) small noise affecting the input-output channels, we can accurately reproduce numerical simulations both for the mean capacity and for the whole distribution. Our results provide insight into the inherent variability in effectiveness occurring in regulatory systems with heterogeneous kinetic parameters.

  10. Use of observational and model-derived fields and regime model output statistics in mesoscale forecasting

    NASA Technical Reports Server (NTRS)

    Forbes, G. S.; Pielke, R. A.

    1985-01-01

    Various empirical and statistical weather-forecasting studies which utilize stratification by weather regime are described. Objective classification was used to determine weather regime in some studies. In other cases the weather pattern was determined on the basis of a parameter representing the physical and dynamical processes relevant to the anticipated mesoscale phenomena, such as low level moisture convergence and convective precipitation, or the Froude number and the occurrence of cold-air damming. For mesoscale phenomena already in existence, new forecasting techniques were developed. The use of cloud models in operational forecasting is discussed. Models to calculate the spatial scales of forcings and resultant response for mesoscale systems are presented. The use of these models to represent the climatologically most prevalent systems, and to perform case-by-case simulations is reviewed. Operational implementation of mesoscale data into weather forecasts, using both actual simulation output and method-output statistics is discussed.

  11. Features of statistical dynamics in a finite system

    NASA Astrophysics Data System (ADS)

    Yan, Shiwei; Sakata, Fumihiko; Zhuo, Yizhong

    2002-03-01

    We study features of statistical dynamics in a finite Hamilton system composed of a relevant one degree of freedom coupled to an irrelevant multidegree of freedom system through a weak interaction. Special attention is paid on how the statistical dynamics changes depending on the number of degrees of freedom in the irrelevant system. It is found that the macrolevel statistical aspects are strongly related to an appearance of the microlevel chaotic motion, and a dissipation of the relevant motion is realized passing through three distinct stages: dephasing, statistical relaxation, and equilibrium regimes. It is clarified that the dynamical description and the conventional transport approach provide us with almost the same macrolevel and microlevel mechanisms only for the system with a very large number of irrelevant degrees of freedom. It is also shown that the statistical relaxation in the finite system is an anomalous diffusion and the fluctuation effects have a finite correlation time.

  12. Features of statistical dynamics in a finite system.

    PubMed

    Yan, Shiwei; Sakata, Fumihiko; Zhuo, Yizhong

    2002-03-01

    We study features of statistical dynamics in a finite Hamilton system composed of a relevant one degree of freedom coupled to an irrelevant multidegree of freedom system through a weak interaction. Special attention is paid on how the statistical dynamics changes depending on the number of degrees of freedom in the irrelevant system. It is found that the macrolevel statistical aspects are strongly related to an appearance of the microlevel chaotic motion, and a dissipation of the relevant motion is realized passing through three distinct stages: dephasing, statistical relaxation, and equilibrium regimes. It is clarified that the dynamical description and the conventional transport approach provide us with almost the same macrolevel and microlevel mechanisms only for the system with a very large number of irrelevant degrees of freedom. It is also shown that the statistical relaxation in the finite system is an anomalous diffusion and the fluctuation effects have a finite correlation time.

  13. A statistical approach to bioclimatic trend detection in the airborne pollen records of Catalonia (NE Spain)

    NASA Astrophysics Data System (ADS)

    Fernández-Llamazares, Álvaro; Belmonte, Jordina; Delgado, Rosario; De Linares, Concepción

    2014-04-01

    Airborne pollen records are a suitable indicator for the study of climate change. The present work focuses on the role of annual pollen indices for the detection of bioclimatic trends through the analysis of the aerobiological spectra of 11 taxa of great biogeographical relevance in Catalonia over an 18-year period (1994-2011), by means of different parametric and non-parametric statistical methods. Among others, two non-parametric rank-based statistical tests were performed for detecting monotonic trends in time series data of the selected airborne pollen types and we have observed that they have similar power in detecting trends. Except for those cases in which the pollen data can be well-modeled by a normal distribution, it is better to apply non-parametric statistical methods to aerobiological studies. Our results provide a reliable representation of the pollen trends in the region and suggest that greater pollen quantities are being liberated to the atmosphere in the last years, specially by Mediterranean taxa such as Pinus, Total Quercus and Evergreen Quercus, although the trends may differ geographically. Longer aerobiological monitoring periods are required to corroborate these results and survey the increasing levels of certain pollen types that could exert an impact in terms of public health.

  14. When Educational Material Is Delivered: A Mixed Methods Content Validation Study of the Information Assessment Method

    PubMed Central

    2017-01-01

    Background The Information Assessment Method (IAM) allows clinicians to report the cognitive impact, clinical relevance, intention to use, and expected patient health benefits associated with clinical information received by email. More than 15,000 Canadian physicians and pharmacists use the IAM in continuing education programs. In addition, information providers can use IAM ratings and feedback comments from clinicians to improve their products. Objective Our general objective was to validate the IAM questionnaire for the delivery of educational material (ecological and logical content validity). Our specific objectives were to measure the relevance and evaluate the representativeness of IAM items for assessing information received by email. Methods A 3-part mixed methods study was conducted (convergent design). In part 1 (quantitative longitudinal study), the relevance of IAM items was measured. Participants were 5596 physician members of the Canadian Medical Association who used the IAM. A total of 234,196 ratings were collected in 2012. The relevance of IAM items with respect to their main construct was calculated using descriptive statistics (relevance ratio R). In part 2 (qualitative descriptive study), the representativeness of IAM items was evaluated. A total of 15 family physicians completed semistructured face-to-face interviews. For each construct, we evaluated the representativeness of IAM items using a deductive-inductive thematic qualitative data analysis. In part 3 (mixing quantitative and qualitative parts), results from quantitative and qualitative analyses were reviewed, juxtaposed in a table, discussed with experts, and integrated. Thus, our final results are derived from the views of users (ecological content validation) and experts (logical content validation). Results Of the 23 IAM items, 21 were validated for content, while 2 were removed. In part 1 (quantitative results), 21 items were deemed relevant, while 2 items were deemed not relevant (R=4.86% [N=234,196] and R=3.04% [n=45,394], respectively). In part 2 (qualitative results), 22 items were deemed representative, while 1 item was not representative. In part 3 (mixing quantitative and qualitative results), the content validity of 21 items was confirmed, and the 2 nonrelevant items were excluded. A fully validated version was generated (IAM-v2014). Conclusions This study produced a content validated IAM questionnaire that is used by clinicians and information providers to assess the clinical information delivered in continuing education programs. PMID:28292738

  15. Ecological content validation of the Information Assessment Method for parents (IAM-parent): A mixed methods study.

    PubMed

    Bujold, M; El Sherif, R; Bush, P L; Johnson-Lafleur, J; Doray, G; Pluye, P

    2018-02-01

    This mixed methods study content validated the Information Assessment Method for parents (IAM-parent) that allows users to systematically rate and comment on online parenting information. Quantitative data and results: 22,407 IAM ratings were collected; of the initial 32 items, descriptive statistics showed that 10 had low relevance. Qualitative data and results: IAM-based comments were collected, and 20 IAM users were interviewed (maximum variation sample); the qualitative data analysis assessed the representativeness of IAM items, and identified items with problematic wording. Researchers, the program director, and Web editors integrated quantitative and qualitative results, which led to a shorter and clearer IAM-parent. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  16. Surveys Assessing Students' Attitudes toward Statistics: A Systematic Review of Validity and Reliability

    ERIC Educational Resources Information Center

    Nolan, Meaghan M.; Beran, Tanya; Hecker, Kent G.

    2012-01-01

    Students with positive attitudes toward statistics are likely to show strong academic performance in statistics courses. Multiple surveys measuring students' attitudes toward statistics exist; however, a comparison of the validity and reliability of interpretations based on their scores is needed. A systematic review of relevant electronic…

  17. Relating Climate Change Risks to Water Supply Planning Assumptions: Recent Applications by the U.S. Bureau of Reclamation (Invited)

    NASA Astrophysics Data System (ADS)

    Brekke, L. D.

    2009-12-01

    Presentation highlights recent methods carried by Reclamation to incorporate climate change and variability information into water supply assumptions for longer-term planning. Presentation also highlights limitations of these methods, and possible method adjustments that might be made to address these limitations. Reclamation was established more than one hundred years ago with a mission centered on the construction of irrigation and hydropower projects in the Western United States. Reclamation’s mission has evolved since its creation to include other activities, including municipal and industrial water supply projects, ecosystem restoration, and the protection and management of water supplies. Reclamation continues to explore ways to better address mission objectives, often considering proposals to develop new infrastructure and/or modify long-term criteria for operations. Such studies typically feature operations analysis to disclose benefits and effects of a given proposal, which are sensitive to assumptions made about future water supplies, water demands, and operating constraints. Development of these assumptions requires consideration to more fundamental future drivers such as land use, demographics, and climate. On the matter of establishing planning assumptions for water supplies under climate change, Reclamation has applied several methods. This presentation highlights two activities where the first focuses on potential changes in hydroclimate frequencies and the second focuses on potential changes in hydroclimate period-statistics. The first activity took place in the Colorado River Basin where there was interest in the interarrival possibilities of drought and surplus events of varying severity relevant to proposals on new criteria for handling lower basin shortages. The second activity occurred in California’s Central Valley where stakeholders were interested in how projected climate change possibilities translated into changes in hydrologic and water supply statistics relevant to a long-term federal Endangered Species Act consultation. Projected climate change possibilities were characterized by surveying a large ensemble of climate projections for change in period climate-statistics and then selecting a small set of projections featuring a bracketing set of period-changes relative to the those from the complete ensemble. Although both methods served the needs of their respective planning activities, each has limited applicability for other planning activities. First, each method addresses only one climate change aspect and not the other. Some planning activities may need to consider potential changes in both period-statistics and frequencies. Second, neither method addresses CMIP3 projected changes in climate variability. The first method bases frequency possibilities on historical information while the second method only surveys CMIP3 projections for change in period-mean and then superimposes those changes on historical variability. Third, artifacts of CMIP3 design lead to interpretation challenges when implementing the second method (e.g., inconsistent projection initialization, model-dependent expressions of multi-decadal variability). Presentation summarizes these issues and also potential method adjustments to address them when defining planning assumptions for water supplies.

  18. Estimating trends in the global mean temperature record

    NASA Astrophysics Data System (ADS)

    Poppick, Andrew; Moyer, Elisabeth J.; Stein, Michael L.

    2017-06-01

    Given uncertainties in physical theory and numerical climate simulations, the historical temperature record is often used as a source of empirical information about climate change. Many historical trend analyses appear to de-emphasize physical and statistical assumptions: examples include regression models that treat time rather than radiative forcing as the relevant covariate, and time series methods that account for internal variability in nonparametric rather than parametric ways. However, given a limited data record and the presence of internal variability, estimating radiatively forced temperature trends in the historical record necessarily requires some assumptions. Ostensibly empirical methods can also involve an inherent conflict in assumptions: they require data records that are short enough for naive trend models to be applicable, but long enough for long-timescale internal variability to be accounted for. In the context of global mean temperatures, empirical methods that appear to de-emphasize assumptions can therefore produce misleading inferences, because the trend over the twentieth century is complex and the scale of temporal correlation is long relative to the length of the data record. We illustrate here how a simple but physically motivated trend model can provide better-fitting and more broadly applicable trend estimates and can allow for a wider array of questions to be addressed. In particular, the model allows one to distinguish, within a single statistical framework, between uncertainties in the shorter-term vs. longer-term response to radiative forcing, with implications not only on historical trends but also on uncertainties in future projections. We also investigate the consequence on inferred uncertainties of the choice of a statistical description of internal variability. While nonparametric methods may seem to avoid making explicit assumptions, we demonstrate how even misspecified parametric statistical methods, if attuned to the important characteristics of internal variability, can result in more accurate uncertainty statements about trends.

  19. Anatomical shape analysis of the mandible in Caucasian and Chinese for the production of preformed mandible reconstruction plates.

    PubMed

    Metzger, Marc C; Vogel, Mathias; Hohlweg-Majert, Bettina; Mast, Hansjörg; Fan, Xianqun; Rüdell, Alexandra; Schlager, Stefan

    2011-09-01

    The purpose of this study was to evaluate and analyze statistical shapes of the outer mandible contour of Caucasian and Chinese people, offering data for the production of preformed mandible reconstruction plates. A CT-database of 925 Caucasians (male: n=463, female: n=462) and 960 Chinese (male: n=469, female: n=491) including scans of unaffected mandibles were used and imported into the 3D modeling software Voxim (IVS-Solutions, Chemnitz, Germany). Anatomical landmarks (n=22 points for both sides) were set using the 3D view along the outer contour of the mandible at the area where reconstruction plates are commonly located. We used morphometric methods for statistical shape analysis. We found statistical relevant differences between populations including a distinct discrimination given by the landmarks at the mandible. After generating a metric model this shape information which separated the populations appeared to be of no clinical relevance. The metric size information given by ramus length however provided a profound base for the production of standard reconstruction plates. Clustering by ramus length into three sizes and calculating means of these size-clusters seem to be a good solution for constructing preformed reconstruction plates that will fit a vast majority. Copyright © 2010 European Association for Cranio-Maxillo-Facial Surgery. Published by Elsevier Ltd. All rights reserved.

  20. Level statistics of words: Finding keywords in literary texts and symbolic sequences

    NASA Astrophysics Data System (ADS)

    Carpena, P.; Bernaola-Galván, P.; Hackenberg, M.; Coronado, A. V.; Oliver, J. L.

    2009-03-01

    Using a generalization of the level statistics analysis of quantum disordered systems, we present an approach able to extract automatically keywords in literary texts. Our approach takes into account not only the frequencies of the words present in the text but also their spatial distribution along the text, and is based on the fact that relevant words are significantly clustered (i.e., they self-attract each other), while irrelevant words are distributed randomly in the text. Since a reference corpus is not needed, our approach is especially suitable for single documents for which no a priori information is available. In addition, we show that our method works also in generic symbolic sequences (continuous texts without spaces), thus suggesting its general applicability.

  1. Modularity and the spread of perturbations in complex dynamical systems

    NASA Astrophysics Data System (ADS)

    Kolchinsky, Artemy; Gates, Alexander J.; Rocha, Luis M.

    2015-12-01

    We propose a method to decompose dynamical systems based on the idea that modules constrain the spread of perturbations. We find partitions of system variables that maximize "perturbation modularity," defined as the autocovariance of coarse-grained perturbed trajectories. The measure effectively separates the fast intramodular from the slow intermodular dynamics of perturbation spreading (in this respect, it is a generalization of the "Markov stability" method of network community detection). Our approach captures variation of modular organization across different system states, time scales, and in response to different kinds of perturbations: aspects of modularity which are all relevant to real-world dynamical systems. It offers a principled alternative to detecting communities in networks of statistical dependencies between system variables (e.g., "relevance networks" or "functional networks"). Using coupled logistic maps, we demonstrate that the method uncovers hierarchical modular organization planted in a system's coupling matrix. Additionally, in homogeneously coupled map lattices, it identifies the presence of self-organized modularity that depends on the initial state, dynamical parameters, and type of perturbations. Our approach offers a powerful tool for exploring the modular organization of complex dynamical systems.

  2. Modularity and the spread of perturbations in complex dynamical systems.

    PubMed

    Kolchinsky, Artemy; Gates, Alexander J; Rocha, Luis M

    2015-12-01

    We propose a method to decompose dynamical systems based on the idea that modules constrain the spread of perturbations. We find partitions of system variables that maximize "perturbation modularity," defined as the autocovariance of coarse-grained perturbed trajectories. The measure effectively separates the fast intramodular from the slow intermodular dynamics of perturbation spreading (in this respect, it is a generalization of the "Markov stability" method of network community detection). Our approach captures variation of modular organization across different system states, time scales, and in response to different kinds of perturbations: aspects of modularity which are all relevant to real-world dynamical systems. It offers a principled alternative to detecting communities in networks of statistical dependencies between system variables (e.g., "relevance networks" or "functional networks"). Using coupled logistic maps, we demonstrate that the method uncovers hierarchical modular organization planted in a system's coupling matrix. Additionally, in homogeneously coupled map lattices, it identifies the presence of self-organized modularity that depends on the initial state, dynamical parameters, and type of perturbations. Our approach offers a powerful tool for exploring the modular organization of complex dynamical systems.

  3. Anchoring quartet-based phylogenetic distances and applications to species tree reconstruction.

    PubMed

    Sayyari, Erfan; Mirarab, Siavash

    2016-11-11

    Inferring species trees from gene trees using the coalescent-based summary methods has been the subject of much attention, yet new scalable and accurate methods are needed. We introduce DISTIQUE, a new statistically consistent summary method for inferring species trees from gene trees under the coalescent model. We generalize our results to arbitrary phylogenetic inference problems; we show that two arbitrarily chosen leaves, called anchors, can be used to estimate relative distances between all other pairs of leaves by inferring relevant quartet trees. This results in a family of distance-based tree inference methods, with running times ranging between quadratic to quartic in the number of leaves. We show in simulated studies that DISTIQUE has comparable accuracy to leading coalescent-based summary methods and reduced running times.

  4. Patch test results in children and adolescents. Study from the Santa Casa de Belo Horizonte Dermatology Clinic, Brazil, from 2003 to 2010*

    PubMed Central

    Rodrigues, Dulcilea Ferraz; Goulart, Eugênio Marcos Andrade

    2015-01-01

    BACKGROUND Patch testing is an efficient method to identify the allergen responsible for allergic contact dermatitis. OBJECTIVE To evaluate the results of patch tests in children and adolescents comparing these two age groups' results. METHODS Cross-sectional study to assess patch test results of 125 children and adolescents aged 1-19 years, with suspected allergic contact dermatitis, in a dermatology clinic in Brazil. Two Brazilian standardized series were used. RESULTS Seventy four (59.2%) patients had "at least one positive reaction" to the patch test. Among these positive tests, 77.0% were deemed relevant. The most frequent allergens were nickel (36.8%), thimerosal (18.4%), tosylamide formaldehyde resin (6.8%), neomycin (6.4%), cobalt (4.0%) and fragrance mix I (4.0%). The most frequent positive tests came from adolescents (p=0.0014) and females (p=0.0002). There was no relevant statistical difference concerning contact sensitizations among patients with or without atopic history. However, there were significant differences regarding sensitization to nickel (p=0.029) and thimerosal (p=0.042) between the two age groups under study, while adolescents were the most affected. CONCLUSION Nickel and fragrances were the only positive (and relevant) allergens in children. Nickel and tosylamide formaldehyde resin were the most frequent and relevant allergens among adolescents. PMID:26560213

  5. Air ions and respiratory function outcomes: a comprehensive review

    PubMed Central

    2013-01-01

    Background From a mechanistic or physical perspective there is no basis to suspect that electric charges on clusters of air molecules (air ions) would have beneficial or deleterious effects on respiratory function. Yet, there is a large lay and scientific literature spanning 80 years that asserts exposure to air ions affects the respiratory system and has other biological effects. Aims This review evaluates the scientific evidence in published human experimental studies regarding the effects of exposure to air ions on respiratory performance and symptoms. Methods We identified 23 studies (published 1933–1993) that met our inclusion criteria. Relevant data pertaining to study population characteristics, study design, experimental methods, statistical techniques, and study results were assessed. Where relevant, random effects meta-analysis models were utilized to quantify similar exposure and outcome groupings. Results The included studies examined the therapeutic benefits of exposure to negative air ions on respiratory outcomes, such as ventilatory function and asthmatic symptoms. Study specific sample sizes ranged between 7 and 23, and studies varied considerably by subject characteristics (e.g., infants with asthma, adults with emphysema), experimental method, outcomes measured (e.g., subjective symptoms, sensitivity, clinical pulmonary function), analytical design, and statistical reporting. Conclusions Despite numerous experimental and analytical differences across studies, the literature does not clearly support a beneficial role in exposure to negative air ions and respiratory function or asthmatic symptom alleviation. Further, collectively, the human experimental studies do not indicate a significant detrimental effect of exposure to positive air ions on respiratory measures. Exposure to negative or positive air ions does not appear to play an appreciable role in respiratory function. PMID:24016271

  6. MetaboLyzer: A Novel Statistical Workflow for Analyzing Post-Processed LC/MS Metabolomics Data

    PubMed Central

    Mak, Tytus D.; Laiakis, Evagelia C.; Goudarzi, Maryam; Fornace, Albert J.

    2014-01-01

    Metabolomics, the global study of small molecules in a particular system, has in the last few years risen to become a primary –omics platform for the study of metabolic processes. With the ever-increasing pool of quantitative data yielded from metabolomic research, specialized methods and tools with which to analyze and extract meaningful conclusions from these data are becoming more and more crucial. Furthermore, the depth of knowledge and expertise required to undertake a metabolomics oriented study is a daunting obstacle to investigators new to the field. As such, we have created a new statistical analysis workflow, MetaboLyzer, which aims to both simplify analysis for investigators new to metabolomics, as well as provide experienced investigators the flexibility to conduct sophisticated analysis. MetaboLyzer’s workflow is specifically tailored to the unique characteristics and idiosyncrasies of postprocessed liquid chromatography/mass spectrometry (LC/MS) based metabolomic datasets. It utilizes a wide gamut of statistical tests, procedures, and methodologies that belong to classical biostatistics, as well as several novel statistical techniques that we have developed specifically for metabolomics data. Furthermore, MetaboLyzer conducts rapid putative ion identification and putative biologically relevant analysis via incorporation of four major small molecule databases: KEGG, HMDB, Lipid Maps, and BioCyc. MetaboLyzer incorporates these aspects into a comprehensive workflow that outputs easy to understand statistically significant and potentially biologically relevant information in the form of heatmaps, volcano plots, 3D visualization plots, correlation maps, and metabolic pathway hit histograms. For demonstration purposes, a urine metabolomics data set from a previously reported radiobiology study in which samples were collected from mice exposed to gamma radiation was analyzed. MetaboLyzer was able to identify 243 statistically significant ions out of a total of 1942. Numerous putative metabolites and pathways were found to be biologically significant from the putative ion identification workflow. PMID:24266674

  7. MSW Students' Perceptions of Relevance and Application of Statistics: Implications for Field Education

    ERIC Educational Resources Information Center

    Davis, Ashley; Mirick, Rebecca G.

    2015-01-01

    Many social work students feel anxious when taking a statistics course. Their attitudes, beliefs, and behaviors after learning statistics are less known. However, such information could help instructors support students' ongoing development of statistical knowledge. With a sample of MSW students (N = 101) in one program, this study examined…

  8. Automated MRI parcellation of the frontal lobe

    PubMed Central

    Ranta, Marin E.; Chen, Min; Crocetti, Deana; Prince, Jerry L.; Subramaniam, Krish; Fischl, Bruce; Kaufmann, Walter E.; Mostofsky, Stewart H.

    2014-01-01

    Examination of associations between specific disorders and physical properties of functionally relevant frontal lobe sub-regions is a fundamental goal in neuropsychiatry. Here we present and evaluate automated methods of frontal lobe parcellation with the programs FreeSurfer(FS) and TOADS-CRUISE(T-C), based on the manual method described in Ranta et al. (2009) in which sulcal-gyral landmarks were used to manually delimit functionally relevant regions within the frontal lobe: i.e., primary motor cortex, anterior cingulate, deep white matter, premotor cortex regions (supplementary motor complex, frontal eye field and lateral premotor cortex) and prefrontal cortex (PFC) regions (medial PFC, dorsolateral PFC, inferior PFC, lateral orbitofrontal cortex (OFC) and medial OFC). Dice's coefficient, a measure of overlap, and percent volume difference were used to measure the reliability between manual and automated delineations for each frontal lobe region. For FS, mean Dice's coefficient for all regions was 0.75 and percent volume difference was 21.2%. For T-C the mean Dice's coefficient was 0.77 and the mean percent volume difference for all regions was 20.2%. These results, along with a high degree of agreement between the two automated methods (mean Dice's coefficient = 0.81, percent volume difference = 12.4%) and a proof-of-principle group difference analysis that highlights the consistency and sensitivity of the automated methods, indicate that the automated methods are valid techniques for parcellation of the frontal lobe into functionally relevant sub-regions. Thus, the methodology has the potential to increase efficiency, statistical power and reproducibility for population analyses of neuropsychiatric disorders with hypothesized frontal lobe contributions. PMID:23897577

  9. gsSKAT: Rapid gene set analysis and multiple testing correction for rare-variant association studies using weighted linear kernels.

    PubMed

    Larson, Nicholas B; McDonnell, Shannon; Cannon Albright, Lisa; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A; Isaacs, William B; Xu, Jianfeng; Cooney, Kathleen A; Lange, Ethan; Schleutker, Johanna; Carpten, John D; Powell, Isaac; Bailey-Wilson, Joan E; Cussenot, Olivier; Cancel-Tassin, Geraldine; Giles, Graham G; MacInnis, Robert J; Maier, Christiane; Whittemore, Alice S; Hsieh, Chih-Lin; Wiklund, Fredrik; Catalona, William J; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote-Jarai, Zsofia; Ackerman, Michael J; Olson, Timothy M; Klein, Christopher J; Thibodeau, Stephen N; Schaid, Daniel J

    2017-05-01

    Next-generation sequencing technologies have afforded unprecedented characterization of low-frequency and rare genetic variation. Due to low power for single-variant testing, aggregative methods are commonly used to combine observed rare variation within a single gene. Causal variation may also aggregate across multiple genes within relevant biomolecular pathways. Kernel-machine regression and adaptive testing methods for aggregative rare-variant association testing have been demonstrated to be powerful approaches for pathway-level analysis, although these methods tend to be computationally intensive at high-variant dimensionality and require access to complete data. An additional analytical issue in scans of large pathway definition sets is multiple testing correction. Gene set definitions may exhibit substantial genic overlap, and the impact of the resultant correlation in test statistics on Type I error rate control for large agnostic gene set scans has not been fully explored. Herein, we first outline a statistical strategy for aggregative rare-variant analysis using component gene-level linear kernel score test summary statistics as well as derive simple estimators of the effective number of tests for family-wise error rate control. We then conduct extensive simulation studies to characterize the behavior of our approach relative to direct application of kernel and adaptive methods under a variety of conditions. We also apply our method to two case-control studies, respectively, evaluating rare variation in hereditary prostate cancer and schizophrenia. Finally, we provide open-source R code for public use to facilitate easy application of our methods to existing rare-variant analysis results. © 2017 WILEY PERIODICALS, INC.

  10. ArrayVigil: a methodology for statistical comparison of gene signatures using segregated-one-tailed (SOT) Wilcoxon's signed-rank test.

    PubMed

    Khan, Haseeb Ahmad

    2005-01-28

    Due to versatile diagnostic and prognostic fidelity molecular signatures or fingerprints are anticipated as the most powerful tools for cancer management in the near future. Notwithstanding the experimental advancements in microarray technology, methods for analyzing either whole arrays or gene signatures have not been firmly established. Recently, an algorithm, ArraySolver has been reported by Khan for two-group comparison of microarray gene expression data using two-tailed Wilcoxon signed-rank test. Most of the molecular signatures are composed of two sets of genes (hybrid signatures) wherein up-regulation of one set and down-regulation of the other set collectively define the purpose of a gene signature. Since the direction of a selected gene's expression (positive or negative) with respect to a particular disease condition is known, application of one-tailed statistics could be a more relevant choice. A novel method, ArrayVigil, is described for comparing hybrid signatures using segregated-one-tailed (SOT) Wilcoxon signed-rank test and the results compared with integrated-two-tailed (ITT) procedures (SPSS and ArraySolver). ArrayVigil resulted in lower P values than those obtained from ITT statistics while comparing real data from four signatures.

  11. NetNorM: Capturing cancer-relevant information in somatic exome mutation data with gene networks for cancer stratification and prognosis.

    PubMed

    Le Morvan, Marine; Zinovyev, Andrei; Vert, Jean-Philippe

    2017-06-01

    Genome-wide somatic mutation profiles of tumours can now be assessed efficiently and promise to move precision medicine forward. Statistical analysis of mutation profiles is however challenging due to the low frequency of most mutations, the varying mutation rates across tumours, and the presence of a majority of passenger events that hide the contribution of driver events. Here we propose a method, NetNorM, to represent whole-exome somatic mutation data in a form that enhances cancer-relevant information using a gene network as background knowledge. We evaluate its relevance for two tasks: survival prediction and unsupervised patient stratification. Using data from 8 cancer types from The Cancer Genome Atlas (TCGA), we show that it improves over the raw binary mutation data and network diffusion for these two tasks. In doing so, we also provide a thorough assessment of somatic mutations prognostic power which has been overlooked by previous studies because of the sparse and binary nature of mutations.

  12. NetNorM: Capturing cancer-relevant information in somatic exome mutation data with gene networks for cancer stratification and prognosis

    PubMed Central

    2017-01-01

    Genome-wide somatic mutation profiles of tumours can now be assessed efficiently and promise to move precision medicine forward. Statistical analysis of mutation profiles is however challenging due to the low frequency of most mutations, the varying mutation rates across tumours, and the presence of a majority of passenger events that hide the contribution of driver events. Here we propose a method, NetNorM, to represent whole-exome somatic mutation data in a form that enhances cancer-relevant information using a gene network as background knowledge. We evaluate its relevance for two tasks: survival prediction and unsupervised patient stratification. Using data from 8 cancer types from The Cancer Genome Atlas (TCGA), we show that it improves over the raw binary mutation data and network diffusion for these two tasks. In doing so, we also provide a thorough assessment of somatic mutations prognostic power which has been overlooked by previous studies because of the sparse and binary nature of mutations. PMID:28650955

  13. Upgrades to the REA method for producing probabilistic climate change projections

    NASA Astrophysics Data System (ADS)

    Xu, Ying; Gao, Xuejie; Giorgi, Filippo

    2010-05-01

    We present an augmented version of the Reliability Ensemble Averaging (REA) method designed to generate probabilistic climate change information from ensembles of climate model simulations. Compared to the original version, the augmented one includes consideration of multiple variables and statistics in the calculation of the performance-based weights. In addition, the model convergence criterion previously employed is removed. The method is applied to the calculation of changes in mean and variability for temperature and precipitation over different sub-regions of East Asia based on the recently completed CMIP3 multi-model ensemble. Comparison of the new and old REA methods, along with the simple averaging procedure, and the use of different combinations of performance metrics shows that at fine sub-regional scales the choice of weighting is relevant. This is mostly because the models show a substantial spread in performance for the simulation of precipitation statistics, a result that supports the use of model weighting as a useful option to account for wide ranges of quality of models. The REA method, and in particular the upgraded one, provides a simple and flexible framework for assessing the uncertainty related to the aggregation of results from ensembles of models in order to produce climate change information at the regional scale. KEY WORDS: REA method, Climate change, CMIP3

  14. A New Method for Automated Identification and Morphometry of Myelinated Fibers Through Light Microscopy Image Analysis.

    PubMed

    Novas, Romulo Bourget; Fazan, Valeria Paula Sassoli; Felipe, Joaquim Cezar

    2016-02-01

    Nerve morphometry is known to produce relevant information for the evaluation of several phenomena, such as nerve repair, regeneration, implant, transplant, aging, and different human neuropathies. Manual morphometry is laborious, tedious, time consuming, and subject to many sources of error. Therefore, in this paper, we propose a new method for the automated morphometry of myelinated fibers in cross-section light microscopy images. Images from the recurrent laryngeal nerve of adult rats and the vestibulocochlear nerve of adult guinea pigs were used herein. The proposed pipeline for fiber segmentation is based on the techniques of competitive clustering and concavity analysis. The evaluation of the proposed method for segmentation of images was done by comparing the automatic segmentation with the manual segmentation. To further evaluate the proposed method considering morphometric features extracted from the segmented images, the distributions of these features were tested for statistical significant difference. The method achieved a high overall sensitivity and very low false-positive rates per image. We detect no statistical difference between the distribution of the features extracted from the manual and the pipeline segmentations. The method presented a good overall performance, showing widespread potential in experimental and clinical settings allowing large-scale image analysis and, thus, leading to more reliable results.

  15. Statistical classification of road pavements using near field vehicle rolling noise measurements.

    PubMed

    Paulo, Joel Preto; Coelho, J L Bento; Figueiredo, Mário A T

    2010-10-01

    Low noise surfaces have been increasingly considered as a viable and cost-effective alternative to acoustical barriers. However, road planners and administrators frequently lack information on the correlation between the type of road surface and the resulting noise emission profile. To address this problem, a method to identify and classify different types of road pavements was developed, whereby near field road noise is analyzed using statistical learning methods. The vehicle rolling sound signal near the tires and close to the road surface was acquired by two microphones in a special arrangement which implements the Close-Proximity method. A set of features, characterizing the properties of the road pavement, was extracted from the corresponding sound profiles. A feature selection method was used to automatically select those that are most relevant in predicting the type of pavement, while reducing the computational cost. A set of different types of road pavement segments were tested and the performance of the classifier was evaluated. Results of pavement classification performed during a road journey are presented on a map, together with geographical data. This procedure leads to a considerable improvement in the quality of road pavement noise data, thereby increasing the accuracy of road traffic noise prediction models.

  16. Assessment of protein set coherence using functional annotations

    PubMed Central

    Chagoyen, Monica; Carazo, Jose M; Pascual-Montano, Alberto

    2008-01-01

    Background Analysis of large-scale experimental datasets frequently produces one or more sets of proteins that are subsequently mined for functional interpretation and validation. To this end, a number of computational methods have been devised that rely on the analysis of functional annotations. Although current methods provide valuable information (e.g. significantly enriched annotations, pairwise functional similarities), they do not specifically measure the degree of homogeneity of a protein set. Results In this work we present a method that scores the degree of functional homogeneity, or coherence, of a set of proteins on the basis of the global similarity of their functional annotations. The method uses statistical hypothesis testing to assess the significance of the set in the context of the functional space of a reference set. As such, it can be used as a first step in the validation of sets expected to be homogeneous prior to further functional interpretation. Conclusion We evaluate our method by analysing known biologically relevant sets as well as random ones. The known relevant sets comprise macromolecular complexes, cellular components and pathways described for Saccharomyces cerevisiae, which are mostly significantly coherent. Finally, we illustrate the usefulness of our approach for validating 'functional modules' obtained from computational analysis of protein-protein interaction networks. Matlab code and supplementary data are available at PMID:18937846

  17. Mastication Evaluation With Unsupervised Learning: Using an Inertial Sensor-Based System

    PubMed Central

    Lucena, Caroline Vieira; Lacerda, Marcelo; Caldas, Rafael; De Lima Neto, Fernando Buarque

    2018-01-01

    There is a direct relationship between the prevalence of musculoskeletal disorders of the temporomandibular joint and orofacial disorders. A well-elaborated analysis of the jaw movements provides relevant information for healthcare professionals to conclude their diagnosis. Different approaches have been explored to track jaw movements such that the mastication analysis is getting less subjective; however, all methods are still highly subjective, and the quality of the assessments depends much on the experience of the health professional. In this paper, an accurate and non-invasive method based on a commercial low-cost inertial sensor (MPU6050) to measure jaw movements is proposed. The jaw-movement feature values are compared to the obtained with clinical analysis, showing no statistically significant difference between both methods. Moreover, We propose to use unsupervised paradigm approaches to cluster mastication patterns of healthy subjects and simulated patients with facial trauma. Two techniques were used in this paper to instantiate the method: Kohonen’s Self-Organizing Maps and K-Means Clustering. Both algorithms have excellent performances to process jaw-movements data, showing encouraging results and potential to bring a full assessment of the masticatory function. The proposed method can be applied in real-time providing relevant dynamic information for health-care professionals. PMID:29651365

  18. Pilot study for the registry of complications in rheumatic diseases from the German Society of Surgery (DGORh): evaluation of methods and data from the first 1000 patients

    PubMed Central

    Kostuj, Tanja; Rehart, Stefan; Matta-Hurtado, Ronald; Biehl, Christoph; Willburger, Roland E; Schmidt, Klaus

    2017-01-01

    Objective Most patients suffering with rheumatic diseases who undergo surgical treatment are receiving immune-modulating therapy. To determine whether these medications affect their outcomes a national registry was established in Germany by the German Society of Surgery (DGORh). Data from the first 1000 patients were used in a pilot study to identify relevant corisk factors and to determine whether such a registry is suitable for developing accurate and relevant recommendations. Design and participants Data were collected from patients undergoing surgical treatments with their written consent. A second consent form was used, if complications occurred. During this pilot study, in order to obtain a quicker overview, risk factors were considered only in patients with complications. Only descriptive statistical analysis was employed in this pilot study due to limited number of observed complications and inhomogeneous data regarding the surgery and the medications the patients received. Analytical statistics will be performed to confirm the results in a future outcome study. Results Complications occurred in 26 patients and were distributed equally among the different types of surgeries. Twenty one of these patients were receiving immune-modulating therapy at the time, while five were not. Infections were observed in 2.3% of patients receiving and in 5.1% not receiving immunosuppression. Conclusions Due to the low number of cases, inhomogeneity in the diseases and the treatments received by the patients in this pilot study, it is not possible to develop standardised best-practice recommendations to optimise their care. Based on this observation we conclude that in order to be suitable to develop accurate and relevant recommendations a national registry must include the most important and relevant variables that impact the care and outcomes of these patients. PMID:29018066

  19. Teaching Biology through Statistics: Application of Statistical Methods in Genetics and Zoology Courses

    PubMed Central

    Colon-Berlingeri, Migdalisel; Burrowes, Patricia A.

    2011-01-01

    Incorporation of mathematics into biology curricula is critical to underscore for undergraduate students the relevance of mathematics to most fields of biology and the usefulness of developing quantitative process skills demanded in modern biology. At our institution, we have made significant changes to better integrate mathematics into the undergraduate biology curriculum. The curricular revision included changes in the suggested course sequence, addition of statistics and precalculus as prerequisites to core science courses, and incorporating interdisciplinary (math–biology) learning activities in genetics and zoology courses. In this article, we describe the activities developed for these two courses and the assessment tools used to measure the learning that took place with respect to biology and statistics. We distinguished the effectiveness of these learning opportunities in helping students improve their understanding of the math and statistical concepts addressed and, more importantly, their ability to apply them to solve a biological problem. We also identified areas that need emphasis in both biology and mathematics courses. In light of our observations, we recommend best practices that biology and mathematics academic departments can implement to train undergraduates for the demands of modern biology. PMID:21885822

  20. Teaching biology through statistics: application of statistical methods in genetics and zoology courses.

    PubMed

    Colon-Berlingeri, Migdalisel; Burrowes, Patricia A

    2011-01-01

    Incorporation of mathematics into biology curricula is critical to underscore for undergraduate students the relevance of mathematics to most fields of biology and the usefulness of developing quantitative process skills demanded in modern biology. At our institution, we have made significant changes to better integrate mathematics into the undergraduate biology curriculum. The curricular revision included changes in the suggested course sequence, addition of statistics and precalculus as prerequisites to core science courses, and incorporating interdisciplinary (math-biology) learning activities in genetics and zoology courses. In this article, we describe the activities developed for these two courses and the assessment tools used to measure the learning that took place with respect to biology and statistics. We distinguished the effectiveness of these learning opportunities in helping students improve their understanding of the math and statistical concepts addressed and, more importantly, their ability to apply them to solve a biological problem. We also identified areas that need emphasis in both biology and mathematics courses. In light of our observations, we recommend best practices that biology and mathematics academic departments can implement to train undergraduates for the demands of modern biology.

  1. Exploring dangerous neighborhoods: Latent Semantic Analysis and computing beyond the bounds of the familiar

    PubMed Central

    Cohen, Trevor; Blatter, Brett; Patel, Vimla

    2005-01-01

    Certain applications require computer systems to approximate intended human meaning. This is achievable in constrained domains with a finite number of concepts. Areas such as psychiatry, however, draw on concepts from the world-at-large. A knowledge structure with broad scope is required to comprehend such domains. Latent Semantic Analysis (LSA) is an unsupervised corpus-based statistical method that derives quantitative estimates of the similarity between words and documents from their contextual usage statistics. The aim of this research was to evaluate the ability of LSA to derive meaningful associations between concepts relevant to the assessment of dangerousness in psychiatry. An expert reference model of dangerousness was used to guide the construction of a relevant corpus. Derived associations between words in the corpus were evaluated qualitatively. A similarity-based scoring function was used to assign dangerousness categories to discharge summaries. LSA was shown to derive intuitive relationships between concepts and correlated significantly better than random with human categorization of psychiatric discharge summaries according to dangerousness. The use of LSA to derive a simulated knowledge structure can extend the scope of computer systems beyond the boundaries of constrained conceptual domains. PMID:16779020

  2. An Overview of data science uses in bioimage informatics.

    PubMed

    Chessel, Anatole

    2017-02-15

    This review aims at providing a practical overview of the use of statistical features and associated data science methods in bioimage informatics. To achieve a quantitative link between images and biological concepts, one typically replaces an object coming from an image (a segmented cell or intracellular object, a pattern of expression or localisation, even a whole image) by a vector of numbers. They range from carefully crafted biologically relevant measurements to features learnt through deep neural networks. This replacement allows for the use of practical algorithms for visualisation, comparison and inference, such as the ones from machine learning or multivariate statistics. While originating mainly, for biology, in high content screening, those methods are integral to the use of data science for the quantitative analysis of microscopy images to gain biological insight, and they are sure to gather more interest as the need to make sense of the increasing amount of acquired imaging data grows more pressing. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. Biomechanical analysis of tension band fixation for olecranon fracture treatment.

    PubMed

    Kozin, S H; Berglund, L J; Cooney, W P; Morrey, B F; An, K N

    1996-01-01

    This study assessed the strength of various tension band fixation methods with wire and cable applied to simulated olecranon fractures to compare stability and potential failure or complications between the two. Transverse olecranon fractures were simulated by osteotomy. The fracture was anatomically reduced, and various tension band fixation techniques were applied with monofilament wire or multifilament cable. With a material testing machine load displacement curves were obtained and statistical relevance determined by analysis of variance. Two loading modes were tested: loading on the posterior surface of olecranon to simulate triceps pull and loading on the anterior olecranon tip to recreate a potential compressive loading on the fragment during the resistive flexion. All fixation methods were more resistant to posterior loading than to an anterior load. Individual comparative analysis for various loading conditions concluded that tension band fixation is more resilient to tensile forces exerted by the triceps than compressive forces on the anterior olecranon tip. Neither wire passage anterior to the K-wires nor the multifilament cable provided statistically significant increased stability.

  4. Linear regression analysis: part 14 of a series on evaluation of scientific publications.

    PubMed

    Schneider, Astrid; Hommel, Gerhard; Blettner, Maria

    2010-11-01

    Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.

  5. Diagnosis of students' ability in a statistical course based on Rasch probabilistic outcome

    NASA Astrophysics Data System (ADS)

    Mahmud, Zamalia; Ramli, Wan Syahira Wan; Sapri, Shamsiah; Ahmad, Sanizah

    2017-06-01

    Measuring students' ability and performance are important in assessing how well students have learned and mastered the statistical courses. Any improvement in learning will depend on the student's approaches to learning, which are relevant to some factors of learning, namely assessment methods carrying out tasks consisting of quizzes, tests, assignment and final examination. This study has attempted an alternative approach to measure students' ability in an undergraduate statistical course based on the Rasch probabilistic model. Firstly, this study aims to explore the learning outcome patterns of students in a statistics course (Applied Probability and Statistics) based on an Entrance-Exit survey. This is followed by investigating students' perceived learning ability based on four Course Learning Outcomes (CLOs) and students' actual learning ability based on their final examination scores. Rasch analysis revealed that students perceived themselves as lacking the ability to understand about 95% of the statistics concepts at the beginning of the class but eventually they had a good understanding at the end of the 14 weeks class. In terms of students' performance in their final examination, their ability in understanding the topics varies at different probability values given the ability of the students and difficulty of the questions. Majority found the probability and counting rules topic to be the most difficult to learn.

  6. Passage relevance models for genomics search.

    PubMed

    Urbain, Jay; Frieder, Ophir; Goharian, Nazli

    2009-03-19

    We present a passage relevance model for integrating syntactic and semantic evidence of biomedical concepts and topics using a probabilistic graphical model. Component models of topics, concepts, terms, and document are represented as potential functions within a Markov Random Field. The probability of a passage being relevant to a biologist's information need is represented as the joint distribution across all potential functions. Relevance model feedback of top ranked passages is used to improve distributional estimates of query concepts and topics in context, and a dimensional indexing strategy is used for efficient aggregation of concept and term statistics. By integrating multiple sources of evidence including dependencies between topics, concepts, and terms, we seek to improve genomics literature passage retrieval precision. Using this model, we are able to demonstrate statistically significant improvements in retrieval precision using a large genomics literature corpus.

  7. Higher-order statistical moments and a procedure that detects potentially anomalous years as two alternative methods describing alterations in continuous environmental data

    USGS Publications Warehouse

    Arismendi, Ivan; Johnson, Sherri L.; Dunham, Jason B.

    2015-01-01

    Statistics of central tendency and dispersion may not capture relevant or desired characteristics of the distribution of continuous phenomena and, thus, they may not adequately describe temporal patterns of change. Here, we present two methodological approaches that can help to identify temporal changes in environmental regimes. First, we use higher-order statistical moments (skewness and kurtosis) to examine potential changes of empirical distributions at decadal extents. Second, we adapt a statistical procedure combining a non-metric multidimensional scaling technique and higher density region plots to detect potentially anomalous years. We illustrate the use of these approaches by examining long-term stream temperature data from minimally and highly human-influenced streams. In particular, we contrast predictions about thermal regime responses to changing climates and human-related water uses. Using these methods, we effectively diagnose years with unusual thermal variability and patterns in variability through time, as well as spatial variability linked to regional and local factors that influence stream temperature. Our findings highlight the complexity of responses of thermal regimes of streams and reveal their differential vulnerability to climate warming and human-related water uses. The two approaches presented here can be applied with a variety of other continuous phenomena to address historical changes, extreme events, and their associated ecological responses.

  8. The Statistical Segment Length of DNA: Opportunities for Biomechanical Modeling in Polymer Physics and Next-Generation Genomics.

    PubMed

    Dorfman, Kevin D

    2018-02-01

    The development of bright bisintercalating dyes for deoxyribonucleic acid (DNA) in the 1990s, most notably YOYO-1, revolutionized the field of polymer physics in the ensuing years. These dyes, in conjunction with modern molecular biology techniques, permit the facile observation of polymer dynamics via fluorescence microscopy and thus direct tests of different theories of polymer dynamics. At the same time, they have played a key role in advancing an emerging next-generation method known as genome mapping in nanochannels. The effect of intercalation on the bending energy of DNA as embodied by a change in its statistical segment length (or, alternatively, its persistence length) has been the subject of significant controversy. The precise value of the statistical segment length is critical for the proper interpretation of polymer physics experiments and controls the phenomena underlying the aforementioned genomics technology. In this perspective, we briefly review the model of DNA as a wormlike chain and a trio of methods (light scattering, optical or magnetic tweezers, and atomic force microscopy (AFM)) that have been used to determine the statistical segment length of DNA. We then outline the disagreement in the literature over the role of bisintercalation on the bending energy of DNA, and how a multiscale biomechanical approach could provide an important model for this scientifically and technologically relevant problem.

  9. Relevance and reliability of experimental data in human health risk assessment of pesticides.

    PubMed

    Kaltenhäuser, Johanna; Kneuer, Carsten; Marx-Stoelting, Philip; Niemann, Lars; Schubert, Jens; Stein, Bernd; Solecki, Roland

    2017-08-01

    Evaluation of data relevance, reliability and contribution to uncertainty is crucial in regulatory health risk assessment if robust conclusions are to be drawn. Whether a specific study is used as key study, as additional information or not accepted depends in part on the criteria according to which its relevance and reliability are judged. In addition to GLP-compliant regulatory studies following OECD Test Guidelines, data from peer-reviewed scientific literature have to be evaluated in regulatory risk assessment of pesticide active substances. Publications should be taken into account if they are of acceptable relevance and reliability. Their contribution to the overall weight of evidence is influenced by factors including test organism, study design and statistical methods, as well as test item identification, documentation and reporting of results. Various reports make recommendations for improving the quality of risk assessments and different criteria catalogues have been published to support evaluation of data relevance and reliability. Their intention was to guide transparent decision making on the integration of the respective information into the regulatory process. This article describes an approach to assess the relevance and reliability of experimental data from guideline-compliant studies as well as from non-guideline studies published in the scientific literature in the specific context of uncertainty and risk assessment of pesticides. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  10. Statistically Validated Networks in Bipartite Complex Systems

    PubMed Central

    Tumminello, Michele; Miccichè, Salvatore; Lillo, Fabrizio; Piilo, Jyrki; Mantegna, Rosario N.

    2011-01-01

    Many complex systems present an intrinsic bipartite structure where elements of one set link to elements of the second set. In these complex systems, such as the system of actors and movies, elements of one set are qualitatively different than elements of the other set. The properties of these complex systems are typically investigated by constructing and analyzing a projected network on one of the two sets (for example the actor network or the movie network). Complex systems are often very heterogeneous in the number of relationships that the elements of one set establish with the elements of the other set, and this heterogeneity makes it very difficult to discriminate links of the projected network that are just reflecting system's heterogeneity from links relevant to unveil the properties of the system. Here we introduce an unsupervised method to statistically validate each link of a projected network against a null hypothesis that takes into account system heterogeneity. We apply the method to a biological, an economic and a social complex system. The method we propose is able to detect network structures which are very informative about the organization and specialization of the investigated systems, and identifies those relationships between elements of the projected network that cannot be explained simply by system heterogeneity. We also show that our method applies to bipartite systems in which different relationships might have different qualitative nature, generating statistically validated networks in which such difference is preserved. PMID:21483858

  11. A Quantile Mapping Bias Correction Method Based on Hydroclimatic Classification of the Guiana Shield

    PubMed Central

    Ringard, Justine; Seyler, Frederique; Linguet, Laurent

    2017-01-01

    Satellite precipitation products (SPPs) provide alternative precipitation data for regions with sparse rain gauge measurements. However, SPPs are subject to different types of error that need correction. Most SPP bias correction methods use the statistical properties of the rain gauge data to adjust the corresponding SPP data. The statistical adjustment does not make it possible to correct the pixels of SPP data for which there is no rain gauge data. The solution proposed in this article is to correct the daily SPP data for the Guiana Shield using a novel two set approach, without taking into account the daily gauge data of the pixel to be corrected, but the daily gauge data from surrounding pixels. In this case, a spatial analysis must be involved. The first step defines hydroclimatic areas using a spatial classification that considers precipitation data with the same temporal distributions. The second step uses the Quantile Mapping bias correction method to correct the daily SPP data contained within each hydroclimatic area. We validate the results by comparing the corrected SPP data and daily rain gauge measurements using relative RMSE and relative bias statistical errors. The results show that analysis scale variation reduces rBIAS and rRMSE significantly. The spatial classification avoids mixing rainfall data with different temporal characteristics in each hydroclimatic area, and the defined bias correction parameters are more realistic and appropriate. This study demonstrates that hydroclimatic classification is relevant for implementing bias correction methods at the local scale. PMID:28621723

  12. A Quantile Mapping Bias Correction Method Based on Hydroclimatic Classification of the Guiana Shield.

    PubMed

    Ringard, Justine; Seyler, Frederique; Linguet, Laurent

    2017-06-16

    Satellite precipitation products (SPPs) provide alternative precipitation data for regions with sparse rain gauge measurements. However, SPPs are subject to different types of error that need correction. Most SPP bias correction methods use the statistical properties of the rain gauge data to adjust the corresponding SPP data. The statistical adjustment does not make it possible to correct the pixels of SPP data for which there is no rain gauge data. The solution proposed in this article is to correct the daily SPP data for the Guiana Shield using a novel two set approach, without taking into account the daily gauge data of the pixel to be corrected, but the daily gauge data from surrounding pixels. In this case, a spatial analysis must be involved. The first step defines hydroclimatic areas using a spatial classification that considers precipitation data with the same temporal distributions. The second step uses the Quantile Mapping bias correction method to correct the daily SPP data contained within each hydroclimatic area. We validate the results by comparing the corrected SPP data and daily rain gauge measurements using relative RMSE and relative bias statistical errors. The results show that analysis scale variation reduces rBIAS and rRMSE significantly. The spatial classification avoids mixing rainfall data with different temporal characteristics in each hydroclimatic area, and the defined bias correction parameters are more realistic and appropriate. This study demonstrates that hydroclimatic classification is relevant for implementing bias correction methods at the local scale.

  13. Identification of Chemical Attribution Signatures of Fentanyl Syntheses Using Multivariate Statistical Analysis of Orthogonal Analytical Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mayer, B. P.; Mew, D. A.; DeHope, A.

    Attribution of the origin of an illicit drug relies on identification of compounds indicative of its clandestine production and is a key component of many modern forensic investigations. The results of these studies can yield detailed information on method of manufacture, starting material source, and final product - all critical forensic evidence. In the present work, chemical attribution signatures (CAS) associated with the synthesis of the analgesic fentanyl, N-(1-phenylethylpiperidin-4-yl)-N-phenylpropanamide, were investigated. Six synthesis methods, all previously published fentanyl synthetic routes or hybrid versions thereof, were studied in an effort to identify and classify route-specific signatures. 160 distinct compounds and inorganicmore » species were identified using gas and liquid chromatographies combined with mass spectrometric methods (GC-MS and LCMS/ MS-TOF) in conjunction with inductively coupled plasma mass spectrometry (ICPMS). The complexity of the resultant data matrix urged the use of multivariate statistical analysis. Using partial least squares discriminant analysis (PLS-DA), 87 route-specific CAS were classified and a statistical model capable of predicting the method of fentanyl synthesis was validated and tested against CAS profiles from crude fentanyl products deposited and later extracted from two operationally relevant surfaces: stainless steel and vinyl tile. This work provides the most detailed fentanyl CAS investigation to date by using orthogonal mass spectral data to identify CAS of forensic significance for illicit drug detection, profiling, and attribution.« less

  14. Improved accuracy of supervised CRM discovery with interpolated Markov models and cross-species comparison

    PubMed Central

    Kazemian, Majid; Zhu, Qiyun; Halfon, Marc S.; Sinha, Saurabh

    2011-01-01

    Despite recent advances in experimental approaches for identifying transcriptional cis-regulatory modules (CRMs, ‘enhancers’), direct empirical discovery of CRMs for all genes in all cell types and environmental conditions is likely to remain an elusive goal. Effective methods for computational CRM discovery are thus a critically needed complement to empirical approaches. However, existing computational methods that search for clusters of putative binding sites are ineffective if the relevant TFs and/or their binding specificities are unknown. Here, we provide a significantly improved method for ‘motif-blind’ CRM discovery that does not depend on knowledge or accurate prediction of TF-binding motifs and is effective when limited knowledge of functional CRMs is available to ‘supervise’ the search. We propose a new statistical method, based on ‘Interpolated Markov Models’, for motif-blind, genome-wide CRM discovery. It captures the statistical profile of variable length words in known CRMs of a regulatory network and finds candidate CRMs that match this profile. The method also uses orthologs of the known CRMs from closely related genomes. We perform in silico evaluation of predicted CRMs by assessing whether their neighboring genes are enriched for the expected expression patterns. This assessment uses a novel statistical test that extends the widely used Hypergeometric test of gene set enrichment to account for variability in intergenic lengths. We find that the new CRM prediction method is superior to existing methods. Finally, we experimentally validate 12 new CRM predictions by examining their regulatory activity in vivo in Drosophila; 10 of the tested CRMs were found to be functional, while 6 of the top 7 predictions showed the expected activity patterns. We make our program available as downloadable source code, and as a plugin for a genome browser installed on our servers. PMID:21821659

  15. A Study of the Effectiveness of the Contextual Lab Activity in the Teaching and Learning Statistics at the UTHM (Universiti Tun Hussein Onn Malaysia)

    ERIC Educational Resources Information Center

    Kamaruddin, Nafisah Kamariah Md; Jaafar, Norzilaila bt; Amin, Zulkarnain Md

    2012-01-01

    Inaccurate concept in statistics contributes to the assumption by the students that statistics do not relate to the real world and are not relevant to the engineering field. There are universities which introduced learning statistics using statistics lab activities. However, the learning is more on the learning how to use software and not to…

  16. The application of feature selection to the development of Gaussian process models for percutaneous absorption.

    PubMed

    Lam, Lun Tak; Sun, Yi; Davey, Neil; Adams, Rod; Prapopoulou, Maria; Brown, Marc B; Moss, Gary P

    2010-06-01

    The aim was to employ Gaussian processes to assess mathematically the nature of a skin permeability dataset and to employ these methods, particularly feature selection, to determine the key physicochemical descriptors which exert the most significant influence on percutaneous absorption, and to compare such models with established existing models. Gaussian processes, including automatic relevance detection (GPRARD) methods, were employed to develop models of percutaneous absorption that identified key physicochemical descriptors of percutaneous absorption. Using MatLab software, the statistical performance of these models was compared with single linear networks (SLN) and quantitative structure-permeability relationships (QSPRs). Feature selection methods were used to examine in more detail the physicochemical parameters used in this study. A range of statistical measures to determine model quality were used. The inherently nonlinear nature of the skin data set was confirmed. The Gaussian process regression (GPR) methods yielded predictive models that offered statistically significant improvements over SLN and QSPR models with regard to predictivity (where the rank order was: GPR > SLN > QSPR). Feature selection analysis determined that the best GPR models were those that contained log P, melting point and the number of hydrogen bond donor groups as significant descriptors. Further statistical analysis also found that great synergy existed between certain parameters. It suggested that a number of the descriptors employed were effectively interchangeable, thus questioning the use of models where discrete variables are output, usually in the form of an equation. The use of a nonlinear GPR method produced models with significantly improved predictivity, compared with SLN or QSPR models. Feature selection methods were able to provide important mechanistic information. However, it was also shown that significant synergy existed between certain parameters, and as such it was possible to interchange certain descriptors (i.e. molecular weight and melting point) without incurring a loss of model quality. Such synergy suggested that a model constructed from discrete terms in an equation may not be the most appropriate way of representing mechanistic understandings of skin absorption.

  17. Health measurement using the ICF: Test-retest reliability study of ICF codes and qualifiers in geriatric care

    PubMed Central

    Okochi, Jiro; Utsunomiya, Sakiko; Takahashi, Tai

    2005-01-01

    Background The International Classification of Functioning, Disability and Health (ICF) was published by the World Health Organization (WHO) to standardize descriptions of health and disability. Little is known about the reliability and clinical relevance of measurements using the ICF and its qualifiers. This study examines the test-retest reliability of ICF codes, and the rate of immeasurability in long-term care settings of the elderly to evaluate the clinical applicability of the ICF and its qualifiers, and the ICF checklist. Methods Reliability of 85 body function (BF) items and 152 activity and participation (AP) items of the ICF was studied using a test-retest procedure with a sample of 742 elderly persons from 59 institutional and at home care service centers. Test-retest reliability was estimated using the weighted kappa statistic. The clinical relevance of the ICF was estimated by calculating immeasurability rate. The effect of the measurement settings and evaluators' experience was analyzed by stratification of these variables. The properties of each item were evaluated using both the kappa statistic and immeasurability rate to assess the clinical applicability of WHO's ICF checklist in the elderly care setting. Results The median of the weighted kappa statistics of 85 BF and 152 AP items were 0.46 and 0.55 respectively. The reproducibility statistics improved when the measurements were performed by experienced evaluators. Some chapters such as genitourinary and reproductive functions in the BF domain and major life area in the AP domain contained more items with lower test-retest reliability measures and rated as immeasurable than in the other chapters. Some items in the ICF checklist were rated as unreliable and immeasurable. Conclusion The reliability of the ICF codes when measured with the current ICF qualifiers is relatively low. The result in increase in reliability according to evaluators' experience suggests proper education will have positive effects to raise the reliability. The ICF checklist contains some items that are difficult to be applied in the geriatric care settings. The improvements should be achieved by selecting the most relevant items for each measurement and by developing appropriate qualifiers for each code according to the interest of the users. PMID:16050960

  18. Measurement of turbulent spatial structure and kinetic energy spectrum by exact temporal-to-spatial mapping

    NASA Astrophysics Data System (ADS)

    Buchhave, Preben; Velte, Clara M.

    2017-08-01

    We present a method for converting a time record of turbulent velocity measured at a point in a flow to a spatial velocity record consisting of consecutive convection elements. The spatial record allows computation of dynamic statistical moments such as turbulent kinetic wavenumber spectra and spatial structure functions in a way that completely bypasses the need for Taylor's hypothesis. The spatial statistics agree with the classical counterparts, such as the total kinetic energy spectrum, at least for spatial extents up to the Taylor microscale. The requirements for applying the method are access to the instantaneous velocity magnitude, in addition to the desired flow quantity, and a high temporal resolution in comparison to the relevant time scales of the flow. We map, without distortion and bias, notoriously difficult developing turbulent high intensity flows using three main aspects that distinguish these measurements from previous work in the field: (1) The measurements are conducted using laser Doppler anemometry and are therefore not contaminated by directional ambiguity (in contrast to, e.g., frequently employed hot-wire anemometers); (2) the measurement data are extracted using a correctly and transparently functioning processor and are analysed using methods derived from first principles to provide unbiased estimates of the velocity statistics; (3) the exact mapping proposed herein has been applied to the high turbulence intensity flows investigated to avoid the significant distortions caused by Taylor's hypothesis. The method is first confirmed to produce the correct statistics using computer simulations and later applied to measurements in some of the most difficult regions of a round turbulent jet—the non-equilibrium developing region and the outermost parts of the developed jet. The proposed mapping is successfully validated using corresponding directly measured spatial statistics in the fully developed jet, even in the difficult outer regions of the jet where the average convection velocity is negligible and turbulence intensities increase dramatically. The measurements in the developing region reveal interesting features of an incomplete Richardson-Kolmogorov cascade under development.

  19. Image segmentation with a novel regularized composite shape prior based on surrogate study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhao, Tingting, E-mail: tingtingzhao@mednet.ucla.edu; Ruan, Dan, E-mail: druan@mednet.ucla.edu

    Purpose: Incorporating training into image segmentation is a good approach to achieve additional robustness. This work aims to develop an effective strategy to utilize shape prior knowledge, so that the segmentation label evolution can be driven toward the desired global optimum. Methods: In the variational image segmentation framework, a regularization for the composite shape prior is designed to incorporate the geometric relevance of individual training data to the target, which is inferred by an image-based surrogate relevance metric. Specifically, this regularization is imposed on the linear weights of composite shapes and serves as a hyperprior. The overall problem is formulatedmore » in a unified optimization setting and a variational block-descent algorithm is derived. Results: The performance of the proposed scheme is assessed in both corpus callosum segmentation from an MR image set and clavicle segmentation based on CT images. The resulted shape composition provides a proper preference for the geometrically relevant training data. A paired Wilcoxon signed rank test demonstrates statistically significant improvement of image segmentation accuracy, when compared to multiatlas label fusion method and three other benchmark active contour schemes. Conclusions: This work has developed a novel composite shape prior regularization, which achieves superior segmentation performance than typical benchmark schemes.« less

  20. Health Care Financing In Iran; Is Privatization A Good Solution?

    PubMed Central

    Davari, M; Haycox, A; Walley, T

    2012-01-01

    Background: This paper considers a range of issues related to the financing of health care system and relevant government policies in Iran. Methods: This study used mixed methods. A systematic literature review was undertaken to identify relevant publications. This was supplemented by hand searching in books and journals, including government publications. The issues and uncertainties identified in the literature were explored in detail through semi-structured interviews with key informants. These were triangulated with empirical evidence in the form of the literature, government statistics and independent expert opinions to validate the views expressed in the interviews. Results: The systematic review of published literature showed that no previous publication has addressed issues relating to the financing of healthcare services in Iran. However, a range of opinion pieces outlined issues to be explored further in the interviews. Such issues summarised into four main categories. Conclusion: The health care market in Iran has faced a period in which financial issues have enhanced managerial complexity. Privatization of health care services would appear to be a step too far in assisting the system to confront its challenges at the current time. The most important step toward solving such challenges is to focus on a feasible, relevant and comprehensive policy, which optimises the use of health care resources in Iran. PMID:23113205

  1. Effects of consanguineous marriage on reproductive behaviour, adverse pregnancy outcomes and offspring mortality in Oman.

    PubMed

    Islam, M Mazharul

    2013-05-01

    The long tradition of high prevalence of consanguineous marriages in Omani society may have ramifications for reproductive behaviour and health of offspring. To examine the relevance of consanguinity to reproductive behaviour, adverse pregnancy outcome and offspring mortality in Oman. The data analysed came from the 2000 Oman National Health Survey. Selected indicators that are related to reproductive behaviour, adverse pregnancy outcome and offspring mortality were considered as explanatory variables. Various statistical methods and tests were used for data analysis. Consanguineous marriage was found to be associated with lower age at first birth, higher preference for larger family size, lower level of husband-wife communication about use of family planning methods and lower rate of contraceptive use. Although bivariate analysis showed elevated fertility and childhood mortality among the women with consanguineous marriage, after controlling for relevant socio-demographic factors in multivariate analysis, fertility, childhood mortality and foetal loss showed no significant association with consanguinity in Oman. Consanguinity plays an important role in determining some of the aspects of reproduction and health of newborns, but did not show any detrimental effects on fertility and offspring mortality. The high level of consanguinity and its relevance to reproduction in Oman need to be considered in its public health strategy in a culturally acceptable manner.

  2. Salaried and Professional Women: Relevant Statistics. Publication #92-3.

    ERIC Educational Resources Information Center

    Wilson, Pamela, Ed.

    This document contains 29 statistical tables grouped into five sections: "General Statistics,""Occupations and Earnings,""Earnings of Selected Professional Occupations,""Women and Higher Education," and "Family Income and Composition." Among the tables are those that show the following: (1) 1991 annual average U.S. civilian work force by…

  3. 34 CFR Appendix A to Subpart N of... - Sample Default Prevention Plan

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... relevant default prevention statistics, including a statistical analysis of the borrowers who default on...'s delinquency status by obtaining reports from data managers and FFEL Program lenders. 5. Enhance... academic study. III. Statistics for Measuring Progress 1. The number of students enrolled at your...

  4. Statistical Cost Estimation in Higher Education: Some Alternatives.

    ERIC Educational Resources Information Center

    Brinkman, Paul T.; Niwa, Shelley

    Recent developments in econometrics that are relevant to the task of estimating costs in higher education are reviewed. The relative effectiveness of alternative statistical procedures for estimating costs are also tested. Statistical cost estimation involves three basic parts: a model, a data set, and an estimation procedure. Actual data are used…

  5. ZERODUR: deterministic approach for strength design

    NASA Astrophysics Data System (ADS)

    Hartmann, Peter

    2012-12-01

    There is an increasing request for zero expansion glass ceramic ZERODUR substrates being capable of enduring higher operational static loads or accelerations. The integrity of structures such as optical or mechanical elements for satellites surviving rocket launches, filigree lightweight mirrors, wobbling mirrors, and reticle and wafer stages in microlithography must be guaranteed with low failure probability. Their design requires statistically relevant strength data. The traditional approach using the statistical two-parameter Weibull distribution suffered from two problems. The data sets were too small to obtain distribution parameters with sufficient accuracy and also too small to decide on the validity of the model. This holds especially for the low failure probability levels that are required for reliable applications. Extrapolation to 0.1% failure probability and below led to design strengths so low that higher load applications seemed to be not feasible. New data have been collected with numbers per set large enough to enable tests on the applicability of the three-parameter Weibull distribution. This distribution revealed to provide much better fitting of the data. Moreover it delivers a lower threshold value, which means a minimum value for breakage stress, allowing of removing statistical uncertainty by introducing a deterministic method to calculate design strength. Considerations taken from the theory of fracture mechanics as have been proven to be reliable with proof test qualifications of delicate structures made from brittle materials enable including fatigue due to stress corrosion in a straight forward way. With the formulae derived, either lifetime can be calculated from given stress or allowable stress from minimum required lifetime. The data, distributions, and design strength calculations for several practically relevant surface conditions of ZERODUR are given. The values obtained are significantly higher than those resulting from the two-parameter Weibull distribution approach and no longer subject to statistical uncertainty.

  6. Easy and accurate variance estimation of the nonparametric estimator of the partial area under the ROC curve and its application.

    PubMed

    Yu, Jihnhee; Yang, Luge; Vexler, Albert; Hutson, Alan D

    2016-06-15

    The receiver operating characteristic (ROC) curve is a popular technique with applications, for example, investigating an accuracy of a biomarker to delineate between disease and non-disease groups. A common measure of accuracy of a given diagnostic marker is the area under the ROC curve (AUC). In contrast with the AUC, the partial area under the ROC curve (pAUC) looks into the area with certain specificities (i.e., true negative rate) only, and it can be often clinically more relevant than examining the entire ROC curve. The pAUC is commonly estimated based on a U-statistic with the plug-in sample quantile, making the estimator a non-traditional U-statistic. In this article, we propose an accurate and easy method to obtain the variance of the nonparametric pAUC estimator. The proposed method is easy to implement for both one biomarker test and the comparison of two correlated biomarkers because it simply adapts the existing variance estimator of U-statistics. In this article, we show accuracy and other advantages of the proposed variance estimation method by broadly comparing it with previously existing methods. Further, we develop an empirical likelihood inference method based on the proposed variance estimator through a simple implementation. In an application, we demonstrate that, depending on the inferences by either the AUC or pAUC, we can make a different decision on a prognostic ability of a same set of biomarkers. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  7. KECSA-Movable Type Implicit Solvation Model (KMTISM)

    PubMed Central

    2015-01-01

    Computation of the solvation free energy for chemical and biological processes has long been of significant interest. The key challenges to effective solvation modeling center on the choice of potential function and configurational sampling. Herein, an energy sampling approach termed the “Movable Type” (MT) method, and a statistical energy function for solvation modeling, “Knowledge-based and Empirical Combined Scoring Algorithm” (KECSA) are developed and utilized to create an implicit solvation model: KECSA-Movable Type Implicit Solvation Model (KMTISM) suitable for the study of chemical and biological systems. KMTISM is an implicit solvation model, but the MT method performs energy sampling at the atom pairwise level. For a specific molecular system, the MT method collects energies from prebuilt databases for the requisite atom pairs at all relevant distance ranges, which by its very construction encodes all possible molecular configurations simultaneously. Unlike traditional statistical energy functions, KECSA converts structural statistical information into categorized atom pairwise interaction energies as a function of the radial distance instead of a mean force energy function. Within the implicit solvent model approximation, aqueous solvation free energies are then obtained from the NVT ensemble partition function generated by the MT method. Validation is performed against several subsets selected from the Minnesota Solvation Database v2012. Results are compared with several solvation free energy calculation methods, including a one-to-one comparison against two commonly used classical implicit solvation models: MM-GBSA and MM-PBSA. Comparison against a quantum mechanics based polarizable continuum model is also discussed (Cramer and Truhlar’s Solvation Model 12). PMID:25691832

  8. A methodology for the design of experiments in computational intelligence with multiple regression models.

    PubMed

    Fernandez-Lozano, Carlos; Gestal, Marcos; Munteanu, Cristian R; Dorado, Julian; Pazos, Alejandro

    2016-01-01

    The design of experiments and the validation of the results achieved with them are vital in any research study. This paper focuses on the use of different Machine Learning approaches for regression tasks in the field of Computational Intelligence and especially on a correct comparison between the different results provided for different methods, as those techniques are complex systems that require further study to be fully understood. A methodology commonly accepted in Computational intelligence is implemented in an R package called RRegrs. This package includes ten simple and complex regression models to carry out predictive modeling using Machine Learning and well-known regression algorithms. The framework for experimental design presented herein is evaluated and validated against RRegrs. Our results are different for three out of five state-of-the-art simple datasets and it can be stated that the selection of the best model according to our proposal is statistically significant and relevant. It is of relevance to use a statistical approach to indicate whether the differences are statistically significant using this kind of algorithms. Furthermore, our results with three real complex datasets report different best models than with the previously published methodology. Our final goal is to provide a complete methodology for the use of different steps in order to compare the results obtained in Computational Intelligence problems, as well as from other fields, such as for bioinformatics, cheminformatics, etc., given that our proposal is open and modifiable.

  9. A methodology for the design of experiments in computational intelligence with multiple regression models

    PubMed Central

    Gestal, Marcos; Munteanu, Cristian R.; Dorado, Julian; Pazos, Alejandro

    2016-01-01

    The design of experiments and the validation of the results achieved with them are vital in any research study. This paper focuses on the use of different Machine Learning approaches for regression tasks in the field of Computational Intelligence and especially on a correct comparison between the different results provided for different methods, as those techniques are complex systems that require further study to be fully understood. A methodology commonly accepted in Computational intelligence is implemented in an R package called RRegrs. This package includes ten simple and complex regression models to carry out predictive modeling using Machine Learning and well-known regression algorithms. The framework for experimental design presented herein is evaluated and validated against RRegrs. Our results are different for three out of five state-of-the-art simple datasets and it can be stated that the selection of the best model according to our proposal is statistically significant and relevant. It is of relevance to use a statistical approach to indicate whether the differences are statistically significant using this kind of algorithms. Furthermore, our results with three real complex datasets report different best models than with the previously published methodology. Our final goal is to provide a complete methodology for the use of different steps in order to compare the results obtained in Computational Intelligence problems, as well as from other fields, such as for bioinformatics, cheminformatics, etc., given that our proposal is open and modifiable. PMID:27920952

  10. A framework for medical image retrieval using machine learning and statistical similarity matching techniques with relevance feedback.

    PubMed

    Rahman, Md Mahmudur; Bhattacharya, Prabir; Desai, Bipin C

    2007-01-01

    A content-based image retrieval (CBIR) framework for diverse collection of medical images of different imaging modalities, anatomic regions with different orientations and biological systems is proposed. Organization of images in such a database (DB) is well defined with predefined semantic categories; hence, it can be useful for category-specific searching. The proposed framework consists of machine learning methods for image prefiltering, similarity matching using statistical distance measures, and a relevance feedback (RF) scheme. To narrow down the semantic gap and increase the retrieval efficiency, we investigate both supervised and unsupervised learning techniques to associate low-level global image features (e.g., color, texture, and edge) in the projected PCA-based eigenspace with their high-level semantic and visual categories. Specially, we explore the use of a probabilistic multiclass support vector machine (SVM) and fuzzy c-mean (FCM) clustering for categorization and prefiltering of images to reduce the search space. A category-specific statistical similarity matching is proposed in a finer level on the prefiltered images. To incorporate a better perception subjectivity, an RF mechanism is also added to update the query parameters dynamically and adjust the proposed matching functions. Experiments are based on a ground-truth DB consisting of 5000 diverse medical images of 20 predefined categories. Analysis of results based on cross-validation (CV) accuracy and precision-recall for image categorization and retrieval is reported. It demonstrates the improvement, effectiveness, and efficiency achieved by the proposed framework.

  11. Twitter: A Novel Tool for Studying the Health and Social Needs of Transgender Communities

    PubMed Central

    Young, Sean D

    2015-01-01

    Background Limited research has examined the health and social needs of transgender and gender nonconforming populations. Due to high levels of stigma, transgender individuals may avoid disclosing their identities to researchers, hindering this type of work. Further, researchers have traditionally relied on clinic-based sampling methods, which may mask the true heterogeneity of transgender and gender nonconforming communities. Online social networking websites present a novel platform for studying this diverse, difficult-to-reach population. Objective The objective of this study was to attempt to examine the perceived health and social needs of transgender and gender nonconforming communities by examining messages posted to the popular microblogging platform, Twitter. Methods Tweets were collected from 13 transgender-related hashtags on July 11, 2014. They were read and coded according to general themes addressed, and a content analysis was performed. Qualitative and descriptive statistics are presented. Results There were 1135 tweets that were collected in total. Both “positive” and “negative” events were discussed, in both personal and social contexts. Violence, discrimination, suicide, and sexual risk behavior were discussed. There were 34.36% (390/1135) of tweets that addressed transgender-relevant current events, and 60.79% (690/1135) provided a link to a relevant news article or resource. Conclusions This study found that transgender individuals and allies use Twitter to discuss health and social needs relevant to the population. Real-time social media sites like Twitter can be used to study issues relevant to transgender communities. PMID:26082941

  12. Assessment of statistical significance and clinical relevance.

    PubMed

    Kieser, Meinhard; Friede, Tim; Gondan, Matthias

    2013-05-10

    In drug development, it is well accepted that a successful study will demonstrate not only a statistically significant result but also a clinically relevant effect size. Whereas standard hypothesis tests are used to demonstrate the former, it is less clear how the latter should be established. In the first part of this paper, we consider the responder analysis approach and study the performance of locally optimal rank tests when the outcome distribution is a mixture of responder and non-responder distributions. We find that these tests are quite sensitive to their planning assumptions and have therefore not really any advantage over standard tests such as the t-test and the Wilcoxon-Mann-Whitney test, which perform overall well and can be recommended for applications. In the second part, we present a new approach to the assessment of clinical relevance based on the so-called relative effect (or probabilistic index) and derive appropriate sample size formulae for the design of studies aiming at demonstrating both a statistically significant and clinically relevant effect. Referring to recent studies in multiple sclerosis, we discuss potential issues in the application of this approach. Copyright © 2012 John Wiley & Sons, Ltd.

  13. A statistical analysis of the elastic distortion and dislocation density fields in deformed crystals

    DOE PAGES

    Mohamed, Mamdouh S.; Larson, Bennett C.; Tischler, Jonathan Z.; ...

    2015-05-18

    The statistical properties of the elastic distortion fields of dislocations in deforming crystals are investigated using the method of discrete dislocation dynamics to simulate dislocation structures and dislocation density evolution under tensile loading. Probability distribution functions (PDF) and pair correlation functions (PCF) of the simulated internal elastic strains and lattice rotations are generated for tensile strain levels up to 0.85%. The PDFs of simulated lattice rotation are compared with sub-micrometer resolution three-dimensional X-ray microscopy measurements of rotation magnitudes and deformation length scales in 1.0% and 2.3% compression strained Cu single crystals to explore the linkage between experiment and the theoreticalmore » analysis. The statistical properties of the deformation simulations are analyzed through determinations of the Nye and Kr ner dislocation density tensors. The significance of the magnitudes and the length scales of the elastic strain and the rotation parts of dislocation density tensors are demonstrated, and their relevance to understanding the fundamental aspects of deformation is discussed.« less

  14. Statistical functions and relevant correlation coefficients of clearness index

    NASA Astrophysics Data System (ADS)

    Pavanello, Diego; Zaaiman, Willem; Colli, Alessandra; Heiser, John; Smith, Scott

    2015-08-01

    This article presents a statistical analysis of the sky conditions, during years from 2010 to 2012, for three different locations: the Joint Research Centre site in Ispra (Italy, European Solar Test Installation - ESTI laboratories), the site of National Renewable Energy Laboratory in Golden (Colorado, USA) and the site of Brookhaven National Laboratories in Upton (New York, USA). The key parameter is the clearness index kT, a dimensionless expression of the global irradiance impinging upon a horizontal surface at a given instant of time. In the first part, the sky conditions are characterized using daily averages, giving a general overview of the three sites. In the second part the analysis is performed using data sets with a short-term resolution of 1 sample per minute, demonstrating remarkable properties of the statistical distributions of the clearness index, reinforced by a proof using fuzzy logic methods. Successively some time-dependent correlations between different meteorological variables are presented in terms of Pearson and Spearman correlation coefficients, and introducing a new one.

  15. Parametric Analysis to Study the Influence of Aerogel-Based Renders' Components on Thermal and Mechanical Performance.

    PubMed

    Ximenes, Sofia; Silva, Ana; Soares, António; Flores-Colen, Inês; de Brito, Jorge

    2016-05-04

    Statistical models using multiple linear regression are some of the most widely used methods to study the influence of independent variables in a given phenomenon. This study's objective is to understand the influence of the various components of aerogel-based renders on their thermal and mechanical performance, namely cement (three types), fly ash, aerial lime, silica sand, expanded clay, type of aerogel, expanded cork granules, expanded perlite, air entrainers, resins (two types), and rheological agent. The statistical analysis was performed using SPSS (Statistical Package for Social Sciences), based on 85 mortar mixes produced in the laboratory and on their values of thermal conductivity and compressive strength obtained using tests in small-scale samples. The results showed that aerial lime assumes the main role in improving the thermal conductivity of the mortars. Aerogel type, fly ash, expanded perlite and air entrainers are also relevant components for a good thermal conductivity. Expanded clay can improve the mechanical behavior and aerogel has the opposite effect.

  16. Test of the statistical model in {sup 96}Mo with the BaF{sub 2}{gamma} calorimeter DANCE array

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sheets, S. A.; Mitchell, G. E.; Agvaanluvsan, U.

    2009-02-15

    The {gamma}-ray cascades following the {sup 95}Mo(n,{gamma}){sup 96}Mo reaction were studied with the {gamma} calorimeter DANCE (Detector for Advanced Neutron Capture Experiments) consisting of 160 BaF{sub 2} scintillation detectors at the Los Alamos Neutron Science Center. The {gamma}-ray energy spectra for different multiplicities were measured for s- and p-wave resonances below 2 keV. The shapes of these spectra were found to be in very good agreement with simulations using the DICEBOX statistical model code. The relevant model parameters used for the level density and photon strength functions were identical with those that provided the best fit of the data frommore » a recent measurement of the thermal {sup 95}Mo(n,{gamma}){sup 96}Mo reaction with the two-step-cascade method. The reported results strongly suggest that the extreme statistical model works very well in the mass region near A=100.« less

  17. Parametric Analysis to Study the Influence of Aerogel-Based Renders’ Components on Thermal and Mechanical Performance

    PubMed Central

    Ximenes, Sofia; Silva, Ana; Soares, António; Flores-Colen, Inês; de Brito, Jorge

    2016-01-01

    Statistical models using multiple linear regression are some of the most widely used methods to study the influence of independent variables in a given phenomenon. This study’s objective is to understand the influence of the various components of aerogel-based renders on their thermal and mechanical performance, namely cement (three types), fly ash, aerial lime, silica sand, expanded clay, type of aerogel, expanded cork granules, expanded perlite, air entrainers, resins (two types), and rheological agent. The statistical analysis was performed using SPSS (Statistical Package for Social Sciences), based on 85 mortar mixes produced in the laboratory and on their values of thermal conductivity and compressive strength obtained using tests in small-scale samples. The results showed that aerial lime assumes the main role in improving the thermal conductivity of the mortars. Aerogel type, fly ash, expanded perlite and air entrainers are also relevant components for a good thermal conductivity. Expanded clay can improve the mechanical behavior and aerogel has the opposite effect. PMID:28773460

  18. Inventory of DOT Statistical Information Systems

    DOT National Transportation Integrated Search

    1983-01-01

    The inventory represents an update of relevant systems described in the Transportation Statistical Reference File (TSRF), coordinated with the GAO update of Congressional Sources and Systems, and the Information Collection Budget. The inventory compi...

  19. Study of on-board compression of earth resources data

    NASA Technical Reports Server (NTRS)

    Habibi, A.

    1975-01-01

    The current literature on image bandwidth compression was surveyed and those methods relevant to compression of multispectral imagery were selected. Typical satellite multispectral data was then analyzed statistically and the results used to select a smaller set of candidate bandwidth compression techniques particularly relevant to earth resources data. These were compared using both theoretical analysis and simulation, under various criteria of optimality such as mean square error (MSE), signal-to-noise ratio, classification accuracy, and computational complexity. By concatenating some of the most promising techniques, three multispectral data compression systems were synthesized which appear well suited to current and future NASA earth resources applications. The performance of these three recommended systems was then examined in detail by all of the above criteria. Finally, merits and deficiencies were summarized and a number of recommendations for future NASA activities in data compression proposed.

  20. [The GIPSY-RECPAM model: a versatile approach for integrated evaluation in cardiologic care].

    PubMed

    Carinci, F

    2009-01-01

    Tree-structured methodology applied for the GISSI-PSICOLOGIA project, although performed in the framework of earliest GISSI studies, represents a powerful tool to analyze different aspects of cardiologic care. The GISSI-PSICOLOGIA project has delivered a novel methodology based on the joint application of psychometric tools and sophisticated statistical techniques. Its prospective use could allow building effective epidemiological models relevant to the prognosis of the cardiologic patient. The various features of the RECPAM method allow a versatile use in the framework of modern e-health projects. The study used the Cognitive Behavioral Assessment H Form (CBA-H) psychometrics scales. The potential for its future application in the framework of Italian cardiology is relevant and particularly indicated to assist planning of systems for integrated care and routine evaluation of the cardiologic patient.

  1. Automated MRI parcellation of the frontal lobe.

    PubMed

    Ranta, Marin E; Chen, Min; Crocetti, Deana; Prince, Jerry L; Subramaniam, Krish; Fischl, Bruce; Kaufmann, Walter E; Mostofsky, Stewart H

    2014-05-01

    Examination of associations between specific disorders and physical properties of functionally relevant frontal lobe sub-regions is a fundamental goal in neuropsychiatry. Here, we present and evaluate automated methods of frontal lobe parcellation with the programs FreeSurfer(FS) and TOADS-CRUISE(T-C), based on the manual method described in Ranta et al. [2009]: Psychiatry Res 172:147-154 in which sulcal-gyral landmarks were used to manually delimit functionally relevant regions within the frontal lobe: i.e., primary motor cortex, anterior cingulate, deep white matter, premotor cortex regions (supplementary motor complex, frontal eye field, and lateral premotor cortex) and prefrontal cortex (PFC) regions (medial PFC, dorsolateral PFC, inferior PFC, lateral orbitofrontal cortex [OFC] and medial OFC). Dice's coefficient, a measure of overlap, and percent volume difference were used to measure the reliability between manual and automated delineations for each frontal lobe region. For FS, mean Dice's coefficient for all regions was 0.75 and percent volume difference was 21.2%. For T-C the mean Dice's coefficient was 0.77 and the mean percent volume difference for all regions was 20.2%. These results, along with a high degree of agreement between the two automated methods (mean Dice's coefficient = 0.81, percent volume difference = 12.4%) and a proof-of-principle group difference analysis that highlights the consistency and sensitivity of the automated methods, indicate that the automated methods are valid techniques for parcellation of the frontal lobe into functionally relevant sub-regions. Thus, the methodology has the potential to increase efficiency, statistical power and reproducibility for population analyses of neuropsychiatric disorders with hypothesized frontal lobe contributions. Copyright © 2013 Wiley Periodicals, Inc.

  2. High-speed data search

    NASA Technical Reports Server (NTRS)

    Driscoll, James N.

    1994-01-01

    The high-speed data search system developed for KSC incorporates existing and emerging information retrieval technology to help a user intelligently and rapidly locate information found in large textual databases. This technology includes: natural language input; statistical ranking of retrieved information; an artificial intelligence concept called semantics, where 'surface level' knowledge found in text is used to improve the ranking of retrieved information; and relevance feedback, where user judgements about viewed information are used to automatically modify the search for further information. Semantics and relevance feedback are features of the system which are not available commercially. The system further demonstrates focus on paragraphs of information to decide relevance; and it can be used (without modification) to intelligently search all kinds of document collections, such as collections of legal documents medical documents, news stories, patents, and so forth. The purpose of this paper is to demonstrate the usefulness of statistical ranking, our semantic improvement, and relevance feedback.

  3. RAId_aPS: MS/MS Analysis with Multiple Scoring Functions and Spectrum-Specific Statistics

    PubMed Central

    Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo

    2010-01-01

    Statistically meaningful comparison/combination of peptide identification results from various search methods is impeded by the lack of a universal statistical standard. Providing an -value calibration protocol, we demonstrated earlier the feasibility of translating either the score or heuristic -value reported by any method into the textbook-defined -value, which may serve as the universal statistical standard. This protocol, although robust, may lose spectrum-specific statistics and might require a new calibration when changes in experimental setup occur. To mitigate these issues, we developed a new MS/MS search tool, RAId_aPS, that is able to provide spectrum-specific -values for additive scoring functions. Given a selection of scoring functions out of RAId score, K-score, Hyperscore and XCorr, RAId_aPS generates the corresponding score histograms of all possible peptides using dynamic programming. Using these score histograms to assign -values enables a calibration-free protocol for accurate significance assignment for each scoring function. RAId_aPS features four different modes: (i) compute the total number of possible peptides for a given molecular mass range, (ii) generate the score histogram given a MS/MS spectrum and a scoring function, (iii) reassign -values for a list of candidate peptides given a MS/MS spectrum and the scoring functions chosen, and (iv) perform database searches using selected scoring functions. In modes (iii) and (iv), RAId_aPS is also capable of combining results from different scoring functions using spectrum-specific statistics. The web link is http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/raid_aps/index.html. Relevant binaries for Linux, Windows, and Mac OS X are available from the same page. PMID:21103371

  4. Source Separation of Heartbeat Sounds for Effective E-Auscultation

    NASA Astrophysics Data System (ADS)

    Geethu, R. S.; Krishnakumar, M.; Pramod, K. V.; George, Sudhish N.

    2016-03-01

    This paper proposes a cost effective solution for improving the effectiveness of e-auscultation. Auscultation is the most difficult skill for a doctor, since it can be acquired only through experience. The heart sound mixtures are captured by placing the four numbers of sensors at appropriate auscultation area in the body. These sound mixtures are separated to its relevant components by a statistical method independent component analysis. The separated heartbeat sounds can be further processed or can be stored for future reference. This idea can be used for making a low cost, easy to use portable instrument which will be beneficial to people living in remote areas and are unable to take the advantage of advanced diagnosis methods.

  5. Parameterization of light absorption by components of seawater in optically complex coastal waters of the Crimea Peninsula (Black Sea).

    PubMed

    Dmitriev, Egor V; Khomenko, Georges; Chami, Malik; Sokolov, Anton A; Churilova, Tatyana Y; Korotaev, Gennady K

    2009-03-01

    The absorption of sunlight by oceanic constituents significantly contributes to the spectral distribution of the water-leaving radiance. Here it is shown that current parameterizations of absorption coefficients do not apply to the optically complex waters of the Crimea Peninsula. Based on in situ measurements, parameterizations of phytoplankton, nonalgal, and total particulate absorption coefficients are proposed. Their performance is evaluated using a log-log regression combined with a low-pass filter and the nonlinear least-square method. Statistical significance of the estimated parameters is verified using the bootstrap method. The parameterizations are relevant for chlorophyll a concentrations ranging from 0.45 up to 2 mg/m(3).

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bennink, Ryan S.; Ferragut, Erik M.; Humble, Travis S.

    Modeling and simulation are essential for predicting and verifying the behavior of fabricated quantum circuits, but existing simulation methods are either impractically costly or require an unrealistic simplification of error processes. In this paper, we present a method of simulating noisy Clifford circuits that is both accurate and practical in experimentally relevant regimes. In particular, the cost is weakly exponential in the size and the degree of non-Cliffordness of the circuit. Our approach is based on the construction of exact representations of quantum channels as quasiprobability distributions over stabilizer operations, which are then sampled, simulated, and weighted to yield unbiasedmore » statistical estimates of circuit outputs and other observables. As a demonstration of these techniques, we simulate a Steane [[7,1,3

  7. Integrative analysis of transcriptomic and metabolomic data via sparse canonical correlation analysis with incorporation of biological information.

    PubMed

    Safo, Sandra E; Li, Shuzhao; Long, Qi

    2018-03-01

    Integrative analysis of high dimensional omics data is becoming increasingly popular. At the same time, incorporating known functional relationships among variables in analysis of omics data has been shown to help elucidate underlying mechanisms for complex diseases. In this article, our goal is to assess association between transcriptomic and metabolomic data from a Predictive Health Institute (PHI) study that includes healthy adults at a high risk of developing cardiovascular diseases. Adopting a strategy that is both data-driven and knowledge-based, we develop statistical methods for sparse canonical correlation analysis (CCA) with incorporation of known biological information. Our proposed methods use prior network structural information among genes and among metabolites to guide selection of relevant genes and metabolites in sparse CCA, providing insight on the molecular underpinning of cardiovascular disease. Our simulations demonstrate that the structured sparse CCA methods outperform several existing sparse CCA methods in selecting relevant genes and metabolites when structural information is informative and are robust to mis-specified structural information. Our analysis of the PHI study reveals that a number of gene and metabolic pathways including some known to be associated with cardiovascular diseases are enriched in the set of genes and metabolites selected by our proposed approach. © 2017, The International Biometric Society.

  8. A bird strike handbook for base-level managers

    NASA Astrophysics Data System (ADS)

    Payson, R. P.; Vance, J. D.

    1984-09-01

    To help develop more awareness about bird strikes and bird strike reduction techniques, this thesis compiled all relevant information through an extensive literature search, review of base-level documents, and personal interviews. The final product--A Bird Strike Handbook for Base-Level Managers--provides information on bird strike statistics, methods to reduce the strike hazards, and means to obtain additional assistance. The handbook is organized for use by six major base agencies: Maintenance, Civil Engineering, Operations, Air Field Management, Safety, and Air Traffic Control. An appendix follows at the end.

  9. Cancer Imaging Phenomics Software Suite: Application to Brain and Breast Cancer | Informatics Technology for Cancer Research (ITCR)

    Cancer.gov

    The transition of oncologic imaging from its “industrial era” to it is “information era” demands analytical methods that 1) extract information from this data that is clinically and biologically relevant; 2) integrate imaging, clinical, and genomic data via rigorous statistical and computational methodologies in order to derive models valuable for understanding cancer mechanisms, diagnosis, prognostic assessment, response evaluation, and personalized treatment management; 3) are available to the biomedical community for easy use and application, with the aim of understanding, diagnosing, an

  10. Keyword extraction by nonextensivity measure.

    PubMed

    Mehri, Ali; Darooneh, Amir H

    2011-05-01

    The presence of a long-range correlation in the spatial distribution of a relevant word type, in spite of random occurrences of an irrelevant word type, is an important feature of human-written texts. We classify the correlation between the occurrences of words by nonextensive statistical mechanics for the word-ranking process. In particular, we look at the nonextensivity parameter as an alternative metric to measure the spatial correlation in the text, from which the words may be ranked in terms of this measure. Finally, we compare different methods for keyword extraction. © 2011 American Physical Society

  11. Clopper-Pearson bounds from HEP data cuts

    NASA Astrophysics Data System (ADS)

    Berg, B. A.

    2001-08-01

    For the measurement of Ns signals in N events rigorous confidence bounds on the true signal probability pexact were established in a classical paper by Clopper and Pearson [Biometrica 26, 404 (1934)]. Here, their bounds are generalized to the HEP situation where cuts on the data tag signals with probability Ps and background data with likelihood Pb

  12. Which Type of Risk Information to Use for Whom? Moderating Role of Outcome-Relevant Involvement in the Effects of Statistical and Exemplified Risk Information on Risk Perceptions.

    PubMed

    So, Jiyeon; Jeong, Se-Hoon; Hwang, Yoori

    2017-04-01

    The extant empirical research examining the effectiveness of statistical and exemplar-based health information is largely inconsistent. Under the premise that the inconsistency may be due to an unacknowledged moderator (O'Keefe, 2002), this study examined a moderating role of outcome-relevant involvement (Johnson & Eagly, 1989) in the effects of statistical and exemplified risk information on risk perception. Consistent with predictions based on elaboration likelihood model (Petty & Cacioppo, 1984), findings from an experiment (N = 237) concerning alcohol consumption risks showed that statistical risk information predicted risk perceptions of individuals with high, rather than low, involvement, while exemplified risk information predicted risk perceptions of those with low, rather than high, involvement. Moreover, statistical risk information contributed to negative attitude toward drinking via increased risk perception only for highly involved individuals, while exemplified risk information influenced the attitude through the same mechanism only for individuals with low involvement. Theoretical and practical implications for health risk communication are discussed.

  13. Employability and career experiences of international graduates of MSc Public Health: a mixed methods study.

    PubMed

    Buunaaisie, C; Manyara, A M; Annett, H; Bird, E L; Bray, I; Ige, J; Jones, M; Orme, J; Pilkington, P; Evans, D

    2018-05-08

    This article aims to describe the public health career experiences of international graduates of a Master of Science in Public Health (MSc PH) programme and to contribute to developing the evidence base on international public health workforce capacity development. A sequential mixed methods study was conducted between January 2017 and April 2017. Ninety-seven international graduates of one UK university's MSc PH programme were invited to take part in an online survey followed by semistructured interviews, for respondents who consented to be interviewed. We computed the descriptive statistics of the quantitative data obtained, and qualitative data were thematically analysed. The response rate was 48.5%. Most respondents (63%) were employed by various agencies within 1 year after graduation. Others (15%) were at different stages of doctor of philosophy studies. Respondents reported enhanced roles after graduation in areas such as public health policy analysis (74%); planning, implementation and evaluation of public health interventions (74%); leadership roles (72%); and research (70%). The common perceived skills that were relevant to the respondents' present jobs were critical analysis (87%), multidisciplinary thinking (86%), demonstrating public health leadership skills (84%) and research (77%). Almost all respondents (90%) were confident in conducting research. Respondents recommended the provision of longer public health placement opportunities, elective courses on project management and advanced statistics, and 'internationalisation' of the programme's curriculum. The study has revealed the relevance of higher education in public health in developing the career prospects and skills of graduates. International graduates of this MSc PH programme were satisfied with the relevance and impact of the skills they acquired during their studies. The outcomes of this study can be used for curriculum reformation. Employers' perspectives of the capabilities of these graduates, however, need further consideration. Copyright © 2018 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.

  14. A Methodology for Anatomic Ultrasound Image Diagnostic Quality Assessment.

    PubMed

    Hemmsen, Martin Christian; Lange, Theis; Brandt, Andreas Hjelm; Nielsen, Michael Bachmann; Jensen, Jorgen Arendt

    2017-01-01

    This paper discusses the methods for the assessment of ultrasound image quality based on our experiences with evaluating new methods for anatomic imaging. It presents a methodology to ensure a fair assessment between competing imaging methods using clinically relevant evaluations. The methodology is valuable in the continuing process of method optimization and guided development of new imaging methods. It includes a three phased study plan covering from initial prototype development to clinical assessment. Recommendations to the clinical assessment protocol, software, and statistical analysis are presented. Earlier uses of the methodology has shown that it ensures validity of the assessment, as it separates the influences between developer, investigator, and assessor once a research protocol has been established. This separation reduces confounding influences on the result from the developer to properly reveal the clinical value. This paper exemplifies the methodology using recent studies of synthetic aperture sequential beamforming tissue harmonic imaging.

  15. Auxiliary Parameter MCMC for Exponential Random Graph Models

    NASA Astrophysics Data System (ADS)

    Byshkin, Maksym; Stivala, Alex; Mira, Antonietta; Krause, Rolf; Robins, Garry; Lomi, Alessandro

    2016-11-01

    Exponential random graph models (ERGMs) are a well-established family of statistical models for analyzing social networks. Computational complexity has so far limited the appeal of ERGMs for the analysis of large social networks. Efficient computational methods are highly desirable in order to extend the empirical scope of ERGMs. In this paper we report results of a research project on the development of snowball sampling methods for ERGMs. We propose an auxiliary parameter Markov chain Monte Carlo (MCMC) algorithm for sampling from the relevant probability distributions. The method is designed to decrease the number of allowed network states without worsening the mixing of the Markov chains, and suggests a new approach for the developments of MCMC samplers for ERGMs. We demonstrate the method on both simulated and actual (empirical) network data and show that it reduces CPU time for parameter estimation by an order of magnitude compared to current MCMC methods.

  16. Visual attention based bag-of-words model for image classification

    NASA Astrophysics Data System (ADS)

    Wang, Qiwei; Wan, Shouhong; Yue, Lihua; Wang, Che

    2014-04-01

    Bag-of-words is a classical method for image classification. The core problem is how to count the frequency of the visual words and what visual words to select. In this paper, we propose a visual attention based bag-of-words model (VABOW model) for image classification task. The VABOW model utilizes visual attention method to generate a saliency map, and uses the saliency map as a weighted matrix to instruct the statistic process for the frequency of the visual words. On the other hand, the VABOW model combines shape, color and texture cues and uses L1 regularization logistic regression method to select the most relevant and most efficient features. We compare our approach with traditional bag-of-words based method on two datasets, and the result shows that our VABOW model outperforms the state-of-the-art method for image classification.

  17. Graphical method for comparative statistical study of vaccine potency tests.

    PubMed

    Pay, T W; Hingley, P J

    1984-03-01

    Producers and consumers are interested in some of the intrinsic characteristics of vaccine potency assays for the comparative evaluation of suitable experimental design. A graphical method is developed which represents the precision of test results, the sensitivity of such results to changes in dosage, and the relevance of the results in the way they reflect the protection afforded in the host species. The graphs can be constructed from Producer's scores and Consumer's scores on each of the scales of test score, antigen dose and probability of protection against disease. A method for calculating these scores is suggested and illustrated for single and multiple component vaccines, for tests which do or do not employ a standard reference preparation, and for tests which employ quantitative or quantal systems of scoring.

  18. Analysis of Nonstationary Time Series for Biological Rhythms Research.

    PubMed

    Leise, Tanya L

    2017-06-01

    This article is part of a Journal of Biological Rhythms series exploring analysis and statistics topics relevant to researchers in biological rhythms and sleep research. The goal is to provide an overview of the most common issues that arise in the analysis and interpretation of data in these fields. In this article on time series analysis for biological rhythms, we describe some methods for assessing the rhythmic properties of time series, including tests of whether a time series is indeed rhythmic. Because biological rhythms can exhibit significant fluctuations in their period, phase, and amplitude, their analysis may require methods appropriate for nonstationary time series, such as wavelet transforms, which can measure how these rhythmic parameters change over time. We illustrate these methods using simulated and real time series.

  19. Developing a statistically powerful measure for quartet tree inference using phylogenetic identities and Markov invariants.

    PubMed

    Sumner, Jeremy G; Taylor, Amelia; Holland, Barbara R; Jarvis, Peter D

    2017-12-01

    Recently there has been renewed interest in phylogenetic inference methods based on phylogenetic invariants, alongside the related Markov invariants. Broadly speaking, both these approaches give rise to polynomial functions of sequence site patterns that, in expectation value, either vanish for particular evolutionary trees (in the case of phylogenetic invariants) or have well understood transformation properties (in the case of Markov invariants). While both approaches have been valued for their intrinsic mathematical interest, it is not clear how they relate to each other, and to what extent they can be used as practical tools for inference of phylogenetic trees. In this paper, by focusing on the special case of binary sequence data and quartets of taxa, we are able to view these two different polynomial-based approaches within a common framework. To motivate the discussion, we present three desirable statistical properties that we argue any invariant-based phylogenetic method should satisfy: (1) sensible behaviour under reordering of input sequences; (2) stability as the taxa evolve independently according to a Markov process; and (3) explicit dependence on the assumption of a continuous-time process. Motivated by these statistical properties, we develop and explore several new phylogenetic inference methods. In particular, we develop a statistically bias-corrected version of the Markov invariants approach which satisfies all three properties. We also extend previous work by showing that the phylogenetic invariants can be implemented in such a way as to satisfy property (3). A simulation study shows that, in comparison to other methods, our new proposed approach based on bias-corrected Markov invariants is extremely powerful for phylogenetic inference. The binary case is of particular theoretical interest as-in this case only-the Markov invariants can be expressed as linear combinations of the phylogenetic invariants. A wider implication of this is that, for models with more than two states-for example DNA sequence alignments with four-state models-we find that methods which rely on phylogenetic invariants are incapable of satisfying all three of the stated statistical properties. This is because in these cases the relevant Markov invariants belong to a class of polynomials independent from the phylogenetic invariants.

  20. Replicability of time-varying connectivity patterns in large resting state fMRI samples.

    PubMed

    Abrol, Anees; Damaraju, Eswar; Miller, Robyn L; Stephen, Julia M; Claus, Eric D; Mayer, Andrew R; Calhoun, Vince D

    2017-12-01

    The past few years have seen an emergence of approaches that leverage temporal changes in whole-brain patterns of functional connectivity (the chronnectome). In this chronnectome study, we investigate the replicability of the human brain's inter-regional coupling dynamics during rest by evaluating two different dynamic functional network connectivity (dFNC) analysis frameworks using 7 500 functional magnetic resonance imaging (fMRI) datasets. To quantify the extent to which the emergent functional connectivity (FC) patterns are reproducible, we characterize the temporal dynamics by deriving several summary measures across multiple large, independent age-matched samples. Reproducibility was demonstrated through the existence of basic connectivity patterns (FC states) amidst an ensemble of inter-regional connections. Furthermore, application of the methods to conservatively configured (statistically stationary, linear and Gaussian) surrogate datasets revealed that some of the studied state summary measures were indeed statistically significant and also suggested that this class of null model did not explain the fMRI data fully. This extensive testing of reproducibility of similarity statistics also suggests that the estimated FC states are robust against variation in data quality, analysis, grouping, and decomposition methods. We conclude that future investigations probing the functional and neurophysiological relevance of time-varying connectivity assume critical importance. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  1. Replicability of time-varying connectivity patterns in large resting state fMRI samples

    PubMed Central

    Abrol, Anees; Damaraju, Eswar; Miller, Robyn L.; Stephen, Julia M.; Claus, Eric D.; Mayer, Andrew R.; Calhoun, Vince D.

    2018-01-01

    The past few years have seen an emergence of approaches that leverage temporal changes in whole-brain patterns of functional connectivity (the chronnectome). In this chronnectome study, we investigate the replicability of the human brain’s inter-regional coupling dynamics during rest by evaluating two different dynamic functional network connectivity (dFNC) analysis frameworks using 7 500 functional magnetic resonance imaging (fMRI) datasets. To quantify the extent to which the emergent functional connectivity (FC) patterns are reproducible, we characterize the temporal dynamics by deriving several summary measures across multiple large, independent age-matched samples. Reproducibility was demonstrated through the existence of basic connectivity patterns (FC states) amidst an ensemble of inter-regional connections. Furthermore, application of the methods to conservatively configured (statistically stationary, linear and Gaussian) surrogate datasets revealed that some of the studied state summary measures were indeed statistically significant and also suggested that this class of null model did not explain the fMRI data fully. This extensive testing of reproducibility of similarity statistics also suggests that the estimated FC states are robust against variation in data quality, analysis, grouping, and decomposition methods. We conclude that future investigations probing the functional and neurophysiological relevance of time-varying connectivity assume critical importance. PMID:28916181

  2. Statistical Modeling of Single Target Cell Encapsulation

    PubMed Central

    Moon, SangJun; Ceyhan, Elvan; Gurkan, Umut Atakan; Demirci, Utkan

    2011-01-01

    High throughput drop-on-demand systems for separation and encapsulation of individual target cells from heterogeneous mixtures of multiple cell types is an emerging method in biotechnology that has broad applications in tissue engineering and regenerative medicine, genomics, and cryobiology. However, cell encapsulation in droplets is a random process that is hard to control. Statistical models can provide an understanding of the underlying processes and estimation of the relevant parameters, and enable reliable and repeatable control over the encapsulation of cells in droplets during the isolation process with high confidence level. We have modeled and experimentally verified a microdroplet-based cell encapsulation process for various combinations of cell loading and target cell concentrations. Here, we explain theoretically and validate experimentally a model to isolate and pattern single target cells from heterogeneous mixtures without using complex peripheral systems. PMID:21814548

  3. Learning physical descriptors for materials science by compressed sensing

    NASA Astrophysics Data System (ADS)

    Ghiringhelli, Luca M.; Vybiral, Jan; Ahmetcik, Emre; Ouyang, Runhai; Levchenko, Sergey V.; Draxl, Claudia; Scheffler, Matthias

    2017-02-01

    The availability of big data in materials science offers new routes for analyzing materials properties and functions and achieving scientific understanding. Finding structure in these data that is not directly visible by standard tools and exploitation of the scientific information requires new and dedicated methodology based on approaches from statistical learning, compressed sensing, and other recent methods from applied mathematics, computer science, statistics, signal processing, and information science. In this paper, we explain and demonstrate a compressed-sensing based methodology for feature selection, specifically for discovering physical descriptors, i.e., physical parameters that describe the material and its properties of interest, and associated equations that explicitly and quantitatively describe those relevant properties. As showcase application and proof of concept, we describe how to build a physical model for the quantitative prediction of the crystal structure of binary compound semiconductors.

  4. Statistical analysis of field data for aircraft warranties

    NASA Astrophysics Data System (ADS)

    Lakey, Mary J.

    Air Force and Navy maintenance data collection systems were researched to determine their scientific applicability to the warranty process. New and unique algorithms were developed to extract failure distributions which were then used to characterize how selected families of equipment typically fails. Families of similar equipment were identified in terms of function, technology and failure patterns. Statistical analyses and applications such as goodness-of-fit test, maximum likelihood estimation and derivation of confidence intervals for the probability density function parameters were applied to characterize the distributions and their failure patterns. Statistical and reliability theory, with relevance to equipment design and operational failures were also determining factors in characterizing the failure patterns of the equipment families. Inferences about the families with relevance to warranty needs were then made.

  5. Constructing and Modifying Sequence Statistics for relevent Using informR in 𝖱

    PubMed Central

    Marcum, Christopher Steven; Butts, Carter T.

    2015-01-01

    The informR package greatly simplifies the analysis of complex event histories in 𝖱 by providing user friendly tools to build sufficient statistics for the relevent package. Historically, building sufficient statistics to model event sequences (of the form a→b) using the egocentric generalization of Butts’ (2008) relational event framework for modeling social action has been cumbersome. The informR package simplifies the construction of the complex list of arrays needed by the rem() model fitting for a variety of cases involving egocentric event data, multiple event types, and/or support constraints. This paper introduces these tools using examples from real data extracted from the American Time Use Survey. PMID:26185488

  6. Constructing a Reward-Related Quality of Life Statistic in Daily Life—a Proof of Concept Study Using Positive Affect

    PubMed Central

    Verhagen, Simone J. W.; Simons, Claudia J. P.; van Zelst, Catherine; Delespaul, Philippe A. E. G.

    2017-01-01

    Background: Mental healthcare needs person-tailored interventions. Experience Sampling Method (ESM) can provide daily life monitoring of personal experiences. This study aims to operationalize and test a measure of momentary reward-related Quality of Life (rQoL). Intuitively, quality of life improves by spending more time on rewarding experiences. ESM clinical interventions can use this information to coach patients to find a realistic, optimal balance of positive experiences (maximize reward) in daily life. rQoL combines the frequency of engaging in a relevant context (a ‘behavior setting’) with concurrent (positive) affect. High rQoL occurs when the most frequent behavior settings are combined with positive affect or infrequent behavior settings co-occur with low positive affect. Methods: Resampling procedures (Monte Carlo experiments) were applied to assess the reliability of rQoL using various behavior setting definitions under different sampling circumstances, for real or virtual subjects with low-, average- and high contextual variability. Furthermore, resampling was used to assess whether rQoL is a distinct concept from positive affect. Virtual ESM beep datasets were extracted from 1,058 valid ESM observations for virtual and real subjects. Results: Behavior settings defined by Who-What contextual information were most informative. Simulations of at least 100 ESM observations are needed for reliable assessment. Virtual ESM beep datasets of a real subject can be defined by Who-What-Where behavior setting combinations. Large sample sizes are necessary for reliable rQoL assessments, except for subjects with low contextual variability. rQoL is distinct from positive affect. Conclusion: rQoL is a feasible concept. Monte Carlo experiments should be used to assess the reliable implementation of an ESM statistic. Future research in ESM should asses the behavior of summary statistics under different sampling situations. This exploration is especially relevant in clinical implementation, where often only small datasets are available. PMID:29163294

  7. Constructing a Reward-Related Quality of Life Statistic in Daily Life-a Proof of Concept Study Using Positive Affect.

    PubMed

    Verhagen, Simone J W; Simons, Claudia J P; van Zelst, Catherine; Delespaul, Philippe A E G

    2017-01-01

    Background: Mental healthcare needs person-tailored interventions. Experience Sampling Method (ESM) can provide daily life monitoring of personal experiences. This study aims to operationalize and test a measure of momentary reward-related Quality of Life (rQoL). Intuitively, quality of life improves by spending more time on rewarding experiences. ESM clinical interventions can use this information to coach patients to find a realistic, optimal balance of positive experiences (maximize reward) in daily life. rQoL combines the frequency of engaging in a relevant context (a 'behavior setting') with concurrent (positive) affect. High rQoL occurs when the most frequent behavior settings are combined with positive affect or infrequent behavior settings co-occur with low positive affect. Methods: Resampling procedures (Monte Carlo experiments) were applied to assess the reliability of rQoL using various behavior setting definitions under different sampling circumstances, for real or virtual subjects with low-, average- and high contextual variability. Furthermore, resampling was used to assess whether rQoL is a distinct concept from positive affect. Virtual ESM beep datasets were extracted from 1,058 valid ESM observations for virtual and real subjects. Results: Behavior settings defined by Who-What contextual information were most informative. Simulations of at least 100 ESM observations are needed for reliable assessment. Virtual ESM beep datasets of a real subject can be defined by Who-What-Where behavior setting combinations. Large sample sizes are necessary for reliable rQoL assessments, except for subjects with low contextual variability. rQoL is distinct from positive affect. Conclusion: rQoL is a feasible concept. Monte Carlo experiments should be used to assess the reliable implementation of an ESM statistic. Future research in ESM should asses the behavior of summary statistics under different sampling situations. This exploration is especially relevant in clinical implementation, where often only small datasets are available.

  8. Implementation of a standardized out-of-hospital management method for Parkinson dysphagia.

    PubMed

    Wei, Hongying; Sun, Dongxiu; Liu, Meiping

    2017-12-01

    Our objective is to explore the effectiveness and feasibility of establishing a swallowing management clinic to implement out-of-hospital management for Parkinson disease (PD) patients with dysphagia. Two-hundred seventeen (217) voluntary PD patients with dysphagia in a PD outpatient clinic were divided into a control group with 100 people, and an experimental group with 117 people. The control group was given dysphagia rehabilitation guidance. The experimental group was presented with the standardized out-of-hospital management method as overall management and information and education materials. Rehabilitation efficiency and incidence rate of dysphagia, as well as relevant complications of both groups were compared after a 6-month intervention. Rehabilitation efficiency and the incidence rate of dysphagia including relevant complications of patients treated with the standardized out-of-hospital management were compared with those seen in the control group. The differences have distinct statistics meaning (p<0.01). Establishing a swallowing management protocol for outpatient setting can effectively help the recovery of the function of swallowing, reduce the incidence rate of dysphagia complications and improve the quality of life in patients with PD.

  9. Stabilized FE simulation of prototype thermal-hydraulics problems with integrated adjoint-based capabilities

    NASA Astrophysics Data System (ADS)

    Shadid, J. N.; Smith, T. M.; Cyr, E. C.; Wildey, T. M.; Pawlowski, R. P.

    2016-09-01

    A critical aspect of applying modern computational solution methods to complex multiphysics systems of relevance to nuclear reactor modeling, is the assessment of the predictive capability of specific proposed mathematical models. In this respect the understanding of numerical error, the sensitivity of the solution to parameters associated with input data, boundary condition uncertainty, and mathematical models is critical. Additionally, the ability to evaluate and or approximate the model efficiently, to allow development of a reasonable level of statistical diagnostics of the mathematical model and the physical system, is of central importance. In this study we report on initial efforts to apply integrated adjoint-based computational analysis and automatic differentiation tools to begin to address these issues. The study is carried out in the context of a Reynolds averaged Navier-Stokes approximation to turbulent fluid flow and heat transfer using a particular spatial discretization based on implicit fully-coupled stabilized FE methods. Initial results are presented that show the promise of these computational techniques in the context of nuclear reactor relevant prototype thermal-hydraulics problems.

  10. Forensic Science Research and Development at the National Institute of Justice: Opportunities in Applied Physics

    NASA Astrophysics Data System (ADS)

    Dutton, Gregory

    Forensic science is a collection of applied disciplines that draws from all branches of science. A key question in forensic analysis is: to what degree do a piece of evidence and a known reference sample share characteristics? Quantification of similarity, estimation of uncertainty, and determination of relevant population statistics are of current concern. A 2016 PCAST report questioned the foundational validity and the validity in practice of several forensic disciplines, including latent fingerprints, firearms comparisons and DNA mixture interpretation. One recommendation was the advancement of objective, automated comparison methods based on image analysis and machine learning. These concerns parallel the National Institute of Justice's ongoing R&D investments in applied chemistry, biology and physics. NIJ maintains a funding program spanning fundamental research with potential for forensic application to the validation of novel instruments and methods. Since 2009, NIJ has funded over 179M in external research to support the advancement of accuracy, validity and efficiency in the forensic sciences. An overview of NIJ's programs will be presented, with examples of relevant projects from fluid dynamics, 3D imaging, acoustics, and materials science.

  11. Stabilized FE simulation of prototype thermal-hydraulics problems with integrated adjoint-based capabilities

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shadid, J.N., E-mail: jnshadi@sandia.gov; Department of Mathematics and Statistics, University of New Mexico; Smith, T.M.

    A critical aspect of applying modern computational solution methods to complex multiphysics systems of relevance to nuclear reactor modeling, is the assessment of the predictive capability of specific proposed mathematical models. In this respect the understanding of numerical error, the sensitivity of the solution to parameters associated with input data, boundary condition uncertainty, and mathematical models is critical. Additionally, the ability to evaluate and or approximate the model efficiently, to allow development of a reasonable level of statistical diagnostics of the mathematical model and the physical system, is of central importance. In this study we report on initial efforts tomore » apply integrated adjoint-based computational analysis and automatic differentiation tools to begin to address these issues. The study is carried out in the context of a Reynolds averaged Navier–Stokes approximation to turbulent fluid flow and heat transfer using a particular spatial discretization based on implicit fully-coupled stabilized FE methods. Initial results are presented that show the promise of these computational techniques in the context of nuclear reactor relevant prototype thermal-hydraulics problems.« less

  12. Stabilized FE simulation of prototype thermal-hydraulics problems with integrated adjoint-based capabilities

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shadid, J. N.; Smith, T. M.; Cyr, E. C.

    A critical aspect of applying modern computational solution methods to complex multiphysics systems of relevance to nuclear reactor modeling, is the assessment of the predictive capability of specific proposed mathematical models. The understanding of numerical error, the sensitivity of the solution to parameters associated with input data, boundary condition uncertainty, and mathematical models is critical. Additionally, the ability to evaluate and or approximate the model efficiently, to allow development of a reasonable level of statistical diagnostics of the mathematical model and the physical system, is of central importance. In our study we report on initial efforts to apply integrated adjoint-basedmore » computational analysis and automatic differentiation tools to begin to address these issues. The study is carried out in the context of a Reynolds averaged Navier–Stokes approximation to turbulent fluid flow and heat transfer using a particular spatial discretization based on implicit fully-coupled stabilized FE methods. We present the initial results that show the promise of these computational techniques in the context of nuclear reactor relevant prototype thermal-hydraulics problems.« less

  13. Stabilized FE simulation of prototype thermal-hydraulics problems with integrated adjoint-based capabilities

    DOE PAGES

    Shadid, J. N.; Smith, T. M.; Cyr, E. C.; ...

    2016-05-20

    A critical aspect of applying modern computational solution methods to complex multiphysics systems of relevance to nuclear reactor modeling, is the assessment of the predictive capability of specific proposed mathematical models. The understanding of numerical error, the sensitivity of the solution to parameters associated with input data, boundary condition uncertainty, and mathematical models is critical. Additionally, the ability to evaluate and or approximate the model efficiently, to allow development of a reasonable level of statistical diagnostics of the mathematical model and the physical system, is of central importance. In our study we report on initial efforts to apply integrated adjoint-basedmore » computational analysis and automatic differentiation tools to begin to address these issues. The study is carried out in the context of a Reynolds averaged Navier–Stokes approximation to turbulent fluid flow and heat transfer using a particular spatial discretization based on implicit fully-coupled stabilized FE methods. We present the initial results that show the promise of these computational techniques in the context of nuclear reactor relevant prototype thermal-hydraulics problems.« less

  14. Chemometrics-based process analytical technology (PAT) tools: applications and adaptation in pharmaceutical and biopharmaceutical industries.

    PubMed

    Challa, Shruthi; Potumarthi, Ravichandra

    2013-01-01

    Process analytical technology (PAT) is used to monitor and control critical process parameters in raw materials and in-process products to maintain the critical quality attributes and build quality into the product. Process analytical technology can be successfully implemented in pharmaceutical and biopharmaceutical industries not only to impart quality into the products but also to prevent out-of-specifications and improve the productivity. PAT implementation eliminates the drawbacks of traditional methods which involves excessive sampling and facilitates rapid testing through direct sampling without any destruction of sample. However, to successfully adapt PAT tools into pharmaceutical and biopharmaceutical environment, thorough understanding of the process is needed along with mathematical and statistical tools to analyze large multidimensional spectral data generated by PAT tools. Chemometrics is a chemical discipline which incorporates both statistical and mathematical methods to obtain and analyze relevant information from PAT spectral tools. Applications of commonly used PAT tools in combination with appropriate chemometric method along with their advantages and working principle are discussed. Finally, systematic application of PAT tools in biopharmaceutical environment to control critical process parameters for achieving product quality is diagrammatically represented.

  15. Networks and the Epidemiology of Infectious Disease

    PubMed Central

    Danon, Leon; Ford, Ashley P.; House, Thomas; Jewell, Chris P.; Keeling, Matt J.; Roberts, Gareth O.; Ross, Joshua V.; Vernon, Matthew C.

    2011-01-01

    The science of networks has revolutionised research into the dynamics of interacting elements. It could be argued that epidemiology in particular has embraced the potential of network theory more than any other discipline. Here we review the growing body of research concerning the spread of infectious diseases on networks, focusing on the interplay between network theory and epidemiology. The review is split into four main sections, which examine: the types of network relevant to epidemiology; the multitude of ways these networks can be characterised; the statistical methods that can be applied to infer the epidemiological parameters on a realised network; and finally simulation and analytical methods to determine epidemic dynamics on a given network. Given the breadth of areas covered and the ever-expanding number of publications, a comprehensive review of all work is impossible. Instead, we provide a personalised overview into the areas of network epidemiology that have seen the greatest progress in recent years or have the greatest potential to provide novel insights. As such, considerable importance is placed on analytical approaches and statistical methods which are both rapidly expanding fields. Throughout this review we restrict our attention to epidemiological issues. PMID:21437001

  16. THE CAUSAL ANALYSIS / DIAGNOSIS DECISION ...

    EPA Pesticide Factsheets

    CADDIS is an on-line decision support system that helps investigators in the regions, states and tribes find, access, organize, use and share information to produce causal evaluations in aquatic systems. It is based on the US EPA's Stressor Identification process which is a formal method for identifying causes of impairments in aquatic systems. CADDIS 2007 increases access to relevant information useful for causal analysis and provides methods and tools that practitioners can use to analyze their own data. The new Candidate Cause section provides overviews of commonly encountered causes of impairments to aquatic systems: metals, sediments, nutrients, flow alteration, temperature, ionic strength, and low dissolved oxygen. CADDIS includes new Conceptual Models that illustrate the relationships from sources to stressors to biological effects. An Interactive Conceptual Model for phosphorus links the diagram with supporting literature citations. The new Analyzing Data section helps practitioners analyze their data sets and interpret and use those results as evidence within the USEPA causal assessment process. Downloadable tools include a graphical user interface statistical package (CADStat), and programs for use with the freeware R statistical package, and a Microsoft Excel template. These tools can be used to quantify associations between causes and biological impairments using innovative methods such as species-sensitivity distributions, biological inferenc

  17. Trends and associated uncertainty in the global mean temperature record

    NASA Astrophysics Data System (ADS)

    Poppick, A. N.; Moyer, E. J.; Stein, M.

    2016-12-01

    Physical models suggest that the Earth's mean temperature warms in response to changing CO2 concentrations (and hence increased radiative forcing); given physical uncertainties in this relationship, the historical temperature record is a source of empirical information about global warming. A persistent thread in many analyses of the historical temperature record, however, is the reliance on methods that appear to deemphasize both physical and statistical assumptions. Examples include regression models that treat time rather than radiative forcing as the relevant covariate, and time series methods that account for natural variability in nonparametric rather than parametric ways. We show here that methods that deemphasize assumptions can limit the scope of analysis and can lead to misleading inferences, particularly in the setting considered where the data record is relatively short and the scale of temporal correlation is relatively long. A proposed model that is simple but physically informed provides a more reliable estimate of trends and allows a broader array of questions to be addressed. In accounting for uncertainty, we also illustrate how parametric statistical models that are attuned to the important characteristics of natural variability can be more reliable than ostensibly more flexible approaches.

  18. In pursuit of a science of agriculture: the role of statistics in field experiments.

    PubMed

    Parolini, Giuditta

    2015-09-01

    Since the beginning of the twentieth century statistics has reshaped the experimental cultures of agricultural research taking part in the subtle dialectic between the epistemic and the material that is proper to experimental systems. This transformation has become especially relevant in field trials and the paper will examine the British agricultural institution, Rothamsted Experimental Station, where statistical methods nowadays popular in the planning and analysis of field experiments were developed in the 1920s. At Rothamsted statistics promoted randomisation over systematic arrangements, factorisation over one-question trials, and emphasised the importance of the experimental error in assessing field trials. These changes in methodology transformed also the material culture of agricultural science, and a new body, the Field Plots Committee, was created to manage the field research of the agricultural institution. Although successful, the vision of field experimentation proposed by the Rothamsted statisticians was not unproblematic. Experimental scientists closely linked to the farming community questioned it in favour of a field research that could be more easily understood by farmers. The clash between the two agendas reveals how the role attributed to statistics in field experimentation defined different pursuits of agricultural research, alternately conceived of as a scientists' science or as a farmers' science.

  19. BTS guide to good statistical practice

    DOT National Transportation Integrated Search

    2002-09-01

    Quality of data has many faces. Primarily, it has to be relevant to its users. Relevance is : an outcome that is achieved through a series of steps starting with a planning process that : link user needs to data requirements. It continues through acq...

  20. Electroencephalogram Signal Classification for Automated Epileptic Seizure Detection Using Genetic Algorithm

    PubMed Central

    Nanthini, B. Suguna; Santhi, B.

    2017-01-01

    Background: Epilepsy causes when the repeated seizure occurs in the brain. Electroencephalogram (EEG) test provides valuable information about the brain functions and can be useful to detect brain disorder, especially for epilepsy. In this study, application for an automated seizure detection model has been introduced successfully. Materials and Methods: The EEG signals are decomposed into sub-bands by discrete wavelet transform using db2 (daubechies) wavelet. The eight statistical features, the four gray level co-occurrence matrix and Renyi entropy estimation with four different degrees of order, are extracted from the raw EEG and its sub-bands. Genetic algorithm (GA) is used to select eight relevant features from the 16 dimension features. The model has been trained and tested using support vector machine (SVM) classifier successfully for EEG signals. The performance of the SVM classifier is evaluated for two different databases. Results: The study has been experimented through two different analyses and achieved satisfactory performance for automated seizure detection using relevant features as the input to the SVM classifier. Conclusion: Relevant features using GA give better accuracy performance for seizure detection. PMID:28781480

  1. Binding specificity of anti-HNK-1 IgM M-protein in anti-MAG neuropathy: possible clinical relevance.

    PubMed

    Hamada, Yukihiro; Hirano, Makito; Kuwahara, Motoi; Samukawa, Makoto; Takada, Kazuo; Morise, Jyoji; Yabuno, Keiko; Oka, Shogo; Kusunoki, Susumu

    2015-02-01

    Anti-myelin-associated-glycoprotein (MAG) neuropathy is an intractable autoimmune polyneuropathy. The antigenic region of MAG is the human natural killer-1 (HNK-1) carbohydrate. We and others previously suggested that the extension of antibody reactivities to HNK-1-bearing proteins other than MAG was associated with treatment resistance, without statistical analyses. In this study, we established an ELISA method with recombinant proteins to test binding specificities of currently available monoclonal antibodies to MAG and another HNK-1-bearing protein, phosphacan. Using this system, we found the distinct binding specificities of anti-MAG antibody in 19 patients with anti-MAG neuropathy. Their clinical relevance was then determined retrospectively with the adjusted 10-points INCAT disability score (0 = normal and 10 = highly disable). The results showed that strong reactivities of anti-MAG antibodies to phosphacan were significantly associated with treatment resistance or progressive clinical courses, indicating a possible clinical relevance of the binding specificities. Copyright © 2014 Elsevier Ireland Ltd and the Japan Neuroscience Society. All rights reserved.

  2. Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline

    PubMed Central

    2013-01-01

    Background As high-throughput genomic technologies become accurate and affordable, an increasing number of data sets have been accumulated in the public domain and genomic information integration and meta-analysis have become routine in biomedical research. In this paper, we focus on microarray meta-analysis, where multiple microarray studies with relevant biological hypotheses are combined in order to improve candidate marker detection. Many methods have been developed and applied in the literature, but their performance and properties have only been minimally investigated. There is currently no clear conclusion or guideline as to the proper choice of a meta-analysis method given an application; the decision essentially requires both statistical and biological considerations. Results We performed 12 microarray meta-analysis methods for combining multiple simulated expression profiles, and such methods can be categorized for different hypothesis setting purposes: (1) HS A : DE genes with non-zero effect sizes in all studies, (2) HS B : DE genes with non-zero effect sizes in one or more studies and (3) HS r : DE gene with non-zero effect in "majority" of studies. We then performed a comprehensive comparative analysis through six large-scale real applications using four quantitative statistical evaluation criteria: detection capability, biological association, stability and robustness. We elucidated hypothesis settings behind the methods and further apply multi-dimensional scaling (MDS) and an entropy measure to characterize the meta-analysis methods and data structure, respectively. Conclusions The aggregated results from the simulation study categorized the 12 methods into three hypothesis settings (HS A , HS B , and HS r ). Evaluation in real data and results from MDS and entropy analyses provided an insightful and practical guideline to the choice of the most suitable method in a given application. All source files for simulation and real data are available on the author’s publication website. PMID:24359104

  3. Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline.

    PubMed

    Chang, Lun-Ching; Lin, Hui-Min; Sibille, Etienne; Tseng, George C

    2013-12-21

    As high-throughput genomic technologies become accurate and affordable, an increasing number of data sets have been accumulated in the public domain and genomic information integration and meta-analysis have become routine in biomedical research. In this paper, we focus on microarray meta-analysis, where multiple microarray studies with relevant biological hypotheses are combined in order to improve candidate marker detection. Many methods have been developed and applied in the literature, but their performance and properties have only been minimally investigated. There is currently no clear conclusion or guideline as to the proper choice of a meta-analysis method given an application; the decision essentially requires both statistical and biological considerations. We performed 12 microarray meta-analysis methods for combining multiple simulated expression profiles, and such methods can be categorized for different hypothesis setting purposes: (1) HS(A): DE genes with non-zero effect sizes in all studies, (2) HS(B): DE genes with non-zero effect sizes in one or more studies and (3) HS(r): DE gene with non-zero effect in "majority" of studies. We then performed a comprehensive comparative analysis through six large-scale real applications using four quantitative statistical evaluation criteria: detection capability, biological association, stability and robustness. We elucidated hypothesis settings behind the methods and further apply multi-dimensional scaling (MDS) and an entropy measure to characterize the meta-analysis methods and data structure, respectively. The aggregated results from the simulation study categorized the 12 methods into three hypothesis settings (HS(A), HS(B), and HS(r)). Evaluation in real data and results from MDS and entropy analyses provided an insightful and practical guideline to the choice of the most suitable method in a given application. All source files for simulation and real data are available on the author's publication website.

  4. XCOM intrinsic dimensionality for low-Z elements at diagnostic energies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bornefalk, Hans

    2012-02-15

    Purpose: To determine the intrinsic dimensionality of linear attenuation coefficients (LACs) from XCOM for elements with low atomic number (Z = 1-20) at diagnostic x-ray energies (25-120 keV). H{sub 0}{sup q}, the hypothesis that the space of LACs is spanned by q bases, is tested for various q-values. Methods: Principal component analysis is first applied and the LACs are projected onto the first q principal component bases. The residuals of the model values vs XCOM data are determined for all energies and atomic numbers. Heteroscedasticity invalidates the prerequisite of i.i.d. errors necessary for bootstrapping residuals. Instead wild bootstrap is applied,more » which, by not mixing residuals, allows the effect of the non-i.i.d residuals to be reflected in the result. Credible regions for the eigenvalues of the correlation matrix for the bootstrapped LAC data are determined. If subsequent credible regions for the eigenvalues overlap, the corresponding principal component is not considered to represent true data structure but noise. If this happens for eigenvalues l and l + 1, for any l{<=}q, H{sub 0}{sup q} is rejected. Results: The largest value of q for which H{sub 0}{sup q} is nonrejectable at the 5%-level is q = 4. This indicates that the statistically significant intrinsic dimensionality of low-Z XCOM data at diagnostic energies is four. Conclusions: The method presented allows determination of the statistically significant dimensionality of any noisy linear subspace. Knowledge of such significant dimensionality is of interest for any method making assumptions on intrinsic dimensionality and evaluating results on noisy reference data. For LACs, knowledge of the low-Z dimensionality might be relevant when parametrization schemes are tuned to XCOM data. For x-ray imaging techniques based on the basis decomposition method (Alvarez and Macovski, Phys. Med. Biol. 21, 733-744, 1976), an underlying dimensionality of two is commonly assigned to the LAC of human tissue at diagnostic energies. The finding of a higher statistically significant dimensionality thus raises the question whether a higher assumed model dimensionality (now feasible with the advent of multibin x-ray systems) might also be practically relevant, i.e., if better tissue characterization results can be obtained.« less

  5. On the Way to 2020: Data for Vocational Education and Training Policies. Country Statistical Overviews. Update 2013

    ERIC Educational Resources Information Center

    Cedefop - European Centre for the Development of Vocational Training, 2014

    2014-01-01

    This report provides an updated statistical overview of vocational education and training (VET) and lifelong learning in European countries. These country statistical snapshots illustrate progress on indicators selected for their policy relevance and contribution to Europe 2020 objectives. The indicators take 2010 as the baseline year and present…

  6. An open source Bayesian Monte Carlo isotope mixing model with applications in Earth surface processes

    NASA Astrophysics Data System (ADS)

    Arendt, Carli A.; Aciego, Sarah M.; Hetland, Eric A.

    2015-05-01

    The implementation of isotopic tracers as constraints on source contributions has become increasingly relevant to understanding Earth surface processes. Interpretation of these isotopic tracers has become more accessible with the development of Bayesian Monte Carlo (BMC) mixing models, which allow uncertainty in mixing end-members and provide methodology for systems with multicomponent mixing. This study presents an open source multiple isotope BMC mixing model that is applicable to Earth surface environments with sources exhibiting distinct end-member isotopic signatures. Our model is first applied to new δ18O and δD measurements from the Athabasca Glacier, which showed expected seasonal melt evolution trends and vigorously assessed the statistical relevance of the resulting fraction estimations. To highlight the broad applicability of our model to a variety of Earth surface environments and relevant isotopic systems, we expand our model to two additional case studies: deriving melt sources from δ18O, δD, and 222Rn measurements of Greenland Ice Sheet bulk water samples and assessing nutrient sources from ɛNd and 87Sr/86Sr measurements of Hawaiian soil cores. The model produces results for the Greenland Ice Sheet and Hawaiian soil data sets that are consistent with the originally published fractional contribution estimates. The advantage of this method is that it quantifies the error induced by variability in the end-member compositions, unrealized by the models previously applied to the above case studies. Results from all three case studies demonstrate the broad applicability of this statistical BMC isotopic mixing model for estimating source contribution fractions in a variety of Earth surface systems.

  7. Chemical Attribution of Fentanyl Using Multivariate Statistical Analysis of Orthogonal Mass Spectral Data

    DOE PAGES

    Mayer, Brian P.; DeHope, Alan J.; Mew, Daniel A.; ...

    2016-03-24

    Attribution of the origin of an illicit drug relies on identification of compounds indicative of its clandestine production and is a key component of many modern forensic investigations. Here, the results of these studies can yield detailed information on method of manufacture, starting material source, and final product, all critical forensic evidence. In the present work, chemical attribution signatures (CAS) associated with the synthesis of the analgesic fentanyl, N-(1-phenylethylpiperidin-4-yl)-N-phenylpropanamide, were investigated. Six synthesis methods, all previously published fentanyl synthetic routes or hybrid versions thereof, were studied in an effort to identify and classify route-specific signatures. A total of 160 distinctmore » compounds and inorganic species were identified using gas and liquid chromatographies combined with mass spectrometric methods (gas chromatography/mass spectrometry (GC/MS) and liquid chromatography–tandem mass spectrometry-time of-flight (LC–MS/MS-TOF)) in conjunction with inductively coupled plasma mass spectrometry (ICPMS). The complexity of the resultant data matrix urged the use of multivariate statistical analysis. Using partial least-squares-discriminant analysis (PLS-DA), 87 route-specific CAS were classified and a statistical model capable of predicting the method of fentanyl synthesis was validated and tested against CAS profiles from crude fentanyl products deposited and later extracted from two operationally relevant surfaces: stainless steel and vinyl tile. Finally, this work provides the most detailed fentanyl CAS investigation to date by using orthogonal mass spectral data to identify CAS of forensic significance for illicit drug detection, profiling, and attribution.« less

  8. Application of the modified chi-square ratio statistic in a stepwise procedure for cascade impactor equivalence testing.

    PubMed

    Weber, Benjamin; Lee, Sau L; Delvadia, Renishkumar; Lionberger, Robert; Li, Bing V; Tsong, Yi; Hochhaus, Guenther

    2015-03-01

    Equivalence testing of aerodynamic particle size distribution (APSD) through multi-stage cascade impactors (CIs) is important for establishing bioequivalence of orally inhaled drug products. Recent work demonstrated that the median of the modified chi-square ratio statistic (MmCSRS) is a promising metric for APSD equivalence testing of test (T) and reference (R) products as it can be applied to a reduced number of CI sites that are more relevant for lung deposition. This metric is also less sensitive to the increased variability often observed for low-deposition sites. A method to establish critical values for the MmCSRS is described here. This method considers the variability of the R product by employing a reference variance scaling approach that allows definition of critical values as a function of the observed variability of the R product. A stepwise CI equivalence test is proposed that integrates the MmCSRS as a method for comparing the relative shapes of CI profiles and incorporates statistical tests for assessing equivalence of single actuation content and impactor sized mass. This stepwise CI equivalence test was applied to 55 published CI profile scenarios, which were classified as equivalent or inequivalent by members of the Product Quality Research Institute working group (PQRI WG). The results of the stepwise CI equivalence test using a 25% difference in MmCSRS as an acceptance criterion provided the best matching with those of the PQRI WG as decisions of both methods agreed in 75% of the 55 CI profile scenarios.

  9. Optical turbulence forecast: ready for an operational application

    NASA Astrophysics Data System (ADS)

    Masciadri, E.; Lascaux, F.; Turchi, A.; Fini, L.

    2017-04-01

    One of the main goals of the feasibility study MOSE (MOdelling ESO Sites) is to evaluate the performances of a method conceived to forecast the optical turbulence (OT) above the European Southern Observatory (ESO) sites of the Very Large Telescope (VLT) and the European Extremely Large Telescope (E-ELT) in Chile. The method implied the use of a dedicated code conceived for the OT called ASTRO-MESO-NH. In this paper, we present results we obtained at conclusion of this project concerning the performances of this method in forecasting the most relevant parameters related to the OT (CN^2, seeing ɛ, isoplanatic angle θ0 and wavefront coherence time τ0). Numerical predictions related to a very rich statistical sample of nights uniformly distributed along a solar year and belonging to different years have been compared to observations, and different statistical operators have been analysed such as the classical bias, root-mean-squared error, σ and more sophisticated statistical operators derived by the contingency tables that are able to quantify the score of success of a predictive method such as the percentage of correct detection (PC) and the probability to detect a parameter within a specific range of values (POD). The main conclusions of the study tell us that the ASTRO-MESO-NH model provides performances that are already very good to definitely guarantee a not negligible positive impact on the service mode of top-class telescopes and ELTs. A demonstrator for an automatic and operational version of the ASTRO-MESO-NH model will be soon implemented on the sites of VLT and E-ELT.

  10. Chemical Attribution of Fentanyl Using Multivariate Statistical Analysis of Orthogonal Mass Spectral Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mayer, Brian P.; DeHope, Alan J.; Mew, Daniel A.

    Attribution of the origin of an illicit drug relies on identification of compounds indicative of its clandestine production and is a key component of many modern forensic investigations. Here, the results of these studies can yield detailed information on method of manufacture, starting material source, and final product, all critical forensic evidence. In the present work, chemical attribution signatures (CAS) associated with the synthesis of the analgesic fentanyl, N-(1-phenylethylpiperidin-4-yl)-N-phenylpropanamide, were investigated. Six synthesis methods, all previously published fentanyl synthetic routes or hybrid versions thereof, were studied in an effort to identify and classify route-specific signatures. A total of 160 distinctmore » compounds and inorganic species were identified using gas and liquid chromatographies combined with mass spectrometric methods (gas chromatography/mass spectrometry (GC/MS) and liquid chromatography–tandem mass spectrometry-time of-flight (LC–MS/MS-TOF)) in conjunction with inductively coupled plasma mass spectrometry (ICPMS). The complexity of the resultant data matrix urged the use of multivariate statistical analysis. Using partial least-squares-discriminant analysis (PLS-DA), 87 route-specific CAS were classified and a statistical model capable of predicting the method of fentanyl synthesis was validated and tested against CAS profiles from crude fentanyl products deposited and later extracted from two operationally relevant surfaces: stainless steel and vinyl tile. Finally, this work provides the most detailed fentanyl CAS investigation to date by using orthogonal mass spectral data to identify CAS of forensic significance for illicit drug detection, profiling, and attribution.« less

  11. Improve ay101 teaching by single modifications versus unchanged controls: statistically-supported examples

    NASA Astrophysics Data System (ADS)

    Byrd, Gene G.; Byrd, Dana

    2017-06-01

    The two main purposes of this paper on improving Ay101 courses are presentations of (1) some very effective single changes and (2) a method to improve teaching by making just single changes which are evaluated statistically versus a control group class. We show how simple statistical comparison can be done even with Excel in Windows. Of course, other more sophisticated and powerful methods could be used if available. One of several examples to be discussed on our poster is our modification of an online introductory astronomy lab course evaluated by the multiple choice final exam. We composed questions related to the learning objectives of the course modules (LOQs). Students could “talk to themselves” by discursively answering these for extra credit prior to the final. Results were compared to an otherwise identical previous unmodified class. Modified classes showed statistically much better final exam average scores (78% vs. 66%). This modification helped those students who most need help. Students in the lower third of the class preferentially answered the LOQs to improve their scores and the class average on the exam. These results also show the effectiveness of relevant extra credit work. Other examples will be discussed as specific examples of evaluating improvement by making one change and then testing it versus a control. Essentially, this is an evolutionary approach in which single favorable “mutations” are retained and the unfavorable removed. The temptation to make more than one change each time must be resisted!

  12. Uniting statistical and individual-based approaches for animal movement modelling.

    PubMed

    Latombe, Guillaume; Parrott, Lael; Basille, Mathieu; Fortin, Daniel

    2014-01-01

    The dynamic nature of their internal states and the environment directly shape animals' spatial behaviours and give rise to emergent properties at broader scales in natural systems. However, integrating these dynamic features into habitat selection studies remains challenging, due to practically impossible field work to access internal states and the inability of current statistical models to produce dynamic outputs. To address these issues, we developed a robust method, which combines statistical and individual-based modelling. Using a statistical technique for forward modelling of the IBM has the advantage of being faster for parameterization than a pure inverse modelling technique and allows for robust selection of parameters. Using GPS locations from caribou monitored in Québec, caribou movements were modelled based on generative mechanisms accounting for dynamic variables at a low level of emergence. These variables were accessed by replicating real individuals' movements in parallel sub-models, and movement parameters were then empirically parameterized using Step Selection Functions. The final IBM model was validated using both k-fold cross-validation and emergent patterns validation and was tested for two different scenarios, with varying hardwood encroachment. Our results highlighted a functional response in habitat selection, which suggests that our method was able to capture the complexity of the natural system, and adequately provided projections on future possible states of the system in response to different management plans. This is especially relevant for testing the long-term impact of scenarios corresponding to environmental configurations that have yet to be observed in real systems.

  13. Uniting Statistical and Individual-Based Approaches for Animal Movement Modelling

    PubMed Central

    Latombe, Guillaume; Parrott, Lael; Basille, Mathieu; Fortin, Daniel

    2014-01-01

    The dynamic nature of their internal states and the environment directly shape animals' spatial behaviours and give rise to emergent properties at broader scales in natural systems. However, integrating these dynamic features into habitat selection studies remains challenging, due to practically impossible field work to access internal states and the inability of current statistical models to produce dynamic outputs. To address these issues, we developed a robust method, which combines statistical and individual-based modelling. Using a statistical technique for forward modelling of the IBM has the advantage of being faster for parameterization than a pure inverse modelling technique and allows for robust selection of parameters. Using GPS locations from caribou monitored in Québec, caribou movements were modelled based on generative mechanisms accounting for dynamic variables at a low level of emergence. These variables were accessed by replicating real individuals' movements in parallel sub-models, and movement parameters were then empirically parameterized using Step Selection Functions. The final IBM model was validated using both k-fold cross-validation and emergent patterns validation and was tested for two different scenarios, with varying hardwood encroachment. Our results highlighted a functional response in habitat selection, which suggests that our method was able to capture the complexity of the natural system, and adequately provided projections on future possible states of the system in response to different management plans. This is especially relevant for testing the long-term impact of scenarios corresponding to environmental configurations that have yet to be observed in real systems. PMID:24979047

  14. Statistics Poster Challenge for Schools

    ERIC Educational Resources Information Center

    Payne, Brad; Freeman, Jenny; Stillman, Eleanor

    2013-01-01

    The analysis and interpretation of data are important life skills. A poster challenge for schoolchildren provides an innovative outlet for these skills and demonstrates their relevance to daily life. We discuss our Statistics Poster Challenge and the lessons we have learned.

  15. Network analysis of named entity co-occurrences in written texts

    NASA Astrophysics Data System (ADS)

    Amancio, Diego Raphael

    2016-06-01

    The use of methods borrowed from statistics and physics to analyze written texts has allowed the discovery of unprecedent patterns of human behavior and cognition by establishing links between models features and language structure. While current models have been useful to unveil patterns via analysis of syntactical and semantical networks, only a few works have probed the relevance of investigating the structure arising from the relationship between relevant entities such as characters, locations and organizations. In this study, we represent entities appearing in the same context as a co-occurrence network, where links are established according to a null model based on random, shuffled texts. Computational simulations performed in novels revealed that the proposed model displays interesting topological features, such as the small world feature, characterized by high values of clustering coefficient. The effectiveness of our model was verified in a practical pattern recognition task in real networks. When compared with traditional word adjacency networks, our model displayed optimized results in identifying unknown references in texts. Because the proposed representation plays a complementary role in characterizing unstructured documents via topological analysis of named entities, we believe that it could be useful to improve the characterization of written texts (and related systems), specially if combined with traditional approaches based on statistical and deeper paradigms.

  16. A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data

    PubMed Central

    Vinaixa, Maria; Samino, Sara; Saez, Isabel; Duran, Jordi; Guinovart, Joan J.; Yanes, Oscar

    2012-01-01

    Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples. PMID:24957762

  17. A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data.

    PubMed

    Vinaixa, Maria; Samino, Sara; Saez, Isabel; Duran, Jordi; Guinovart, Joan J; Yanes, Oscar

    2012-10-18

    Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples.

  18. Statistical considerations on prognostic models for glioma

    PubMed Central

    Molinaro, Annette M.; Wrensch, Margaret R.; Jenkins, Robert B.; Eckel-Passow, Jeanette E.

    2016-01-01

    Given the lack of beneficial treatments in glioma, there is a need for prognostic models for therapeutic decision making and life planning. Recently several studies defining subtypes of glioma have been published. Here, we review the statistical considerations of how to build and validate prognostic models, explain the models presented in the current glioma literature, and discuss advantages and disadvantages of each model. The 3 statistical considerations to establishing clinically useful prognostic models are: study design, model building, and validation. Careful study design helps to ensure that the model is unbiased and generalizable to the population of interest. During model building, a discovery cohort of patients can be used to choose variables, construct models, and estimate prediction performance via internal validation. Via external validation, an independent dataset can assess how well the model performs. It is imperative that published models properly detail the study design and methods for both model building and validation. This provides readers the information necessary to assess the bias in a study, compare other published models, and determine the model's clinical usefulness. As editors, reviewers, and readers of the relevant literature, we should be cognizant of the needed statistical considerations and insist on their use. PMID:26657835

  19. Research methodology in dentistry: Part II — The relevance of statistics in research

    PubMed Central

    Krithikadatta, Jogikalmat; Valarmathi, Srinivasan

    2012-01-01

    The lifeline of original research depends on adept statistical analysis. However, there have been reports of statistical misconduct in studies that could arise from the inadequate understanding of the fundamental of statistics. There have been several reports on this across medical and dental literature. This article aims at encouraging the reader to approach statistics from its logic rather than its theoretical perspective. The article also provides information on statistical misuse in the Journal of Conservative Dentistry between the years 2008 and 2011 PMID:22876003

  20. When Educational Material Is Delivered: A Mixed Methods Content Validation Study of the Information Assessment Method.

    PubMed

    Badran, Hani; Pluye, Pierre; Grad, Roland

    2017-03-14

    The Information Assessment Method (IAM) allows clinicians to report the cognitive impact, clinical relevance, intention to use, and expected patient health benefits associated with clinical information received by email. More than 15,000 Canadian physicians and pharmacists use the IAM in continuing education programs. In addition, information providers can use IAM ratings and feedback comments from clinicians to improve their products. Our general objective was to validate the IAM questionnaire for the delivery of educational material (ecological and logical content validity). Our specific objectives were to measure the relevance and evaluate the representativeness of IAM items for assessing information received by email. A 3-part mixed methods study was conducted (convergent design). In part 1 (quantitative longitudinal study), the relevance of IAM items was measured. Participants were 5596 physician members of the Canadian Medical Association who used the IAM. A total of 234,196 ratings were collected in 2012. The relevance of IAM items with respect to their main construct was calculated using descriptive statistics (relevance ratio R). In part 2 (qualitative descriptive study), the representativeness of IAM items was evaluated. A total of 15 family physicians completed semistructured face-to-face interviews. For each construct, we evaluated the representativeness of IAM items using a deductive-inductive thematic qualitative data analysis. In part 3 (mixing quantitative and qualitative parts), results from quantitative and qualitative analyses were reviewed, juxtaposed in a table, discussed with experts, and integrated. Thus, our final results are derived from the views of users (ecological content validation) and experts (logical content validation). Of the 23 IAM items, 21 were validated for content, while 2 were removed. In part 1 (quantitative results), 21 items were deemed relevant, while 2 items were deemed not relevant (R=4.86% [N=234,196] and R=3.04% [n=45,394], respectively). In part 2 (qualitative results), 22 items were deemed representative, while 1 item was not representative. In part 3 (mixing quantitative and qualitative results), the content validity of 21 items was confirmed, and the 2 nonrelevant items were excluded. A fully validated version was generated (IAM-v2014). This study produced a content validated IAM questionnaire that is used by clinicians and information providers to assess the clinical information delivered in continuing education programs. ©Hani Badran, Pierre Pluye, Roland Grad. Originally published in JMIR Medical Education (http://mededu.jmir.org), 14.03.2017.

  1. Validation of surrogate endpoints in advanced solid tumors: systematic review of statistical methods, results, and implications for policy makers.

    PubMed

    Ciani, Oriana; Davis, Sarah; Tappenden, Paul; Garside, Ruth; Stein, Ken; Cantrell, Anna; Saad, Everardo D; Buyse, Marc; Taylor, Rod S

    2014-07-01

    Licensing of, and coverage decisions on, new therapies should rely on evidence from patient-relevant endpoints such as overall survival (OS). Nevertheless, evidence from surrogate endpoints may also be useful, as it may not only expedite the regulatory approval of new therapies but also inform coverage decisions. It is, therefore, essential that candidate surrogate endpoints be properly validated. However, there is no consensus on statistical methods for such validation and on how the evidence thus derived should be applied by policy makers. We review current statistical approaches to surrogate-endpoint validation based on meta-analysis in various advanced-tumor settings. We assessed the suitability of two surrogates (progression-free survival [PFS] and time-to-progression [TTP]) using three current validation frameworks: Elston and Taylor's framework, the German Institute of Quality and Efficiency in Health Care's (IQWiG) framework and the Biomarker-Surrogacy Evaluation Schema (BSES3). A wide variety of statistical methods have been used to assess surrogacy. The strength of the association between the two surrogates and OS was generally low. The level of evidence (observation-level versus treatment-level) available varied considerably by cancer type, by evaluation tools and was not always consistent even within one specific cancer type. Not in all solid tumors the treatment-level association between PFS or TTP and OS has been investigated. According to IQWiG's framework, only PFS achieved acceptable evidence of surrogacy in metastatic colorectal and ovarian cancer treated with cytotoxic agents. Our study emphasizes the challenges of surrogate-endpoint validation and the importance of building consensus on the development of evaluation frameworks.

  2. Statistical issues in the design, conduct and analysis of two large safety studies.

    PubMed

    Gaffney, Michael

    2016-10-01

    The emergence, post approval, of serious medical events, which may be associated with the use of a particular drug or class of drugs, is an important public health and regulatory issue. The best method to address this issue is through a large, rigorously designed safety study. Therefore, it is important to elucidate the statistical issues involved in these large safety studies. Two such studies are PRECISION and EAGLES. PRECISION is the primary focus of this article. PRECISION is a non-inferiority design with a clinically relevant non-inferiority margin. Statistical issues in the design, conduct and analysis of PRECISION are discussed. Quantitative and clinical aspects of the selection of the composite primary endpoint, the determination and role of the non-inferiority margin in a large safety study and the intent-to-treat and modified intent-to-treat analyses in a non-inferiority safety study are shown. Protocol changes that were necessary during the conduct of PRECISION are discussed from a statistical perspective. Issues regarding the complex analysis and interpretation of the results of PRECISION are outlined. EAGLES is presented as a large, rigorously designed safety study when a non-inferiority margin was not able to be determined by a strong clinical/scientific method. In general, when a non-inferiority margin is not able to be determined, the width of the 95% confidence interval is a way to size the study and to assess the cost-benefit of relative trial size. A non-inferiority margin, when able to be determined by a strong scientific method, should be included in a large safety study. Although these studies could not be called "pragmatic," they are examples of best real-world designs to address safety and regulatory concerns. © The Author(s) 2016.

  3. Characteristics of genomic signatures derived using univariate methods and mechanistically anchored functional descriptors for predicting drug- and xenobiotic-induced nephrotoxicity.

    PubMed

    Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J

    2008-01-01

    ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of predictive genomic investigations.

  4. Guide to good statistical practice in the transportation field

    DOT National Transportation Integrated Search

    2003-05-01

    Quality of data has many faces. Primarily, it has to be relevant (i.e., useful) to its users. Relevance is achieved through a series of steps starting with a planning process that links user needs to data requirements. It continues through acquisitio...

  5. Permutation testing of orthogonal factorial effects in a language-processing experiment using fMRI.

    PubMed

    Suckling, John; Davis, Matthew H; Ooi, Cinly; Wink, Alle Meije; Fadili, Jalal; Salvador, Raymond; Welchew, David; Sendur, Levent; Maxim, Vochita; Bullmore, Edward T

    2006-05-01

    The block-paradigm of the Functional Image Analysis Contest (FIAC) dataset was analysed with the Brain Activation and Morphological Mapping software. Permutation methods in the wavelet domain were used for inference on cluster-based test statistics of orthogonal contrasts relevant to the factorial design of the study, namely: the average response across all active blocks, the main effect of speaker, the main effect of sentence, and the interaction between sentence and speaker. Extensive activation was seen with all these contrasts. In particular, different vs. same-speaker blocks produced elevated activation in bilateral regions of the superior temporal lobe and repetition suppression for linguistic materials (same vs. different-sentence blocks) in left inferior frontal regions. These are regions previously reported in the literature. Additional regions were detected in this study, perhaps due to the enhanced sensitivity of the methodology. Within-block sentence suppression was tested post-hoc by regression of an exponential decay model onto the extracted time series from the left inferior frontal gyrus, but no strong evidence of such an effect was found. The significance levels set for the activation maps are P-values at which we expect <1 false-positive cluster per image. Nominal type I error control was verified by empirical testing of a test statistic corresponding to a randomly ordered design matrix. The small size of the BOLD effect necessitates sensitive methods of detection of brain activation. Permutation methods permit the necessary flexibility to develop novel test statistics to meet this challenge.

  6. Reproducible and Verifiable Equations of State Using Microfabricated Materials

    NASA Astrophysics Data System (ADS)

    Martin, J. F.; Pigott, J. S.; Panero, W. R.

    2017-12-01

    Accurate interpretation of observable geophysical data, relevant to the structure, composition, and evolution of planetary interiors, requires precise determination of appropriate equations of state. We present the synthesis of controlled-geometry nanofabricated samples and insulation layers for the laser-heated diamond anvil cell. We present electron-gun evaporation, sputter deposition, and photolithography methods to mass-produce Pt/SiO2/Fe/SiO2 stacks and MgO insulating disks to be used in LHDAC experiments to reduce uncertainties in equation of state measurements due to large temperature gradients. We present a reanalysis of published iron PVT data to establish a statistically-valid extrapolation of the equation of state to inner core conditions with quantified uncertainties, addressing the complication of covariance in equation of state parameters. We use this reanalysis, together with the synthesized samples, to propose a scheme for measurement and validation of high-precision equations of state relevant to the Earth and super-Earth exoplanets.

  7. The role of different PI-RADS versions in prostate multiparametric magnetic resonance tomography assessment.

    PubMed

    Aliukonis, Paulius; Letauta, Tadas; Briedienė, Rūta; Naruševičiūtė, Ieva; Letautienė, Simona

    2017-01-01

    Background . Standardised Prostate Imaging Reporting and Data System (PI-RADS) guidelines for the assessment of prostate alterations were designed for the assessment of prostate pathology. Published by the ESUR in 2012, PI-RADS v1 was based on the total score of different MRI sequences with subsequent calculation. PI-RADS v2 was published by the American College of Radiology in 2015 and featured different assessment criteria for prostate peripheral and transitory zones. Aim . To assess the correlations of PI-RADS v1 and PI-RADS v2 with Gleason score values and to define their predictive values of the diagnosis of prostate cancer. Materials and methods . A retrospective analysis of 66 patients. Prostate specific antigen (PSA) value and the Gleason score (GS) were assessed. One the most malignant focal lesion was selected in the peripheral zone of each lobe of the prostate (91 in total). Statistical analysis was carried out applying SPSS software, v.23, p < 0.05. Results . Focal lesions assessed by PI-RADS v1 score: 10% - 1, 12% - 2, 41% - 3, 23% - 4, 14% - 5. Assessment applying PI-RADS v.2: 20% - 1, 7.5% - 2, 26%, 29.5%, and 17% were assessed by 3, 4, and 5 scores. Statistically relevant correlation was found only between GS and PI-RADS ( p = 0.033). The positive predictive value of both versions of PI-RADS - 75%, negative predictive value of PI-RADS v1 - 46%, PI-RADS v2 - 43%. Conclusions . PI-RADS v1 was more statistically relevant in assessing the grade of tumour. Prediction values were similar in both versions.

  8. Generalized theory of semiflexible polymers.

    PubMed

    Wiggins, Paul A; Nelson, Philip C

    2006-03-01

    DNA bending on length scales shorter than a persistence length plays an integral role in the translation of genetic information from DNA to cellular function. Quantitative experimental studies of these biological systems have led to a renewed interest in the polymer mechanics relevant for describing the conformational free energy of DNA bending induced by protein-DNA complexes. Recent experimental results from DNA cyclization studies have cast doubt on the applicability of the canonical semiflexible polymer theory, the wormlike chain (WLC) model, to DNA bending on biologically relevant length scales. This paper develops a theory of the chain statistics of a class of generalized semiflexible polymer models. Our focus is on the theoretical development of these models and the calculation of experimental observables. To illustrate our methods, we focus on a specific, illustrative model of DNA bending. We show that the WLC model generically describes the long-length-scale chain statistics of semiflexible polymers, as predicted by renormalization group arguments. In particular, we show that either the WLC or our present model adequately describes force-extension, solution scattering, and long-contour-length cyclization experiments, regardless of the details of DNA bend elasticity. In contrast, experiments sensitive to short-length-scale chain behavior can in principle reveal dramatic departures from the linear elastic behavior assumed in the WLC model. We demonstrate this explicitly by showing that our toy model can reproduce the anomalously large short-contour-length cyclization factors recently measured by Cloutier and Widom. Finally, we discuss the applicability of these models to DNA chain statistics in the context of future experiments.

  9. UniEnt: uniform entropy model for the dynamics of a neuronal population

    NASA Astrophysics Data System (ADS)

    Hernandez Lahme, Damian; Nemenman, Ilya

    Sensory information and motor responses are encoded in the brain in a collective spiking activity of a large number of neurons. Understanding the neural code requires inferring statistical properties of such collective dynamics from multicellular neurophysiological recordings. Questions of whether synchronous activity or silence of multiple neurons carries information about the stimuli or the motor responses are especially interesting. Unfortunately, detection of such high order statistical interactions from data is especially challenging due to the exponentially large dimensionality of the state space of neural collectives. Here we present UniEnt, a method for the inference of strengths of multivariate neural interaction patterns. The method is based on the Bayesian prior that makes no assumptions (uniform a priori expectations) about the value of the entropy of the observed multivariate neural activity, in contrast to popular approaches that maximize this entropy. We then study previously published multi-electrode recordings data from salamander retina, exposing the relevance of higher order neural interaction patterns for information encoding in this system. This work was supported in part by Grants JSMF/220020321 and NSF/IOS/1208126.

  10. Statistical Physics of Population Genetics in the Low Population Size Limit

    NASA Astrophysics Data System (ADS)

    Atwal, Gurinder

    The understanding of evolutionary processes lends itself naturally to theory and computation, and the entire field of population genetics has benefited greatly from the influx of methods from applied mathematics for decades. However, in spite of all this effort, there are a number of key dynamical models of evolution that have resisted analytical treatment. In addition, modern DNA sequencing technologies have magnified the amount of genetic data available, revealing an excess of rare genetic variants in human genomes, challenging the predictions of conventional theory. Here I will show that methods from statistical physics can be used to model the distribution of genetic variants, incorporating selection and spatial degrees of freedom. In particular, a functional path-integral formulation of the Wright-Fisher process maps exactly to the dynamics of a particle in an effective potential, beyond the mean field approximation. In the small population size limit, the dynamics are dominated by instanton-like solutions which determine the probability of fixation in short timescales. These results are directly relevant for understanding the unusual genetic variant distribution at moving frontiers of populations.

  11. Health care research by degrees N Reid Health care research by degrees Blackwell Scientific 162pp £13.99 0632-03466-1 [Formula: see text].

    PubMed

    1993-03-03

    Inadequately understood statistics so often cloud both the argument of the researcher and the judgement of the reader. Norma Reid brings a refreshing clarity to a complex topic; she takes the mystification and mystique out of statistics. Her basic premiss that theory ought to be based on practical utility and relevance shines through her text and helps to make the subject accessible co clinicians who want to understand the underpinnings of their practice. Research methods, particularly qualitative approaches, are sketchily dealt with when compared with the wealth of detail on the mechanics of computing. Also, it is awkward to find methods and analysis not clearly separated in places (eg, Delphi studies), but ample references direct the reader to more expansive sources. Any attempt to steer the uninitiated through the minefields of computing is fraught with difficulties, and some will be disappointed to find one system used exclusively, but, perhaps, it serves as an illustration rather than a course to be slavishly followed.

  12. 12 CFR 348.2 - Definitions.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... means a natural person, corporation, or other business entity. (m) Relevant metropolitan statistical... median family income for the metropolitan statistical area (MSA), if a depository organization is located... exclusively to the business of retail merchandising or manufacturing; (ii) A person whose management functions...

  13. Effect of postmortem sampling technique on the clinical significance of autopsy blood cultures.

    PubMed

    Hove, M; Pencil, S D

    1998-02-01

    Our objective was to investigate the value of postmortem autopsy blood cultures performed with an iodine-subclavian technique relative to the classical method of atrial heat searing and antemortem blood cultures. The study consisted of a prospective autopsy series with each case serving as its own control relative to subsequent testing, and a retrospective survey of patients coming to autopsy who had both autopsy blood cultures and premortem blood cultures. A busy academic autopsy service (600 cases per year) at University of Texas Medical Branch Hospitals, Galveston, Texas, served as the setting for this work. The incidence of non-clinically relevant (false-positive) culture results were compared using different methods for collecting blood samples in a prospective series of 38 adult autopsy specimens. One hundred eleven adult autopsy specimens in which both postmortem and antemortem blood cultures were obtained were studied retrospectively. For both studies, positive culture results were scored as either clinically relevant or false positives based on analysis of the autopsy findings and the clinical summary. The rate of false-positive culture results obtained by an iodine-subclavian technique from blood drawn soon after death were statistically significantly lower (13%) than using the classical method of obtaining blood through the atrium after heat searing at the time of the autopsy (34%) in the same set of autopsy subjects. When autopsy results were compared with subjects' antemortem blood culture results, there was no significant difference in the rate of non-clinically relevant culture results in a paired retrospective series of antemortem blood cultures and postmortem blood cultures using the iodine-subclavian postmortem method (11.7% v 13.5%). The results indicate that autopsy blood cultures obtained using the iodine-subclavian technique have reliability equivalent to that of antemortem blood cultures.

  14. Comparative study of methods to calibrate the stiffness of a single-beam gradient-force optical tweezers over various laser trapping powers

    PubMed Central

    Sarshar, Mohammad; Wong, Winson T.; Anvari, Bahman

    2014-01-01

    Abstract. Optical tweezers have become an important instrument in force measurements associated with various physical, biological, and biophysical phenomena. Quantitative use of optical tweezers relies on accurate calibration of the stiffness of the optical trap. Using the same optical tweezers platform operating at 1064 nm and beads with two different diameters, we present a comparative study of viscous drag force, equipartition theorem, Boltzmann statistics, and power spectral density (PSD) as methods in calibrating the stiffness of a single beam gradient force optical trap at trapping laser powers in the range of 0.05 to 1.38 W at the focal plane. The equipartition theorem and Boltzmann statistic methods demonstrate a linear stiffness with trapping laser powers up to 355 mW, when used in conjunction with video position sensing means. The PSD of a trapped particle’s Brownian motion or measurements of the particle displacement against known viscous drag forces can be reliably used for stiffness calibration of an optical trap over a greater range of trapping laser powers. Viscous drag stiffness calibration method produces results relevant to applications where trapped particle undergoes large displacements, and at a given position sensing resolution, can be used for stiffness calibration at higher trapping laser powers than the PSD method. PMID:25375348

  15. A nonlinear quality-related fault detection approach based on modified kernel partial least squares.

    PubMed

    Jiao, Jianfang; Zhao, Ning; Wang, Guang; Yin, Shen

    2017-01-01

    In this paper, a new nonlinear quality-related fault detection method is proposed based on kernel partial least squares (KPLS) model. To deal with the nonlinear characteristics among process variables, the proposed method maps these original variables into feature space in which the linear relationship between kernel matrix and output matrix is realized by means of KPLS. Then the kernel matrix is decomposed into two orthogonal parts by singular value decomposition (SVD) and the statistics for each part are determined appropriately for the purpose of quality-related fault detection. Compared with relevant existing nonlinear approaches, the proposed method has the advantages of simple diagnosis logic and stable performance. A widely used literature example and an industrial process are used for the performance evaluation for the proposed method. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.

  16. Unbiased simulation of near-Clifford quantum circuits

    DOE PAGES

    Bennink, Ryan S.; Ferragut, Erik M.; Humble, Travis S.; ...

    2017-06-28

    Modeling and simulation are essential for predicting and verifying the behavior of fabricated quantum circuits, but existing simulation methods are either impractically costly or require an unrealistic simplification of error processes. In this paper, we present a method of simulating noisy Clifford circuits that is both accurate and practical in experimentally relevant regimes. In particular, the cost is weakly exponential in the size and the degree of non-Cliffordness of the circuit. Our approach is based on the construction of exact representations of quantum channels as quasiprobability distributions over stabilizer operations, which are then sampled, simulated, and weighted to yield unbiasedmore » statistical estimates of circuit outputs and other observables. As a demonstration of these techniques, we simulate a Steane [[7,1,3

  17. A statistical learning strategy for closed-loop control of fluid flows

    NASA Astrophysics Data System (ADS)

    Guéniat, Florimond; Mathelin, Lionel; Hussaini, M. Yousuff

    2016-12-01

    This work discusses a closed-loop control strategy for complex systems utilizing scarce and streaming data. A discrete embedding space is first built using hash functions applied to the sensor measurements from which a Markov process model is derived, approximating the complex system's dynamics. A control strategy is then learned using reinforcement learning once rewards relevant with respect to the control objective are identified. This method is designed for experimental configurations, requiring no computations nor prior knowledge of the system, and enjoys intrinsic robustness. It is illustrated on two systems: the control of the transitions of a Lorenz'63 dynamical system, and the control of the drag of a cylinder flow. The method is shown to perform well.

  18. Combined slope ratio analysis and linear-subtraction: An extension of the Pearce ratio method

    NASA Astrophysics Data System (ADS)

    De Waal, Sybrand A.

    1996-07-01

    A new technique, called combined slope ratio analysis, has been developed by extending the Pearce element ratio or conserved-denominator method (Pearce, 1968) to its logical conclusions. If two stoichiometric substances are mixed and certain chemical components are uniquely contained in either one of the two mixing substances, then by treating these unique components as conserved, the composition of the substance not containing the relevant component can be accurately calculated within the limits allowed by analytical and geological error. The calculated composition can then be subjected to rigorous statistical testing using the linear-subtraction method recently advanced by Woronow (1994). Application of combined slope ratio analysis to the rocks of the Uwekahuna Laccolith, Hawaii, USA, and the lavas of the 1959-summit eruption of Kilauea Volcano, Hawaii, USA, yields results that are consistent with field observations.

  19. Problems With Risk Reclassification Methods for Evaluating Prediction Models

    PubMed Central

    Pepe, Margaret S.

    2011-01-01

    For comparing the performance of a baseline risk prediction model with one that includes an additional predictor, a risk reclassification analysis strategy has been proposed. The first step is to cross-classify risks calculated according to the 2 models for all study subjects. Summary measures including the percentage of reclassification and the percentage of correct reclassification are calculated, along with 2 reclassification calibration statistics. The author shows that interpretations of the proposed summary measures and P values are problematic. The author's recommendation is to display the reclassification table, because it shows interesting information, but to use alternative methods for summarizing and comparing model performance. The Net Reclassification Index has been suggested as one alternative method. The author argues for reporting components of the Net Reclassification Index because they are more clinically relevant than is the single numerical summary measure. PMID:21555714

  20. Multi-focus image fusion algorithm using NSCT and MPCNN

    NASA Astrophysics Data System (ADS)

    Liu, Kang; Wang, Lianli

    2018-04-01

    Based on nonsubsampled contourlet transform (NSCT) and modified pulse coupled neural network (MPCNN), the paper proposes an effective method of image fusion. Firstly, the paper decomposes the source image into the low-frequency components and high-frequency components using NSCT, and then processes the low-frequency components by regional statistical fusion rules. For high-frequency components, the paper calculates the spatial frequency (SF), which is input into MPCNN model to get relevant coefficients according to the fire-mapping image of MPCNN. At last, the paper restructures the final image by inverse transformation of low-frequency and high-frequency components. Compared with the wavelet transformation (WT) and the traditional NSCT algorithm, experimental results indicate that the method proposed in this paper achieves an improvement both in human visual perception and objective evaluation. It indicates that the method is effective, practical and good performance.

  1. Stochastic Growth of Ion Cyclotron And Mirror Waves In Earth's Magnetosheath

    NASA Technical Reports Server (NTRS)

    Cairns, Iver H.; Grubits, K. A.

    2001-01-01

    Electromagnetic ion cyclotron and mirror waves in Earth's magnetosheath are bursty, have widely variable fields, and are unexpectedly persistent, properties difficult to reconcile with uniform secular growth. Here it is shown for specific periods that stochastic growth theory (SGT) quantitatively accounts for the functional form of the wave statistics and qualitatively explains the wave properties. The wave statistics are inconsistent with uniform secular growth or self-organized criticality, but nonlinear processes sometimes play a role at high fields. The results show SGT's relevance near marginal stability and suggest that it is widely relevant to space and astrophysical plasmas.

  2. Decoding 2D-PAGE complex maps: relevance to proteomics.

    PubMed

    Pietrogrande, Maria Chiara; Marchetti, Nicola; Dondi, Francesco; Righetti, Pier Giorgio

    2006-03-20

    This review describes two mathematical approaches useful for decoding the complex signal of 2D-PAGE maps of protein mixtures. These methods are helpful for interpreting the large amount of data of each 2D-PAGE map by extracting all the analytical information hidden therein by spot overlapping. Here the basic theory and application to 2D-PAGE maps are reviewed: the means for extracting information from the experimental data and their relevance to proteomics are discussed. One method is based on the quantitative theory of statistical model of peak overlapping (SMO) using the spot experimental data (intensity and spatial coordinates). The second method is based on the study of the 2D-autocovariance function (2D-ACVF) computed on the experimental digitised map. They are two independent methods that are able to extract equal and complementary information from the 2D-PAGE map. Both methods permit to obtain fundamental information on the sample complexity and the separation performance and to single out ordered patterns present in spot positions: the availability of two independent procedures to compute the same separation parameters is a powerful tool to estimate the reliability of the obtained results. The SMO procedure is an unique tool to quantitatively estimate the degree of spot overlapping present in the map, while the 2D-ACVF method is particularly powerful in simply singling out the presence of order in the spot position from the complexity of the whole 2D map, i.e., spot trains. The procedures were validated by extensive numerical computation on computer-generated maps describing experimental 2D-PAGE gels of protein mixtures. Their applicability to real samples was tested on reference maps obtained from literature sources. The review describes the most relevant information for proteomics: sample complexity, separation performance, overlapping extent, identification of spot trains related to post-translational modifications (PTMs).

  3. An Experiment Comparing Lexical and Statistical Methods for Extracting MeSH Terms from Clinical Free Text

    PubMed Central

    Cooper, Gregory F.; Miller, Randolph A.

    1998-01-01

    Abstract Objective: A primary goal of the University of Pittsburgh's 1990-94 UMLS-sponsored effort was to develop and evaluate PostDoc (a lexical indexing system) and Pindex (a statistical indexing system) comparatively, and then in combination as a hybrid system. Each system takes as input a portion of the free text from a narrative part of a patient's electronic medical record and returns a list of suggested MeSH terms to use in formulating a Medline search that includes concepts in the text. This paper describes the systems and reports an evaluation. The intent is for this evaluation to serve as a step toward the eventual realization of systems that assist healthcare personnel in using the electronic medical record to construct patient-specific searches of Medline. Design: The authors tested the performances of PostDoc, Pindex, and a hybrid system, using text taken from randomly selected clinical records, which were stratified to include six radiology reports, six pathology reports, and six discharge summaries. They identified concepts in the clinical records that might conceivably be used in performing a patient-specific Medline search. Each system was given the free text of each record as an input. The extent to which a system-derived list of MeSH terms captured the relevant concepts in these documents was determined based on blinded assessments by the authors. Results: PostDoc output a mean of approximately 19 MeSH terms per report, which included about 40% of the relevant report concepts. Pindex output a mean of approximately 57 terms per report and captured about 45% of the relevant report concepts. A hybrid system captured approximately 66% of the relevant concepts and output about 71 terms per report. Conclusion: The outputs of PostDoc and Pindex are complementary in capturing MeSH terms from clinical free text. The results suggest possible approaches to reduce the number of terms output while maintaining the percentage of terms captured, including the use of UMLS semantic types to constrain the output list to contain only clinically relevant MeSH terms. PMID:9452986

  4. Pilot study for the registry of complications in rheumatic diseases from the German Society of Surgery (DGORh): evaluation of methods and data from the first 1000 patients.

    PubMed

    Kostuj, Tanja; Rehart, Stefan; Matta-Hurtado, Ronald; Biehl, Christoph; Willburger, Roland E; Schmidt, Klaus

    2017-10-10

    Most patients suffering with rheumatic diseases who undergo surgical treatment are receiving immune-modulating therapy. To determine whether these medications affect their outcomes a national registry was established in Germany by the German Society of Surgery (DGORh). Data from the first 1000 patients were used in a pilot study to identify relevant corisk factors and to determine whether such a registry is suitable for developing accurate and relevant recommendations. Data were collected from patients undergoing surgical treatments with their written consent. A second consent form was used, if complications occurred. During this pilot study, in order to obtain a quicker overview, risk factors were considered only in patients with complications. Only descriptive statistical analysis was employed in this pilot study due to limited number of observed complications and inhomogeneous data regarding the surgery and the medications the patients received. Analytical statistics will be performed to confirm the results in a future outcome study. Complications occurred in 26 patients and were distributed equally among the different types of surgeries. Twenty one of these patients were receiving immune-modulating therapy at the time, while five were not. Infections were observed in 2.3% of patients receiving and in 5.1% not receiving immunosuppression. Due to the low number of cases, inhomogeneity in the diseases and the treatments received by the patients in this pilot study, it is not possible to develop standardised best-practice recommendations to optimise their care. Based on this observation we conclude that in order to be suitable to develop accurate and relevant recommendations a national registry must include the most important and relevant variables that impact the care and outcomes of these patients. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  5. Attitudes towards statistics of graduate entry medical students: the role of prior learning experiences

    PubMed Central

    2014-01-01

    Background While statistics is increasingly taught as part of the medical curriculum, it can be an unpopular subject and feedback from students indicates that some find it more difficult than other subjects. Understanding attitudes towards statistics on entry to graduate entry medical programmes is particularly important, given that many students may have been exposed to quantitative courses in their previous degree and hence bring preconceptions of their ability and interest to their medical education programme. The aim of this study therefore is to explore, for the first time, attitudes towards statistics of graduate entry medical students from a variety of backgrounds and focus on understanding the role of prior learning experiences. Methods 121 first year graduate entry medical students completed the Survey of Attitudes toward Statistics instrument together with information on demographics and prior learning experiences. Results Students tended to appreciate the relevance of statistics in their professional life and be prepared to put effort into learning statistics. They had neutral to positive attitudes about their interest in statistics and their intellectual knowledge and skills when applied to it. Their feelings towards statistics were slightly less positive e.g. feelings of insecurity, stress, fear and frustration and they tended to view statistics as difficult. Even though 85% of students had taken a quantitative course in the past, only 24% of students described it as likely that they would take any course in statistics if the choice was theirs. How well students felt they had performed in mathematics in the past was a strong predictor of many of the components of attitudes. Conclusion The teaching of statistics to medical students should start with addressing the association between students’ past experiences in mathematics and their attitudes towards statistics and encouraging students to recognise the difference between the two disciplines. Addressing these issues may reduce students’ anxiety and perception of difficulty at the start of their learning experience and encourage students to engage with statistics in their future careers. PMID:24708762

  6. Theory of the Sea Ice Thickness Distribution

    NASA Astrophysics Data System (ADS)

    Toppaladoddi, Srikanth; Wettlaufer, J. S.

    2015-10-01

    We use concepts from statistical physics to transform the original evolution equation for the sea ice thickness distribution g (h ) from Thorndike et al. into a Fokker-Planck-like conservation law. The steady solution is g (h )=N (q )hqe-h /H, where q and H are expressible in terms of moments over the transition probabilities between thickness categories. The solution exhibits the functional form used in observational fits and shows that for h ≪1 , g (h ) is controlled by both thermodynamics and mechanics, whereas for h ≫1 only mechanics controls g (h ). Finally, we derive the underlying Langevin equation governing the dynamics of the ice thickness h , from which we predict the observed g (h ). The genericity of our approach provides a framework for studying the geophysical-scale structure of the ice pack using methods of broad relevance in statistical mechanics.

  7. How to Perform a Systematic Review and Meta-analysis of Diagnostic Imaging Studies.

    PubMed

    Cronin, Paul; Kelly, Aine Marie; Altaee, Duaa; Foerster, Bradley; Petrou, Myria; Dwamena, Ben A

    2018-05-01

    A systematic review is a comprehensive search, critical evaluation, and synthesis of all the relevant studies on a specific (clinical) topic that can be applied to the evaluation of diagnostic and screening imaging studies. It can be a qualitative or a quantitative (meta-analysis) review of available literature. A meta-analysis uses statistical methods to combine and summarize the results of several studies. In this review, a 12-step approach to performing a systematic review (and meta-analysis) is outlined under the four domains: (1) Problem Formulation and Data Acquisition, (2) Quality Appraisal of Eligible Studies, (3) Statistical Analysis of Quantitative Data, and (4) Clinical Interpretation of the Evidence. This review is specifically geared toward the performance of a systematic review and meta-analysis of diagnostic test accuracy (imaging) studies. Copyright © 2018 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.

  8. Theory of the Sea Ice Thickness Distribution.

    PubMed

    Toppaladoddi, Srikanth; Wettlaufer, J S

    2015-10-02

    We use concepts from statistical physics to transform the original evolution equation for the sea ice thickness distribution g(h) from Thorndike et al. into a Fokker-Planck-like conservation law. The steady solution is g(h)=N(q)h(q)e(-h/H), where q and H are expressible in terms of moments over the transition probabilities between thickness categories. The solution exhibits the functional form used in observational fits and shows that for h≪1, g(h) is controlled by both thermodynamics and mechanics, whereas for h≫1 only mechanics controls g(h). Finally, we derive the underlying Langevin equation governing the dynamics of the ice thickness h, from which we predict the observed g(h). The genericity of our approach provides a framework for studying the geophysical-scale structure of the ice pack using methods of broad relevance in statistical mechanics.

  9. Linear models: permutation methods

    USGS Publications Warehouse

    Cade, B.S.; Everitt, B.S.; Howell, D.C.

    2005-01-01

    Permutation tests (see Permutation Based Inference) for the linear model have applications in behavioral studies when traditional parametric assumptions about the error term in a linear model are not tenable. Improved validity of Type I error rates can be achieved with properly constructed permutation tests. Perhaps more importantly, increased statistical power, improved robustness to effects of outliers, and detection of alternative distributional differences can be achieved by coupling permutation inference with alternative linear model estimators. For example, it is well-known that estimates of the mean in linear model are extremely sensitive to even a single outlying value of the dependent variable compared to estimates of the median [7, 19]. Traditionally, linear modeling focused on estimating changes in the center of distributions (means or medians). However, quantile regression allows distributional changes to be estimated in all or any selected part of a distribution or responses, providing a more complete statistical picture that has relevance to many biological questions [6]...

  10. Descriptive epidemiology of breast cancer in China: incidence, mortality, survival and prevalence.

    PubMed

    Li, Tong; Mello-Thoms, Claudia; Brennan, Patrick C

    2016-10-01

    Breast cancer is the most common neoplasm diagnosed amongst women worldwide and is the leading cause of female cancer death. However, breast cancer in China is not comprehensively understood compared with Westernised countries, although the 5-year prevalence statistics indicate that approximately 11 % of worldwide breast cancer occurs in China and that the incidence has increased rapidly in recent decades. This paper reviews the descriptive epidemiology of Chinese breast cancer in terms of incidence, mortality, survival and prevalence, and explores relevant factors such as age of manifestation and geographic locations. The statistics are compared with data from the Westernised world with particular emphasis on the United States and Australia. Potential causal agents responsible for differences in breast cancer epidemiology between Chinese and other populations are also explored. The need to minimise variability and discrepancies in methods of data acquisition, analysis and presentation is highlighted.

  11. Improving the Prognostic Ability through Better Use of Standard Clinical Data - The Nottingham Prognostic Index as an Example

    PubMed Central

    Winzer, Klaus-Jürgen; Buchholz, Anika; Schumacher, Martin; Sauerbrei, Willi

    2016-01-01

    Background Prognostic factors and prognostic models play a key role in medical research and patient management. The Nottingham Prognostic Index (NPI) is a well-established prognostic classification scheme for patients with breast cancer. In a very simple way, it combines the information from tumor size, lymph node stage and tumor grade. For the resulting index cutpoints are proposed to classify it into three to six groups with different prognosis. As not all prognostic information from the three and other standard factors is used, we will consider improvement of the prognostic ability using suitable analysis approaches. Methods and Findings Reanalyzing overall survival data of 1560 patients from a clinical database by using multivariable fractional polynomials and further modern statistical methods we illustrate suitable multivariable modelling and methods to derive and assess the prognostic ability of an index. Using a REMARK type profile we summarize relevant steps of the analysis. Adding the information from hormonal receptor status and using the full information from the three NPI components, specifically concerning the number of positive lymph nodes, an extended NPI with improved prognostic ability is derived. Conclusions The prognostic ability of even one of the best established prognostic index in medicine can be improved by using suitable statistical methodology to extract the full information from standard clinical data. This extended version of the NPI can serve as a benchmark to assess the added value of new information, ranging from a new single clinical marker to a derived index from omics data. An established benchmark would also help to harmonize the statistical analyses of such studies and protect against the propagation of many false promises concerning the prognostic value of new measurements. Statistical methods used are generally available and can be used for similar analyses in other diseases. PMID:26938061

  12. Successful classification of cocaine dependence using brain imaging: a generalizable machine learning approach.

    PubMed

    Mete, Mutlu; Sakoglu, Unal; Spence, Jeffrey S; Devous, Michael D; Harris, Thomas S; Adinoff, Bryon

    2016-10-06

    Neuroimaging studies have yielded significant advances in the understanding of neural processes relevant to the development and persistence of addiction. However, these advances have not explored extensively for diagnostic accuracy in human subjects. The aim of this study was to develop a statistical approach, using a machine learning framework, to correctly classify brain images of cocaine-dependent participants and healthy controls. In this study, a framework suitable for educing potential brain regions that differed between the two groups was developed and implemented. Single Photon Emission Computerized Tomography (SPECT) images obtained during rest or a saline infusion in three cohorts of 2-4 week abstinent cocaine-dependent participants (n = 93) and healthy controls (n = 69) were used to develop a classification model. An information theoretic-based feature selection algorithm was first conducted to reduce the number of voxels. A density-based clustering algorithm was then used to form spatially connected voxel clouds in three-dimensional space. A statistical classifier, Support Vectors Machine (SVM), was then used for participant classification. Statistically insignificant voxels of spatially connected brain regions were removed iteratively and classification accuracy was reported through the iterations. The voxel-based analysis identified 1,500 spatially connected voxels in 30 distinct clusters after a grid search in SVM parameters. Participants were successfully classified with 0.88 and 0.89 F-measure accuracies in 10-fold cross validation (10xCV) and leave-one-out (LOO) approaches, respectively. Sensitivity and specificity were 0.90 and 0.89 for LOO; 0.83 and 0.83 for 10xCV. Many of the 30 selected clusters are highly relevant to the addictive process, including regions relevant to cognitive control, default mode network related self-referential thought, behavioral inhibition, and contextual memories. Relative hyperactivity and hypoactivity of regional cerebral blood flow in brain regions in cocaine-dependent participants are presented with corresponding level of significance. The SVM-based approach successfully classified cocaine-dependent and healthy control participants using voxels selected with information theoretic-based and statistical methods from participants' SPECT data. The regions found in this study align with brain regions reported in the literature. These findings support the future use of brain imaging and SVM-based classifier in the diagnosis of substance use disorders and furthering an understanding of their underlying pathology.

  13. Joint principal trend analysis for longitudinal high-dimensional data.

    PubMed

    Zhang, Yuping; Ouyang, Zhengqing

    2018-06-01

    We consider a research scenario motivated by integrating multiple sources of information for better knowledge discovery in diverse dynamic biological processes. Given two longitudinal high-dimensional datasets for a group of subjects, we want to extract shared latent trends and identify relevant features. To solve this problem, we present a new statistical method named as joint principal trend analysis (JPTA). We demonstrate the utility of JPTA through simulations and applications to gene expression data of the mammalian cell cycle and longitudinal transcriptional profiling data in response to influenza viral infections. © 2017, The International Biometric Society.

  14. Crisis as opportunity: international health work during the economic depression.

    PubMed

    Borowy, Iris

    2008-01-01

    The economic depression of the 1930s represented the most important economic and social crisis of its time. Surprisingly, its effect on health did not show in available morbidity and mortality rates. In 1932, the League of Nations Health Organisation embarked on a six-point program addressing statistical methods of measuring the effect and its influence on mental health and nutrition and establishing ways to safeguard public health through more efficient health systems. Some of these studies resulted in considerations of general relevance beyond crisis management. Unexpectedly, the crisis offered an opportunity to reconsider key concepts of individual and public health.

  15. Zone clearance in an infinite TASEP with a step initial condition

    NASA Astrophysics Data System (ADS)

    Cividini, Julien; Appert-Rolland, Cécile

    2017-06-01

    The TASEP is a paradigmatic model of out-of-equilibrium statistical physics, for which many quantities have been computed, either exactly or by approximate methods. In this work we study two new kinds of observables that have some relevance in biological or traffic models. They represent the probability for a given clearance zone of the lattice to be empty (for the first time) at a given time, starting from a step density profile. Exact expressions are obtained for single-time quantities, while more involved history-dependent observables are studied by Monte Carlo simulation, and partially predicted by a phenomenological approach.

  16. Models of Pilot Behavior and Their Use to Evaluate the State of Pilot Training

    NASA Astrophysics Data System (ADS)

    Jirgl, Miroslav; Jalovecky, Rudolf; Bradac, Zdenek

    2016-07-01

    This article discusses the possibilities of obtaining new information related to human behavior, namely the changes or progressive development of pilots' abilities during training. The main assumption is that a pilot's ability can be evaluated based on a corresponding behavioral model whose parameters are estimated using mathematical identification procedures. The mean values of the identified parameters are obtained via statistical methods. These parameters are then monitored and their changes evaluated. In this context, the paper introduces and examines relevant mathematical models of human (pilot) behavior, the pilot-aircraft interaction, and an example of the mathematical analysis.

  17. Recommendations for research design of telehealth studies.

    PubMed

    Chumbler, Neale R; Kobb, Rita; Brennan, David M; Rabinowitz, Terry

    2008-11-01

    Properly designed randomized controlled trials (RCTs) are the gold standard to use when examining the effectiveness of telehealth interventions on clinical outcomes. Some published telehealth studies have employed well-designed RCTs. However, such methods are not always feasible and practical in particular settings. This white paper addresses not only the need for properly designed RCTs, but also offers alternative research designs, such as quasi-experimental designs, and statistical techniques that can be employed to rigorously assess the effectiveness of telehealth studies. This paper further offers design and measurement recommendations aimed at and relevant to administrative decision-makers, policymakers, and practicing clinicians.

  18. Arthroscopic Debridement for Primary Degenerative Osteoarthritis of the Elbow Leads to Significant Improvement in Range of Motion and Clinical Outcomes: A Systematic Review.

    PubMed

    Sochacki, Kyle R; Jack, Robert A; Hirase, Takashi; McCulloch, Patrick C; Lintner, David M; Liberman, Shari R; Harris, Joshua D

    2017-12-01

    The purpose of this investigation was to determine whether arthroscopic debridement of primary elbow osteoarthritis results in statistically significant and clinically relevant improvement in (1) elbow range of motion and (2) clinical outcomes with (3) low complication and reoperation rates. A systematic review was registered with PROSPERO and performed using PRISMA guidelines. Databases were searched for studies that investigated the outcomes of arthroscopic debridement for the treatment of primary osteoarthritis of the elbow in adult human patients. Study methodological quality was analyzed. Studies that included post-traumatic arthritis were excluded. Elbow motion and all elbow-specific patient-reported outcome scores were eligible for analysis. Comparisons between preoperative and postoperative values from each study were made using 2-sample Z-tests (http://in-silico.net/tools/statistics/ztest) using a P value < .05. Nine articles (209 subjects, 213 elbows, 187 males, 22 females, mean age 45.7 ± 7.1 years, mean follow-up 41.7 ± 16.3. months; 75% right, 25% left; 79% dominant elbow, 21% nondominant) were analyzed. Elbow extension (23.4°-10.7°, Δ 12.7°), flexion (115.9°-128.7°, Δ 12.8°), and global arc of motion (94.5°-117.6°, Δ 23.1°) had statistically significant and clinically relevant improvement following arthroscopic debridement (P < .0001 for all). There was also a statistically significant (P < .0001) and clinically relevant improvement in the Mayo Elbow Performance Score (60.7-84.6, Δ 23.9) postoperatively. Six patients (2.8%) had postoperative complications. Nine (4.2%) underwent reoperation. Elbow arthroscopic debridement for primary degenerative osteoarthritis results in statistically significant and clinically relevant improvement in elbow range of motion and clinical outcomes with low complication and reoperation rates. Systematic review of level IV studies. Copyright © 2017 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.

  19. Improved accuracy of supervised CRM discovery with interpolated Markov models and cross-species comparison.

    PubMed

    Kazemian, Majid; Zhu, Qiyun; Halfon, Marc S; Sinha, Saurabh

    2011-12-01

    Despite recent advances in experimental approaches for identifying transcriptional cis-regulatory modules (CRMs, 'enhancers'), direct empirical discovery of CRMs for all genes in all cell types and environmental conditions is likely to remain an elusive goal. Effective methods for computational CRM discovery are thus a critically needed complement to empirical approaches. However, existing computational methods that search for clusters of putative binding sites are ineffective if the relevant TFs and/or their binding specificities are unknown. Here, we provide a significantly improved method for 'motif-blind' CRM discovery that does not depend on knowledge or accurate prediction of TF-binding motifs and is effective when limited knowledge of functional CRMs is available to 'supervise' the search. We propose a new statistical method, based on 'Interpolated Markov Models', for motif-blind, genome-wide CRM discovery. It captures the statistical profile of variable length words in known CRMs of a regulatory network and finds candidate CRMs that match this profile. The method also uses orthologs of the known CRMs from closely related genomes. We perform in silico evaluation of predicted CRMs by assessing whether their neighboring genes are enriched for the expected expression patterns. This assessment uses a novel statistical test that extends the widely used Hypergeometric test of gene set enrichment to account for variability in intergenic lengths. We find that the new CRM prediction method is superior to existing methods. Finally, we experimentally validate 12 new CRM predictions by examining their regulatory activity in vivo in Drosophila; 10 of the tested CRMs were found to be functional, while 6 of the top 7 predictions showed the expected activity patterns. We make our program available as downloadable source code, and as a plugin for a genome browser installed on our servers. © The Author(s) 2011. Published by Oxford University Press.

  20. A bootstrap based Neyman-Pearson test for identifying variable importance.

    PubMed

    Ditzler, Gregory; Polikar, Robi; Rosen, Gail

    2015-04-01

    Selection of most informative features that leads to a small loss on future data are arguably one of the most important steps in classification, data analysis and model selection. Several feature selection (FS) algorithms are available; however, due to noise present in any data set, FS algorithms are typically accompanied by an appropriate cross-validation scheme. In this brief, we propose a statistical hypothesis test derived from the Neyman-Pearson lemma for determining if a feature is statistically relevant. The proposed approach can be applied as a wrapper to any FS algorithm, regardless of the FS criteria used by that algorithm, to determine whether a feature belongs in the relevant set. Perhaps more importantly, this procedure efficiently determines the number of relevant features given an initial starting point. We provide freely available software implementations of the proposed methodology.

  1. The effect of numerical methods on the simulation of mid-ocean ridge hydrothermal models

    NASA Astrophysics Data System (ADS)

    Carpio, J.; Braack, M.

    2012-01-01

    This work considers the effect of the numerical method on the simulation of a 2D model of hydrothermal systems located in the high-permeability axial plane of mid-ocean ridges. The behavior of hot plumes, formed in a porous medium between volcanic lava and the ocean floor, is very irregular due to convective instabilities. Therefore, we discuss and compare two different numerical methods for solving the mathematical model of this system. In concrete, we consider two ways to treat the temperature equation of the model: a semi-Lagrangian formulation of the advective terms in combination with a Galerkin finite element method for the parabolic part of the equations and a stabilized finite element scheme. Both methods are very robust and accurate. However, due to physical instabilities in the system at high Rayleigh number, the effect of the numerical method is significant with regard to the temperature distribution at a certain time instant. The good news is that relevant statistical quantities remain relatively stable and coincide for the two numerical schemes. The agreement is larger in the case of a mathematical model with constant water properties. In the case of a model with nonlinear dependence of the water properties on the temperature and pressure, the agreement in the statistics is clearly less pronounced. Hence, the presented work accentuates the need for a strengthened validation of the compatibility between numerical scheme (accuracy/resolution) and complex (realistic/nonlinear) models.

  2. Determinants of Judgments of Explanatory Power: Credibility, Generality, and Statistical Relevance.

    PubMed

    Colombo, Matteo; Bucher, Leandra; Sprenger, Jan

    2017-01-01

    Explanation is a central concept in human psychology. Drawing upon philosophical theories of explanation, psychologists have recently begun to examine the relationship between explanation, probability and causality. Our study advances this growing literature at the intersection of psychology and philosophy of science by systematically investigating how judgments of explanatory power are affected by (i) the prior credibility of an explanatory hypothesis, (ii) the causal framing of the hypothesis, (iii) the perceived generalizability of the explanation, and (iv) the relation of statistical relevance between hypothesis and evidence. Collectively, the results of our five experiments support the hypothesis that the prior credibility of a causal explanation plays a central role in explanatory reasoning: first, because of the presence of strong main effects on judgments of explanatory power, and second, because of the gate-keeping role it has for other factors. Highly credible explanations are not susceptible to causal framing effects, but they are sensitive to the effects of normatively relevant factors: the generalizability of an explanation, and its statistical relevance for the evidence. These results advance current literature in the philosophy and psychology of explanation in three ways. First, they yield a more nuanced understanding of the determinants of judgments of explanatory power, and the interaction between these factors. Second, they show the close relationship between prior beliefs and explanatory power. Third, they elucidate the nature of abductive reasoning.

  3. Determinants of Judgments of Explanatory Power: Credibility, Generality, and Statistical Relevance

    PubMed Central

    Colombo, Matteo; Bucher, Leandra; Sprenger, Jan

    2017-01-01

    Explanation is a central concept in human psychology. Drawing upon philosophical theories of explanation, psychologists have recently begun to examine the relationship between explanation, probability and causality. Our study advances this growing literature at the intersection of psychology and philosophy of science by systematically investigating how judgments of explanatory power are affected by (i) the prior credibility of an explanatory hypothesis, (ii) the causal framing of the hypothesis, (iii) the perceived generalizability of the explanation, and (iv) the relation of statistical relevance between hypothesis and evidence. Collectively, the results of our five experiments support the hypothesis that the prior credibility of a causal explanation plays a central role in explanatory reasoning: first, because of the presence of strong main effects on judgments of explanatory power, and second, because of the gate-keeping role it has for other factors. Highly credible explanations are not susceptible to causal framing effects, but they are sensitive to the effects of normatively relevant factors: the generalizability of an explanation, and its statistical relevance for the evidence. These results advance current literature in the philosophy and psychology of explanation in three ways. First, they yield a more nuanced understanding of the determinants of judgments of explanatory power, and the interaction between these factors. Second, they show the close relationship between prior beliefs and explanatory power. Third, they elucidate the nature of abductive reasoning. PMID:28928679

  4. Learning semantic and visual similarity for endomicroscopy video retrieval.

    PubMed

    Andre, Barbara; Vercauteren, Tom; Buchner, Anna M; Wallace, Michael B; Ayache, Nicholas

    2012-06-01

    Content-based image retrieval (CBIR) is a valuable computer vision technique which is increasingly being applied in the medical community for diagnosis support. However, traditional CBIR systems only deliver visual outputs, i.e., images having a similar appearance to the query, which is not directly interpretable by the physicians. Our objective is to provide a system for endomicroscopy video retrieval which delivers both visual and semantic outputs that are consistent with each other. In a previous study, we developed an adapted bag-of-visual-words method for endomicroscopy retrieval, called "Dense-Sift," that computes a visual signature for each video. In this paper, we present a novel approach to complement visual similarity learning with semantic knowledge extraction, in the field of in vivo endomicroscopy. We first leverage a semantic ground truth based on eight binary concepts, in order to transform these visual signatures into semantic signatures that reflect how much the presence of each semantic concept is expressed by the visual words describing the videos. Using cross-validation, we demonstrate that, in terms of semantic detection, our intuitive Fisher-based method transforming visual-word histograms into semantic estimations outperforms support vector machine (SVM) methods with statistical significance. In a second step, we propose to improve retrieval relevance by learning an adjusted similarity distance from a perceived similarity ground truth. As a result, our distance learning method allows to statistically improve the correlation with the perceived similarity. We also demonstrate that, in terms of perceived similarity, the recall performance of the semantic signatures is close to that of visual signatures and significantly better than those of several state-of-the-art CBIR methods. The semantic signatures are thus able to communicate high-level medical knowledge while being consistent with the low-level visual signatures and much shorter than them. In our resulting retrieval system, we decide to use visual signatures for perceived similarity learning and retrieval, and semantic signatures for the output of an additional information, expressed in the endoscopist own language, which provides a relevant semantic translation of the visual retrieval outputs.

  5. Randomized clinical trials in implant therapy: relationships among methodological, statistical, clinical, paratextual features and number of citations.

    PubMed

    Nieri, Michele; Clauser, Carlo; Franceschi, Debora; Pagliaro, Umberto; Saletta, Daniele; Pini-Prato, Giovanpaolo

    2007-08-01

    The aim of the present study was to investigate the relationships among reported methodological, statistical, clinical and paratextual variables of randomized clinical trials (RCTs) in implant therapy, and their influence on subsequent research. The material consisted of the RCTs in implant therapy published through the end of the year 2000. Methodological, statistical, clinical and paratextual features of the articles were assessed and recorded. The perceived clinical relevance was subjectively evaluated by an experienced clinician on anonymous abstracts. The impact on research was measured by the number of citations found in the Science Citation Index. A new statistical technique (Structural learning of Bayesian Networks) was used to assess the relationships among the considered variables. Descriptive statistics revealed that the reported methodology and statistics of RCTs in implant therapy were defective. Follow-up of the studies was generally short. The perceived clinical relevance appeared to be associated with the objectives of the studies and with the number of published images in the original articles. The impact on research was related to the nationality of the involved institutions and to the number of published images. RCTs in implant therapy (until 2000) show important methodological and statistical flaws and may not be appropriate for guiding clinicians in their practice. The methodological and statistical quality of the studies did not appear to affect their impact on practice and research. Bayesian Networks suggest new and unexpected relationships among the methodological, statistical, clinical and paratextual features of RCTs.

  6. Methods for specifying spatial boundaries of cities in the world: The impacts of delineation methods on city sustainability indices.

    PubMed

    Uchiyama, Yuta; Mori, Koichiro

    2017-08-15

    The purpose of this paper is to analyze how different definitions and methods for delineating the spatial boundaries of cities have an impact on the values of city sustainability indicators. It is necessary to distinguish the inside of cities from the outside when calculating the values of sustainability indicators that assess the impacts of human activities within cities on areas beyond their boundaries. For this purpose, spatial boundaries of cities should be practically detected on the basis of a relevant definition of a city. Although no definition of a city is commonly shared among academic fields, three practical methods for identifying urban areas are available in remote sensing science. Those practical methods are based on population density, landcover, and night-time lights. These methods are correlated, but non-negligible differences exist in their determination of urban extents and urban population. Furthermore, critical and statistically significant differences in some urban environmental sustainability indicators result from the three different urban detection methods. For example, the average values of CO 2 emissions per capita and PM 10 concentration in cities with more than 1 million residents are significantly different among the definitions. When analyzing city sustainability indicators and disseminating the implication of the results, the values based on the different definitions should be simultaneously investigated. It is necessary to carefully choose a relevant definition to analyze sustainability indicators for policy making. Otherwise, ineffective and inefficient policies will be developed. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Long-term variability of global statistical properties of epileptic brain networks

    NASA Astrophysics Data System (ADS)

    Kuhnert, Marie-Therese; Elger, Christian E.; Lehnertz, Klaus

    2010-12-01

    We investigate the influence of various pathophysiologic and physiologic processes on global statistical properties of epileptic brain networks. We construct binary functional networks from long-term, multichannel electroencephalographic data recorded from 13 epilepsy patients, and the average shortest path length and the clustering coefficient serve as global statistical network characteristics. For time-resolved estimates of these characteristics we observe large fluctuations over time, however, with some periodic temporal structure. These fluctuations can—to a large extent—be attributed to daily rhythms while relevant aspects of the epileptic process contribute only marginally. Particularly, we could not observe clear cut changes in network states that can be regarded as predictive of an impending seizure. Our findings are of particular relevance for studies aiming at an improved understanding of the epileptic process with graph-theoretical approaches.

  8. diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data.

    PubMed

    Lun, Aaron T L; Smyth, Gordon K

    2015-08-19

    Chromatin conformation capture with high-throughput sequencing (Hi-C) is a technique that measures the in vivo intensity of interactions between all pairs of loci in the genome. Most conventional analyses of Hi-C data focus on the detection of statistically significant interactions. However, an alternative strategy involves identifying significant changes in the interaction intensity (i.e., differential interactions) between two or more biological conditions. This is more statistically rigorous and may provide more biologically relevant results. Here, we present the diffHic software package for the detection of differential interactions from Hi-C data. diffHic provides methods for read pair alignment and processing, counting into bin pairs, filtering out low-abundance events and normalization of trended or CNV-driven biases. It uses the statistical framework of the edgeR package to model biological variability and to test for significant differences between conditions. Several options for the visualization of results are also included. The use of diffHic is demonstrated with real Hi-C data sets. Performance against existing methods is also evaluated with simulated data. On real data, diffHic is able to successfully detect interactions with significant differences in intensity between biological conditions. It also compares favourably to existing software tools on simulated data sets. These results suggest that diffHic is a viable approach for differential analyses of Hi-C data.

  9. Predicting Flowering Behavior and Exploring Its Genetic Determinism in an Apple Multi-family Population Based on Statistical Indices and Simplified Phenotyping.

    PubMed

    Durand, Jean-Baptiste; Allard, Alix; Guitton, Baptiste; van de Weg, Eric; Bink, Marco C A M; Costes, Evelyne

    2017-01-01

    Irregular flowering over years is commonly observed in fruit trees. The early prediction of tree behavior is highly desirable in breeding programmes. This study aims at performing such predictions, combining simplified phenotyping and statistics methods. Sequences of vegetative vs. floral annual shoots (AS) were observed along axes in trees belonging to five apple related full-sib families. Sequences were analyzed using Markovian and linear mixed models including year and site effects. Indices of flowering irregularity, periodicity and synchronicity were estimated, at tree and axis scales. They were used to predict tree behavior and detect QTL with a Bayesian pedigree-based analysis, using an integrated genetic map containing 6,849 SNPs. The combination of a Biennial Bearing Index (BBI) with an autoregressive coefficient (γ g ) efficiently predicted and classified the genotype behaviors, despite few misclassifications. Four QTLs common to BBIs and γ g and one for synchronicity were highlighted and revealed the complex genetic architecture of the traits. Irregularity resulted from high AS synchronism, whereas regularity resulted from either asynchronous locally alternating or continual regular AS flowering. A relevant and time-saving method, based on a posteriori sampling of axes and statistical indices is proposed, which is efficient to evaluate the tree breeding values for flowering regularity and could be transferred to other species.

  10. Suicide Risk Assessment and Prevention: A Systematic Review Focusing on Veterans.

    PubMed

    Nelson, Heidi D; Denneson, Lauren M; Low, Allison R; Bauer, Brian W; O'Neil, Maya; Kansagara, Devan; Teo, Alan R

    2017-10-01

    Suicide rates in veteran and military populations in the United States are high. This article reviews studies of the accuracy of methods to identify individuals at increased risk of suicide and the effectiveness and adverse effects of health care interventions relevant to U.S. veteran and military populations in reducing suicide and suicide attempts. Trials, observational studies, and systematic reviews relevant to U.S. veterans and military personnel were identified in searches of MEDLINE, PsycINFO, SocINDEX, and Cochrane databases (January 1, 2008, to September 11, 2015), on Web sites, and in reference lists. Investigators extracted and confirmed data and dual-rated risk of bias for included studies. Nineteen studies evaluated accuracy of risk assessment methods, including models using retrospective electronic records data and clinician- or patient-rated instruments. Most methods demonstrated sensitivity ≥80% or area-under-the-curve values ≥.70 in single studies, including two studies based on electronic records of veterans and military personnel, but specificity varied. Suicide rates were reduced in six of eight observational studies of population-level interventions. Only two of ten trials of individual-level psychotherapy reported statistically significant differences between treatment and usual care. Risk assessment methods have been shown to be sensitive predictors of suicide and suicide attempts, but the frequency of false positives limits their clinical utility. Research to refine these methods and examine clinical applications is needed. Studies of suicide prevention interventions are inconclusive; trials of population-level interventions and promising therapies are required to support their clinical use.

  11. Statistical learning algorithms for identifying contrasting tillage practices with landsat thematic mapper data

    USDA-ARS?s Scientific Manuscript database

    Tillage management practices have direct impact on water holding capacity, evaporation, carbon sequestration, and water quality. This study examines the feasibility of two statistical learning algorithms, such as Least Square Support Vector Machine (LSSVM) and Relevance Vector Machine (RVM), for cla...

  12. 50 CFR 600.133 - Scientific and Statistical Committee (SSC).

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... information as is relevant to such Council's development and amendment of any fishery management plan. (b...). 600.133 Section 600.133 Wildlife and Fisheries FISHERY CONSERVATION AND MANAGEMENT, NATIONAL OCEANIC... Fishery Management Councils § 600.133 Scientific and Statistical Committee (SSC). (a) Each Council shall...

  13. 50 CFR 600.133 - Scientific and Statistical Committee (SSC).

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... information as is relevant to such Council's development and amendment of any fishery management plan. (b...). 600.133 Section 600.133 Wildlife and Fisheries FISHERY CONSERVATION AND MANAGEMENT, NATIONAL OCEANIC... Fishery Management Councils § 600.133 Scientific and Statistical Committee (SSC). (a) Each Council shall...

  14. Major bleeding after percutaneous coronary intervention and risk of subsequent mortality: a systematic review and meta-analysis

    PubMed Central

    Kwok, Chun Shing; Rao, Sunil V; Myint, Phyo K; Keavney, Bernard; Nolan, James; Ludman, Peter F; de Belder, Mark A; Loke, Yoon K; Mamas, Mamas A

    2014-01-01

    Objectives To examine the relationship between periprocedural bleeding complications and major adverse cardiovascular events (MACEs) and mortality outcomes following percutaneous coronary intervention (PCI) and study differences in the prognostic impact of different bleeding definitions. Methods We conducted a systematic review and meta-analysis of PCI studies that evaluated periprocedural bleeding complications and their impact on MACEs and mortality outcomes. A systematic search of MEDLINE and EMBASE was conducted to identify relevant studies. Data from relevant studies were extracted and random effects meta-analysis was used to estimate the risk of adverse outcomes with periprocedural bleeding. Statistical heterogeneity was assessed by considering the I2 statistic. Results 42 relevant studies were identified including 533 333 patients. Meta-analysis demonstrated that periprocedural major bleeding complications was independently associated with increased risk of mortality (OR 3.31 (2.86 to 3.82), I2=80%) and MACEs (OR 3.89 (3.26 to 4.64), I2=42%). A differential impact of major bleeding as defined by different bleeding definitions on mortality outcomes was observed, in which the REPLACE-2 (OR 6.69, 95% CI 2.26 to 19.81), STEEPLE (OR 6.59, 95% CI 3.89 to 11.16) and BARC (OR 5.40, 95% CI 1.74 to 16.74) had the worst prognostic impacts while HORIZONS-AMI (OR 1.51, 95% CI 1.11 to 2.05) had the least impact on mortality outcomes. Conclusions Major bleeding after PCI is independently associated with a threefold increase in mortality and MACEs outcomes. Different contemporary bleeding definitions have differential impacts on mortality outcomes, with 1.5–6.7-fold increases in mortality observed depending on the definition of major bleeding used. PMID:25332786

  15. Health-Related Quality of Life up to Six Years After {sup 125}I Brachytherapy for Early-Stage Prostate Cancer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Roeloffzen, Ellen M.A., E-mail: E.M.A.Roeloffzen@UMCUtrecht.n; Lips, Irene M.; Gellekom, Marion P.R. van

    2010-03-15

    Purpose: Health-related quality of life (HRQOL) after prostate brachytherapy has been extensively described in published reports but hardly any long-term data are available. The aim of the present study was to prospectively assess long-term HRQOL 6 years after {sup 125}I prostate brachytherapy. Methods and Materials: A total of 127 patients treated with {sup 125}I brachytherapy for early-stage prostate cancer between December 2000 and June 2003 completed a HRQOL questionnaire at five time-points: before treatment and 1 month, 6 months, 1 year, and 6 years after treatment. The questionnaire included the RAND-36 generic health survey, the cancer-specific European Organization for Researchmore » and Treatment of Cancer core questionnaire (EORTCQLQ-C30), and the tumor-specific EORTC prostate cancer module (EORTC-PR25). A change in a score of >=10 points was considered clinically relevant. Results: Overall, the HRQOL at 6 years after {sup 125}I prostate brachytherapy did not significantly differ from baseline. Although a statistically significant deterioration in HRQOL at 6 years was seen for urinary symptoms, bowel symptoms, pain, physical functioning, and sexual activity (p <.01), most changes were not clinically relevant. A statistically significant improvement at 6 years was seen for mental health, emotional functioning, and insomnia (p <.01). The only clinically relevant changes were seen for emotional functioning and sexual activity. Conclusion: This is the first study presenting prospective HRQOL data up to 6 years after {sup 125}I prostate brachytherapy. HRQOL scores returned to approximately baseline values at 1 year and remained stable up to 6 years after treatment. {sup 125}I prostate brachytherapy did not adversely affect patients' long-term HRQOL.« less

  16. Use of a Self-Reflection Tool to Enhance Resident Learning on an Adolescent Medicine Rotation.

    PubMed

    Greenberg, Katherine Blumoff; Baldwin, Constance

    2016-08-01

    Adolescent Medicine (AM) educators in pediatric residency programs are seeking new ways to engage learners in adolescent health. This mixed-methods study presents a novel self-reflection tool and addresses whether self-reflection enhanced residents' perception of the value of an adolescent rotation, in particular, its relevance to their future practice. The self-reflection tool included 17 Likert scale items on residents' comfort with the essential tasks of adolescent care and open-ended questions that promoted self-reflection and goal setting. Semi-structured, postrotation interviews encouraged residents to discuss their experiences. Likert scale data were analyzed using descriptive statistics, and interview notes and written comments on the self-reflection tool were combined for qualitative data analysis. Residents' pre-to post-self-evaluations showed statistically significant increases in comfort with most of the adolescent health care tasks. Four major themes emerged from our qualitative analysis: (1) the value of observing skilled attendings as role models; (2) the comfort gained through broad and frequent adolescent care experiences; (3) the career relevance of AM; and (4) the ability to set personally meaningful goals for the rotation. Residents used the self-reflection tool to mindfully set goals and found their AM education valuable and relevant to their future careers. Our tool helped make explicit to residents the norms, values, and beliefs of the hidden curriculum applied to the care of adolescents and helped them to improve the self-assessed quality of their rapport and communications with adolescents. We conclude that a structured self-reflection exercise can enhance residents' experiences on an AM rotation. Copyright © 2016 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.

  17. Inferring general relations between network characteristics from specific network ensembles.

    PubMed

    Cardanobile, Stefano; Pernice, Volker; Deger, Moritz; Rotter, Stefan

    2012-01-01

    Different network models have been suggested for the topology underlying complex interactions in natural systems. These models are aimed at replicating specific statistical features encountered in real-world networks. However, it is rarely considered to which degree the results obtained for one particular network class can be extrapolated to real-world networks. We address this issue by comparing different classical and more recently developed network models with respect to their ability to generate networks with large structural variability. In particular, we consider the statistical constraints which the respective construction scheme imposes on the generated networks. After having identified the most variable networks, we address the issue of which constraints are common to all network classes and are thus suitable candidates for being generic statistical laws of complex networks. In fact, we find that generic, not model-related dependencies between different network characteristics do exist. This makes it possible to infer global features from local ones using regression models trained on networks with high generalization power. Our results confirm and extend previous findings regarding the synchronization properties of neural networks. Our method seems especially relevant for large networks, which are difficult to map completely, like the neural networks in the brain. The structure of such large networks cannot be fully sampled with the present technology. Our approach provides a method to estimate global properties of under-sampled networks in good approximation. Finally, we demonstrate on three different data sets (C. elegans neuronal network, R. prowazekii metabolic network, and a network of synonyms extracted from Roget's Thesaurus) that real-world networks have statistical relations compatible with those obtained using regression models.

  18. Diagnostic potential of real-time elastography (RTE) and shear wave elastography (SWE) to differentiate benign and malignant thyroid nodules

    PubMed Central

    Hu, Xiangdong; Liu, Yujiang; Qian, Linxue

    2017-01-01

    Abstract Background: Real-time elastography (RTE) and shear wave elastography (SWE) are noninvasive and easily available imaging techniques that measure the tissue strain, and it has been reported that the sensitivity and the specificity of elastography were better in differentiating between benign and malignant thyroid nodules than conventional technologies. Methods: Relevant articles were searched in multiple databases; the comparison of elasticity index (EI) was conducted with the Review Manager 5.0. Forest plots of the sensitivity and specificity and SROC curve of RTE and SWE were performed with STATA 10.0 software. In addition, sensitivity analysis and bias analysis of the studies were conducted to examine the quality of articles; and to estimate possible publication bias, funnel plot was used and the Egger test was conducted. Results: Finally 22 articles which eventually satisfied the inclusion criteria were included in this study. After eliminating the inefficient, benign and malignant nodules were 2106 and 613, respectively. The meta-analysis suggested that the difference of EI between benign and malignant nodules was statistically significant (SMD = 2.11, 95% CI [1.67, 2.55], P < .00001). The overall sensitivities of RTE and SWE were roughly comparable, whereas the difference of specificities between these 2 methods was statistically significant. In addition, statistically significant difference of AUC between RTE and SWE was observed between RTE and SWE (P < .01). Conclusion: The specificity of RTE was statistically higher than that of SWE; which suggests that compared with SWE, RTE may be more accurate on differentiating benign and malignant thyroid nodules. PMID:29068996

  19. Addressing the mischaracterization of extreme rainfall in regional climate model simulations - A synoptic pattern based bias correction approach

    NASA Astrophysics Data System (ADS)

    Li, Jingwan; Sharma, Ashish; Evans, Jason; Johnson, Fiona

    2018-01-01

    Addressing systematic biases in regional climate model simulations of extreme rainfall is a necessary first step before assessing changes in future rainfall extremes. Commonly used bias correction methods are designed to match statistics of the overall simulated rainfall with observations. This assumes that change in the mix of different types of extreme rainfall events (i.e. convective and non-convective) in a warmer climate is of little relevance in the estimation of overall change, an assumption that is not supported by empirical or physical evidence. This study proposes an alternative approach to account for the potential change of alternate rainfall types, characterized here by synoptic weather patterns (SPs) using self-organizing maps classification. The objective of this study is to evaluate the added influence of SPs on the bias correction, which is achieved by comparing the corrected distribution of future extreme rainfall with that using conventional quantile mapping. A comprehensive synthetic experiment is first defined to investigate the conditions under which the additional information of SPs makes a significant difference to the bias correction. Using over 600,000 synthetic cases, statistically significant differences are found to be present in 46% cases. This is followed by a case study over the Sydney region using a high-resolution run of the Weather Research and Forecasting (WRF) regional climate model, which indicates a small change in the proportions of the SPs and a statistically significant change in the extreme rainfall over the region, although the differences between the changes obtained from the two bias correction methods are not statistically significant.

  20. Changes in physiotherapy students' knowledge and perceptions of EBP from first year to graduation: a mixed methods study.

    PubMed

    McEvoy, Maureen P; Lewis, Lucy K; Luker, Julie

    2018-05-11

    Dedicated Evidence-Based Practice (EBP) courses are often included in health professional education programs. It is important to understand the effectiveness of this training. This study investigated EBP outcomes in entry-level physiotherapy students from baseline to completion of all EBP training (graduation). Mixed methods with an explanatory sequential design. Physiotherapy students completed two psychometrically-tested health professional EBP instruments at baseline and graduation. The Evidence-Based Practice Profile questionnaire collected self-reported data (Terminology, Confidence, Practice, Relevance, Sympathy), and the Knowledge of Research Evidence Competencies instrument collected objective data (Actual Knowledge). Focus groups with students were conducted at graduation to gain a deeper understanding of the factors impacting changes in students' EBP knowledge, attitudes, behaviour and competency. Descriptive statistics, paired t-tests, 95% CI and effect sizes (ES) were used to examine changes in outcome scores from baseline to graduation. Transcribed focus group data were analysed following a qualitative descriptive approach with thematic analysis. A second stage of merged data analysis for mixed methods studies was undertaken using side-by-side comparisons to explore quantitatively assessed EBP measures with participants' personal perceptions. Data were analysed from 56 participants who completed both instruments at baseline and graduation, and from 21 focus group participants. Large ES were reported across most outcomes: Relevance (ES 2.29, p ≤ 0.001), Practice (1.8, p ≤ 0.001), Confidence (1.67, p ≤ 0.001), Terminology (3.13, p ≤ 0.001) and Actual Knowledge (4.3, p ≤ 0.001). A medium ES was found for Sympathy (0.49, p = 0.008). Qualitative and quantitative findings mostly aligned but for statistical terminology, participants' self-reported understanding was disparate with focus group reported experiences. Qualitative findings highlighted the importance of providing relevant context and positive role models for students during EBP training. Following EBP training across an entry-level physiotherapy program, there were qualitative and significant quantitative changes in participants' knowledge and perceptions of EBP. The qualitative and quantitative findings were mainly well-aligned with the exception of the Terminology domain, where the qualitative findings did not support the strength of the effect reported quantitatively. The findings of this study have implications for the timing and content of EBP curricula in entry-level health professional programs.

  1. Sampling of temporal networks: Methods and biases

    NASA Astrophysics Data System (ADS)

    Rocha, Luis E. C.; Masuda, Naoki; Holme, Petter

    2017-11-01

    Temporal networks have been increasingly used to model a diversity of systems that evolve in time; for example, human contact structures over which dynamic processes such as epidemics take place. A fundamental aspect of real-life networks is that they are sampled within temporal and spatial frames. Furthermore, one might wish to subsample networks to reduce their size for better visualization or to perform computationally intensive simulations. The sampling method may affect the network structure and thus caution is necessary to generalize results based on samples. In this paper, we study four sampling strategies applied to a variety of real-life temporal networks. We quantify the biases generated by each sampling strategy on a number of relevant statistics such as link activity, temporal paths and epidemic spread. We find that some biases are common in a variety of networks and statistics, but one strategy, uniform sampling of nodes, shows improved performance in most scenarios. Given the particularities of temporal network data and the variety of network structures, we recommend that the choice of sampling methods be problem oriented to minimize the potential biases for the specific research questions on hand. Our results help researchers to better design network data collection protocols and to understand the limitations of sampled temporal network data.

  2. A retrospective likelihood approach for efficient integration of multiple omics factors in case-control association studies.

    PubMed

    Balliu, Brunilda; Tsonaka, Roula; Boehringer, Stefan; Houwing-Duistermaat, Jeanine

    2015-03-01

    Integrative omics, the joint analysis of outcome and multiple types of omics data, such as genomics, epigenomics, and transcriptomics data, constitute a promising approach for powerful and biologically relevant association studies. These studies often employ a case-control design, and often include nonomics covariates, such as age and gender, that may modify the underlying omics risk factors. An open question is how to best integrate multiple omics and nonomics information to maximize statistical power in case-control studies that ascertain individuals based on the phenotype. Recent work on integrative omics have used prospective approaches, modeling case-control status conditional on omics, and nonomics risk factors. Compared to univariate approaches, jointly analyzing multiple risk factors with a prospective approach increases power in nonascertained cohorts. However, these prospective approaches often lose power in case-control studies. In this article, we propose a novel statistical method for integrating multiple omics and nonomics factors in case-control association studies. Our method is based on a retrospective likelihood function that models the joint distribution of omics and nonomics factors conditional on case-control status. The new method provides accurate control of Type I error rate and has increased efficiency over prospective approaches in both simulated and real data. © 2015 Wiley Periodicals, Inc.

  3. climwin: An R Toolbox for Climate Window Analysis.

    PubMed

    Bailey, Liam D; van de Pol, Martijn

    2016-01-01

    When studying the impacts of climate change, there is a tendency to select climate data from a small set of arbitrary time periods or climate windows (e.g., spring temperature). However, these arbitrary windows may not encompass the strongest periods of climatic sensitivity and may lead to erroneous biological interpretations. Therefore, there is a need to consider a wider range of climate windows to better predict the impacts of future climate change. We introduce the R package climwin that provides a number of methods to test the effect of different climate windows on a chosen response variable and compare these windows to identify potential climate signals. climwin extracts the relevant data for each possible climate window and uses this data to fit a statistical model, the structure of which is chosen by the user. Models are then compared using an information criteria approach. This allows users to determine how well each window explains variation in the response variable and compare model support between windows. climwin also contains methods to detect type I and II errors, which are often a problem with this type of exploratory analysis. This article presents the statistical framework and technical details behind the climwin package and demonstrates the applicability of the method with a number of worked examples.

  4. Evaluating methods of correcting for multiple comparisons implemented in SPM12 in social neuroscience fMRI studies: an example from moral psychology.

    PubMed

    Han, Hyemin; Glenn, Andrea L

    2018-06-01

    In fMRI research, the goal of correcting for multiple comparisons is to identify areas of activity that reflect true effects, and thus would be expected to replicate in future studies. Finding an appropriate balance between trying to minimize false positives (Type I error) while not being too stringent and omitting true effects (Type II error) can be challenging. Furthermore, the advantages and disadvantages of these types of errors may differ for different areas of study. In many areas of social neuroscience that involve complex processes and considerable individual differences, such as the study of moral judgment, effects are typically smaller and statistical power weaker, leading to the suggestion that less stringent corrections that allow for more sensitivity may be beneficial and also result in more false positives. Using moral judgment fMRI data, we evaluated four commonly used methods for multiple comparison correction implemented in Statistical Parametric Mapping 12 by examining which method produced the most precise overlap with results from a meta-analysis of relevant studies and with results from nonparametric permutation analyses. We found that voxelwise thresholding with familywise error correction based on Random Field Theory provides a more precise overlap (i.e., without omitting too few regions or encompassing too many additional regions) than either clusterwise thresholding, Bonferroni correction, or false discovery rate correction methods.

  5. HYPE: a WFD tool for the identification of significant and sustained upward trends in groundwater time series

    NASA Astrophysics Data System (ADS)

    Lopez, Benjamin; Croiset, Nolwenn; Laurence, Gourcy

    2014-05-01

    The Water Framework Directive 2006/11/CE (WFD) on the protection of groundwater against pollution and deterioration asks Member States to identify significant and sustained upward trends in all bodies or groups of bodies of groundwater that are characterised as being at risk in accordance with Annex II to Directive 2000/60/EC. The Directive indicates that the procedure for the identification of significant and sustained upward trends must be based on a statistical method. Moreover, for significant increases of concentrations of pollutants, trend reversals are identified as being necessary. This means to be able to identify significant trend reversals. A specific tool, named HYPE, has been developed in order to help stakeholders working on groundwater trend assessment. The R encoded tool HYPE provides statistical analysis of groundwater time series. It follows several studies on the relevancy of the use of statistical tests on groundwater data series (Lopez et al., 2011) and other case studies on the thematic (Bourgine et al., 2012). It integrates the most powerful and robust statistical tests for hydrogeological applications. HYPE is linked to the French national database on groundwater data (ADES). So monitoring data gathered by the Water Agencies can be directly processed. HYPE has two main modules: - a characterisation module, which allows to visualize time series. HYPE calculates the main statistical characteristics and provides graphical representations; - a trend module, which identifies significant breaks, trends and trend reversals in time series, providing result table and graphical representation (cf figure). Additional modules are also implemented to identify regional and seasonal trends and to sample time series in a relevant way. HYPE has been used successfully in 2012 by the French Water Agencies to satisfy requirements of the WFD, concerning characterization of groundwater bodies' qualitative status and evaluation of the risk of non-achievement of good status. Bourgine B. et al. 2012, Ninth International Geostatistics Congress, Oslo, Norway June 11 - 15. Lopez B. et al. 2011, Final Report BRGM/RP-59515-FR. 166p.

  6. SU-E-I-71: Quality Assessment of Surrogate Metrics in Multi-Atlas-Based Image Segmentation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhao, T; Ruan, D

    Purpose: With the ever-growing data of heterogeneous quality, relevance assessment of atlases becomes increasingly critical for multi-atlas-based image segmentation. However, there is no universally recognized best relevance metric and even a standard to compare amongst candidates remains elusive. This study, for the first time, designs a quantification to assess relevance metrics’ quality, based on a novel perspective of the metric as surrogate for inferring the inaccessible oracle geometric agreement. Methods: We first develop an inference model to relate surrogate metrics in image space to the underlying oracle relevance metric in segmentation label space, with a monotonically non-decreasing function subject tomore » random perturbations. Subsequently, we investigate model parameters to reveal key contributing factors to surrogates’ ability in prognosticating the oracle relevance value, for the specific task of atlas selection. Finally, we design an effective contract-to-noise ratio (eCNR) to quantify surrogates’ quality based on insights from these analyses and empirical observations. Results: The inference model was specialized to a linear function with normally distributed perturbations, with surrogate metric exemplified by several widely-used image similarity metrics, i.e., MSD/NCC/(N)MI. Surrogates’ behaviors in selecting the most relevant atlases were assessed under varying eCNR, showing that surrogates with high eCNR dominated those with low eCNR in retaining the most relevant atlases. In an end-to-end validation, NCC/(N)MI with eCNR of 0.12 compared to MSD with eCNR of 0.10 resulted in statistically better segmentation with mean DSC of about 0.85 and the first and third quartiles of (0.83, 0.89), compared to MSD with mean DSC of 0.84 and the first and third quartiles of (0.81, 0.89). Conclusion: The designed eCNR is capable of characterizing surrogate metrics’ quality in prognosticating the oracle relevance value. It has been demonstrated to be correlated with the performance of relevant atlas selection and ultimate label fusion.« less

  7. Analyzing Large Gene Expression and Methylation Data Profiles Using StatBicRM: Statistical Biclustering-Based Rule Mining

    PubMed Central

    Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra

    2015-01-01

    Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data-matrix. Finally, we have also included the integrated analysis of gene expression and methylation for determining epigenetic effect (viz., effect of methylation) on gene expression level. PMID:25830807

  8. Analyzing large gene expression and methylation data profiles using StatBicRM: statistical biclustering-based rule mining.

    PubMed

    Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra

    2015-01-01

    Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data-matrix. Finally, we have also included the integrated analysis of gene expression and methylation for determining epigenetic effect (viz., effect of methylation) on gene expression level.

  9. Mild cognitive impairment and fMRI studies of brain functional connectivity: the state of the art

    PubMed Central

    Farràs-Permanyer, Laia; Guàrdia-Olmos, Joan; Peró-Cebollero, Maribel

    2015-01-01

    In the last 15 years, many articles have studied brain connectivity in Mild Cognitive Impairment patients with fMRI techniques, seemingly using different connectivity statistical models in each investigation to identify complex connectivity structures so as to recognize typical behavior in this type of patient. This diversity in statistical approaches may cause problems in results comparison. This paper seeks to describe how researchers approached the study of brain connectivity in MCI patients using fMRI techniques from 2002 to 2014. The focus is on the statistical analysis proposed by each research group in reference to the limitations and possibilities of those techniques to identify some recommendations to improve the study of functional connectivity. The included articles came from a search of Web of Science and PsycINFO using the following keywords: f MRI, MCI, and functional connectivity. Eighty-one papers were found, but two of them were discarded because of the lack of statistical analysis. Accordingly, 79 articles were included in this review. We summarized some parts of the articles, including the goal of every investigation, the cognitive paradigm and methods used, brain regions involved, use of ROI analysis and statistical analysis, emphasizing on the connectivity estimation model used in each investigation. The present analysis allowed us to confirm the remarkable variability of the statistical analysis methods found. Additionally, the study of brain connectivity in this type of population is not providing, at the moment, any significant information or results related to clinical aspects relevant for prediction and treatment. We propose to follow guidelines for publishing fMRI data that would be a good solution to the problem of study replication. The latter aspect could be important for future publications because a higher homogeneity would benefit the comparison between publications and the generalization of results. PMID:26300802

  10. Content validation: clarity/relevance, reliability and internal consistency of enunciative signs of language acquisition.

    PubMed

    Crestani, Anelise Henrich; Moraes, Anaelena Bragança de; Souza, Ana Paula Ramos de

    2017-08-10

    To analyze the results of the validation of building enunciative signs of language acquisition for children aged 3 to 12 months. The signs were built based on mechanisms of language acquisition in an enunciative perspective and on clinical experience with language disorders. The signs were submitted to judgment of clarity and relevance by a sample of six experts, doctors in linguistic in with knowledge of psycholinguistics and language clinic. In the validation of reliability, two judges/evaluators helped to implement the instruments in videos of 20% of the total sample of mother-infant dyads using the inter-evaluator method. The method known as internal consistency was applied to the total sample, which consisted of 94 mother-infant dyads to the contents of the Phase 1 (3-6 months) and 61 mother-infant dyads to the contents of Phase 2 (7 to 12 months). The data were collected through the analysis of mother-infant interaction based on filming of dyads and application of the parameters to be validated according to the child's age. Data were organized in a spreadsheet and then converted to computer applications for statistical analysis. The judgments of clarity/relevance indicated no modifications to be made in the instruments. The reliability test showed an almost perfect agreement between judges (0.8 ≤ Kappa ≥ 1.0); only the item 2 of Phase 1 showed substantial agreement (0.6 ≤ Kappa ≥ 0.79). The internal consistency for Phase 1 had alpha = 0.84, and Phase 2, alpha = 0.74. This demonstrates the reliability of the instruments. The results suggest adequacy as to content validity of the instruments created for both age groups, demonstrating the relevance of the content of enunciative signs of language acquisition.

  11. Clinical Research Informatics: Supporting the Research Study Lifecycle.

    PubMed

    Johnson, S B

    2017-08-01

    Objectives: The primary goal of this review is to summarize significant developments in the field of Clinical Research Informatics (CRI) over the years 2015-2016. The secondary goal is to contribute to a deeper understanding of CRI as a field, through the development of a strategy for searching and classifying CRI publications. Methods: A search strategy was developed to query the PubMed database, using medical subject headings to both select and exclude articles, and filtering publications by date and other characteristics. A manual review classified publications using stages in the "research study lifecycle", with key stages that include study definition, participant enrollment, data management, data analysis, and results dissemination. Results: The search strategy generated 510 publications. The manual classification identified 125 publications as relevant to CRI, which were classified into seven different stages of the research lifecycle, and one additional class that pertained to multiple stages, referring to general infrastructure or standards. Important cross-cutting themes included new applications of electronic media (Internet, social media, mobile devices), standardization of data and procedures, and increased automation through the use of data mining and big data methods. Conclusions: The review revealed increased interest and support for CRI in large-scale projects across institutions, regionally, nationally, and internationally. A search strategy based on medical subject headings can find many relevant papers, but a large number of non-relevant papers need to be detected using text words which pertain to closely related fields such as computational statistics and clinical informatics. The research lifecycle was useful as a classification scheme by highlighting the relevance to the users of clinical research informatics solutions. Georg Thieme Verlag KG Stuttgart.

  12. Factors Associated with HIV Testing Among Participants from Substance Use Disorder Treatment Programs in the US: A Machine Learning Approach.

    PubMed

    Pan, Yue; Liu, Hongmei; Metsch, Lisa R; Feaster, Daniel J

    2017-02-01

    HIV testing is the foundation for consolidated HIV treatment and prevention. In this study, we aim to discover the most relevant variables for predicting HIV testing uptake among substance users in substance use disorder treatment programs by applying random forest (RF), a robust multivariate statistical learning method. We also provide a descriptive introduction to this method for those who are unfamiliar with it. We used data from the National Institute on Drug Abuse Clinical Trials Network HIV testing and counseling study (CTN-0032). A total of 1281 HIV-negative or status unknown participants from 12 US community-based substance use disorder treatment programs were included and were randomized into three HIV testing and counseling treatment groups. The a priori primary outcome was self-reported receipt of HIV test results. Classification accuracy of RF was compared to logistic regression, a standard statistical approach for binary outcomes. Variable importance measures for the RF model were used to select the most relevant variables. RF based models produced much higher classification accuracy than those based on logistic regression. Treatment group is the most important predictor among all covariates, with a variable importance index of 12.9%. RF variable importance revealed that several types of condomless sex behaviors, condom use self-efficacy and attitudes towards condom use, and level of depression are the most important predictors of receipt of HIV testing results. There is a non-linear negative relationship between count of condomless sex acts and the receipt of HIV testing. In conclusion, RF seems promising in discovering important factors related to HIV testing uptake among large numbers of predictors and should be encouraged in future HIV prevention and treatment research and intervention program evaluations.

  13. Hidden treasures in "ancient" microarrays: gene-expression portrays biology and potential resistance pathways of major lung cancer subtypes and normal tissue.

    PubMed

    Kerkentzes, Konstantinos; Lagani, Vincenzo; Tsamardinos, Ioannis; Vyberg, Mogens; Røe, Oluf Dimitri

    2014-01-01

    Novel statistical methods and increasingly more accurate gene annotations can transform "old" biological data into a renewed source of knowledge with potential clinical relevance. Here, we provide an in silico proof-of-concept by extracting novel information from a high-quality mRNA expression dataset, originally published in 2001, using state-of-the-art bioinformatics approaches. The dataset consists of histologically defined cases of lung adenocarcinoma (AD), squamous (SQ) cell carcinoma, small-cell lung cancer, carcinoid, metastasis (breast and colon AD), and normal lung specimens (203 samples in total). A battery of statistical tests was used for identifying differential gene expressions, diagnostic and prognostic genes, enriched gene ontologies, and signaling pathways. Our results showed that gene expressions faithfully recapitulate immunohistochemical subtype markers, as chromogranin A in carcinoids, cytokeratin 5, p63 in SQ, and TTF1 in non-squamous types. Moreover, biological information with putative clinical relevance was revealed as potentially novel diagnostic genes for each subtype with specificity 93-100% (AUC = 0.93-1.00). Cancer subtypes were characterized by (a) differential expression of treatment target genes as TYMS, HER2, and HER3 and (b) overrepresentation of treatment-related pathways like cell cycle, DNA repair, and ERBB pathways. The vascular smooth muscle contraction, leukocyte trans-endothelial migration, and actin cytoskeleton pathways were overexpressed in normal tissue. Reanalysis of this public dataset displayed the known biological features of lung cancer subtypes and revealed novel pathways of potentially clinical importance. The findings also support our hypothesis that even old omics data of high quality can be a source of significant biological information when appropriate bioinformatics methods are used.

  14. Climate Change Education in Earth System Science

    NASA Astrophysics Data System (ADS)

    Hänsel, Stephanie; Matschullat, Jörg

    2013-04-01

    The course "Atmospheric Research - Climate Change" is offered to master Earth System Science students within the specialisation "Climate and Environment" at the Technical University Bergakademie Freiberg. This module takes a comprehensive approach to climate sciences, reaching from the natural sciences background of climate change via the social components of the issue to the statistical analysis of changes in climate parameters. The course aims at qualifying the students to structure the physical and chemical basics of the climate system including relevant feedbacks. The students can evaluate relevant drivers of climate variability and change on various temporal and spatial scales and can transform knowledge from climate history to the present and the future. Special focus is given to the assessment of uncertainties related to climate observations and projections as well as the specific challenges of extreme weather and climate events. At the end of the course the students are able to critically reflect and evaluate climate change related results of scientific studies and related issues in media. The course is divided into two parts - "Climate Change" and "Climate Data Analysis" and encompasses two lectures, one seminar and one exercise. The weekly "Climate change" lecture transmits the physical and chemical background for climate variation and change. (Pre)historical, observed and projected climate changes and their effects on various sectors are being introduced and discussed regarding their implications for society, economics, ecology and politics. The related seminar presents and discusses the multiple reasons for controversy in climate change issues, based on various texts. Students train the presentation of scientific content and the discussion of climate change aspects. The biweekly lecture on "Climate data analysis" introduces the most relevant statistical tools and methods in climate science. Starting with checking data quality via tools of exploratory data analysis the approaches on climate time series, trend analysis and extreme events analysis are explained. Tools to describe relations within the data sets and significance tests further corroborate this. Within the weekly exercises that have to be prepared at home, the students work with self-selected climate data sets and apply the learned methods. The presentation and discussion of intermediate results by the students is as much part of the exercises as the illustration of possible methodological procedures by the teacher using exemplary data sets. The total time expenditure of the course is 270 hours with 90 attendance hours. The remainder consists of individual studies, e.g., preparation of discussions and presentations, statistical data analysis, and scientific writing. Different forms of examination are applied including written or oral examination, scientific report, presentation and portfolio work.

  15. Learning predictive statistics from temporal sequences: Dynamics and strategies.

    PubMed

    Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E; Kourtzi, Zoe

    2017-10-01

    Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics-that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments.

  16. How Historical Information Can Improve Extreme Value Analysis of Coastal Water Levels

    NASA Astrophysics Data System (ADS)

    Le Cozannet, G.; Bulteau, T.; Idier, D.; Lambert, J.; Garcin, M.

    2016-12-01

    The knowledge of extreme coastal water levels is useful for coastal flooding studies or the design of coastal defences. While deriving such extremes with standard analyses using tide gauge measurements, one often needs to deal with limited effective duration of observation which can result in large statistical uncertainties. This is even truer when one faces outliers, those particularly extreme values distant from the others. In a recent work (Bulteau et al., 2015), we investigated how historical information of past events reported in archives can reduce statistical uncertainties and relativize such outlying observations. We adapted a Bayesian Markov Chain Monte Carlo method, initially developed in the hydrology field (Reis and Stedinger, 2005), to the specific case of coastal water levels. We applied this method to the site of La Rochelle (France), where the storm Xynthia in 2010 generated a water level considered so far as an outlier. Based on 30 years of tide gauge measurements and 8 historical events since 1890, the results showed a significant decrease in statistical uncertainties on return levels when historical information is used. Also, Xynthia's water level no longer appeared as an outlier and we could have reasonably predicted the annual exceedance probability of that level beforehand (predictive probability for 2010 based on data until the end of 2009 of the same order of magnitude as the standard estimative probability using data until the end of 2010). Such results illustrate the usefulness of historical information in extreme value analyses of coastal water levels, as well as the relevance of the proposed method to integrate heterogeneous data in such analyses.

  17. Three-dimensional virtual planning in orthognathic surgery enhances the accuracy of soft tissue prediction.

    PubMed

    Van Hemelen, Geert; Van Genechten, Maarten; Renier, Lieven; Desmedt, Maria; Verbruggen, Elric; Nadjmi, Nasser

    2015-07-01

    Throughout the history of computing, shortening the gap between the physical and digital world behind the screen has always been strived for. Recent advances in three-dimensional (3D) virtual surgery programs have reduced this gap significantly. Although 3D assisted surgery is now widely available for orthognathic surgery, one might still argue whether a 3D virtual planning approach is a better alternative to a conventional two-dimensional (2D) planning technique. The purpose of this study was to compare the accuracy of a traditional 2D technique and a 3D computer-aided prediction method. A double blind randomised prospective study was performed to compare the prediction accuracy of a traditional 2D planning technique versus a 3D computer-aided planning approach. The accuracy of the hard and soft tissue profile predictions using both planning methods was investigated. There was a statistically significant difference between 2D and 3D soft tissue planning (p < 0.05). The statistically significant difference found between 2D and 3D planning and the actual soft tissue outcome was not confirmed by a statistically significant difference between methods. The 3D planning approach provides more accurate soft tissue planning. However, the 2D orthognathic planning is comparable to 3D planning when it comes to hard tissue planning. This study provides relevant results for choosing between 3D and 2D planning in clinical practice. Copyright © 2015 European Association for Cranio-Maxillo-Facial Surgery. Published by Elsevier Ltd. All rights reserved.

  18. Resemblance profiles as clustering decision criteria: Estimating statistical power, error, and correspondence for a hypothesis test for multivariate structure.

    PubMed

    Kilborn, Joshua P; Jones, David L; Peebles, Ernst B; Naar, David F

    2017-04-01

    Clustering data continues to be a highly active area of data analysis, and resemblance profiles are being incorporated into ecological methodologies as a hypothesis testing-based approach to clustering multivariate data. However, these new clustering techniques have not been rigorously tested to determine the performance variability based on the algorithm's assumptions or any underlying data structures. Here, we use simulation studies to estimate the statistical error rates for the hypothesis test for multivariate structure based on dissimilarity profiles (DISPROF). We concurrently tested a widely used algorithm that employs the unweighted pair group method with arithmetic mean (UPGMA) to estimate the proficiency of clustering with DISPROF as a decision criterion. We simulated unstructured multivariate data from different probability distributions with increasing numbers of objects and descriptors, and grouped data with increasing overlap, overdispersion for ecological data, and correlation among descriptors within groups. Using simulated data, we measured the resolution and correspondence of clustering solutions achieved by DISPROF with UPGMA against the reference grouping partitions used to simulate the structured test datasets. Our results highlight the dynamic interactions between dataset dimensionality, group overlap, and the properties of the descriptors within a group (i.e., overdispersion or correlation structure) that are relevant to resemblance profiles as a clustering criterion for multivariate data. These methods are particularly useful for multivariate ecological datasets that benefit from distance-based statistical analyses. We propose guidelines for using DISPROF as a clustering decision tool that will help future users avoid potential pitfalls during the application of methods and the interpretation of results.

  19. Hematological change parameters in patients with pressure ulcer at long-term care hospital

    PubMed Central

    Neiva, Giselle Protta; Carnevalli, Julia Romualdo; Cataldi, Rodrigo Lessa; Furtado, Denise Mendes; Fabri, Rodrigo Luiz; Silva, Pâmela Souza

    2014-01-01

    Objective To assess factors associated with the development of pressure ulcers, and to compare the effectiveness of pharmacological treatments. Methods The factors associated with the development of pressure ulcers were compared in lesion-carrying patients (n=14) and non-carriers (n=16). Lesion-carrying patients were treated with 1% silver sulfadiazine or 0.6IU/g collagenase and were observed for 8 weeks. The data collected was analyzed with p<0.05 being statistically relevant. Results The prevalence of pressure ulcers was about 6%. The comparison of carrier and non-carrier groups of pressure ulcers revealed no statistically significant difference in its occurrence with respect to age, sex, skin color, mobility, or the use of diapers. However, levels of hemoglobin, hematocrit, and red blood cells were found to be statistically different between groups, being lower in lesion-carrying patients. There was no significant difference found in lesion area between patients treated with collagenase or silver sulfadiazine, although both groups showed an overall reduction in lesion area through the treatment course. Conclusion Hematologic parameters showed a statistically significant difference between the two groups. Regarding the treatment of ulcers, there was no difference in the area of the lesion found between the groups treated with collagenase and silver sulfadiazine. PMID:25295450

  20. Library Statistical Data Base Formats and Definitions.

    ERIC Educational Resources Information Center

    Jones, Dennis; And Others

    Represented are the detailed set of data structures relevant to the categorization of information, terminology, and definitions employed in the design of the library statistical data base. The data base, or management information system, provides administrators with a framework of information and standardized data for library management, planning,…

  1. Understanding Broadscale Wildfire Risks in a Human-Dominated Landscape

    Treesearch

    Jeffrey P. Prestemon; John M. Pye; David T. Butry; Thomas P. Holmes; D. Evan Mercer

    2002-01-01

    Broadscale statistical evaluations of wildfire incidence can answer policy relevant questions about the effectiveness of microlevel vegetation management and can identify subjects needing further study. A dynamic time series cross-sectional model was used to evaluate the statistical links between forest wildfire and vegetation management, human land use, and climatic...

  2. A Bifactor Approach to Model Multifaceted Constructs in Statistical Mediation Analysis

    ERIC Educational Resources Information Center

    Gonzalez, Oscar; MacKinnon, David P.

    2018-01-01

    Statistical mediation analysis allows researchers to identify the most important mediating constructs in the causal process studied. Identifying specific mediators is especially relevant when the hypothesized mediating construct consists of multiple related facets. The general definition of the construct and its facets might relate differently to…

  3. A Statistical Decision Model for Periodical Selection for a Specialized Information Center

    ERIC Educational Resources Information Center

    Dym, Eleanor D.; Shirey, Donald L.

    1973-01-01

    An experiment is described which attempts to define a quantitative methodology for the identification and evaluation of all possibly relevant periodical titles containing toxicological-biological information. A statistical decision model was designed and employed, along with yes/no criteria questions, a training technique and a quality control…

  4. Introductory Statistics and Fish Management.

    ERIC Educational Resources Information Center

    Jardine, Dick

    2002-01-01

    Describes how fisheries research and management data (available on a website) have been incorporated into an Introductory Statistics course. In addition to the motivation gained from seeing the practical relevance of the course, some students have participated in the data collection and analysis for the New Hampshire Fish and Game Department. (MM)

  5. The Importance of Introductory Statistics Students Understanding Appropriate Sampling Techniques

    ERIC Educational Resources Information Center

    Menil, Violeta C.

    2005-01-01

    In this paper the author discusses the meaning of sampling, the reasons for sampling, the Central Limit Theorem, and the different techniques of sampling. Practical and relevant examples are given to make the appropriate sampling techniques understandable to students of Introductory Statistics courses. With a thorough knowledge of sampling…

  6. Teaching Primary School Mathematics and Statistics: Evidence-Based Practice

    ERIC Educational Resources Information Center

    Averill, Robin; Harvey, Roger

    2010-01-01

    Here is the only reference book you will ever need for teaching primary school mathematics and statistics. It is full of exciting and engaging snapshots of excellent classroom practice relevant to "The New Zealand Curriculum" and national mathematics standards. There are many fascinating examples of investigative learning experiences,…

  7. Forecast in foreign exchange markets

    NASA Astrophysics Data System (ADS)

    Baviera, R.; Pasquini, M.; Serva, M.; Vergni, D.; Vulpiani, A.

    2001-04-01

    We perform a statistical study of weak efficiency in Deutschemark/US dollar exchange rates using high frequency data. The presence of correlations in the returns sequence implies the possibility of a statistical forecast of market behavior. We show the existence of correlations and how information theory can be relevant in this context.

  8. Structure of Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition Criteria for Obsessive–Compulsive Personality Disorder in Patients With Binge Eating Disorder

    PubMed Central

    Ansell, Emily B; Pinto, Anthony; Edelen, Maria Orlando; Grilo, Carlos M

    2013-01-01

    Objective To examine 1-, 2-, and 3-factor model structures through confirmatory analytic procedures for Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) obsessive–compulsive personality disorder (OCPD) criteria in patients with binge eating disorder (BED). Method Participants were consecutive outpatients (n = 263) with binge eating disorder and were assessed with semi-structured interviews. The 8 OCPD criteria were submitted to confirmatory factor analyses in Mplus Version 4.2 (Los Angeles, CA) in which previously identified factor models of OCPD were compared for fit, theoretical relevance, and parsimony. Nested models were compared for significant improvements in model fit. Results Evaluation of indices of fit in combination with theoretical considerations suggest a multifactorial model is a significant improvement in fit over the current DSM-IV single-factor model of OCPD. Though the data support both 2- and 3-factor models, the 3-factor model is hindered by an underspecified third factor. Conclusion A multifactorial model of OCPD incorporating the factors perfectionism and rigidity represents the best compromise of fit and theory in modelling the structure of OCPD in patients with BED. A third factor representing miserliness may be relevant in BED populations but needs further development. The perfectionism and rigidity factors may represent distinct intrapersonal and interpersonal attempts at control and may have implications for the assessment of OCPD. PMID:19087485

  9. In vivo clinical and radiological effects of platelet-rich plasma on interstitial supraspinatus lesion: Case series.

    PubMed

    Lädermann, A; Zumstein, M A; Kolo, F C; Grosclaude, M; Koglin, L; Schwitzguebel, A J P

    2016-12-01

    Rotator cuff tear (RCT) is a frequent condition of clinical relevance that can be managed with a symptomatic conservative treatment, but surgery is often needed. Biological components like leukocytes and platelet rich plasma (L-PRP) could represent an alternative curative method for interstitial RCT. It has been hypothesized that an ultrasound guided L-PRP injection in supraspinatus interstitial RCT could induce radiological healing. A prospective case series including 25 patients was performed in order to assess the effect of L-PRP infiltration into supraspinatus interstitial RCTs. Primary outcome was tear size change determined by magnetic resonance imaging arthrogram (MRA) before and 6 months after L-PRP infiltration. Secondary outcomes were Constant score, SANE score, and pain visual analog scale (VAS) after L-PRP infiltration. Tear volume diminution was statistically significant (P=.007), and a >50% tear volume diminution was observed in 15 patients. A statistically significant improvement of Constant score (P<.001), SANE score (P=.001), and VAS (P<.001) was observed. In 21 patients, Constant score improvement reached the minimal clinical important difference of 10.4 points. We observed a statistically significant and clinically relevant effect on RCT size and clinical parameters after L-PRP infiltration. Such an important improvement of supraspinatus interstitial RCT with conservative management is uncommon, therefore intratendinous L-PRP infiltrations could have been beneficial. This encouraging result could pave the way for future randomized studies in order to formally determinate whether L-PRP infiltrations are a possible alternative to surgical treatment of interstitial RCT. Prospective observational study; Level of evidence II. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  10. Additive scales in degenerative disease--calculation of effect sizes and clinical judgment.

    PubMed

    Riepe, Matthias W; Wilkinson, David; Förstl, Hans; Brieden, Andreas

    2011-12-16

    The therapeutic efficacy of an intervention is often assessed in clinical trials by scales measuring multiple diverse activities that are added to produce a cumulative global score. Medical communities and health care systems subsequently use these data to calculate pooled effect sizes to compare treatments. This is done because major doubt has been cast over the clinical relevance of statistically significant findings relying on p values with the potential to report chance findings. Hence in an aim to overcome this pooling the results of clinical studies into a meta-analyses with a statistical calculus has been assumed to be a more definitive way of deciding of efficacy. We simulate the therapeutic effects as measured with additive scales in patient cohorts with different disease severity and assess the limitations of an effect size calculation of additive scales which are proven mathematically. We demonstrate that the major problem, which cannot be overcome by current numerical methods, is the complex nature and neurobiological foundation of clinical psychiatric endpoints in particular and additive scales in general. This is particularly relevant for endpoints used in dementia research. 'Cognition' is composed of functions such as memory, attention, orientation and many more. These individual functions decline in varied and non-linear ways. Here we demonstrate that with progressive diseases cumulative values from multidimensional scales are subject to distortion by the limitations of the additive scale. The non-linearity of the decline of function impedes the calculation of effect sizes based on cumulative values from these multidimensional scales. Statistical analysis needs to be guided by boundaries of the biological condition. Alternatively, we suggest a different approach avoiding the error imposed by over-analysis of cumulative global scores from additive scales.

  11. Serum Endotoxins and Flagellin and Risk of Colorectal Cancer in the European Prospective Investigation into Cancer and Nutrition (EPIC) Cohort

    PubMed Central

    Kong, So Yeon; Tran, Hao Quang; Gewirtz, Andrew T.; McKeown-Eyssen, Gail; Fedirko, Veronika; Romieu, Isabelle; Tjønneland, Anne; Olsen, Anja; Overvad, Kim; Boutron-Ruault, Marie-Christine; Bastide, Nadia; Affret, Aurélie; Kühn, Tilman; Kaaks, Rudolf; Boeing, Heiner; Aleksandrova, Krasimira; Trichopoulou, Antonia; Kritikou, Maria; Vasilopoulou, Effie; Palli, Domenico; Krogh, Vittorio; Mattiello, Amalia; Tumino, Rosario; Naccarati, Alessio; Bueno-de-Mesquita, H.Bas; Peeters, Petra H.; Weiderpass, Elisabete; Quirós, J. Ramón; Sala, Núria; Sánchez, María-José; Huerta Castaño, José María; Barricarte, Aurelio; Dorronsoro, Miren; Werner, Mårten; Wareham, Nicholas J.; Khaw, Kay-Tee; Bradbury, Kathryn E.; Freisling, Heinz; Stavropoulou, Faidra; Ferrari, Pietro; Gunter, Marc J.; Cross, Amanda J.; Riboli, Elio; Bruce, W. Robert

    2017-01-01

    Background Chronic inflammation and oxidative stress are thought to be involved in colorectal cancer (CRC) development. These processes may be contributed to by leakage of bacterial products, such as lipopolysaccharide (LPS) and flagellin, across the gut barrier. The objective of this study, nested within a prospective cohort, was to examine associations between circulating LPS and flagellin serum antibody levels and CRC risk. Methods 1,065 incident CRC cases (colon n=667; rectal n=398) were matched (1:1) to control subjects. Serum flagellin- and LPS-specific IgA and IgG levels were quantitated by ELISA. Multivariable conditional logistic regression models were used to calculate odds ratios (OR) and 95% confidence intervals (CI), adjusting for multiple relevant confouding factors. Results Overall, elevated anti-LPS and anti-flagellin biomarker levels were not associated with CRC risk. After testing potential interactions by various factors relevant for CRC risk and anti-LPS and anti-flagellin, sex was identified as a statistically significant interaction factor (pinteraction < 0.05 for all the biomarkers). Analyses stratified by sex showed a statistically significant positive CRC risk association for men (fully-adjusted OR for highest vs. lowest quartile for total anti-LPS+flagellin = 1.66; 95% CI, 1.10-2.51; ptrend = 0.049) while a borderline statistically significant inverse association was observed for women (fully-adjusted OR= 0.70; 95%CI, 0.47-1.02; ptrend = 0.18). Conclusion In this prospective study on European populations, we found bacterial exposure levels to be positively associated to CRC risk among men while in women, a possible inverse association may exist. Impact Further studies are warranted to better clarify these preliminary observations. PMID:26823475

  12. Cost effectiveness of brace, physiotherapy, or both for treatment of tennis elbow

    PubMed Central

    Struijs, P A A; Bos, I B C Korthals‐de; van Tulder, M W; van Dijk, C N; Bouter, L M

    2006-01-01

    Background The annual incidence of tennis elbow in the general population is high (1–3%). Tennis elbow often leads to limitation of activities of daily living and work absenteeism. Physiotherapy and braces are the most common treatments. Objectives The hypothesis of the trial was that no difference exists in the cost effectiveness of physiotherapy, braces, and a combination of the two for treatment of tennis elbow. Methods The trial was designed as a randomised controlled trial with intention to treat analysis. A total of 180 patients with tennis elbow were randomised to brace only (n  =  68), physiotherapy (n  =  56), or a combination of the two (n  =  56). Outcome measures were success rate, severity of complaints, pain, functional disability, and quality of life. Follow up was at six, 26, and 52 weeks. Direct healthcare and non‐healthcare costs and indirect costs were measured. Mean cost differences over 12 months were evaluated by applying non‐parametric bootstrap techniques. Results No clinically relevant or statistically significant differences were found between the groups. Success rate at 12 months was 89% in the physiotherapy group, 86% in the brace group, and 87% in the combination group. Mean total costs per patient were €2069 in the brace only group, €978 in the physiotherapy group, and €1256 in the combination group. The mean difference in total costs between the physiotherapy and brace group was substantial (€1005), although not significant. Cost effectiveness ratios and cost utility ratios showed physiotherapy to be the most cost effective, although this also was not statistically significant. Conclusion No clinically relevant or statistically significant differences in costs were identified between the three strategies. PMID:16687482

  13. Research on the development efficiency of regional high-end talent in China: A complex network approach

    PubMed Central

    Zhang, Wenbin

    2017-01-01

    In this paper, based on the panel data of 31 provinces and cities in China from 1991 to 2016, the regional development efficiency matrix of high-end talent is obtained by DEA method, and the matrix is converted into a continuous change of complex networks through the construction of sliding window. Using a series of continuous changes in the complex network topology statistics, the characteristics of regional high-end talent development efficiency system are analyzed. And the results show that the average development efficiency of high-end talent in the western region is at a low level. After 2005, the national regional high-end talent development efficiency network has both short-range relevance and long-range relevance in the evolution process. The central region plays an important intermediary role in the national regional high-end talent development system. And the western region has high clustering characteristics. With the implementation of the high-end talent policies with regional characteristics by different provinces and cities, the relevance of high-end talent development efficiency in various provinces and cities presents a weakening trend, and the geographical characteristics of high-end talent are more and more obvious. PMID:29272286

  14. The Gaussian CL s method for searches of new physics

    DOE PAGES

    Qian, X.; Tan, A.; Ling, J. J.; ...

    2016-04-23

    Here we describe a method based on the CL s approach to present results in searches of new physics, under the condition that the relevant parameter space is continuous. Our method relies on a class of test statistics developed for non-nested hypotheses testing problems, denoted by ΔT, which has a Gaussian approximation to its parent distribution when the sample size is large. This leads to a simple procedure of forming exclusion sets for the parameters of interest, which we call the Gaussian CL s method. Our work provides a self-contained mathematical proof for the Gaussian CL s method, that explicitlymore » outlines the required conditions. These conditions are milder than that required by the Wilks' theorem to set confidence intervals (CIs). We illustrate the Gaussian CL s method in an example of searching for a sterile neutrino, where the CL s approach was rarely used before. We also compare data analysis results produced by the Gaussian CL s method and various CI methods to showcase their differences.« less

  15. Automatic Artifact Removal from Electroencephalogram Data Based on A Priori Artifact Information.

    PubMed

    Zhang, Chi; Tong, Li; Zeng, Ying; Jiang, Jingfang; Bu, Haibing; Yan, Bin; Li, Jianxin

    2015-01-01

    Electroencephalogram (EEG) is susceptible to various nonneural physiological artifacts. Automatic artifact removal from EEG data remains a key challenge for extracting relevant information from brain activities. To adapt to variable subjects and EEG acquisition environments, this paper presents an automatic online artifact removal method based on a priori artifact information. The combination of discrete wavelet transform and independent component analysis (ICA), wavelet-ICA, was utilized to separate artifact components. The artifact components were then automatically identified using a priori artifact information, which was acquired in advance. Subsequently, signal reconstruction without artifact components was performed to obtain artifact-free signals. The results showed that, using this automatic online artifact removal method, there were statistical significant improvements of the classification accuracies in both two experiments, namely, motor imagery and emotion recognition.

  16. Automatic Artifact Removal from Electroencephalogram Data Based on A Priori Artifact Information

    PubMed Central

    Zhang, Chi; Tong, Li; Zeng, Ying; Jiang, Jingfang; Bu, Haibing; Li, Jianxin

    2015-01-01

    Electroencephalogram (EEG) is susceptible to various nonneural physiological artifacts. Automatic artifact removal from EEG data remains a key challenge for extracting relevant information from brain activities. To adapt to variable subjects and EEG acquisition environments, this paper presents an automatic online artifact removal method based on a priori artifact information. The combination of discrete wavelet transform and independent component analysis (ICA), wavelet-ICA, was utilized to separate artifact components. The artifact components were then automatically identified using a priori artifact information, which was acquired in advance. Subsequently, signal reconstruction without artifact components was performed to obtain artifact-free signals. The results showed that, using this automatic online artifact removal method, there were statistical significant improvements of the classification accuracies in both two experiments, namely, motor imagery and emotion recognition. PMID:26380294

  17. Exploratory Visual Analysis of Statistical Results from Microarray Experiments Comparing High and Low Grade Glioma

    PubMed Central

    Reif, David M.; Israel, Mark A.; Moore, Jason H.

    2007-01-01

    The biological interpretation of gene expression microarray results is a daunting challenge. For complex diseases such as cancer, wherein the body of published research is extensive, the incorporation of expert knowledge provides a useful analytical framework. We have previously developed the Exploratory Visual Analysis (EVA) software for exploring data analysis results in the context of annotation information about each gene, as well as biologically relevant groups of genes. We present EVA as a flexible combination of statistics and biological annotation that provides a straightforward visual interface for the interpretation of microarray analyses of gene expression in the most commonly occuring class of brain tumors, glioma. We demonstrate the utility of EVA for the biological interpretation of statistical results by analyzing publicly available gene expression profiles of two important glial tumors. The results of a statistical comparison between 21 malignant, high-grade glioblastoma multiforme (GBM) tumors and 19 indolent, low-grade pilocytic astrocytomas were analyzed using EVA. By using EVA to examine the results of a relatively simple statistical analysis, we were able to identify tumor class-specific gene expression patterns having both statistical and biological significance. Our interactive analysis highlighted the potential importance of genes involved in cell cycle progression, proliferation, signaling, adhesion, migration, motility, and structure, as well as candidate gene loci on a region of Chromosome 7 that has been implicated in glioma. Because EVA does not require statistical or computational expertise and has the flexibility to accommodate any type of statistical analysis, we anticipate EVA will prove a useful addition to the repertoire of computational methods used for microarray data analysis. EVA is available at no charge to academic users and can be found at http://www.epistasis.org. PMID:19390666

  18. Reliability studies of diagnostic methods in Indian traditional Ayurveda medicine: An overview

    PubMed Central

    Kurande, Vrinda Hitendra; Waagepetersen, Rasmus; Toft, Egon; Prasad, Ramjee

    2013-01-01

    Recently, a need to develop supportive new scientific evidence for contemporary Ayurveda has emerged. One of the research objectives is an assessment of the reliability of diagnoses and treatment. Reliability is a quantitative measure of consistency. It is a crucial issue in classification (such as prakriti classification), method development (pulse diagnosis), quality assurance for diagnosis and treatment and in the conduct of clinical studies. Several reliability studies are conducted in western medicine. The investigation of the reliability of traditional Chinese, Japanese and Sasang medicine diagnoses is in the formative stage. However, reliability studies in Ayurveda are in the preliminary stage. In this paper, examples are provided to illustrate relevant concepts of reliability studies of diagnostic methods and their implication in practice, education, and training. An introduction to reliability estimates and different study designs and statistical analysis is given for future studies in Ayurveda. PMID:23930037

  19. Geometry and Dynamics for Markov Chain Monte Carlo

    NASA Astrophysics Data System (ADS)

    Barp, Alessandro; Briol, François-Xavier; Kennedy, Anthony D.; Girolami, Mark

    2018-03-01

    Markov Chain Monte Carlo methods have revolutionised mathematical computation and enabled statistical inference within many previously intractable models. In this context, Hamiltonian dynamics have been proposed as an efficient way of building chains which can explore probability densities efficiently. The method emerges from physics and geometry and these links have been extensively studied by a series of authors through the last thirty years. However, there is currently a gap between the intuitions and knowledge of users of the methodology and our deep understanding of these theoretical foundations. The aim of this review is to provide a comprehensive introduction to the geometric tools used in Hamiltonian Monte Carlo at a level accessible to statisticians, machine learners and other users of the methodology with only a basic understanding of Monte Carlo methods. This will be complemented with some discussion of the most recent advances in the field which we believe will become increasingly relevant to applied scientists.

  20. [Clinical research IV. Relevancy of the statistical test chosen].

    PubMed

    Talavera, Juan O; Rivas-Ruiz, Rodolfo

    2011-01-01

    When we look at the difference between two therapies or the association of a risk factor or prognostic indicator with its outcome, we need to evaluate the accuracy of the result. This assessment is based on a judgment that uses information about the study design and statistical management of the information. This paper specifically mentions the relevance of the statistical test selected. Statistical tests are chosen mainly from two characteristics: the objective of the study and type of variables. The objective can be divided into three test groups: a) those in which you want to show differences between groups or inside a group before and after a maneuver, b) those that seek to show the relationship (correlation) between variables, and c) those that aim to predict an outcome. The types of variables are divided in two: quantitative (continuous and discontinuous) and qualitative (ordinal and dichotomous). For example, if we seek to demonstrate differences in age (quantitative variable) among patients with systemic lupus erythematosus (SLE) with and without neurological disease (two groups), the appropriate test is the "Student t test for independent samples." But if the comparison is about the frequency of females (binomial variable), then the appropriate statistical test is the χ(2).

  1. Using Algal Metrics and Biomass to Evaluate Multiple Ways of Defining Concentration-Based Nutrient Criteria in Streams and their Ecological Relevance

    EPA Science Inventory

    We examined the utility of nutrient criteria derived solely from total phosphorus (TP) concentrations in streams (regression models and percentile distributions) and evaluated their ecological relevance to diatom and algal biomass responses. We used a variety of statistics to cha...

  2. Effects of Hearing Impairment and Hearing Aid Amplification on Listening Effort: A Systematic Review.

    PubMed

    Ohlenforst, Barbara; Zekveld, Adriana A; Jansma, Elise P; Wang, Yang; Naylor, Graham; Lorens, Artur; Lunner, Thomas; Kramer, Sophia E

    To undertake a systematic review of available evidence on the effect of hearing impairment and hearing aid amplification on listening effort. Two research questions were addressed: Q1) does hearing impairment affect listening effort? and Q2) can hearing aid amplification affect listening effort during speech comprehension? English language articles were identified through systematic searches in PubMed, EMBASE, Cinahl, the Cochrane Library, and PsycINFO from inception to August 2014. References of eligible studies were checked. The Population, Intervention, Control, Outcomes, and Study design strategy was used to create inclusion criteria for relevance. It was not feasible to apply a meta-analysis of the results from comparable studies. For the articles identified as relevant, a quality rating, based on the 2011 Grading of Recommendations Assessment, Development, and Evaluation Working Group guidelines, was carried out to judge the reliability and confidence of the estimated effects. The primary search produced 7017 unique hits using the keywords: hearing aids OR hearing impairment AND listening effort OR perceptual effort OR ease of listening. Of these, 41 articles fulfilled the Population, Intervention, Control, Outcomes, and Study design selection criteria of: experimental work on hearing impairment OR hearing aid technologies AND listening effort OR fatigue during speech perception. The methods applied in those articles were categorized into subjective, behavioral, and physiological assessment of listening effort. For each study, the statistical analysis addressing research question Q1 and/or Q2 was extracted. In seven articles more than one measure of listening effort was provided. Evidence relating to Q1 was provided by 21 articles that reported 41 relevant findings. Evidence relating to Q2 was provided by 27 articles that reported 56 relevant findings. The quality of evidence on both research questions (Q1 and Q2) was very low, according to the Grading of Recommendations Assessment, Development, and Evaluation Working Group guidelines. We tested the statistical evidence across studies with nonparametric tests. The testing revealed only one consistent effect across studies, namely that listening effort was higher for hearing-impaired listeners compared with normal-hearing listeners (Q1) as measured by electroencephalographic measures. For all other studies, the evidence across studies failed to reveal consistent effects on listening effort. In summary, we could only identify scientific evidence from physiological measurement methods, suggesting that hearing impairment increases listening effort during speech perception (Q1). There was no scientific, finding across studies indicating that hearing aid amplification decreases listening effort (Q2). In general, there were large differences in the study population, the control groups and conditions, and the outcome measures applied between the studies included in this review. The results of this review indicate that published listening effort studies lack consistency, lack standardization across studies, and have insufficient statistical power. The findings underline the need for a common conceptual framework for listening effort to address the current shortcomings.

  3. Effects of Hearing Impairment and Hearing Aid Amplification on Listening Effort: A Systematic Review

    PubMed Central

    Ohlenforst, Barbara; Jansma, Elise P.; Wang, Yang; Naylor, Graham; Lorens, Artur; Lunner, Thomas; Kramer, Sophia E.

    2017-01-01

    Objectives: To undertake a systematic review of available evidence on the effect of hearing impairment and hearing aid amplification on listening effort. Two research questions were addressed: Q1) does hearing impairment affect listening effort? and Q2) can hearing aid amplification affect listening effort during speech comprehension? Design: English language articles were identified through systematic searches in PubMed, EMBASE, Cinahl, the Cochrane Library, and PsycINFO from inception to August 2014. References of eligible studies were checked. The Population, Intervention, Control, Outcomes, and Study design strategy was used to create inclusion criteria for relevance. It was not feasible to apply a meta-analysis of the results from comparable studies. For the articles identified as relevant, a quality rating, based on the 2011 Grading of Recommendations Assessment, Development, and Evaluation Working Group guidelines, was carried out to judge the reliability and confidence of the estimated effects. Results: The primary search produced 7017 unique hits using the keywords: hearing aids OR hearing impairment AND listening effort OR perceptual effort OR ease of listening. Of these, 41 articles fulfilled the Population, Intervention, Control, Outcomes, and Study design selection criteria of: experimental work on hearing impairment OR hearing aid technologies AND listening effort OR fatigue during speech perception. The methods applied in those articles were categorized into subjective, behavioral, and physiological assessment of listening effort. For each study, the statistical analysis addressing research question Q1 and/or Q2 was extracted. In seven articles more than one measure of listening effort was provided. Evidence relating to Q1 was provided by 21 articles that reported 41 relevant findings. Evidence relating to Q2 was provided by 27 articles that reported 56 relevant findings. The quality of evidence on both research questions (Q1 and Q2) was very low, according to the Grading of Recommendations Assessment, Development, and Evaluation Working Group guidelines. We tested the statistical evidence across studies with nonparametric tests. The testing revealed only one consistent effect across studies, namely that listening effort was higher for hearing-impaired listeners compared with normal-hearing listeners (Q1) as measured by electroencephalographic measures. For all other studies, the evidence across studies failed to reveal consistent effects on listening effort. Conclusion: In summary, we could only identify scientific evidence from physiological measurement methods, suggesting that hearing impairment increases listening effort during speech perception (Q1). There was no scientific, finding across studies indicating that hearing aid amplification decreases listening effort (Q2). In general, there were large differences in the study population, the control groups and conditions, and the outcome measures applied between the studies included in this review. The results of this review indicate that published listening effort studies lack consistency, lack standardization across studies, and have insufficient statistical power. The findings underline the need for a common conceptual framework for listening effort to address the current shortcomings. PMID:28234670

  4. Development of ecological indicator guilds for land management

    USGS Publications Warehouse

    Krzysik, A.J.; Balbach, H.E.; Duda, J.J.; Emlen, J.M.; Freeman, D.C.; Graham, J.H.; Kovacic, D.A.; Smith, L.M.; Zak, J.C.

    2005-01-01

    Agency land-use must be efficiently and cost-effectively monitored to assess conditions and trends in ecosystem processes and natural resources relevant to mission requirements and legal mandates. Ecological Indicators represent important land management tools for tracking ecological changes and preventing irreversible environmental damage in disturbed landscapes. The overall objective of the research was to develop both individual and integrated sets (i.e., statistically derived guilds) of Ecological Indicators to: quantify habitat conditions and trends, track and monitor ecological changes, provide early warning or threshold detection, and provide guidance for land managers. The derivation of Ecological Indicators was based on statistical criteria, ecosystem relevance, reliability and robustness, economy and ease of use for land managers, multi-scale performance, and stress response criteria. The basis for the development of statistically based Ecological Indicators was the identification of ecosystem metrics that analytically tracked a landscape disturbance gradient.

  5. Low order models for uncertainty quantification in acoustic propagation problems

    NASA Astrophysics Data System (ADS)

    Millet, Christophe

    2016-11-01

    Long-range sound propagation problems are characterized by both a large number of length scales and a large number of normal modes. In the atmosphere, these modes are confined within waveguides causing the sound to propagate through multiple paths to the receiver. For uncertain atmospheres, the modes are described as random variables. Concise mathematical models and analysis reveal fundamental limitations in classical projection techniques due to different manifestations of the fact that modes that carry small variance can have important effects on the large variance modes. In the present study, we propose a systematic strategy for obtaining statistically accurate low order models. The normal modes are sorted in decreasing Sobol indices using asymptotic expansions, and the relevant modes are extracted using a modified iterative Krylov-based method. The statistics of acoustic signals are computed by decomposing the original pulse into a truncated sum of modal pulses that can be described by a stationary phase method. As the low-order acoustic model preserves the overall structure of waveforms under perturbations of the atmosphere, it can be applied to uncertainty quantification. The result of this study is a new algorithm which applies on the entire phase space of acoustic fields.

  6. Perspectives on statistics education: observations from statistical consulting in an academic nursing environment.

    PubMed

    Hayat, Matthew J; Schmiege, Sarah J; Cook, Paul F

    2014-04-01

    Statistics knowledge is essential for understanding the nursing and health care literature, as well as for applying rigorous science in nursing research. Statistical consultants providing services to faculty and students in an academic nursing program have the opportunity to identify gaps and challenges in statistics education for nursing students. This information may be useful to curriculum committees and statistics educators. This article aims to provide perspective on statistics education stemming from the experiences of three experienced statistics educators who regularly collaborate and consult with nurse investigators. The authors share their knowledge and express their views about data management, data screening and manipulation, statistical software, types of scientific investigation, and advanced statistical topics not covered in the usual coursework. The suggestions provided promote a call for data to study these topics. Relevant data about statistics education can assist educators in developing comprehensive statistics coursework for nursing students. Copyright 2014, SLACK Incorporated.

  7. Book review: Bird census techniques, Second edition

    USGS Publications Warehouse

    Sauer, John R.

    2002-01-01

    Conservation concerns, federal mandates to monitor birds, and citizen science programs have spawned a variety of surveys that collect information on bird populations. Unfortunately, all too frequently these surveys are poorly designed and use inappropriate counting methods. Some of the flawed approaches reflect a lack of understanding of statistical design; many ornithologists simply are not aware that many of our most entrenched counting methods (such as point counts) cannot appropriately be used in studies that compare densities of birds over space and time. It is likely that most of the readers of The Condor have participated in a bird population survey that has been criticized for poor sampling methods. For example, North American readers may be surprised to read in Bird Census Techniques that the North American Breeding Bird Survey 'is seriously flawed in its design,' and that 'Analysis of trends is impossible from points that are positioned along roads' (p. 109). Our conservation efforts are at risk if we do not acknowledge these concerns and improve our survey designs. Other surveys suffer from a lack of focus. In Bird Census Techniques, the authors emphasize that all surveys require clear statements of objectives and an understanding of appropriate survey designs to meet their objectives. Too often, we view survey design as the realm of ornithologists who know the life histories and logistical issues relevant to counting birds. This view reflects pure hubris: survey design is a collaboration between ornithologists, statisticians, and managers, in which goals based on management needs are met by applying statistical principles for design to the biological context of the species of interest. Poor survey design is often due to exclusion of some of these partners from survey development. Because ornithologists are too frequently unaware of these issues, books such as Bird Census Techniques take on added importance as manuals for educating ornithologists about the relevance of survey design and methods and the often subtle interdisciplinary nature of surveys.Review info: Bird Census Techniques, Second Edition. By Colin J. Bibby, Neil D. Burgess, David A. Hill, and Simon H. Mustoe. 2000. Academic Press, London, UK. xvii 1 302 pp.  ISBN 0- 12-095831-7.

  8. Demographic inference under the coalescent in a spatial continuum.

    PubMed

    Guindon, Stéphane; Guo, Hongbin; Welch, David

    2016-10-01

    Understanding population dynamics from the analysis of molecular and spatial data requires sound statistical modeling. Current approaches assume that populations are naturally partitioned into discrete demes, thereby failing to be relevant in cases where individuals are scattered on a spatial continuum. Other models predict the formation of increasingly tight clusters of individuals in space, which, again, conflicts with biological evidence. Building on recent theoretical work, we introduce a new genealogy-based inference framework that alleviates these issues. This approach effectively implements a stochastic model in which the distribution of individuals is homogeneous and stationary, thereby providing a relevant null model for the fluctuation of genetic diversity in time and space. Importantly, the spatial density of individuals in a population and their range of dispersal during the course of evolution are two parameters that can be inferred separately with this method. The validity of the new inference framework is confirmed with extensive simulations and the analysis of influenza sequences collected over five seasons in the USA. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. Improvement of Biocatalysts for Industrial and Environmental Purposes by Saturation Mutagenesis

    PubMed Central

    Valetti, Francesca; Gilardi, Gianfranco

    2013-01-01

    Laboratory evolution techniques are becoming increasingly widespread among protein engineers for the development of novel and designed biocatalysts. The palette of different approaches ranges from complete randomized strategies to rational and structure-guided mutagenesis, with a wide variety of costs, impacts, drawbacks and relevance to biotechnology. A technique that convincingly compromises the extremes of fully randomized vs. rational mutagenesis, with a high benefit/cost ratio, is saturation mutagenesis. Here we will present and discuss this approach in its many facets, also tackling the issue of randomization, statistical evaluation of library completeness and throughput efficiency of screening methods. Successful recent applications covering different classes of enzymes will be presented referring to the literature and to research lines pursued in our group. The focus is put on saturation mutagenesis as a tool for designing novel biocatalysts specifically relevant to production of fine chemicals for improving bulk enzymes for industry and engineering technical enzymes involved in treatment of waste, detoxification and production of clean energy from renewable sources. PMID:24970191

  10. Cognitive capitalism: the effect of cognitive ability on wealth, as mediated through scientific achievement and economic freedom.

    PubMed

    Rindermann, Heiner; Thompson, James

    2011-06-01

    Traditional economic theories stress the relevance of political, institutional, geographic, and historical factors for economic growth. In contrast, human-capital theories suggest that peoples' competences, mediated by technological progress, are the deciding factor in a nation's wealth. Using three large-scale assessments, we calculated cognitive-competence sums for the mean and for upper- and lower-level groups for 90 countries and compared the influence of each group's intellectual ability on gross domestic product. In our cross-national analyses, we applied different statistical methods (path analyses, bootstrapping) and measures developed by different research groups to various country samples and historical periods. Our results underscore the decisive relevance of cognitive ability--particularly of an intellectual class with high cognitive ability and accomplishments in science, technology, engineering, and math--for national wealth. Furthermore, this group's cognitive ability predicts the quality of economic and political institutions, which further determines the economic affluence of the nation. Cognitive resources enable the evolution of capitalism and the rise of wealth.

  11. A review of downscaling procedures - a contribution to the research on climate change impacts at city scale

    NASA Astrophysics Data System (ADS)

    Smid, Marek; Costa, Ana; Pebesma, Edzer; Granell, Carlos; Bhattacharya, Devanjan

    2016-04-01

    Human kind is currently predominantly urban based, and the majority of ever continuing population growth will take place in urban agglomerations. Urban systems are not only major drivers of climate change, but also the impact hot spots. Furthermore, climate change impacts are commonly managed at city scale. Therefore, assessing climate change impacts on urban systems is a very relevant subject of research. Climate and its impacts on all levels (local, meso and global scale) and also the inter-scale dependencies of those processes should be a subject to detail analysis. While global and regional projections of future climate are currently available, local-scale information is lacking. Hence, statistical downscaling methodologies represent a potentially efficient way to help to close this gap. In general, the methodological reviews of downscaling procedures cover the various methods according to their application (e.g. downscaling for the hydrological modelling). Some of the most recent and comprehensive studies, such as the ESSEM COST Action ES1102 (VALUE), use the concept of Perfect Prog and MOS. Other examples of classification schemes of downscaling techniques consider three main categories: linear methods, weather classifications and weather generators. Downscaling and climate modelling represent a multidisciplinary field, where researchers from various backgrounds intersect their efforts, resulting in specific terminology, which may be somewhat confusing. For instance, the Polynomial Regression (also called the Surface Trend Analysis) is a statistical technique. In the context of the spatial interpolation procedures, it is commonly classified as a deterministic technique, and kriging approaches are classified as stochastic. Furthermore, the terms "statistical" and "stochastic" (frequently used as names of sub-classes in downscaling methodological reviews) are not always considered as synonymous, even though both terms could be seen as identical since they are referring to methods handling input modelling factors as variables with certain probability distributions. In addition, the recent development is going towards multi-step methodologies containing deterministic and stochastic components. This evolution leads to the introduction of new terms like hybrid or semi-stochastic approaches, which makes the efforts to systematically classifying downscaling methods to the previously defined categories even more challenging. This work presents a review of statistical downscaling procedures, which classifies the methods in two steps. In the first step, we describe several techniques that produce a single climatic surface based on observations. The methods are classified into two categories using an approximation to the broadest consensual statistical terms: linear and non-linear methods. The second step covers techniques that use simulations to generate alternative surfaces, which correspond to different realizations of the same processes. Those simulations are essential because there is a limited number of real observational data, and such procedures are crucial for modelling extremes. This work emphasises the link between statistical downscaling methods and the research of climate change impacts at city scale.

  12. A randomized trial comparing INR monitoring devices in patients with anticoagulation self-management: evaluation of a novel error-grid approach.

    PubMed

    Hemkens, Lars G; Hilden, Kristian M; Hartschen, Stephan; Kaiser, Thomas; Didjurgeit, Ulrike; Hansen, Roland; Bender, Ralf; Sawicki, Peter T

    2008-08-01

    In addition to the metrological quality of international normalized ratio (INR) monitoring devices used in patients' self-management of long-term anticoagulation, the effectiveness of self-monitoring with such devices has to be evaluated under real-life conditions with a focus on clinical implications. An approach to evaluate the clinical significance of inaccuracies is the error-grid analysis as already established in self-monitoring of blood glucose. Two anticoagulation monitors were compared in a real-life setting and a novel error-grid instrument for oral anticoagulation has been evaluated. In a randomized crossover study 16 patients performed self-management of anticoagulation using the INRatio and the CoaguChek S system. Main outcome measures were clinically relevant INR differences according to established criteria and to the error-grid approach. A lower rate of clinically relevant disagreements according to Anderson's criteria was found with CoaguChek S than with INRatio without statistical significance (10.77% vs. 12.90%; P = 0.787). Using the error-grid we found principally consistent results: More measurement pairs with discrepancies of no or low clinical relevance were found with CoaguChek S, whereas with INRatio we found more differences with a moderate clinical relevance. A high rate of patients' satisfaction with both of the point of care devices was found with only marginal differences. A principal appropriateness of the investigated point-of-care devices to adequately monitor the INR is shown. The error-grid is useful for comparing monitoring methods with a focus on clinical relevance under real-life conditions beyond assessing the pure metrological quality, but we emphasize that additional trials using this instrument with larger patient populations are needed to detect differences in clinically relevant disagreements.

  13. Overview of the SAE G-11 RMSL (Reliability, Maintainability, Supportability, and Logistics) Division Activities and Technical Projects

    NASA Technical Reports Server (NTRS)

    Singhal, Surendra N.

    2003-01-01

    The SAE G-11 RMSL (Reliability, Maintainability, Supportability, and Logistics) Division activities include identification and fulfillment of joint industry, government, and academia needs for development and implementation of RMSL technologies. Four Projects in the Probabilistic Methods area and two in the area of RMSL have been identified. These are: (1) Evaluation of Probabilistic Technology - progress has been made toward the selection of probabilistic application cases. Future effort will focus on assessment of multiple probabilistic softwares in solving selected engineering problems using probabilistic methods. Relevance to Industry & Government - Case studies of typical problems encountering uncertainties, results of solutions to these problems run by different codes, and recommendations on which code is applicable for what problems; (2) Probabilistic Input Preparation - progress has been made in identifying problem cases such as those with no data, little data and sufficient data. Future effort will focus on developing guidelines for preparing input for probabilistic analysis, especially with no or little data. Relevance to Industry & Government - Too often, we get bogged down thinking we need a lot of data before we can quantify uncertainties. Not True. There are ways to do credible probabilistic analysis with little data; (3) Probabilistic Reliability - probabilistic reliability literature search has been completed along with what differentiates it from statistical reliability. Work on computation of reliability based on quantification of uncertainties in primitive variables is in progress. Relevance to Industry & Government - Correct reliability computations both at the component and system level are needed so one can design an item based on its expected usage and life span; (4) Real World Applications of Probabilistic Methods (PM) - A draft of volume 1 comprising aerospace applications has been released. Volume 2, a compilation of real world applications of probabilistic methods with essential information demonstrating application type and timehost savings by the use of probabilistic methods for generic applications is in progress. Relevance to Industry & Government - Too often, we say, 'The Proof is in the Pudding'. With help from many contributors, we hope to produce such a document. Problem is - not too many people are coming forward due to proprietary nature. So, we are asking to document only minimum information including problem description, what method used, did it result in any savings, and how much?; (5) Software Reliability - software reliability concept, program, implementation, guidelines, and standards are being documented. Relevance to Industry & Government - software reliability is a complex issue that must be understood & addressed in all facets of business in industry, government, and other institutions. We address issues, concepts, ways to implement solutions, and guidelines for maximizing software reliability; (6) Maintainability Standards - maintainability/serviceability industry standard/guidelines and industry best practices and methodologies used in performing maintainability/ serviceability tasks are being documented. Relevance to Industry & Government - Any industry or government process, project, and/or tool must be maintained and serviced to realize the life and performance it was designed for. We address issues and develop guidelines for optimum performance & life.

  14. Using Landslide Failure Forecast Models in Near Real Time: the Mt. de La Saxe case-study

    NASA Astrophysics Data System (ADS)

    Manconi, Andrea; Giordan, Daniele

    2014-05-01

    Forecasting the occurrence of landslide phenomena in space and time is a major scientific challenge. The approaches used to forecast landslides mainly depend on the spatial scale analyzed (regional vs. local), the temporal range of forecast (long- vs. short-term), as well as the triggering factor and the landslide typology considered. By focusing on short-term forecast methods for large, deep seated slope instabilities, the potential time of failure (ToF) can be estimated by studying the evolution of the landslide deformation over time (i.e., strain rate) provided that, under constant stress conditions, landslide materials follow creep mechanism before reaching rupture. In the last decades, different procedures have been proposed to estimate ToF by considering simplified empirical and/or graphical methods applied to time series of deformation data. Fukuzono, 1985 proposed a failure forecast method based on the experience performed during large scale laboratory experiments, which were aimed at observing the kinematic evolution of a landslide induced by rain. This approach, known also as the inverse-velocity method, considers the evolution over time of the inverse value of the surface velocity (v) as an indicator of the ToF, by assuming that failure approaches while 1/v tends to zero. Here we present an innovative method to aimed at achieving failure forecast of landslide phenomena by considering near-real-time monitoring data. Starting from the inverse velocity theory, we analyze landslide surface displacements on different temporal windows, and then apply straightforward statistical methods to obtain confidence intervals on the time of failure. Our results can be relevant to support the management of early warning systems during landslide emergency conditions, also when the predefined displacement and/or velocity thresholds are exceeded. In addition, our statistical approach for the definition of confidence interval and forecast reliability can be applied also to different failure forecast methods. We applied for the first time the herein presented approach in near real time during the emergency scenario relevant to the reactivation of the La Saxe rockslide, a large mass wasting menacing the population of Courmayeur, northern Italy, and the important European route E25. We show how the application of simplified but robust forecast models can be a convenient method to manage and support early warning systems during critical situations. References: Fukuzono T. (1985), A New Method for Predicting the Failure Time of a Slope, Proc. IVth International Conference and Field Workshop on Landslides, Tokyo.

  15. Identifying candidate drivers of drug response in heterogeneous cancer by mining high throughput genomics data.

    PubMed

    Nabavi, Sheida

    2016-08-15

    With advances in technologies, huge amounts of multiple types of high-throughput genomics data are available. These data have tremendous potential to identify new and clinically valuable biomarkers to guide the diagnosis, assessment of prognosis, and treatment of complex diseases, such as cancer. Integrating, analyzing, and interpreting big and noisy genomics data to obtain biologically meaningful results, however, remains highly challenging. Mining genomics datasets by utilizing advanced computational methods can help to address these issues. To facilitate the identification of a short list of biologically meaningful genes as candidate drivers of anti-cancer drug resistance from an enormous amount of heterogeneous data, we employed statistical machine-learning techniques and integrated genomics datasets. We developed a computational method that integrates gene expression, somatic mutation, and copy number aberration data of sensitive and resistant tumors. In this method, an integrative method based on module network analysis is applied to identify potential driver genes. This is followed by cross-validation and a comparison of the results of sensitive and resistance groups to obtain the final list of candidate biomarkers. We applied this method to the ovarian cancer data from the cancer genome atlas. The final result contains biologically relevant genes, such as COL11A1, which has been reported as a cis-platinum resistant biomarker for epithelial ovarian carcinoma in several recent studies. The described method yields a short list of aberrant genes that also control the expression of their co-regulated genes. The results suggest that the unbiased data driven computational method can identify biologically relevant candidate biomarkers. It can be utilized in a wide range of applications that compare two conditions with highly heterogeneous datasets.

  16. FUN-LDA: A Latent Dirichlet Allocation Model for Predicting Tissue-Specific Functional Effects of Noncoding Variation: Methods and Applications.

    PubMed

    Backenroth, Daniel; He, Zihuai; Kiryluk, Krzysztof; Boeva, Valentina; Pethukova, Lynn; Khurana, Ekta; Christiano, Angela; Buxbaum, Joseph D; Ionita-Laza, Iuliana

    2018-05-03

    We describe a method based on a latent Dirichlet allocation model for predicting functional effects of noncoding genetic variants in a cell-type- and/or tissue-specific way (FUN-LDA). Using this unsupervised approach, we predict tissue-specific functional effects for every position in the human genome in 127 different tissues and cell types. We demonstrate the usefulness of our predictions by using several validation experiments. Using eQTL data from several sources, including the GTEx project, Geuvadis project, and TwinsUK cohort, we show that eQTLs in specific tissues tend to be most enriched among the predicted functional variants in relevant tissues in Roadmap. We further show how these integrated functional scores can be used for (1) deriving the most likely cell or tissue type causally implicated for a complex trait by using summary statistics from genome-wide association studies and (2) estimating a tissue-based correlation matrix of various complex traits. We found large enrichment of heritability in functional components of relevant tissues for various complex traits, and FUN-LDA yielded higher enrichment estimates than existing methods. Finally, using experimentally validated functional variants from the literature and variants possibly implicated in disease by previous studies, we rigorously compare FUN-LDA with state-of-the-art functional annotation methods and show that FUN-LDA has better prediction accuracy and higher resolution than these methods. In particular, our results suggest that tissue- and cell-type-specific functional prediction methods tend to have substantially better prediction accuracy than organism-level prediction methods. Scores for each position in the human genome and for each ENCODE and Roadmap tissue are available online (see Web Resources). Copyright © 2018 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  17. Anatomical masking of pressure footprints based on the Oxford Foot Model: validation and clinical relevance.

    PubMed

    Giacomozzi, Claudia; Stebbins, Julie A

    2017-03-01

    Plantar pressure analysis is widely used in the assessment of foot function. In order to assess regional loading, a mask is applied to the footprint to sub-divide it into regions of interest (ROIs). The most common masking method is based on geometric features of the footprint (GM). Footprint masking based on anatomical landmarks of the foot has been implemented more recently, and involves the integration of a 3D motion capture system, plantar pressure measurement device, and a multi-segment foot model. However, thorough validation of anatomical masking (AM) using pathological footprints has not yet been presented. In the present study, an AM method based on the Oxford Foot Model (OFM) was compared to an equivalent GM. Pressure footprints from 20 young healthy subjects (HG) and 20 patients with clubfoot (CF) were anatomically divided into 5 ROIs using a subset of the OFM markers. The same foot regions were also identified by using a standard GM method. Comparisons of intra-subject coefficient of variation (CV) showed that the OFM-based AM was at least as reliable as the GM for all investigated pressure parameters in all foot regions. Clinical relevance of AM was investigated by comparing footprints from HG and CF groups. Contact time, maximum force, force-time integral and contact area proved to be sensitive parameters that were able to distinguish HG and CF groups, using both AM and GM methods However, the AM method revealed statistically significant differences between groups in 75% of measured variables, compared to 62% using a standard GM method, indicating that the AM method is more sensitive for revealing differences between groups. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Teaching statistics to nursing students: an expert panel consensus.

    PubMed

    Hayat, Matthew J; Eckardt, Patricia; Higgins, Melinda; Kim, MyoungJin; Schmiege, Sarah J

    2013-06-01

    Statistics education is a necessary element of nursing education, and its inclusion is recommended in the American Association of Colleges of Nursing guidelines for nurse training at all levels. This article presents a cohesive summary of an expert panel discussion, "Teaching Statistics to Nursing Students," held at the 2012 Joint Statistical Meetings. All panelists were statistics experts, had extensive teaching and consulting experience, and held faculty appointments in a U.S.-based nursing college or school. The panel discussed degree-specific curriculum requirements, course content, how to ensure nursing students understand the relevance of statistics, approaches to integrating statistics consulting knowledge, experience with classroom instruction, use of knowledge from the statistics education research field to make improvements in statistics education for nursing students, and classroom pedagogy and instruction on the use of statistical software. Panelists also discussed the need for evidence to make data-informed decisions about statistics education and training for nurses. Copyright 2013, SLACK Incorporated.

  19. Lung nodule malignancy classification using only radiologist-quantified image features as inputs to statistical learning algorithms: probing the Lung Image Database Consortium dataset with two statistical learning methods

    PubMed Central

    Hancock, Matthew C.; Magnan, Jerry F.

    2016-01-01

    Abstract. In the assessment of nodules in CT scans of the lungs, a number of image-derived features are diagnostically relevant. Currently, many of these features are defined only qualitatively, so they are difficult to quantify from first principles. Nevertheless, these features (through their qualitative definitions and interpretations thereof) are often quantified via a variety of mathematical methods for the purpose of computer-aided diagnosis (CAD). To determine the potential usefulness of quantified diagnostic image features as inputs to a CAD system, we investigate the predictive capability of statistical learning methods for classifying nodule malignancy. We utilize the Lung Image Database Consortium dataset and only employ the radiologist-assigned diagnostic feature values for the lung nodules therein, as well as our derived estimates of the diameter and volume of the nodules from the radiologists’ annotations. We calculate theoretical upper bounds on the classification accuracy that are achievable by an ideal classifier that only uses the radiologist-assigned feature values, and we obtain an accuracy of 85.74 (±1.14)%, which is, on average, 4.43% below the theoretical maximum of 90.17%. The corresponding area-under-the-curve (AUC) score is 0.932 (±0.012), which increases to 0.949 (±0.007) when diameter and volume features are included and has an accuracy of 88.08 (±1.11)%. Our results are comparable to those in the literature that use algorithmically derived image-based features, which supports our hypothesis that lung nodules can be classified as malignant or benign using only quantified, diagnostic image features, and indicates the competitiveness of this approach. We also analyze how the classification accuracy depends on specific features and feature subsets, and we rank the features according to their predictive power, statistically demonstrating the top four to be spiculation, lobulation, subtlety, and calcification. PMID:27990453

  20. Lung nodule malignancy classification using only radiologist-quantified image features as inputs to statistical learning algorithms: probing the Lung Image Database Consortium dataset with two statistical learning methods.

    PubMed

    Hancock, Matthew C; Magnan, Jerry F

    2016-10-01

    In the assessment of nodules in CT scans of the lungs, a number of image-derived features are diagnostically relevant. Currently, many of these features are defined only qualitatively, so they are difficult to quantify from first principles. Nevertheless, these features (through their qualitative definitions and interpretations thereof) are often quantified via a variety of mathematical methods for the purpose of computer-aided diagnosis (CAD). To determine the potential usefulness of quantified diagnostic image features as inputs to a CAD system, we investigate the predictive capability of statistical learning methods for classifying nodule malignancy. We utilize the Lung Image Database Consortium dataset and only employ the radiologist-assigned diagnostic feature values for the lung nodules therein, as well as our derived estimates of the diameter and volume of the nodules from the radiologists' annotations. We calculate theoretical upper bounds on the classification accuracy that are achievable by an ideal classifier that only uses the radiologist-assigned feature values, and we obtain an accuracy of 85.74 [Formula: see text], which is, on average, 4.43% below the theoretical maximum of 90.17%. The corresponding area-under-the-curve (AUC) score is 0.932 ([Formula: see text]), which increases to 0.949 ([Formula: see text]) when diameter and volume features are included and has an accuracy of 88.08 [Formula: see text]. Our results are comparable to those in the literature that use algorithmically derived image-based features, which supports our hypothesis that lung nodules can be classified as malignant or benign using only quantified, diagnostic image features, and indicates the competitiveness of this approach. We also analyze how the classification accuracy depends on specific features and feature subsets, and we rank the features according to their predictive power, statistically demonstrating the top four to be spiculation, lobulation, subtlety, and calcification.

  1. Incorporating molecular and functional context into the analysis and prioritization of human variants associated with cancer

    PubMed Central

    Peterson, Thomas A; Nehrt, Nathan L; Park, DoHwan

    2012-01-01

    Background and objective With recent breakthroughs in high-throughput sequencing, identifying deleterious mutations is one of the key challenges for personalized medicine. At the gene and protein level, it has proven difficult to determine the impact of previously unknown variants. A statistical method has been developed to assess the significance of disease mutation clusters on protein domains by incorporating domain functional annotations to assist in the functional characterization of novel variants. Methods Disease mutations aggregated from multiple databases were mapped to domains, and were classified as either cancer- or non-cancer-related. The statistical method for identifying significantly disease-associated domain positions was applied to both sets of mutations and to randomly generated mutation sets for comparison. To leverage the known function of protein domain regions, the method optionally distributes significant scores to associated functional feature positions. Results Most disease mutations are localized within protein domains and display a tendency to cluster at individual domain positions. The method identified significant disease mutation hotspots in both the cancer and non-cancer datasets. The domain significance scores (DS-scores) for cancer form a bimodal distribution with hotspots in oncogenes forming a second peak at higher DS-scores than non-cancer, and hotspots in tumor suppressors have scores more similar to non-cancers. In addition, on an independent mutation benchmarking set, the DS-score method identified mutations known to alter protein function with very high precision. Conclusion By aggregating mutations with known disease association at the domain level, the method was able to discover domain positions enriched with multiple occurrences of deleterious mutations while incorporating relevant functional annotations. The method can be incorporated into translational bioinformatics tools to characterize rare and novel variants within large-scale sequencing studies. PMID:22319177

  2. A novel bi-level meta-analysis approach: applied to biological pathway analysis.

    PubMed

    Nguyen, Tin; Tagett, Rebecca; Donato, Michele; Mitrea, Cristina; Draghici, Sorin

    2016-02-01

    The accumulation of high-throughput data in public repositories creates a pressing need for integrative analysis of multiple datasets from independent experiments. However, study heterogeneity, study bias, outliers and the lack of power of available methods present real challenge in integrating genomic data. One practical drawback of many P-value-based meta-analysis methods, including Fisher's, Stouffer's, minP and maxP, is that they are sensitive to outliers. Another drawback is that, because they perform just one statistical test for each individual experiment, they may not fully exploit the potentially large number of samples within each study. We propose a novel bi-level meta-analysis approach that employs the additive method and the Central Limit Theorem within each individual experiment and also across multiple experiments. We prove that the bi-level framework is robust against bias, less sensitive to outliers than other methods, and more sensitive to small changes in signal. For comparative analysis, we demonstrate that the intra-experiment analysis has more power than the equivalent statistical test performed on a single large experiment. For pathway analysis, we compare the proposed framework versus classical meta-analysis approaches (Fisher's, Stouffer's and the additive method) as well as against a dedicated pathway meta-analysis package (MetaPath), using 1252 samples from 21 datasets related to three human diseases, acute myeloid leukemia (9 datasets), type II diabetes (5 datasets) and Alzheimer's disease (7 datasets). Our framework outperforms its competitors to correctly identify pathways relevant to the phenotypes. The framework is sufficiently general to be applied to any type of statistical meta-analysis. The R scripts are available on demand from the authors. sorin@wayne.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  3. VALUE - Validating and Integrating Downscaling Methods for Climate Change Research

    NASA Astrophysics Data System (ADS)

    Maraun, Douglas; Widmann, Martin; Benestad, Rasmus; Kotlarski, Sven; Huth, Radan; Hertig, Elke; Wibig, Joanna; Gutierrez, Jose

    2013-04-01

    Our understanding of global climate change is mainly based on General Circulation Models (GCMs) with a relatively coarse resolution. Since climate change impacts are mainly experienced on regional scales, high-resolution climate change scenarios need to be derived from GCM simulations by downscaling. Several projects have been carried out over the last years to validate the performance of statistical and dynamical downscaling, yet several aspects have not been systematically addressed: variability on sub-daily, decadal and longer time-scales, extreme events, spatial variability and inter-variable relationships. Different downscaling approaches such as dynamical downscaling, statistical downscaling and bias correction approaches have not been systematically compared. Furthermore, collaboration between different communities, in particular regional climate modellers, statistical downscalers and statisticians has been limited. To address these gaps, the EU Cooperation in Science and Technology (COST) action VALUE (www.value-cost.eu) has been brought into life. VALUE is a research network with participants from currently 23 European countries running from 2012 to 2015. Its main aim is to systematically validate and develop downscaling methods for climate change research in order to improve regional climate change scenarios for use in climate impact studies. Inspired by the co-design idea of the international research initiative "future earth", stakeholders of climate change information have been involved in the definition of research questions to be addressed and are actively participating in the network. The key idea of VALUE is to identify the relevant weather and climate characteristics required as input for a wide range of impact models and to define an open framework to systematically validate these characteristics. Based on a range of benchmark data sets, in principle every downscaling method can be validated and compared with competing methods. The results of this exercise will directly provide end users with important information about the uncertainty of regional climate scenarios, and will furthermore provide the basis for further developing downscaling methods. This presentation will provide background information on VALUE and discuss the identified characteristics and the validation framework.

  4. Decrease of Fisher information and the information geometry of evolution equations for quantum mechanical probability amplitudes.

    PubMed

    Cafaro, Carlo; Alsing, Paul M

    2018-04-01

    The relevance of the concept of Fisher information is increasing in both statistical physics and quantum computing. From a statistical mechanical standpoint, the application of Fisher information in the kinetic theory of gases is characterized by its decrease along the solutions of the Boltzmann equation for Maxwellian molecules in the two-dimensional case. From a quantum mechanical standpoint, the output state in Grover's quantum search algorithm follows a geodesic path obtained from the Fubini-Study metric on the manifold of Hilbert-space rays. Additionally, Grover's algorithm is specified by constant Fisher information. In this paper, we present an information geometric characterization of the oscillatory or monotonic behavior of statistically parametrized squared probability amplitudes originating from special functional forms of the Fisher information function: constant, exponential decay, and power-law decay. Furthermore, for each case, we compute both the computational speed and the availability loss of the corresponding physical processes by exploiting a convenient Riemannian geometrization of useful thermodynamical concepts. Finally, we briefly comment on the possibility of using the proposed methods of information geometry to help identify a suitable trade-off between speed and thermodynamic efficiency in quantum search algorithms.

  5. Non-equilibrium Statistical Mechanics and the Sea Ice Thickness Distribution

    NASA Astrophysics Data System (ADS)

    Wettlaufer, John; Toppaladoddi, Srikanth

    We use concepts from non-equilibrium statistical physics to transform the original evolution equation for the sea ice thickness distribution g (h) due to Thorndike et al., (1975) into a Fokker-Planck like conservation law. The steady solution is g (h) = calN (q) hqe - h / H , where q and H are expressible in terms of moments over the transition probabilities between thickness categories. The solution exhibits the functional form used in observational fits and shows that for h << 1 , g (h) is controlled by both thermodynamics and mechanics, whereas for h >> 1 only mechanics controls g (h) . Finally, we derive the underlying Langevin equation governing the dynamics of the ice thickness h, from which we predict the observed g (h) . This allows us to demonstrate that the ice thickness field is ergodic. The genericity of our approach provides a framework for studying the geophysical scale structure of the ice pack using methods of broad relevance in statistical mechanics. Swedish Research Council Grant No. 638-2013-9243, NASA Grant NNH13ZDA001N-CRYO and the National Science Foundation and the Office of Naval Research under OCE-1332750 for support.

  6. Decrease of Fisher information and the information geometry of evolution equations for quantum mechanical probability amplitudes

    NASA Astrophysics Data System (ADS)

    Cafaro, Carlo; Alsing, Paul M.

    2018-04-01

    The relevance of the concept of Fisher information is increasing in both statistical physics and quantum computing. From a statistical mechanical standpoint, the application of Fisher information in the kinetic theory of gases is characterized by its decrease along the solutions of the Boltzmann equation for Maxwellian molecules in the two-dimensional case. From a quantum mechanical standpoint, the output state in Grover's quantum search algorithm follows a geodesic path obtained from the Fubini-Study metric on the manifold of Hilbert-space rays. Additionally, Grover's algorithm is specified by constant Fisher information. In this paper, we present an information geometric characterization of the oscillatory or monotonic behavior of statistically parametrized squared probability amplitudes originating from special functional forms of the Fisher information function: constant, exponential decay, and power-law decay. Furthermore, for each case, we compute both the computational speed and the availability loss of the corresponding physical processes by exploiting a convenient Riemannian geometrization of useful thermodynamical concepts. Finally, we briefly comment on the possibility of using the proposed methods of information geometry to help identify a suitable trade-off between speed and thermodynamic efficiency in quantum search algorithms.

  7. Inactive Hepatitis B Carrier and Pregnancy Outcomes: A Systematic Review and Meta-analysis.

    PubMed

    Keramat, Afsaneh; Younesian, Masud; Gholami Fesharaki, Mohammad; Hasani, Maryam; Mirzaei, Samaneh; Ebrahimi, Elham; Alavian, Seyed Moaed; Mohammadi, Fatemeh

    2017-04-01

    We aimed to explore whether maternal asymptomatic hepatitis B (HB) infection effects on pre-term rupture of membranous (PROM), stillbirth, preeclampsia, eclampsia, gestational hypertension, or antepartum hemorrhage. We searched the PubMed, Scopus, and ISI web of science from 1990 to Feb 2015. In addition, electronic literature searches supplemented by searching the gray literature (e.g., conference abstracts thesis and the result of technical reports) and scanning the reference lists of included studies and relevant systematic reviews. We explored statistical heterogeneity using the, I2 and tau-squared (Tau2) statistical tests. Eighteen studies were included. Preterm rupture of membranous (PROM), stillbirth, preeclampsia, eclampsia, gestational hypertension and antepartum hemorrhage were considerable outcomes in this survey. The results showed no significant association between inactive HB and these complications in pregnancy. The small amounts of P -value and chi-square and large amount of I2 suggested the probable heterogeneity in this part, which we tried to modify with statistical methods such as subgroup analysis. Inactive HB infection did not increase the risk of adversely mentioned outcomes in this study. Further, well-designed studies should be performed to confirm the results.

  8. Rank Dynamics of Word Usage at Multiple Scales

    NASA Astrophysics Data System (ADS)

    Morales, José A.; Colman, Ewan; Sánchez, Sergio; Sánchez-Puig, Fernanda; Pineda, Carlos; Iñiguez, Gerardo; Cocho, Germinal; Flores, Jorge; Gershenson, Carlos

    2018-05-01

    The recent dramatic increase in online data availability has allowed researchers to explore human culture with unprecedented detail, such as the growth and diversification of language. In particular, it provides statistical tools to explore whether word use is similar across languages, and if so, whether these generic features appear at different scales of language structure. Here we use the Google Books N-grams dataset to analyze the temporal evolution of word usage in several languages. We apply measures proposed recently to study rank dynamics, such as the diversity of N-grams in a given rank, the probability that an N-gram changes rank between successive time intervals, the rank entropy, and the rank complexity. Using different methods, results show that there are generic properties for different languages at different scales, such as a core of words necessary to minimally understand a language. We also propose a null model to explore the relevance of linguistic structure across multiple scales, concluding that N-gram statistics cannot be reduced to word statistics. We expect our results to be useful in improving text prediction algorithms, as well as in shedding light on the large-scale features of language use, beyond linguistic and cultural differences across human populations.

  9. A study of correlations between crude oil spot and futures markets: A rolling sample test

    NASA Astrophysics Data System (ADS)

    Liu, Li; Wan, Jieqiu

    2011-10-01

    In this article, we investigate the asymmetries of exceedance correlations and cross-correlations between West Texas Intermediate (WTI) spot and futures markets. First, employing the test statistic proposed by Hong et al. [Asymmetries in stock returns: statistical tests and economic evaluation, Review of Financial Studies 20 (2007) 1547-1581], we find that the exceedance correlations were overall symmetric. However, the results from rolling windows show that some occasional events could induce the significant asymmetries of the exceedance correlations. Second, employing the test statistic proposed by Podobnik et al. [Quantifying cross-correlations using local and global detrending approaches, European Physics Journal B 71 (2009) 243-250], we find that the cross-correlations were significant even for large lagged orders. Using the detrended cross-correlation analysis proposed by Podobnik and Stanley [Detrended cross-correlation analysis: a new method for analyzing two nonstationary time series, Physics Review Letters 100 (2008) 084102], we find that the cross-correlations were weakly persistent and were stronger between spot and futures contract with larger maturity. Our results from rolling sample test also show the apparent effects of the exogenous events. Additionally, we have some relevant discussions on the obtained evidence.

  10. Quantitatively measured tremor in hand-arm vibration-exposed workers.

    PubMed

    Edlund, Maria; Burström, Lage; Hagberg, Mats; Lundström, Ronnie; Nilsson, Tohr; Sandén, Helena; Wastensson, Gunilla

    2015-04-01

    The aim of the present study was to investigate the possible increase in hand tremor in relation to hand-arm vibration (HAV) exposure in a cohort of exposed and unexposed workers. Participants were 178 male workers with or without exposure to HAV. The study is cross-sectional regarding the outcome of tremor and has a longitudinal design with respect to exposure. The dose of HAV exposure was collected via questionnaires and measurements at several follow-ups. The CATSYS Tremor Pen(®) was used for measuring postural tremor. Multiple linear regression methods were used to analyze associations between different tremor variables and HAV exposure, along with predictor variables with biological relevance. There were no statistically significant associations between the different tremor variables and cumulative HAV or current exposure. Age was a statistically significant predictor of variation in tremor outcomes for three of the four tremor variables, whereas nicotine use was a statistically significant predictor of either left or right hand or both hands for all four tremor variables. In the present study, there was no evidence of an exposure-response association between HAV exposure and measured postural tremor. Increase in age and nicotine use appeared to be the strongest predictors of tremor.

  11. Model independent approach to the single photoelectron calibration of photomultiplier tubes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Saldanha, R.; Grandi, L.; Guardincerri, Y.

    2017-08-01

    The accurate calibration of photomultiplier tubes is critical in a wide variety of applications in which it is necessary to know the absolute number of detected photons or precisely determine the resolution of the signal. Conventional calibration methods rely on fitting the photomultiplier response to a low intensity light source with analytical approximations to the single photoelectron distribution, often leading to biased estimates due to the inability to accurately model the full distribution, especially at low charge values. In this paper we present a simple statistical method to extract the relevant single photoelectron calibration parameters without making any assumptions aboutmore » the underlying single photoelectron distribution. We illustrate the use of this method through the calibration of a Hamamatsu R11410 photomultiplier tube and study the accuracy and precision of the method using Monte Carlo simulations. The method is found to have significantly reduced bias compared to conventional methods and works under a wide range of light intensities, making it suitable for simultaneously calibrating large arrays of photomultiplier tubes.« less

  12. Predictive distributions for between-study heterogeneity and simple methods for their application in Bayesian meta-analysis

    PubMed Central

    Turner, Rebecca M; Jackson, Dan; Wei, Yinghui; Thompson, Simon G; Higgins, Julian P T

    2015-01-01

    Numerous meta-analyses in healthcare research combine results from only a small number of studies, for which the variance representing between-study heterogeneity is estimated imprecisely. A Bayesian approach to estimation allows external evidence on the expected magnitude of heterogeneity to be incorporated. The aim of this paper is to provide tools that improve the accessibility of Bayesian meta-analysis. We present two methods for implementing Bayesian meta-analysis, using numerical integration and importance sampling techniques. Based on 14 886 binary outcome meta-analyses in the Cochrane Database of Systematic Reviews, we derive a novel set of predictive distributions for the degree of heterogeneity expected in 80 settings depending on the outcomes assessed and comparisons made. These can be used as prior distributions for heterogeneity in future meta-analyses. The two methods are implemented in R, for which code is provided. Both methods produce equivalent results to standard but more complex Markov chain Monte Carlo approaches. The priors are derived as log-normal distributions for the between-study variance, applicable to meta-analyses of binary outcomes on the log odds-ratio scale. The methods are applied to two example meta-analyses, incorporating the relevant predictive distributions as prior distributions for between-study heterogeneity. We have provided resources to facilitate Bayesian meta-analysis, in a form accessible to applied researchers, which allow relevant prior information on the degree of heterogeneity to be incorporated. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:25475839

  13. [Studies on localized low-risk prostate cancer : Do we know enough?

    PubMed

    Weißbach, L; Roloff, C

    2018-06-05

    Treatment of localized low-risk prostate cancer (PCa) is undergoing a paradigm shift: Invasive treatments such as surgery and radiation therapy are being replaced by defensive strategies such as active surveillance (AS) and watchful waiting (WW). The aim of this work is to evaluate the significance of current studies regarding defensive strategies (AS and WW). The best-known AS studies are critically evaluated for their significance in terms of input criteria, follow-up criteria, and statistical significance. The difficulties faced by randomized studies in answering the question of the best treatment for low-risk cancer in two or even more study groups with known low tumor-specific mortality are clearly shown. Some studies fail because of the objective, others-like PIVOT-are underpowered. ProtecT, a renowned randomized, controlled trial (RCT), lists systematic and statistical shortcomings in detail. The time and effort required for RCTs to answer the question of which therapy is best for locally limited low-risk cancer is very large because the low specific mortality rate requires a large number of participants and a long study duration. In any case, RCTs create hand-picked cohorts for statistical evaluation that have little to do with care in daily clinical practice. The necessary randomization is also offset by the decision-making of the informed patient. If further studies of low-risk PCa are needed, they will need real-world conditions that an RCT can not provide. To obtain clinically relevant results, we need to rethink things: When planning the study, biometricians and clinicians must understand that the statistical methods used in RCTs are of limited use and they must select a method (e.g. propensity scores) appropriate for health care research.

  14. A Bayesian statistical analysis of mouse dermal tumor promotion assay data for evaluating cigarette smoke condensate.

    PubMed

    Kathman, Steven J; Potts, Ryan J; Ayres, Paul H; Harp, Paul R; Wilson, Cody L; Garner, Charles D

    2010-10-01

    The mouse dermal assay has long been used to assess the dermal tumorigenicity of cigarette smoke condensate (CSC). This mouse skin model has been developed for use in carcinogenicity testing utilizing the SENCAR mouse as the standard strain. Though the model has limitations, it remains as the most relevant method available to study the dermal tumor promoting potential of mainstream cigarette smoke. In the typical SENCAR mouse CSC bioassay, CSC is applied for 29 weeks following the application of a tumor initiator such as 7,12-dimethylbenz[a]anthracene (DMBA). Several endpoints are considered for analysis including: the percentage of animals with at least one mass, latency, and number of masses per animal. In this paper, a relatively straightforward analytic model and procedure is presented for analyzing the time course of the incidence of masses. The procedure considered here takes advantage of Bayesian statistical techniques, which provide powerful methods for model fitting and simulation. Two datasets are analyzed to illustrate how the model fits the data, how well the model may perform in predicting data from such trials, and how the model may be used as a decision tool when comparing the dermal tumorigenicity of cigarette smoke condensate from multiple cigarette types. The analysis presented here was developed as a statistical decision tool for differentiating between two or more prototype products based on the dermal tumorigenicity. Copyright (c) 2010 Elsevier Inc. All rights reserved.

  15. Unravelling the complex MRI pattern in glutaric aciduria type I using statistical models-a cohort study in 180 patients.

    PubMed

    Garbade, Sven F; Greenberg, Cheryl R; Demirkol, Mübeccel; Gökçay, Gülden; Ribes, Antonia; Campistol, Jaume; Burlina, Alberto B; Burgard, Peter; Kölker, Stefan

    2014-09-01

    Glutaric aciduria type I (GA-I) is a cerebral organic aciduria caused by inherited deficiency of glutaryl-CoA dehydrogenase and is characterized biochemically by an accumulation of putatively neurotoxic dicarboxylic metabolites. The majority of untreated patients develops a complex movement disorder with predominant dystonia during age 3-36 months. Magnetic resonance imaging (MRI) studies have demonstrated striatal and extrastriatal abnormalities. The major aim of this study was to elucidate the complex neuroradiological pattern of patients with GA-I and to associate the MRI findings with the severity of predominant neurological symptoms. In 180 patients, detailed information about the neurological presentation and brain region-specific MRI abnormalities were obtained via a standardized questionnaire. Patients with a movement disorder had more often MRI abnormalities in putamen, caudate, cortex, ventricles and external CSF spaces than patients without or with minor neurological symptoms. Putaminal MRI changes and strongly dilated ventricles were identified as the most reliable predictors of a movement disorder. In contrast, abnormalities in globus pallidus were not clearly associated with a movement disorder. Caudate and putamen as well as cortex, ventricles and external CSF spaces clearly collocalized on a two-dimensional map demonstrating statistical similarity and suggesting the same underlying pathomechanism. This study demonstrates that complex statistical methods are useful to decipher the age-dependent and region-specific MRI patterns of rare neurometabolic diseases and that these methods are helpful to elucidate the clinical relevance of specific MRI findings.

  16. Enrichment analysis in high-throughput genomics - accounting for dependency in the NULL.

    PubMed

    Gold, David L; Coombes, Kevin R; Wang, Jing; Mallick, Bani

    2007-03-01

    Translating the overwhelming amount of data generated in high-throughput genomics experiments into biologically meaningful evidence, which may for example point to a series of biomarkers or hint at a relevant pathway, is a matter of great interest in bioinformatics these days. Genes showing similar experimental profiles, it is hypothesized, share biological mechanisms that if understood could provide clues to the molecular processes leading to pathological events. It is the topic of further study to learn if or how a priori information about the known genes may serve to explain coexpression. One popular method of knowledge discovery in high-throughput genomics experiments, enrichment analysis (EA), seeks to infer if an interesting collection of genes is 'enriched' for a Consortium particular set of a priori Gene Ontology Consortium (GO) classes. For the purposes of statistical testing, the conventional methods offered in EA software implicitly assume independence between the GO classes. Genes may be annotated for more than one biological classification, and therefore the resulting test statistics of enrichment between GO classes can be highly dependent if the overlapping gene sets are relatively large. There is a need to formally determine if conventional EA results are robust to the independence assumption. We derive the exact null distribution for testing enrichment of GO classes by relaxing the independence assumption using well-known statistical theory. In applications with publicly available data sets, our test results are similar to the conventional approach which assumes independence. We argue that the independence assumption is not detrimental.

  17. Problematic smartphone use: A conceptual overview and systematic review of relations with anxiety and depression psychopathology.

    PubMed

    Elhai, Jon D; Dvorak, Robert D; Levine, Jason C; Hall, Brian J

    2017-01-01

    Research literature on problematic smartphone use, or smartphone addiction, has proliferated. However, relationships with existing categories of psychopathology are not well defined. We discuss the concept of problematic smartphone use, including possible causal pathways to such use. We conducted a systematic review of the relationship between problematic use with psychopathology. Using scholarly bibliographic databases, we screened 117 total citations, resulting in 23 peer-reviewer papers examining statistical relations between standardized measures of problematic smartphone use/use severity and the severity of psychopathology. Most papers examined problematic use in relation to depression, anxiety, chronic stress and/or low self-esteem. Across this literature, without statistically adjusting for other relevant variables, depression severity was consistently related to problematic smartphone use, demonstrating at least medium effect sizes. Anxiety was also consistently related to problem use, but with small effect sizes. Stress was somewhat consistently related, with small to medium effects. Self-esteem was inconsistently related, with small to medium effects when found. Statistically adjusting for other relevant variables yielded similar but somewhat smaller effects. We only included correlational studies in our systematic review, but address the few relevant experimental studies also. We discuss causal explanations for relationships between problem smartphone use and psychopathology. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. Two-stage atlas subset selection in multi-atlas based image segmentation.

    PubMed

    Zhao, Tingting; Ruan, Dan

    2015-06-01

    Fast growing access to large databases and cloud stored data presents a unique opportunity for multi-atlas based image segmentation and also presents challenges in heterogeneous atlas quality and computation burden. This work aims to develop a novel two-stage method tailored to the special needs in the face of large atlas collection with varied quality, so that high-accuracy segmentation can be achieved with low computational cost. An atlas subset selection scheme is proposed to substitute a significant portion of the computationally expensive full-fledged registration in the conventional scheme with a low-cost alternative. More specifically, the authors introduce a two-stage atlas subset selection method. In the first stage, an augmented subset is obtained based on a low-cost registration configuration and a preliminary relevance metric; in the second stage, the subset is further narrowed down to a fusion set of desired size, based on full-fledged registration and a refined relevance metric. An inference model is developed to characterize the relationship between the preliminary and refined relevance metrics, and a proper augmented subset size is derived to ensure that the desired atlases survive the preliminary selection with high probability. The performance of the proposed scheme has been assessed with cross validation based on two clinical datasets consisting of manually segmented prostate and brain magnetic resonance images, respectively. The proposed scheme demonstrates comparable end-to-end segmentation performance as the conventional single-stage selection method, but with significant computation reduction. Compared with the alternative computation reduction method, their scheme improves the mean and medium Dice similarity coefficient value from (0.74, 0.78) to (0.83, 0.85) and from (0.82, 0.84) to (0.95, 0.95) for prostate and corpus callosum segmentation, respectively, with statistical significance. The authors have developed a novel two-stage atlas subset selection scheme for multi-atlas based segmentation. It achieves good segmentation accuracy with significantly reduced computation cost, making it a suitable configuration in the presence of extensive heterogeneous atlases.

  19. Use of keyword hierarchies to interpret gene expression patterns.

    PubMed

    Masys, D R; Welsh, J B; Lynn Fink, J; Gribskov, M; Klacansky, I; Corbeil, J

    2001-04-01

    High-density microarray technology permits the quantitative and simultaneous monitoring of thousands of genes. The interpretation challenge is to extract relevant information from this large amount of data. A growing variety of statistical analysis approaches are available to identify clusters of genes that share common expression characteristics, but provide no information regarding the biological similarities of genes within clusters. The published literature provides a potential source of information to assist in interpretation of clustering results. We describe a data mining method that uses indexing terms ('keywords') from the published literature linked to specific genes to present a view of the conceptual similarity of genes within a cluster or group of interest. The method takes advantage of the hierarchical nature of Medical Subject Headings used to index citations in the MEDLINE database, and the registry numbers applied to enzymes.

  20. Framing Electronic Medical Records as Polylingual Documents in Query Expansion

    PubMed Central

    Huang, Edward W; Wang, Sheng; Lee, Doris Jung-Lin; Zhang, Runshun; Liu, Baoyan; Zhou, Xuezhong; Zhai, ChengXiang

    2017-01-01

    We present a study of electronic medical record (EMR) retrieval that emulates situations in which a doctor treats a new patient. Given a query consisting of a new patient’s symptoms, the retrieval system returns the set of most relevant records of previously treated patients. However, due to semantic, functional, and treatment synonyms in medical terminology, queries are often incomplete and thus require enhancement. In this paper, we present a topic model that frames symptoms and treatments as separate languages. Our experimental results show that this method improves retrieval performance over several baselines with statistical significance. These baselines include methods used in prior studies as well as state-of-the-art embedding techniques. Finally, we show that our proposed topic model discovers all three types of synonyms to improve medical record retrieval. PMID:29854161

  1. Waste management barriers in developing country hospitals: Case study and AHP analysis.

    PubMed

    Delmonico, Diego V de Godoy; Santos, Hugo H Dos; Pinheiro, Marco Ap; de Castro, Rosani; de Souza, Regiane M

    2018-01-01

    Healthcare waste management is an essential field for both researchers and practitioners. Although there have been few studies using statistical methods for its evaluation, it has been the subject of several studies in different contexts. Furthermore, the known precarious practices for waste management in developing countries raise questions about its potential barriers. This study aims to investigate the barriers in healthcare waste management and their relevance. For this purpose, this paper analyses waste management practices in two Brazilian hospitals by using case study and the Analytic Hierarchy Process method. The barriers were organized into three categories - human factors, management, and infrastructure, and the main findings suggest that cost and employee awareness were the most significant barriers. These results highlight the main barriers to more sustainable waste management, and provide an empirical basis for multi-criteria evaluation of the literature.

  2. Translating statistical species-habitat models to interactive decision support tools

    USGS Publications Warehouse

    Wszola, Lyndsie S.; Simonsen, Victoria L.; Stuber, Erica F.; Gillespie, Caitlyn R.; Messinger, Lindsey N.; Decker, Karie L.; Lusk, Jeffrey J.; Jorgensen, Christopher F.; Bishop, Andrew A.; Fontaine, Joseph J.

    2017-01-01

    Understanding species-habitat relationships is vital to successful conservation, but the tools used to communicate species-habitat relationships are often poorly suited to the information needs of conservation practitioners. Here we present a novel method for translating a statistical species-habitat model, a regression analysis relating ring-necked pheasant abundance to landcover, into an interactive online tool. The Pheasant Habitat Simulator combines the analytical power of the R programming environment with the user-friendly Shiny web interface to create an online platform in which wildlife professionals can explore the effects of variation in local landcover on relative pheasant habitat suitability within spatial scales relevant to individual wildlife managers. Our tool allows users to virtually manipulate the landcover composition of a simulated space to explore how changes in landcover may affect pheasant relative habitat suitability, and guides users through the economic tradeoffs of landscape changes. We offer suggestions for development of similar interactive applications and demonstrate their potential as innovative science delivery tools for diverse professional and public audiences.

  3. Extracting semantic representations from word co-occurrence statistics: stop-lists, stemming, and SVD.

    PubMed

    Bullinaria, John A; Levy, Joseph P

    2012-09-01

    In a previous article, we presented a systematic computational study of the extraction of semantic representations from the word-word co-occurrence statistics of large text corpora. The conclusion was that semantic vectors of pointwise mutual information values from very small co-occurrence windows, together with a cosine distance measure, consistently resulted in the best representations across a range of psychologically relevant semantic tasks. This article extends that study by investigating the use of three further factors--namely, the application of stop-lists, word stemming, and dimensionality reduction using singular value decomposition (SVD)--that have been used to provide improved performance elsewhere. It also introduces an additional semantic task and explores the advantages of using a much larger corpus. This leads to the discovery and analysis of improved SVD-based methods for generating semantic representations (that provide new state-of-the-art performance on a standard TOEFL task) and the identification and discussion of problems and misleading results that can arise without a full systematic study.

  4. Statistical against dynamical PLF fission as seen by the IMF-IMF correlation functions and comparisons with CoMD model

    NASA Astrophysics Data System (ADS)

    Pagano, E. V.; Acosta, L.; Auditore, L.; Cap, T.; Cardella, G.; Colonna, M.; De Filippo, E.; Geraci, E.; Gnoffo, B.; Lanzalone, G.; Maiolino, C.; Martorana, N.; Pagano, A.; Papa, M.; Piasecki, E.; Pirrone, S.; Politi, G.; Porto, F.; Quattrocchi, L.; Rizzo, F.; Russotto, P.; Trifiro’, A.; Trimarchi, M.; Siwek-Wilczynska, K.

    2018-05-01

    In nuclear reactions at Fermi energies two and multi particles intensity interferometry correlation methods are powerful tools in order to pin down the characteristic time scale of the emission processes. In this paper we summarize an improved application of the fragment-fragment correlation function in the specific physics case of heavy projectile-like (PLF) binary massive splitting in two fragments of intermediate mass(IMF). Results are shown for the reverse kinematics reaction 124 Sn+64 Ni at 35 AMeV that has been investigated by using the forward part of CHIMERA multi-detector. The analysis was performed as a function of the charge asymmetry of the observed couples of IMF. We show a coexistence of dynamical and statistical components as a function of the charge asymmetry. Transport CoMD simulations are compared with the data in order to pin down the timescale of the fragments production and the relevant ingredients of the in medium effective interaction used in the transport calculations.

  5. Translating statistical species-habitat models to interactive decision support tools.

    PubMed

    Wszola, Lyndsie S; Simonsen, Victoria L; Stuber, Erica F; Gillespie, Caitlyn R; Messinger, Lindsey N; Decker, Karie L; Lusk, Jeffrey J; Jorgensen, Christopher F; Bishop, Andrew A; Fontaine, Joseph J

    2017-01-01

    Understanding species-habitat relationships is vital to successful conservation, but the tools used to communicate species-habitat relationships are often poorly suited to the information needs of conservation practitioners. Here we present a novel method for translating a statistical species-habitat model, a regression analysis relating ring-necked pheasant abundance to landcover, into an interactive online tool. The Pheasant Habitat Simulator combines the analytical power of the R programming environment with the user-friendly Shiny web interface to create an online platform in which wildlife professionals can explore the effects of variation in local landcover on relative pheasant habitat suitability within spatial scales relevant to individual wildlife managers. Our tool allows users to virtually manipulate the landcover composition of a simulated space to explore how changes in landcover may affect pheasant relative habitat suitability, and guides users through the economic tradeoffs of landscape changes. We offer suggestions for development of similar interactive applications and demonstrate their potential as innovative science delivery tools for diverse professional and public audiences.

  6. Translating statistical species-habitat models to interactive decision support tools

    PubMed Central

    Simonsen, Victoria L.; Stuber, Erica F.; Gillespie, Caitlyn R.; Messinger, Lindsey N.; Decker, Karie L.; Lusk, Jeffrey J.; Jorgensen, Christopher F.; Bishop, Andrew A.; Fontaine, Joseph J.

    2017-01-01

    Understanding species-habitat relationships is vital to successful conservation, but the tools used to communicate species-habitat relationships are often poorly suited to the information needs of conservation practitioners. Here we present a novel method for translating a statistical species-habitat model, a regression analysis relating ring-necked pheasant abundance to landcover, into an interactive online tool. The Pheasant Habitat Simulator combines the analytical power of the R programming environment with the user-friendly Shiny web interface to create an online platform in which wildlife professionals can explore the effects of variation in local landcover on relative pheasant habitat suitability within spatial scales relevant to individual wildlife managers. Our tool allows users to virtually manipulate the landcover composition of a simulated space to explore how changes in landcover may affect pheasant relative habitat suitability, and guides users through the economic tradeoffs of landscape changes. We offer suggestions for development of similar interactive applications and demonstrate their potential as innovative science delivery tools for diverse professional and public audiences. PMID:29236707

  7. Initial Steps toward Validating and Measuring the Quality of Computerized Provider Documentation

    PubMed Central

    Hammond, Kenric W.; Efthimiadis, Efthimis N.; Weir, Charlene R.; Embi, Peter J.; Thielke, Stephen M.; Laundry, Ryan M.; Hedeen, Ashley

    2010-01-01

    Background: Concerns exist about the quality of electronic health care documentation. Prior studies have focused on physicians. This investigation studied document quality perceptions of practitioners (including physicians), nurses and administrative staff. Methods: An instrument developed from staff interviews and literature sources was administered to 110 practitioners, nurses and administrative staff. Short, long and original versions of records were rated. Results: Length transformation did not affect quality ratings. On several scales practitioners rated notes less favorably than administrators or nurses. The original source document was associated with the quality rating, as was tf·idf, a relevance statistic computed from document text. Tf·idf was strongly associated with practitioner quality ratings. Conclusion: Document quality estimates were not sensitive to modifying redundancy in documents. Some perceptions of quality differ by role. Intrinsic document properties are associated with staff judgments of document quality. For practitioners, the tf·idf statistic was strongly associated with the quality dimensions evaluated. PMID:21346983

  8. minet: A R/Bioconductor package for inferring large transcriptional networks using mutual information.

    PubMed

    Meyer, Patrick E; Lafitte, Frédéric; Bontempi, Gianluca

    2008-10-29

    This paper presents the R/Bioconductor package minet (version 1.1.6) which provides a set of functions to infer mutual information networks from a dataset. Once fed with a microarray dataset, the package returns a network where nodes denote genes, edges model statistical dependencies between genes and the weight of an edge quantifies the statistical evidence of a specific (e.g transcriptional) gene-to-gene interaction. Four different entropy estimators are made available in the package minet (empirical, Miller-Madow, Schurmann-Grassberger and shrink) as well as four different inference methods, namely relevance networks, ARACNE, CLR and MRNET. Also, the package integrates accuracy assessment tools, like F-scores, PR-curves and ROC-curves in order to compare the inferred network with a reference one. The package minet provides a series of tools for inferring transcriptional networks from microarray data. It is freely available from the Comprehensive R Archive Network (CRAN) as well as from the Bioconductor website.

  9. Single-molecule photon emission statistics for systems with explicit time dependence: Generating function approach

    NASA Astrophysics Data System (ADS)

    Peng, Yonggang; Xie, Shijie; Zheng, Yujun; Brown, Frank L. H.

    2009-12-01

    Generating function calculations are extended to allow for laser pulse envelopes of arbitrary shape in numerical applications. We investigate photon emission statistics for two-level and V- and Λ-type three-level systems under time-dependent excitation. Applications relevant to electromagnetically induced transparency and photon emission from single quantum dots are presented.

  10. Making Online Instruction Count: Statistical Reporting of Web-Based Library Instruction Activities

    ERIC Educational Resources Information Center

    Bottorff, Tim; Todd, Andrew

    2012-01-01

    Statistical reporting of library instruction (LI) activities has historically focused on measures relevant to face-to-face (F2F) settings. However, newer forms of LI conducted in the online realm may be difficult to count in traditional ways, leading to inaccurate reporting to both internal and external stakeholders. A thorough literature review…

  11. Learning predictive statistics from temporal sequences: Dynamics and strategies

    PubMed Central

    Wang, Rui; Shen, Yuan; Tino, Peter; Welchman, Andrew E.; Kourtzi, Zoe

    2017-01-01

    Human behavior is guided by our expectations about the future. Often, we make predictions by monitoring how event sequences unfold, even though such sequences may appear incomprehensible. Event structures in the natural environment typically vary in complexity, from simple repetition to complex probabilistic combinations. How do we learn these structures? Here we investigate the dynamics of structure learning by tracking human responses to temporal sequences that change in structure unbeknownst to the participants. Participants were asked to predict the upcoming item following a probabilistic sequence of symbols. Using a Markov process, we created a family of sequences, from simple frequency statistics (e.g., some symbols are more probable than others) to context-based statistics (e.g., symbol probability is contingent on preceding symbols). We demonstrate the dynamics with which individuals adapt to changes in the environment's statistics—that is, they extract the behaviorally relevant structures to make predictions about upcoming events. Further, we show that this structure learning relates to individual decision strategy; faster learning of complex structures relates to selection of the most probable outcome in a given context (maximizing) rather than matching of the exact sequence statistics. Our findings provide evidence for alternate routes to learning of behaviorally relevant statistics that facilitate our ability to predict future events in variable environments. PMID:28973111

  12. Approaches for estimating minimal clinically important differences in systemic lupus erythematosus.

    PubMed

    Rai, Sharan K; Yazdany, Jinoos; Fortin, Paul R; Aviña-Zubieta, J Antonio

    2015-06-03

    A minimal clinically important difference (MCID) is an important concept used to determine whether a medical intervention improves perceived outcomes in patients. Prior to the introduction of the concept in 1989, studies focused primarily on statistical significance. As most recent clinical trials in systemic lupus erythematosus (SLE) have failed to show significant effects, determining a clinically relevant threshold for outcome scores (that is, the MCID) of existing instruments may be critical for conducting and interpreting meaningful clinical trials as well as for facilitating the establishment of treatment recommendations for patients. To that effect, methods to determine the MCID can be divided into two well-defined categories: distribution-based and anchor-based approaches. Distribution-based approaches are based on statistical characteristics of the obtained samples. There are various methods within the distribution-based approach, including the standard error of measurement, the standard deviation, the effect size, the minimal detectable change, the reliable change index, and the standardized response mean. Anchor-based approaches compare the change in a patient-reported outcome to a second, external measure of change (that is, one that is more clearly understood, such as a global assessment), which serves as the anchor. Finally, the Delphi technique can be applied as an adjunct to defining a clinically important difference. Despite an abundance of methods reported in the literature, little work in MCID estimation has been done in the context of SLE. As the MCID can help determine the effect of a given therapy on a patient and add meaning to statistical inferences made in clinical research, we believe there ought to be renewed focus on this area. Here, we provide an update on the use of MCIDs in clinical research, review some of the work done in this area in SLE, and propose an agenda for future research.

  13. Developing and validating a nutrition knowledge questionnaire: key methods and considerations.

    PubMed

    Trakman, Gina Louise; Forsyth, Adrienne; Hoye, Russell; Belski, Regina

    2017-10-01

    To outline key statistical considerations and detailed methodologies for the development and evaluation of a valid and reliable nutrition knowledge questionnaire. Literature on questionnaire development in a range of fields was reviewed and a set of evidence-based guidelines specific to the creation of a nutrition knowledge questionnaire have been developed. The recommendations describe key qualitative methods and statistical considerations, and include relevant examples from previous papers and existing nutrition knowledge questionnaires. Where details have been omitted for the sake of brevity, the reader has been directed to suitable references. We recommend an eight-step methodology for nutrition knowledge questionnaire development as follows: (i) definition of the construct and development of a test plan; (ii) generation of the item pool; (iii) choice of the scoring system and response format; (iv) assessment of content validity; (v) assessment of face validity; (vi) purification of the scale using item analysis, including item characteristics, difficulty and discrimination; (vii) evaluation of the scale including its factor structure and internal reliability, or Rasch analysis, including assessment of dimensionality and internal reliability; and (viii) gathering of data to re-examine the questionnaire's properties, assess temporal stability and confirm construct validity. Several of these methods have previously been overlooked. The measurement of nutrition knowledge is an important consideration for individuals working in the nutrition field. Improved methods in the development of nutrition knowledge questionnaires, such as the use of factor analysis or Rasch analysis, will enable more confidence in reported measures of nutrition knowledge.

  14. Identification and characterization of earthquake clusters: a comparative analysis for selected sequences in Italy

    NASA Astrophysics Data System (ADS)

    Peresan, Antonella; Gentili, Stefania

    2017-04-01

    Identification and statistical characterization of seismic clusters may provide useful insights about the features of seismic energy release and their relation to physical properties of the crust within a given region. Moreover, a number of studies based on spatio-temporal analysis of main-shocks occurrence require preliminary declustering of the earthquake catalogs. Since various methods, relying on different physical/statistical assumptions, may lead to diverse classifications of earthquakes into main events and related events, we aim to investigate the classification differences among different declustering techniques. Accordingly, a formal selection and comparative analysis of earthquake clusters is carried out for the most relevant earthquakes in North-Eastern Italy, as reported in the local OGS-CRS bulletins, compiled at the National Institute of Oceanography and Experimental Geophysics since 1977. The comparison is then extended to selected earthquake sequences associated with a different seismotectonic setting, namely to events that occurred in the region struck by the recent Central Italy destructive earthquakes, making use of INGV data. Various techniques, ranging from classical space-time windows methods to ad hoc manual identification of aftershocks, are applied for detection of earthquake clusters. In particular, a statistical method based on nearest-neighbor distances of events in space-time-energy domain, is considered. Results from clusters identification by the nearest-neighbor method turn out quite robust with respect to the time span of the input catalogue, as well as to minimum magnitude cutoff. The identified clusters for the largest events reported in North-Eastern Italy since 1977 are well consistent with those reported in earlier studies, which were aimed at detailed manual aftershocks identification. The study shows that the data-driven approach, based on the nearest-neighbor distances, can be satisfactorily applied to decompose the seismic catalog into background seismicity and individual sequences of earthquake clusters, also in areas characterized by moderate seismic activity, where the standard declustering techniques may turn out rather gross approximations. With these results acquired, the main statistical features of seismic clusters are explored, including complex interdependence of related events, with the aim to characterize the space-time patterns of earthquakes occurrence in North-Eastern Italy and capture their basic differences with Central Italy sequences.

  15. Visual analytics in cheminformatics: user-supervised descriptor selection for QSAR methods.

    PubMed

    Martínez, María Jimena; Ponzoni, Ignacio; Díaz, Mónica F; Vazquez, Gustavo E; Soto, Axel J

    2015-01-01

    The design of QSAR/QSPR models is a challenging problem, where the selection of the most relevant descriptors constitutes a key step of the process. Several feature selection methods that address this step are concentrated on statistical associations among descriptors and target properties, whereas the chemical knowledge is left out of the analysis. For this reason, the interpretability and generality of the QSAR/QSPR models obtained by these feature selection methods are drastically affected. Therefore, an approach for integrating domain expert's knowledge in the selection process is needed for increase the confidence in the final set of descriptors. In this paper a software tool, which we named Visual and Interactive DEscriptor ANalysis (VIDEAN), that combines statistical methods with interactive visualizations for choosing a set of descriptors for predicting a target property is proposed. Domain expertise can be added to the feature selection process by means of an interactive visual exploration of data, and aided by statistical tools and metrics based on information theory. Coordinated visual representations are presented for capturing different relationships and interactions among descriptors, target properties and candidate subsets of descriptors. The competencies of the proposed software were assessed through different scenarios. These scenarios reveal how an expert can use this tool to choose one subset of descriptors from a group of candidate subsets or how to modify existing descriptor subsets and even incorporate new descriptors according to his or her own knowledge of the target property. The reported experiences showed the suitability of our software for selecting sets of descriptors with low cardinality, high interpretability, low redundancy and high statistical performance in a visual exploratory way. Therefore, it is possible to conclude that the resulting tool allows the integration of a chemist's expertise in the descriptor selection process with a low cognitive effort in contrast with the alternative of using an ad-hoc manual analysis of the selected descriptors. Graphical abstractVIDEAN allows the visual analysis of candidate subsets of descriptors for QSAR/QSPR. In the two panels on the top, users can interactively explore numerical correlations as well as co-occurrences in the candidate subsets through two interactive graphs.

  16. Ascertainment-adjusted parameter estimation approach to improve robustness against misspecification of health monitoring methods

    NASA Astrophysics Data System (ADS)

    Juesas, P.; Ramasso, E.

    2016-12-01

    Condition monitoring aims at ensuring system safety which is a fundamental requirement for industrial applications and that has become an inescapable social demand. This objective is attained by instrumenting the system and developing data analytics methods such as statistical models able to turn data into relevant knowledge. One difficulty is to be able to correctly estimate the parameters of those methods based on time-series data. This paper suggests the use of the Weighted Distribution Theory together with the Expectation-Maximization algorithm to improve parameter estimation in statistical models with latent variables with an application to health monotonic under uncertainty. The improvement of estimates is made possible by incorporating uncertain and possibly noisy prior knowledge on latent variables in a sound manner. The latent variables are exploited to build a degradation model of dynamical system represented as a sequence of discrete states. Examples on Gaussian Mixture Models, Hidden Markov Models (HMM) with discrete and continuous outputs are presented on both simulated data and benchmarks using the turbofan engine datasets. A focus on the application of a discrete HMM to health monitoring under uncertainty allows to emphasize the interest of the proposed approach in presence of different operating conditions and fault modes. It is shown that the proposed model depicts high robustness in presence of noisy and uncertain prior.

  17. How large must a treatment effect be before it matters to practitioners? An estimation method and demonstration.

    PubMed

    Miller, William R; Manuel, Jennifer Knapp

    2008-09-01

    Treatment research is sometimes criticised as lacking in clinical relevance, and one potential source of this friction is a disconnection between statistical significance and what clinicians regard to be a meaningful difference in outcomes. This report demonstrates a novel methodology for estimating what substance abuse practitioners regard to be clinically important differences. To illustrate the estimation method, we surveyed 50 substance abuse treatment providers participating in the National Institute on Drug Abuse (NIDA) Clinical Trials Network. Practitioners identified thresholds for clinically meaningful differences on nine common outcome variables, indicated the size of effect that would justify their learning a new treatment method and estimated current outcomes from their services. Clinicians judged a difference between two treatments to be meaningful if outcomes were improved by about 10 - 12 points on the percentage of patients totally abstaining, arrested for driving while intoxicated, employed or having abnormal liver enzymes. A 5 percentage-point reduction in patient mortality was regarded as clinically significant. On continuous outcome measures (such as percentage of days abstinent or drinks per drinking day), practitioners judged an outcome to be significant when it doubled or halved the base rate. When a new treatment meets such criteria, practitioners were interested in learning it. Effects that are statistically significant in clinical trials may be unimpressive to practitioners. Clinicians' judgements of meaningful differences can inform the powering of clinical trials.

  18. Entropy of Ultrasound-Contrast-Agent Velocity Fields for Angiogenesis Imaging in Prostate Cancer.

    PubMed

    van Sloun, Ruud J G; Demi, Libertario; Postema, Arnoud W; Jmch De La Rosette, Jean; Wijkstra, Hessel; Mischi, Massimo

    2017-03-01

    Prostate cancer care can benefit from accurate and cost-efficient imaging modalities that are able to reveal prognostic indicators for cancer. Angiogenesis is known to play a central role in the growth of tumors towards a metastatic or a lethal phenotype. With the aim of localizing angiogenic activity in a non-invasive manner, Dynamic Contrast Enhanced Ultrasound (DCE-US) has been widely used. Usually, the passage of ultrasound contrast agents thought the organ of interest is analyzed for the assessment of tissue perfusion. However, the heterogeneous nature of blood flow in angiogenic vasculature hampers the diagnostic effectiveness of perfusion parameters. In this regard, quantification of the heterogeneity of flow may provide a relevant additional feature for localizing angiogenesis. Statistics based on flow magnitude as well as its orientation can be exploited for this purpose. In this paper, we estimate the microbubble velocity fields from a standard bolus injection and provide a first statistical characterization by performing a spatial entropy analysis. By testing the method on 24 patients with biopsy-proven prostate cancer, we show that the proposed method can be applied effectively to clinically acquired DCE-US data. The method permits estimation of the in-plane flow vector fields and their local intricacy, and yields promising results (receiver-operating-characteristic curve area of 0.85) for the detection of prostate cancer.

  19. When the Ostrich-Algorithm Fails: Blanking Method Affects Spike Train Statistics.

    PubMed

    Joseph, Kevin; Mottaghi, Soheil; Christ, Olaf; Feuerstein, Thomas J; Hofmann, Ulrich G

    2018-01-01

    Modern electroceuticals are bound to employ the usage of electrical high frequency (130-180 Hz) stimulation carried out under closed loop control, most prominent in the case of movement disorders. However, particular challenges are faced when electrical recordings of neuronal tissue are carried out during high frequency electrical stimulation, both in-vivo and in-vitro . This stimulation produces undesired artifacts and can render the recorded signal only partially useful. The extent of these artifacts is often reduced by temporarily grounding the recording input during stimulation pulses. In the following study, we quantify the effects of this method, "blanking," on the spike count and spike train statistics. Starting from a theoretical standpoint, we calculate a loss in the absolute number of action potentials, depending on: width of the blanking window, frequency of stimulation, and intrinsic neuronal activity. These calculations were then corroborated by actual high signal to noise ratio (SNR) single cell recordings. We state that, for clinically relevant frequencies of 130 Hz (used for movement disorders) and realistic blanking windows of 2 ms, up to 27% of actual existing spikes are lost. We strongly advice cautioned use of the blanking method when spike rate quantification is attempted. Blanking (artifact removal by temporarily grounding input), depending on recording parameters, can lead to significant spike loss. Very careful use of blanking circuits is advised.

  20. When the Ostrich-Algorithm Fails: Blanking Method Affects Spike Train Statistics

    PubMed Central

    Joseph, Kevin; Mottaghi, Soheil; Christ, Olaf; Feuerstein, Thomas J.; Hofmann, Ulrich G.

    2018-01-01

    Modern electroceuticals are bound to employ the usage of electrical high frequency (130–180 Hz) stimulation carried out under closed loop control, most prominent in the case of movement disorders. However, particular challenges are faced when electrical recordings of neuronal tissue are carried out during high frequency electrical stimulation, both in-vivo and in-vitro. This stimulation produces undesired artifacts and can render the recorded signal only partially useful. The extent of these artifacts is often reduced by temporarily grounding the recording input during stimulation pulses. In the following study, we quantify the effects of this method, “blanking,” on the spike count and spike train statistics. Starting from a theoretical standpoint, we calculate a loss in the absolute number of action potentials, depending on: width of the blanking window, frequency of stimulation, and intrinsic neuronal activity. These calculations were then corroborated by actual high signal to noise ratio (SNR) single cell recordings. We state that, for clinically relevant frequencies of 130 Hz (used for movement disorders) and realistic blanking windows of 2 ms, up to 27% of actual existing spikes are lost. We strongly advice cautioned use of the blanking method when spike rate quantification is attempted. Impact statement Blanking (artifact removal by temporarily grounding input), depending on recording parameters, can lead to significant spike loss. Very careful use of blanking circuits is advised. PMID:29780301

  1. Invited Commentary: The Need for Cognitive Science in Methodology.

    PubMed

    Greenland, Sander

    2017-09-15

    There is no complete solution for the problem of abuse of statistics, but methodological training needs to cover cognitive biases and other psychosocial factors affecting inferences. The present paper discusses 3 common cognitive distortions: 1) dichotomania, the compulsion to perceive quantities as dichotomous even when dichotomization is unnecessary and misleading, as in inferences based on whether a P value is "statistically significant"; 2) nullism, the tendency to privilege the hypothesis of no difference or no effect when there is no scientific basis for doing so, as when testing only the null hypothesis; and 3) statistical reification, treating hypothetical data distributions and statistical models as if they reflect known physical laws rather than speculative assumptions for thought experiments. As commonly misused, null-hypothesis significance testing combines these cognitive problems to produce highly distorted interpretation and reporting of study results. Interval estimation has so far proven to be an inadequate solution because it involves dichotomization, an avenue for nullism. Sensitivity and bias analyses have been proposed to address reproducibility problems (Am J Epidemiol. 2017;186(6):646-647); these methods can indeed address reification, but they can also introduce new distortions via misleading specifications for bias parameters. P values can be reframed to lessen distortions by presenting them without reference to a cutoff, providing them for relevant alternatives to the null, and recognizing their dependence on all assumptions used in their computation; they nonetheless require rescaling for measuring evidence. I conclude that methodological development and training should go beyond coverage of mechanistic biases (e.g., confounding, selection bias, measurement error) to cover distortions of conclusions produced by statistical methods and psychosocial forces. © The Author(s) 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  2. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gecow, Andrzej

    On the way to simulating adaptive evolution of complex system describing a living object or human developed project, a fitness should be defined on node states or network external outputs. Feedbacks lead to circular attractors of these states or outputs which make it difficult to define a fitness. The main statistical effects of adaptive condition are the result of small change tendency and to appear, they only need a statistically correct size of damage initiated by evolutionary change of system. This observation allows to cut loops of feedbacks and in effect to obtain a particular statistically correct state instead ofmore » a long circular attractor which in the quenched model is expected for chaotic network with feedback. Defining fitness on such states is simple. We calculate only damaged nodes and only once. Such an algorithm is optimal for investigation of damage spreading i.e. statistical connections of structural parameters of initial change with the size of effected damage. It is a reversed-annealed method--function and states (signals) may be randomly substituted but connections are important and are preserved. The small damages important for adaptive evolution are correctly depicted in comparison to Derrida annealed approximation which expects equilibrium levels for large networks. The algorithm indicates these levels correctly. The relevant program in Pascal, which executes the algorithm for a wide range of parameters, can be obtained from the author.« less

  3. Assessing Attitudes towards Statistics among Medical Students: Psychometric Properties of the Serbian Version of the Survey of Attitudes Towards Statistics (SATS)

    PubMed Central

    Stanisavljevic, Dejana; Trajkovic, Goran; Marinkovic, Jelena; Bukumiric, Zoran; Cirkovic, Andja; Milic, Natasa

    2014-01-01

    Background Medical statistics has become important and relevant for future doctors, enabling them to practice evidence based medicine. Recent studies report that students’ attitudes towards statistics play an important role in their statistics achievements. The aim of the study was to test the psychometric properties of the Serbian version of the Survey of Attitudes Towards Statistics (SATS) in order to acquire a valid instrument to measure attitudes inside the Serbian educational context. Methods The validation study was performed on a cohort of 417 medical students who were enrolled in an obligatory introductory statistics course. The SATS adaptation was based on an internationally accepted methodology for translation and cultural adaptation. Psychometric properties of the Serbian version of the SATS were analyzed through the examination of factorial structure and internal consistency. Results Most medical students held positive attitudes towards statistics. The average total SATS score was above neutral (4.3±0.8), and varied from 1.9 to 6.2. Confirmatory factor analysis validated the six-factor structure of the questionnaire (Affect, Cognitive Competence, Value, Difficulty, Interest and Effort). Values for fit indices TLI (0.940) and CFI (0.961) were above the cut-off of ≥0.90. The RMSEA value of 0.064 (0.051–0.078) was below the suggested value of ≤0.08. Cronbach’s alpha of the entire scale was 0.90, indicating scale reliability. In a multivariate regression model, self-rating of ability in mathematics and current grade point average were significantly associated with the total SATS score after adjusting for age and gender. Conclusion Present study provided the evidence for the appropriate metric properties of the Serbian version of SATS. Confirmatory factor analysis validated the six-factor structure of the scale. The SATS might be reliable and a valid instrument for identifying medical students’ attitudes towards statistics in the Serbian educational context. PMID:25405489

  4. A Double-Blind Placebo-Controlled Randomized Clinical Trial With Magnesium Oxide to Reduce Intrafraction Prostate Motion for Prostate Cancer Radiotherapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lips, Irene M., E-mail: i.m.lips@umcutrecht.nl; Gils, Carla H. van; Kotte, Alexis N.T.J.

    2012-06-01

    Purpose: To investigate whether magnesium oxide during external-beam radiotherapy for prostate cancer reduces intrafraction prostate motion in a double-blind, placebo-controlled randomized trial. Methods and Materials: At the Department of Radiotherapy, prostate cancer patients scheduled for intensity-modulated radiotherapy (77 Gy in 35 fractions) using fiducial marker-based position verification were randomly assigned to receive magnesium oxide (500 mg twice a day) or placebo during radiotherapy. The primary outcome was the proportion of patients with clinically relevant intrafraction prostate motion, defined as the proportion of patients who demonstrated in {>=}50% of the fractions an intrafraction motion outside a range of 2 mm. Secondarymore » outcome measures included quality of life and acute toxicity. Results: In total, 46 patients per treatment arm were enrolled. The primary endpoint did not show a statistically significant difference between the treatment arms with a percentage of patients with clinically relevant intrafraction motion of 83% in the magnesium oxide arm as compared with 80% in the placebo arm (p = 1.00). Concerning the secondary endpoints, exploratory analyses demonstrated a trend towards worsened quality of life and slightly more toxicity in the magnesium oxide arm than in the placebo arm; however, these differences were not statistically significant. Conclusions: Magnesium oxide is not effective in reducing the intrafraction prostate motion during external-beam radiotherapy, and therefore there is no indication to use it in clinical practice for this purpose.« less

  5. geneCommittee: a web-based tool for extensively testing the discriminatory power of biologically relevant gene sets in microarray data classification.

    PubMed

    Reboiro-Jato, Miguel; Arrais, Joel P; Oliveira, José Luis; Fdez-Riverola, Florentino

    2014-01-30

    The diagnosis and prognosis of several diseases can be shortened through the use of different large-scale genome experiments. In this context, microarrays can generate expression data for a huge set of genes. However, to obtain solid statistical evidence from the resulting data, it is necessary to train and to validate many classification techniques in order to find the best discriminative method. This is a time-consuming process that normally depends on intricate statistical tools. geneCommittee is a web-based interactive tool for routinely evaluating the discriminative classification power of custom hypothesis in the form of biologically relevant gene sets. While the user can work with different gene set collections and several microarray data files to configure specific classification experiments, the tool is able to run several tests in parallel. Provided with a straightforward and intuitive interface, geneCommittee is able to render valuable information for diagnostic analyses and clinical management decisions based on systematically evaluating custom hypothesis over different data sets using complementary classifiers, a key aspect in clinical research. geneCommittee allows the enrichment of microarrays raw data with gene functional annotations, producing integrated datasets that simplify the construction of better discriminative hypothesis, and allows the creation of a set of complementary classifiers. The trained committees can then be used for clinical research and diagnosis. Full documentation including common use cases and guided analysis workflows is freely available at http://sing.ei.uvigo.es/GC/.

  6. Trends in statistical methods in articles published in Archives of Plastic Surgery between 2012 and 2017.

    PubMed

    Han, Kyunghwa; Jung, Inkyung

    2018-05-01

    This review article presents an assessment of trends in statistical methods and an evaluation of their appropriateness in articles published in the Archives of Plastic Surgery (APS) from 2012 to 2017. We reviewed 388 original articles published in APS between 2012 and 2017. We categorized the articles that used statistical methods according to the type of statistical method, the number of statistical methods, and the type of statistical software used. We checked whether there were errors in the description of statistical methods and results. A total of 230 articles (59.3%) published in APS between 2012 and 2017 used one or more statistical method. Within these articles, there were 261 applications of statistical methods with continuous or ordinal outcomes, and 139 applications of statistical methods with categorical outcome. The Pearson chi-square test (17.4%) and the Mann-Whitney U test (14.4%) were the most frequently used methods. Errors in describing statistical methods and results were found in 133 of the 230 articles (57.8%). Inadequate description of P-values was the most common error (39.1%). Among the 230 articles that used statistical methods, 71.7% provided details about the statistical software programs used for the analyses. SPSS was predominantly used in the articles that presented statistical analyses. We found that the use of statistical methods in APS has increased over the last 6 years. It seems that researchers have been paying more attention to the proper use of statistics in recent years. It is expected that these positive trends will continue in APS.

  7. An enhancement of ROC curves made them clinically relevant for diagnostic-test comparison and optimal-threshold determination.

    PubMed

    Subtil, Fabien; Rabilloud, Muriel

    2015-07-01

    The receiver operating characteristic curves (ROC curves) are often used to compare continuous diagnostic tests or determine the optimal threshold of a test; however, they do not consider the costs of misclassifications or the disease prevalence. The ROC graph was extended to allow for these aspects. Two new lines are added to the ROC graph: a sensitivity line and a specificity line. Their slopes depend on the disease prevalence and on the ratio of the net benefit of treating a diseased subject to the net cost of treating a nondiseased one. First, these lines help researchers determine the range of specificities within which test comparisons of partial areas under the curves is clinically relevant. Second, the ROC curve point the farthest from the specificity line is shown to be the optimal threshold in terms of expected utility. This method was applied: (1) to determine the optimal threshold of ratio specific immunoglobulin G (IgG)/total IgG for the diagnosis of congenital toxoplasmosis and (2) to select, among two markers, the most accurate for the diagnosis of left ventricular hypertrophy in hypertensive subjects. The two additional lines transform the statistically valid ROC graph into a clinically relevant tool for test selection and threshold determination. Copyright © 2015 Elsevier Inc. All rights reserved.

  8. Process defects and in situ monitoring methods in metal powder bed fusion: a review

    NASA Astrophysics Data System (ADS)

    Grasso, Marco; Colosimo, Bianca Maria

    2017-04-01

    Despite continuous technological enhancements of metal Additive Manufacturing (AM) systems, the lack of process repeatability and stability still represents a barrier for the industrial breakthrough. The most relevant metal AM applications currently involve industrial sectors (e.g. aerospace and bio-medical) where defects avoidance is fundamental. Because of this, there is the need to develop novel in situ monitoring tools able to keep under control the stability of the process on a layer-by-layer basis, and to detect the onset of defects as soon as possible. On the one hand, AM systems must be equipped with in situ sensing devices able to measure relevant quantities during the process, a.k.a. process signatures. On the other hand, in-process data analytics and statistical monitoring techniques are required to detect and localize the defects in an automated way. This paper reviews the literature and the commercial tools for in situ monitoring of powder bed fusion (PBF) processes. It explores the different categories of defects and their main causes, the most relevant process signatures and the in situ sensing approaches proposed so far. Particular attention is devoted to the development of automated defect detection rules and the study of process control strategies, which represent two critical fields for the development of future smart PBF systems.

  9. Health care financing in iran; is privatization a good solution?

    PubMed

    Davari, M; Haycox, A; Walley, T

    2012-01-01

    This paper considers a range of issues related to the financing of health care system and relevant government policies in Iran. This study used mixed methods. A systematic literature review was undertaken to identify relevant publications. This was supplemented by hand searching in books and journals, including government publications. The issues and uncertainties identified in the literature were explored in detail through semi-structured interviews with key informants. These were triangulated with empirical evidence in the form of the literature, government statistics and independent expert opinions to validate the views expressed in the interviews. The systematic review of published literature showed that no previous publication has addressed issues relating to the financing of healthcare services in Iran. However, a range of opinion pieces outlined issues to be explored further in the interviews. Such issues summarised into four main categories. The health care market in Iran has faced a period in which financial issues have enhanced managerial complexity. Privatization of health care services would appear to be a step too far in assisting the system to confront its challenges at the current time. The most important step toward solving such challenges is to focus on a feasible, relevant and comprehensive policy, which optimises the use of health care resources in Iran.

  10. Enrichment methods provide a feasible approach to comprehensive and adequately powered investigations of the brain methylome

    PubMed Central

    Chan, Robin F.; Shabalin, Andrey A.; Xie, Lin Y.; Adkins, Daniel E.; Zhao, Min; Turecki, Gustavo; Clark, Shaunna L.; Aberg, Karolina A.

    2017-01-01

    Abstract Methylome-wide association studies are typically performed using microarray technologies that only assay a very small fraction of the CG methylome and entirely miss two forms of methylation that are common in brain and likely of particular relevance for neuroscience and psychiatric disorders. The alternative is to use whole genome bisulfite (WGB) sequencing but this approach is not yet practically feasible with sample sizes required for adequate statistical power. We argue for revisiting methylation enrichment methods that, provided optimal protocols are used, enable comprehensive, adequately powered and cost-effective genome-wide investigations of the brain methylome. To support our claim we use data showing that enrichment methods approximate the sensitivity obtained with WGB methods and with slightly better specificity. However, this performance is achieved at <5% of the reagent costs. Furthermore, because many more samples can be sequenced simultaneously, projects can be completed about 15 times faster. Currently the only viable option available for comprehensive brain methylome studies, enrichment methods may be critical for moving the field forward. PMID:28334972

  11. The Problem Solving Method in Teaching Physics in Elementary School

    NASA Astrophysics Data System (ADS)

    Jandrić, Gordana Hajduković; Obadović, Dušanka Ž.; Stojanović, Maja

    2010-01-01

    The most of the teachers ask if there is a "best" known way to teach. The most effective teaching method depends on the specific goals of the course and the needs of the students. An investigation has been carried out to compare the effect of teaching selected physics topics using problem-solving method on the overall achievements of the acquired knowledge and teaching the same material by traditional teaching method. The investigation was performed as a pedagogical experiment of the type of parallel groups with randomly chosen sample of students attending grades eight. The control and experimental groups were equalized in the relevant pedagogical parameters. The obtained results were treated statistically. The comparison showed a significant difference in respect of the speed of acquiring knowledge, the problem-solving teaching being advantageous over traditional methodDo not replace the word "abstract," but do replace the rest of this text. If you must insert a hard line break, please use Shift+Enter rather than just tapping your "Enter" key. You may want to print this page and refer to it as a style sample before you begin working on your paper.

  12. An Update on Statistical Boosting in Biomedicine.

    PubMed

    Mayr, Andreas; Hofner, Benjamin; Waldmann, Elisabeth; Hepp, Tobias; Meyer, Sebastian; Gefeller, Olaf

    2017-01-01

    Statistical boosting algorithms have triggered a lot of research during the last decade. They combine a powerful machine learning approach with classical statistical modelling, offering various practical advantages like automated variable selection and implicit regularization of effect estimates. They are extremely flexible, as the underlying base-learners (regression functions defining the type of effect for the explanatory variables) can be combined with any kind of loss function (target function to be optimized, defining the type of regression setting). In this review article, we highlight the most recent methodological developments on statistical boosting regarding variable selection, functional regression, and advanced time-to-event modelling. Additionally, we provide a short overview on relevant applications of statistical boosting in biomedicine.

  13. Statistical methods used in articles published by the Journal of Periodontal and Implant Science.

    PubMed

    Choi, Eunsil; Lyu, Jiyoung; Park, Jinyoung; Kim, Hae-Young

    2014-12-01

    The purposes of this study were to assess the trend of use of statistical methods including parametric and nonparametric methods and to evaluate the use of complex statistical methodology in recent periodontal studies. This study analyzed 123 articles published in the Journal of Periodontal & Implant Science (JPIS) between 2010 and 2014. Frequencies and percentages were calculated according to the number of statistical methods used, the type of statistical method applied, and the type of statistical software used. Most of the published articles considered (64.4%) used statistical methods. Since 2011, the percentage of JPIS articles using statistics has increased. On the basis of multiple counting, we found that the percentage of studies in JPIS using parametric methods was 61.1%. Further, complex statistical methods were applied in only 6 of the published studies (5.0%), and nonparametric statistical methods were applied in 77 of the published studies (38.9% of a total of 198 studies considered). We found an increasing trend towards the application of statistical methods and nonparametric methods in recent periodontal studies and thus, concluded that increased use of complex statistical methodology might be preferred by the researchers in the fields of study covered by JPIS.

  14. An analysis of the positional distribution of DNA motifs in promoter regions and its biological relevance.

    PubMed

    Casimiro, Ana C; Vinga, Susana; Freitas, Ana T; Oliveira, Arlindo L

    2008-02-07

    Motif finding algorithms have developed in their ability to use computationally efficient methods to detect patterns in biological sequences. However the posterior classification of the output still suffers from some limitations, which makes it difficult to assess the biological significance of the motifs found. Previous work has highlighted the existence of positional bias of motifs in the DNA sequences, which might indicate not only that the pattern is important, but also provide hints of the positions where these patterns occur preferentially. We propose to integrate position uniformity tests and over-representation tests to improve the accuracy of the classification of motifs. Using artificial data, we have compared three different statistical tests (Chi-Square, Kolmogorov-Smirnov and a Chi-Square bootstrap) to assess whether a given motif occurs uniformly in the promoter region of a gene. Using the test that performed better in this dataset, we proceeded to study the positional distribution of several well known cis-regulatory elements, in the promoter sequences of different organisms (S. cerevisiae, H. sapiens, D. melanogaster, E. coli and several Dicotyledons plants). The results show that position conservation is relevant for the transcriptional machinery. We conclude that many biologically relevant motifs appear heterogeneously distributed in the promoter region of genes, and therefore, that non-uniformity is a good indicator of biological relevance and can be used to complement over-representation tests commonly used. In this article we present the results obtained for the S. cerevisiae data sets.

  15. Twitter: A Novel Tool for Studying the Health and Social Needs of Transgender Communities.

    PubMed

    Krueger, Evan A; Young, Sean D

    2015-01-01

    Limited research has examined the health and social needs of transgender and gender nonconforming populations. Due to high levels of stigma, transgender individuals may avoid disclosing their identities to researchers, hindering this type of work. Further, researchers have traditionally relied on clinic-based sampling methods, which may mask the true heterogeneity of transgender and gender nonconforming communities. Online social networking websites present a novel platform for studying this diverse, difficult-to-reach population. The objective of this study was to attempt to examine the perceived health and social needs of transgender and gender nonconforming communities by examining messages posted to the popular microblogging platform, Twitter. Tweets were collected from 13 transgender-related hashtags on July 11, 2014. They were read and coded according to general themes addressed, and a content analysis was performed. Qualitative and descriptive statistics are presented. There were 1135 tweets that were collected in total. Both "positive" and "negative" events were discussed, in both personal and social contexts. Violence, discrimination, suicide, and sexual risk behavior were discussed. There were 34.36% (390/1135) of tweets that addressed transgender-relevant current events, and 60.79% (690/1135) provided a link to a relevant news article or resource. This study found that transgender individuals and allies use Twitter to discuss health and social needs relevant to the population. Real-time social media sites like Twitter can be used to study issues relevant to transgender communities.

  16. A gene-signature progression approach to identifying candidate small-molecule cancer therapeutics with connectivity mapping.

    PubMed

    Wen, Qing; Kim, Chang-Sik; Hamilton, Peter W; Zhang, Shu-Dong

    2016-05-11

    Gene expression connectivity mapping has gained much popularity recently with a number of successful applications in biomedical research testifying its utility and promise. Previously methodological research in connectivity mapping mainly focused on two of the key components in the framework, namely, the reference gene expression profiles and the connectivity mapping algorithms. The other key component in this framework, the query gene signature, has been left to users to construct without much consensus on how this should be done, albeit it has been an issue most relevant to end users. As a key input to the connectivity mapping process, gene signature is crucially important in returning biologically meaningful and relevant results. This paper intends to formulate a standardized procedure for constructing high quality gene signatures from a user's perspective. We describe a two-stage process for making quality gene signatures using gene expression data as initial inputs. First, a differential gene expression analysis comparing two distinct biological states; only the genes that have passed stringent statistical criteria are considered in the second stage of the process, which involves ranking genes based on statistical as well as biological significance. We introduce a "gene signature progression" method as a standard procedure in connectivity mapping. Starting from the highest ranked gene, we progressively determine the minimum length of the gene signature that allows connections to the reference profiles (drugs) being established with a preset target false discovery rate. We use a lung cancer dataset and a breast cancer dataset as two case studies to demonstrate how this standardized procedure works, and we show that highly relevant and interesting biological connections are returned. Of particular note is gefitinib, identified as among the candidate therapeutics in our lung cancer case study. Our gene signature was based on gene expression data from Taiwan female non-smoker lung cancer patients, while there is evidence from independent studies that gefitinib is highly effective in treating women, non-smoker or former light smoker, advanced non-small cell lung cancer patients of Asian origin. In summary, we introduced a gene signature progression method into connectivity mapping, which enables a standardized procedure for constructing high quality gene signatures. This progression method is particularly useful when the number of differentially expressed genes identified is large, and when there is a need to prioritize them to be included in the query signature. The results from two case studies demonstrate that the approach we have developed is capable of obtaining pertinent candidate drugs with high precision.

  17. Statistical CT noise reduction with multiscale decomposition and penalized weighted least squares in the projection domain

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tang Shaojie; Tang Xiangyang; School of Automation, Xi'an University of Posts and Telecommunications, Xi'an, Shaanxi 710121

    2012-09-15

    Purposes: The suppression of noise in x-ray computed tomography (CT) imaging is of clinical relevance for diagnostic image quality and the potential for radiation dose saving. Toward this purpose, statistical noise reduction methods in either the image or projection domain have been proposed, which employ a multiscale decomposition to enhance the performance of noise suppression while maintaining image sharpness. Recognizing the advantages of noise suppression in the projection domain, the authors propose a projection domain multiscale penalized weighted least squares (PWLS) method, in which the angular sampling rate is explicitly taken into consideration to account for the possible variation ofmore » interview sampling rate in advanced clinical or preclinical applications. Methods: The projection domain multiscale PWLS method is derived by converting an isotropic diffusion partial differential equation in the image domain into the projection domain, wherein a multiscale decomposition is carried out. With adoption of the Markov random field or soft thresholding objective function, the projection domain multiscale PWLS method deals with noise at each scale. To compensate for the degradation in image sharpness caused by the projection domain multiscale PWLS method, an edge enhancement is carried out following the noise reduction. The performance of the proposed method is experimentally evaluated and verified using the projection data simulated by computer and acquired by a CT scanner. Results: The preliminary results show that the proposed projection domain multiscale PWLS method outperforms the projection domain single-scale PWLS method and the image domain multiscale anisotropic diffusion method in noise reduction. In addition, the proposed method can preserve image sharpness very well while the occurrence of 'salt-and-pepper' noise and mosaic artifacts can be avoided. Conclusions: Since the interview sampling rate is taken into account in the projection domain multiscale decomposition, the proposed method is anticipated to be useful in advanced clinical and preclinical applications where the interview sampling rate varies.« less

  18. Effects of Cognitive Load on Trust

    DTIC Science & Technology

    2013-10-01

    that may be affected by load  Build a parsing tool to extract relevant features  Statistical analysis of results (by load components) Achieved...for a business application. Participants assessed potential job candidates and reviewed the applicants’ virtual resume which included standard...substantially different from each other that would make any confounding problems or other issues. Some statistics of the Australian data collection are

  19. Actitudes de Estudiantes Universitarios que Tomaron Cursos Introductorios de Estadistica y su Relacion con el Exito Academico en La Disciplina

    ERIC Educational Resources Information Center

    Colon-Rosa, Hector Wm.

    2012-01-01

    Considering the range of changes in the instruction and learning of statistics, several questions emerge regarding how those changes influence students' attitudes. Equally, other questions emerge to reflect that statistics is a fundamental course in the university academic programs because of its relevance to the professional development of the…

  20. Arab oil and gas directory 1985

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Not Available

    1985-01-01

    The directory provides detailed statistics and information on aspects of oil and gas production, exploration and developments in the 24 Arab countries of the Middle East and North Africa and in Iran. It includes the texts of relevant new laws and official documents, official surveys, current projects and developments, up-to-date statistics covering OPEC and OAPEC member countries, and has 26 maps.

Top