Application of the Bootstrap Statistical Method in Deriving Vibroacoustic Specifications
NASA Technical Reports Server (NTRS)
Hughes, William O.; Paez, Thomas L.
2006-01-01
This paper discusses the Bootstrap Method for specification of vibroacoustic test specifications. Vibroacoustic test specifications are necessary to properly accept or qualify a spacecraft and its components for the expected acoustic, random vibration and shock environments seen on an expendable launch vehicle. Traditionally, NASA and the U.S. Air Force have employed methods of Normal Tolerance Limits to derive these test levels based upon the amount of data available, and the probability and confidence levels desired. The Normal Tolerance Limit method contains inherent assumptions about the distribution of the data. The Bootstrap is a distribution-free statistical subsampling method which uses the measured data themselves to establish estimates of statistical measures of random sources. This is achieved through the computation of large numbers of Bootstrap replicates of a data measure of interest and the use of these replicates to derive test levels consistent with the probability and confidence desired. The comparison of the results of these two methods is illustrated via an example utilizing actual spacecraft vibroacoustic data.
Development of a Predictive Corrosion Model Using Locality-Specific Corrosion Indices
2017-09-12
6 3.2.1 Statistical data analysis methods ...6 3.2.2 Algorithm development method ...components, and method ) were compiled into an executable program that uses mathematical models of materials degradation, and statistical calcula- tions
Statistical and Economic Techniques for Site-specific Nematode Management.
Liu, Zheng; Griffin, Terry; Kirkpatrick, Terrence L
2014-03-01
Recent advances in precision agriculture technologies and spatial statistics allow realistic, site-specific estimation of nematode damage to field crops and provide a platform for the site-specific delivery of nematicides within individual fields. This paper reviews the spatial statistical techniques that model correlations among neighboring observations and develop a spatial economic analysis to determine the potential of site-specific nematicide application. The spatial econometric methodology applied in the context of site-specific crop yield response contributes to closing the gap between data analysis and realistic site-specific nematicide recommendations and helps to provide a practical method of site-specifically controlling nematodes.
Bruner, L H; Carr, G J; Harbell, J W; Curren, R D
2002-06-01
An approach commonly used to measure new toxicity test method (NTM) performance in validation studies is to divide toxicity results into positive and negative classifications, and the identify true positive (TP), true negative (TN), false positive (FP) and false negative (FN) results. After this step is completed, the contingent probability statistics (CPS), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) are calculated. Although these statistics are widely used and often the only statistics used to assess the performance of toxicity test methods, there is little specific guidance in the validation literature on what values for these statistics indicate adequate performance. The purpose of this study was to begin developing data-based answers to this question by characterizing the CPS obtained from an NTM whose data have a completely random association with a reference test method (RTM). Determining the CPS of this worst-case scenario is useful because it provides a lower baseline from which the performance of an NTM can be judged in future validation studies. It also provides an indication of relationships in the CPS that help identify random or near-random relationships in the data. The results from this study of randomly associated tests show that the values obtained for the statistics vary significantly depending on the cut-offs chosen, that high values can be obtained for individual statistics, and that the different measures cannot be considered independently when evaluating the performance of an NTM. When the association between results of an NTM and RTM is random the sum of the complementary pairs of statistics (sensitivity + specificity, NPV + PPV) is approximately 1, and the prevalence (i.e., the proportion of toxic chemicals in the population of chemicals) and PPV are equal. Given that combinations of high sensitivity-low specificity or low specificity-high sensitivity (i.e., the sum of the sensitivity and specificity equal to approximately 1) indicate lack of predictive capacity, an NTM having these performance characteristics should be considered no better for predicting toxicity than by chance alone.
Statistical methods for nuclear material management
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bowen W.M.; Bennett, C.A.
1988-12-01
This book is intended as a reference manual of statistical methodology for nuclear material management practitioners. It describes statistical methods currently or potentially important in nuclear material management, explains the choice of methods for specific applications, and provides examples of practical applications to nuclear material management problems. Together with the accompanying training manual, which contains fully worked out problems keyed to each chapter, this book can also be used as a textbook for courses in statistical methods for nuclear material management. It should provide increased understanding and guidance to help improve the application of statistical methods to nuclear material managementmore » problems.« less
NASA Technical Reports Server (NTRS)
Keegan, W. B.
1974-01-01
In order to produce cost effective environmental test programs, the test specifications must be realistic and to be useful, they must be available early in the life of a program. This paper describes a method for achieving such specifications for subsystems by utilizing the results of a statistical analysis of data acquired at subsystem mounting locations during system level environmental tests. The paper describes the details of this statistical analysis. The resultant recommended levels are a function of the subsystems' mounting location in the spacecraft. Methods of determining this mounting 'zone' are described. Recommendations are then made as to which of the various problem areas encountered should be pursued further.
ERIC Educational Resources Information Center
Thebaud, Schiller
This report examines four UNESCO pilot projects undertaken in 1972 in Brazil, Colombia, Peru, and Uruguay to study the methods used for national statistical surveys of science and technology. The projects specifically addressed the problems of comparing statistics gathered by different methods in different countries. Surveys carried out in Latin…
Raffelt, David A.; Smith, Robert E.; Ridgway, Gerard R.; Tournier, J-Donald; Vaughan, David N.; Rose, Stephen; Henderson, Robert; Connelly, Alan
2015-01-01
In brain regions containing crossing fibre bundles, voxel-average diffusion MRI measures such as fractional anisotropy (FA) are difficult to interpret, and lack within-voxel single fibre population specificity. Recent work has focused on the development of more interpretable quantitative measures that can be associated with a specific fibre population within a voxel containing crossing fibres (herein we use fixel to refer to a specific fibre population within a single voxel). Unfortunately, traditional 3D methods for smoothing and cluster-based statistical inference cannot be used for voxel-based analysis of these measures, since the local neighbourhood for smoothing and cluster formation can be ambiguous when adjacent voxels may have different numbers of fixels, or ill-defined when they belong to different tracts. Here we introduce a novel statistical method to perform whole-brain fixel-based analysis called connectivity-based fixel enhancement (CFE). CFE uses probabilistic tractography to identify structurally connected fixels that are likely to share underlying anatomy and pathology. Probabilistic connectivity information is then used for tract-specific smoothing (prior to the statistical analysis) and enhancement of the statistical map (using a threshold-free cluster enhancement-like approach). To investigate the characteristics of the CFE method, we assessed sensitivity and specificity using a large number of combinations of CFE enhancement parameters and smoothing extents, using simulated pathology generated with a range of test-statistic signal-to-noise ratios in five different white matter regions (chosen to cover a broad range of fibre bundle features). The results suggest that CFE input parameters are relatively insensitive to the characteristics of the simulated pathology. We therefore recommend a single set of CFE parameters that should give near optimal results in future studies where the group effect is unknown. We then demonstrate the proposed method by comparing apparent fibre density between motor neurone disease (MND) patients with control subjects. The MND results illustrate the benefit of fixel-specific statistical inference in white matter regions that contain crossing fibres. PMID:26004503
Mathes, Robert W; Lall, Ramona; Levin-Rector, Alison; Sell, Jessica; Paladini, Marc; Konty, Kevin J; Olson, Don; Weiss, Don
2017-01-01
The New York City Department of Health and Mental Hygiene has operated an emergency department syndromic surveillance system since 2001, using temporal and spatial scan statistics run on a daily basis for cluster detection. Since the system was originally implemented, a number of new methods have been proposed for use in cluster detection. We evaluated six temporal and four spatial/spatio-temporal detection methods using syndromic surveillance data spiked with simulated injections. The algorithms were compared on several metrics, including sensitivity, specificity, positive predictive value, coherence, and timeliness. We also evaluated each method's implementation, programming time, run time, and the ease of use. Among the temporal methods, at a set specificity of 95%, a Holt-Winters exponential smoother performed the best, detecting 19% of the simulated injects across all shapes and sizes, followed by an autoregressive moving average model (16%), a generalized linear model (15%), a modified version of the Early Aberration Reporting System's C2 algorithm (13%), a temporal scan statistic (11%), and a cumulative sum control chart (<2%). Of the spatial/spatio-temporal methods we tested, a spatial scan statistic detected 3% of all injects, a Bayes regression found 2%, and a generalized linear mixed model and a space-time permutation scan statistic detected none at a specificity of 95%. Positive predictive value was low (<7%) for all methods. Overall, the detection methods we tested did not perform well in identifying the temporal and spatial clusters of cases in the inject dataset. The spatial scan statistic, our current method for spatial cluster detection, performed slightly better than the other tested methods across different inject magnitudes and types. Furthermore, we found the scan statistics, as applied in the SaTScan software package, to be the easiest to program and implement for daily data analysis.
Lall, Ramona; Levin-Rector, Alison; Sell, Jessica; Paladini, Marc; Konty, Kevin J.; Olson, Don; Weiss, Don
2017-01-01
The New York City Department of Health and Mental Hygiene has operated an emergency department syndromic surveillance system since 2001, using temporal and spatial scan statistics run on a daily basis for cluster detection. Since the system was originally implemented, a number of new methods have been proposed for use in cluster detection. We evaluated six temporal and four spatial/spatio-temporal detection methods using syndromic surveillance data spiked with simulated injections. The algorithms were compared on several metrics, including sensitivity, specificity, positive predictive value, coherence, and timeliness. We also evaluated each method’s implementation, programming time, run time, and the ease of use. Among the temporal methods, at a set specificity of 95%, a Holt-Winters exponential smoother performed the best, detecting 19% of the simulated injects across all shapes and sizes, followed by an autoregressive moving average model (16%), a generalized linear model (15%), a modified version of the Early Aberration Reporting System’s C2 algorithm (13%), a temporal scan statistic (11%), and a cumulative sum control chart (<2%). Of the spatial/spatio-temporal methods we tested, a spatial scan statistic detected 3% of all injects, a Bayes regression found 2%, and a generalized linear mixed model and a space-time permutation scan statistic detected none at a specificity of 95%. Positive predictive value was low (<7%) for all methods. Overall, the detection methods we tested did not perform well in identifying the temporal and spatial clusters of cases in the inject dataset. The spatial scan statistic, our current method for spatial cluster detection, performed slightly better than the other tested methods across different inject magnitudes and types. Furthermore, we found the scan statistics, as applied in the SaTScan software package, to be the easiest to program and implement for daily data analysis. PMID:28886112
SEGMENTING CT PROSTATE IMAGES USING POPULATION AND PATIENT-SPECIFIC STATISTICS FOR RADIOTHERAPY.
Feng, Qianjin; Foskey, Mark; Tang, Songyuan; Chen, Wufan; Shen, Dinggang
2009-08-07
This paper presents a new deformable model using both population and patient-specific statistics to segment the prostate from CT images. There are two novelties in the proposed method. First, a modified scale invariant feature transform (SIFT) local descriptor, which is more distinctive than general intensity and gradient features, is used to characterize the image features. Second, an online training approach is used to build the shape statistics for accurately capturing intra-patient variation, which is more important than inter-patient variation for prostate segmentation in clinical radiotherapy. Experimental results show that the proposed method is robust and accurate, suitable for clinical application.
SEGMENTING CT PROSTATE IMAGES USING POPULATION AND PATIENT-SPECIFIC STATISTICS FOR RADIOTHERAPY
Feng, Qianjin; Foskey, Mark; Tang, Songyuan; Chen, Wufan; Shen, Dinggang
2010-01-01
This paper presents a new deformable model using both population and patient-specific statistics to segment the prostate from CT images. There are two novelties in the proposed method. First, a modified scale invariant feature transform (SIFT) local descriptor, which is more distinctive than general intensity and gradient features, is used to characterize the image features. Second, an online training approach is used to build the shape statistics for accurately capturing intra-patient variation, which is more important than inter-patient variation for prostate segmentation in clinical radiotherapy. Experimental results show that the proposed method is robust and accurate, suitable for clinical application. PMID:21197416
A Unifying Framework for Teaching Nonparametric Statistical Tests
ERIC Educational Resources Information Center
Bargagliotti, Anna E.; Orrison, Michael E.
2014-01-01
Increased importance is being placed on statistics at both the K-12 and undergraduate level. Research divulging effective methods to teach specific statistical concepts is still widely sought after. In this paper, we focus on best practices for teaching topics in nonparametric statistics at the undergraduate level. To motivate the work, we…
An Applied Statistics Course for Systematics and Ecology PhD Students
ERIC Educational Resources Information Center
Ojeda, Mario Miguel; Sosa, Victoria
2002-01-01
Statistics education is under review at all educational levels. Statistical concepts, as well as the use of statistical methods and techniques, can be taught in at least two contrasting ways. Specifically, (1) teaching can be theoretically and mathematically oriented, or (2) it can be less mathematically oriented being focused, instead, on…
Statistical Process Control: Going to the Limit for Quality.
ERIC Educational Resources Information Center
Training, 1987
1987-01-01
Defines the concept of statistical process control, a quality control method used especially in manufacturing. Generally, concept users set specific standard levels that must be met. Makes the point that although employees work directly with the method, management is responsible for its success within the plant. (CH)
Statistical Design Model (SDM) of satellite thermal control subsystem
NASA Astrophysics Data System (ADS)
Mirshams, Mehran; Zabihian, Ehsan; Aarabi Chamalishahi, Mahdi
2016-07-01
Satellites thermal control, is a satellite subsystem that its main task is keeping the satellite components at its own survival and activity temperatures. Ability of satellite thermal control plays a key role in satisfying satellite's operational requirements and designing this subsystem is a part of satellite design. In the other hand due to the lack of information provided by companies and designers still doesn't have a specific design process while it is one of the fundamental subsystems. The aim of this paper, is to identify and extract statistical design models of spacecraft thermal control subsystem by using SDM design method. This method analyses statistical data with a particular procedure. To implement SDM method, a complete database is required. Therefore, we first collect spacecraft data and create a database, and then we extract statistical graphs using Microsoft Excel, from which we further extract mathematical models. Inputs parameters of the method are mass, mission, and life time of the satellite. For this purpose at first thermal control subsystem has been introduced and hardware using in the this subsystem and its variants has been investigated. In the next part different statistical models has been mentioned and a brief compare will be between them. Finally, this paper particular statistical model is extracted from collected statistical data. Process of testing the accuracy and verifying the method use a case study. Which by the comparisons between the specifications of thermal control subsystem of a fabricated satellite and the analyses results, the methodology in this paper was proved to be effective. Key Words: Thermal control subsystem design, Statistical design model (SDM), Satellite conceptual design, Thermal hardware
Assessing groundwater vulnerability to agrichemical contamination in the Midwest US
Burkart, M.R.; Kolpin, D.W.; James, D.E.
1999-01-01
Agrichemicals (herbicides and nitrate) are significant sources of diffuse pollution to groundwater. Indirect methods are needed to assess the potential for groundwater contamination by diffuse sources because groundwater monitoring is too costly to adequately define the geographic extent of contamination at a regional or national scale. This paper presents examples of the application of statistical, overlay and index, and process-based modeling methods for groundwater vulnerability assessments to a variety of data from the Midwest U.S. The principles for vulnerability assessment include both intrinsic (pedologic, climatologic, and hydrogeologic factors) and specific (contaminant and other anthropogenic factors) vulnerability of a location. Statistical methods use the frequency of contaminant occurrence, contaminant concentration, or contamination probability as a response variable. Statistical assessments are useful for defining the relations among explanatory and response variables whether they define intrinsic or specific vulnerability. Multivariate statistical analyses are useful for ranking variables critical to estimating water quality responses of interest. Overlay and index methods involve intersecting maps of intrinsic and specific vulnerability properties and indexing the variables by applying appropriate weights. Deterministic models use process-based equations to simulate contaminant transport and are distinguished from the other methods in their potential to predict contaminant transport in both space and time. An example of a one-dimensional leaching model linked to a geographic information system (GIS) to define a regional metamodel for contamination in the Midwest is included.
40 CFR 1065.12 - Approval of alternate procedures.
Code of Federal Regulations, 2010 CFR
2010-07-01
... engine meets all applicable emission standards according to specified procedures. (iii) Use statistical.... (e) We may give you specific directions regarding methods for statistical analysis, or we may approve... statistical tests. Perform the tests as follows: (1) Repeat measurements for all applicable duty cycles at...
The Statistical Power of Planned Comparisons.
ERIC Educational Resources Information Center
Benton, Roberta L.
Basic principles underlying statistical power are examined; and issues pertaining to effect size, sample size, error variance, and significance level are highlighted via the use of specific hypothetical examples. Analysis of variance (ANOVA) and related methods remain popular, although other procedures sometimes have more statistical power against…
General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies
Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong
2013-01-01
We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515
Non-linear scaling of a musculoskeletal model of the lower limb using statistical shape models.
Nolte, Daniel; Tsang, Chui Kit; Zhang, Kai Yu; Ding, Ziyun; Kedgley, Angela E; Bull, Anthony M J
2016-10-03
Accurate muscle geometry for musculoskeletal models is important to enable accurate subject-specific simulations. Commonly, linear scaling is used to obtain individualised muscle geometry. More advanced methods include non-linear scaling using segmented bone surfaces and manual or semi-automatic digitisation of muscle paths from medical images. In this study, a new scaling method combining non-linear scaling with reconstructions of bone surfaces using statistical shape modelling is presented. Statistical Shape Models (SSMs) of femur and tibia/fibula were used to reconstruct bone surfaces of nine subjects. Reference models were created by morphing manually digitised muscle paths to mean shapes of the SSMs using non-linear transformations and inter-subject variability was calculated. Subject-specific models of muscle attachment and via points were created from three reference models. The accuracy was evaluated by calculating the differences between the scaled and manually digitised models. The points defining the muscle paths showed large inter-subject variability at the thigh and shank - up to 26mm; this was found to limit the accuracy of all studied scaling methods. Errors for the subject-specific muscle point reconstructions of the thigh could be decreased by 9% to 20% by using the non-linear scaling compared to a typical linear scaling method. We conclude that the proposed non-linear scaling method is more accurate than linear scaling methods. Thus, when combined with the ability to reconstruct bone surfaces from incomplete or scattered geometry data using statistical shape models our proposed method is an alternative to linear scaling methods. Copyright © 2016 The Author. Published by Elsevier Ltd.. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bao, Rong; Li, Yongdong; Liu, Chunliang
2016-07-15
The output power fluctuations caused by weights of macro particles used in particle-in-cell (PIC) simulations of a backward wave oscillator and a travelling wave tube are statistically analyzed. It is found that the velocities of electrons passed a specific slow-wave structure form a specific electron velocity distribution. The electron velocity distribution obtained in PIC simulation with a relative small weight of macro particles is considered as an initial distribution. By analyzing this initial distribution with a statistical method, the estimations of the output power fluctuations caused by different weights of macro particles are obtained. The statistical method is verified bymore » comparing the estimations with the simulation results. The fluctuations become stronger with increasing weight of macro particles, which can also be determined reversely from estimations of the output power fluctuations. With the weights of macro particles optimized by the statistical method, the output power fluctuations in PIC simulations are relatively small and acceptable.« less
Le Strat, Yann
2017-01-01
The objective of this paper is to evaluate a panel of statistical algorithms for temporal outbreak detection. Based on a large dataset of simulated weekly surveillance time series, we performed a systematic assessment of 21 statistical algorithms, 19 implemented in the R package surveillance and two other methods. We estimated false positive rate (FPR), probability of detection (POD), probability of detection during the first week, sensitivity, specificity, negative and positive predictive values and F1-measure for each detection method. Then, to identify the factors associated with these performance measures, we ran multivariate Poisson regression models adjusted for the characteristics of the simulated time series (trend, seasonality, dispersion, outbreak sizes, etc.). The FPR ranged from 0.7% to 59.9% and the POD from 43.3% to 88.7%. Some methods had a very high specificity, up to 99.4%, but a low sensitivity. Methods with a high sensitivity (up to 79.5%) had a low specificity. All methods had a high negative predictive value, over 94%, while positive predictive values ranged from 6.5% to 68.4%. Multivariate Poisson regression models showed that performance measures were strongly influenced by the characteristics of time series. Past or current outbreak size and duration strongly influenced detection performances. PMID:28715489
NASA Astrophysics Data System (ADS)
Denis, Vincent
2008-09-01
This paper presents a statistical method for determining the dimensions, tolerance and specifications of components for the Laser MegaJoule (LMJ). Numerous constraints inherent to a large facility require specific tolerances: the huge number of optical components; the interdependence of these components between the beams of same bundle; angular multiplexing for the amplifier section; distinct operating modes between the alignment and firing phases; the definition and use of alignment software in the place of classic optimization. This method provides greater flexibility to determine the positioning and manufacturing specifications of the optical components. Given the enormous power of the Laser MegaJoule (over 18 kJ in the infrared and 9 kJ in the ultraviolet), one of the major risks is damage the optical mounts and pollution of the installation by mechanical ablation. This method enables estimation of the beam occultation probabilities and quantification of the risks for the facility. All the simulations were run using the ZEMAX-EE optical design software.
ERIC Educational Resources Information Center
Bliss, Leonard B.; Tashakkori, Abbas
This paper discusses the objectives that would be appropriate for statistics classes for students who are not majoring in statistics, evaluation, or quantitative research design. These "non-majors" should be able to choose appropriate analytical methods for specific sets of data based on the research question and the nature of the data, and they…
ERIC Educational Resources Information Center
Nevitt, Jonathan; Hancock, Gregory R.
2001-01-01
Evaluated the bootstrap method under varying conditions of nonnormality, sample size, model specification, and number of bootstrap samples drawn from the resampling space. Results for the bootstrap suggest the resampling-based method may be conservative in its control over model rejections, thus having an impact on the statistical power associated…
[Optimized application of nested PCR method for detection of malaria].
Yao-Guang, Z; Li, J; Zhen-Yu, W; Li, C
2017-04-28
Objective To optimize the application of the nested PCR method for the detection of malaria according to the working practice, so as to improve the efficiency of malaria detection. Methods Premixing solution of PCR, internal primers for further amplification and new designed primers that aimed at two Plasmodium ovale subspecies were employed to optimize the reaction system, reaction condition and specific primers of P . ovale on basis of routine nested PCR. Then the specificity and the sensitivity of the optimized method were analyzed. The positive blood samples and examination samples of malaria were detected by the routine nested PCR and the optimized method simultaneously, and the detection results were compared and analyzed. Results The optimized method showed good specificity, and its sensitivity could reach the pg to fg level. The two methods were used to detect the same positive malarial blood samples simultaneously, the results indicated that the PCR products of the two methods had no significant difference, but the non-specific amplification reduced obviously and the detection rates of P . ovale subspecies improved, as well as the total specificity also increased through the use of the optimized method. The actual detection results of 111 cases of malarial blood samples showed that the sensitivity and specificity of the routine nested PCR were 94.57% and 86.96%, respectively, and those of the optimized method were both 93.48%, and there was no statistically significant difference between the two methods in the sensitivity ( P > 0.05), but there was a statistically significant difference between the two methods in the specificity ( P < 0.05). Conclusion The optimized PCR can improve the specificity without reducing the sensitivity on the basis of the routine nested PCR, it also can save the cost and increase the efficiency of malaria detection as less experiment links.
Davalos, Angel D; Luben, Thomas J; Herring, Amy H; Sacks, Jason D
2017-02-01
Air pollution epidemiology traditionally focuses on the relationship between individual air pollutants and health outcomes (e.g., mortality). To account for potential copollutant confounding, individual pollutant associations are often estimated by adjusting or controlling for other pollutants in the mixture. Recently, the need to characterize the relationship between health outcomes and the larger multipollutant mixture has been emphasized in an attempt to better protect public health and inform more sustainable air quality management decisions. New and innovative statistical methods to examine multipollutant exposures were identified through a broad literature search, with a specific focus on those statistical approaches currently used in epidemiologic studies of short-term exposures to criteria air pollutants (i.e., particulate matter, carbon monoxide, sulfur dioxide, nitrogen dioxide, and ozone). Five broad classes of statistical approaches were identified for examining associations between short-term multipollutant exposures and health outcomes, specifically additive main effects, effect measure modification, unsupervised dimension reduction, supervised dimension reduction, and nonparametric methods. These approaches are characterized including advantages and limitations in different epidemiologic scenarios. By highlighting the characteristics of various studies in which multipollutant statistical methods have been used, this review provides epidemiologists and biostatisticians with a resource to aid in the selection of the most optimal statistical method to use when examining multipollutant exposures. Published by Elsevier Inc.
Gardenier, John S
2012-12-01
This paper recommends how authors of statistical studies can communicate to general audiences fully, clearly, and comfortably. The studies may use statistical methods to explore issues in science, engineering, and society or they may address issues in statistics specifically. In either case, readers without explicit statistical training should have no problem understanding the issues, the methods, or the results at a non-technical level. The arguments for those results should be clear, logical, and persuasive. This paper also provides advice for editors of general journals on selecting high quality statistical articles without the need for exceptional work or expense. Finally, readers are also advised to watch out for some common errors or misuses of statistics that can be detected without a technical statistical background.
A statistical method (cross-validation) for bone loss region detection after spaceflight
Zhao, Qian; Li, Wenjun; Li, Caixia; Chu, Philip W.; Kornak, John; Lang, Thomas F.
2010-01-01
Astronauts experience bone loss after the long spaceflight missions. Identifying specific regions that undergo the greatest losses (e.g. the proximal femur) could reveal information about the processes of bone loss in disuse and disease. Methods for detecting such regions, however, remains an open problem. This paper focuses on statistical methods to detect such regions. We perform statistical parametric mapping to get t-maps of changes in images, and propose a new cross-validation method to select an optimum suprathreshold for forming clusters of pixels. Once these candidate clusters are formed, we use permutation testing of longitudinal labels to derive significant changes. PMID:20632144
Lee, Juneyoung; Kim, Kyung Won; Choi, Sang Hyun; Huh, Jimi
2015-01-01
Meta-analysis of diagnostic test accuracy studies differs from the usual meta-analysis of therapeutic/interventional studies in that, it is required to simultaneously analyze a pair of two outcome measures such as sensitivity and specificity, instead of a single outcome. Since sensitivity and specificity are generally inversely correlated and could be affected by a threshold effect, more sophisticated statistical methods are required for the meta-analysis of diagnostic test accuracy. Hierarchical models including the bivariate model and the hierarchical summary receiver operating characteristic model are increasingly being accepted as standard methods for meta-analysis of diagnostic test accuracy studies. We provide a conceptual review of statistical methods currently used and recommended for meta-analysis of diagnostic test accuracy studies. This article could serve as a methodological reference for those who perform systematic review and meta-analysis of diagnostic test accuracy studies. PMID:26576107
Simplified estimation of age-specific reference intervals for skewed data.
Wright, E M; Royston, P
1997-12-30
Age-specific reference intervals are commonly used in medical screening and clinical practice, where interest lies in the detection of extreme values. Many different statistical approaches have been published on this topic. The advantages of a parametric method are that they necessarily produce smooth centile curves, the entire density is estimated and an explicit formula is available for the centiles. The method proposed here is a simplified version of a recent approach proposed by Royston and Wright. Basic transformations of the data and multiple regression techniques are combined to model the mean, standard deviation and skewness. Using these simple tools, which are implemented in almost all statistical computer packages, age-specific reference intervals may be obtained. The scope of the method is illustrated by fitting models to several real data sets and assessing each model using goodness-of-fit techniques.
Analysis of visual quality improvements provided by known tools for HDR content
NASA Astrophysics Data System (ADS)
Kim, Jaehwan; Alshina, Elena; Lee, JongSeok; Park, Youngo; Choi, Kwang Pyo
2016-09-01
In this paper, the visual quality of different solutions for high dynamic range (HDR) compression using MPEG test contents is analyzed. We also simulate the method for an efficient HDR compression which is based on statistical property of the signal. The method is compliant with HEVC specification and also easily compatible with other alternative methods which might require HEVC specification changes. It was subjectively tested on commercial TVs and compared with alternative solutions for HDR coding. Subjective visual quality tests were performed using SUHD TVs model which is SAMSUNG JS9500 with maximum luminance up to 1000nit in test. The solution that is based on statistical property shows not only improvement of objective performance but improvement of visual quality compared to other HDR solutions, while it is compatible with HEVC specification.
Analysis of Statistical Methods Currently used in Toxicology Journals
Na, Jihye; Yang, Hyeri
2014-01-01
Statistical methods are frequently used in toxicology, yet it is not clear whether the methods employed by the studies are used consistently and conducted based on sound statistical grounds. The purpose of this paper is to describe statistical methods used in top toxicology journals. More specifically, we sampled 30 papers published in 2014 from Toxicology and Applied Pharmacology, Archives of Toxicology, and Toxicological Science and described methodologies used to provide descriptive and inferential statistics. One hundred thirteen endpoints were observed in those 30 papers, and most studies had sample size less than 10, with the median and the mode being 6 and 3 & 6, respectively. Mean (105/113, 93%) was dominantly used to measure central tendency, and standard error of the mean (64/113, 57%) and standard deviation (39/113, 34%) were used to measure dispersion, while few studies provide justifications regarding why the methods being selected. Inferential statistics were frequently conducted (93/113, 82%), with one-way ANOVA being most popular (52/93, 56%), yet few studies conducted either normality or equal variance test. These results suggest that more consistent and appropriate use of statistical method is necessary which may enhance the role of toxicology in public health. PMID:25343012
Analysis of Statistical Methods Currently used in Toxicology Journals.
Na, Jihye; Yang, Hyeri; Bae, SeungJin; Lim, Kyung-Min
2014-09-01
Statistical methods are frequently used in toxicology, yet it is not clear whether the methods employed by the studies are used consistently and conducted based on sound statistical grounds. The purpose of this paper is to describe statistical methods used in top toxicology journals. More specifically, we sampled 30 papers published in 2014 from Toxicology and Applied Pharmacology, Archives of Toxicology, and Toxicological Science and described methodologies used to provide descriptive and inferential statistics. One hundred thirteen endpoints were observed in those 30 papers, and most studies had sample size less than 10, with the median and the mode being 6 and 3 & 6, respectively. Mean (105/113, 93%) was dominantly used to measure central tendency, and standard error of the mean (64/113, 57%) and standard deviation (39/113, 34%) were used to measure dispersion, while few studies provide justifications regarding why the methods being selected. Inferential statistics were frequently conducted (93/113, 82%), with one-way ANOVA being most popular (52/93, 56%), yet few studies conducted either normality or equal variance test. These results suggest that more consistent and appropriate use of statistical method is necessary which may enhance the role of toxicology in public health.
Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong
2015-01-01
Background Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. Objectives This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. Methods We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. Results There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. Conclusion The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent. PMID:26053876
SpeCond: a method to detect condition-specific gene expression
2011-01-01
Transcriptomic studies routinely measure expression levels across numerous conditions. These datasets allow identification of genes that are specifically expressed in a small number of conditions. However, there are currently no statistically robust methods for identifying such genes. Here we present SpeCond, a method to detect condition-specific genes that outperforms alternative approaches. We apply the method to a dataset of 32 human tissues to determine 2,673 specifically expressed genes. An implementation of SpeCond is freely available as a Bioconductor package at http://www.bioconductor.org/packages/release/bioc/html/SpeCond.html. PMID:22008066
Vomweg, T W; Buscema, M; Kauczor, H U; Teifke, A; Intraligi, M; Terzi, S; Heussel, C P; Achenbach, T; Rieker, O; Mayer, D; Thelen, M
2003-09-01
The aim of this study was to evaluate the capability of improved artificial neural networks (ANN) and additional novel training methods in distinguishing between benign and malignant breast lesions in contrast-enhanced magnetic resonance-mammography (MRM). A total of 604 histologically proven cases of contrast-enhanced lesions of the female breast at MRI were analyzed. Morphological, dynamic and clinical parameters were collected and stored in a database. The data set was divided into several groups using random or experimental methods [Training & Testing (T&T) algorithm] to train and test different ANNs. An additional novel computer program for input variable selection was applied. Sensitivity and specificity were calculated and compared with a statistical method and an expert radiologist. After optimization of the distribution of cases among the training and testing sets by the T & T algorithm and the reduction of input variables by the Input Selection procedure a highly sophisticated ANN achieved a sensitivity of 93.6% and a specificity of 91.9% in predicting malignancy of lesions within an independent prediction sample set. The best statistical method reached a sensitivity of 90.5% and a specificity of 68.9%. An expert radiologist performed better than the statistical method but worse than the ANN (sensitivity 92.1%, specificity 85.6%). Features extracted out of dynamic contrast-enhanced MRM and additional clinical data can be successfully analyzed by advanced ANNs. The quality of the resulting network strongly depends on the training methods, which are improved by the use of novel training tools. The best results of an improved ANN outperform expert radiologists.
Walsh, Daniel P.; Norton, Andrew S.; Storm, Daniel J.; Van Deelen, Timothy R.; Heisy, Dennis M.
2018-01-01
Implicit and explicit use of expert knowledge to inform ecological analyses is becoming increasingly common because it often represents the sole source of information in many circumstances. Thus, there is a need to develop statistical methods that explicitly incorporate expert knowledge, and can successfully leverage this information while properly accounting for associated uncertainty during analysis. Studies of cause-specific mortality provide an example of implicit use of expert knowledge when causes-of-death are uncertain and assigned based on the observer's knowledge of the most likely cause. To explicitly incorporate this use of expert knowledge and the associated uncertainty, we developed a statistical model for estimating cause-specific mortality using a data augmentation approach within a Bayesian hierarchical framework. Specifically, for each mortality event, we elicited the observer's belief of cause-of-death by having them specify the probability that the death was due to each potential cause. These probabilities were then used as prior predictive values within our framework. This hierarchical framework permitted a simple and rigorous estimation method that was easily modified to include covariate effects and regularizing terms. Although applied to survival analysis, this method can be extended to any event-time analysis with multiple event types, for which there is uncertainty regarding the true outcome. We conducted simulations to determine how our framework compared to traditional approaches that use expert knowledge implicitly and assume that cause-of-death is specified accurately. Simulation results supported the inclusion of observer uncertainty in cause-of-death assignment in modeling of cause-specific mortality to improve model performance and inference. Finally, we applied the statistical model we developed and a traditional method to cause-specific survival data for white-tailed deer, and compared results. We demonstrate that model selection results changed between the two approaches, and incorporating observer knowledge in cause-of-death increased the variability associated with parameter estimates when compared to the traditional approach. These differences between the two approaches can impact reported results, and therefore, it is critical to explicitly incorporate expert knowledge in statistical methods to ensure rigorous inference.
Shi, Y; Qi, F; Xue, Z; Chen, L; Ito, K; Matsuo, H; Shen, D
2008-04-01
This paper presents a new deformable model using both population-based and patient-specific shape statistics to segment lung fields from serial chest radiographs. There are two novelties in the proposed deformable model. First, a modified scale invariant feature transform (SIFT) local descriptor, which is more distinctive than the general intensity and gradient features, is used to characterize the image features in the vicinity of each pixel. Second, the deformable contour is constrained by both population-based and patient-specific shape statistics, and it yields more robust and accurate segmentation of lung fields for serial chest radiographs. In particular, for segmenting the initial time-point images, the population-based shape statistics is used to constrain the deformable contour; as more subsequent images of the same patient are acquired, the patient-specific shape statistics online collected from the previous segmentation results gradually takes more roles. Thus, this patient-specific shape statistics is updated each time when a new segmentation result is obtained, and it is further used to refine the segmentation results of all the available time-point images. Experimental results show that the proposed method is more robust and accurate than other active shape models in segmenting the lung fields from serial chest radiographs.
A Selective Overview of Variable Selection in High Dimensional Feature Space
Fan, Jianqing
2010-01-01
High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded as a specific form of penalized likelihood, is computationally too expensive for many modern statistical applications. Other forms of penalized likelihood methods have been successfully developed over the last decade to cope with high dimensionality. They have been widely applied for simultaneously selecting important variables and estimating their effects in high dimensional statistical inference. In this article, we present a brief account of the recent developments of theory, methods, and implementations for high dimensional variable selection. What limits of the dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the field. The properties of non-concave penalized likelihood and its roles in high dimensional statistical modeling are emphasized. We also review some recent advances in ultra-high dimensional variable selection, with emphasis on independence screening and two-scale methods. PMID:21572976
Chi-squared and C statistic minimization for low count per bin data
NASA Astrophysics Data System (ADS)
Nousek, John A.; Shue, David R.
1989-07-01
Results are presented from a computer simulation comparing two statistical fitting techniques on data samples with large and small counts per bin; the results are then related specifically to X-ray astronomy. The Marquardt and Powell minimization techniques are compared by using both to minimize the chi-squared statistic. In addition, Cash's C statistic is applied, with Powell's method, and it is shown that the C statistic produces better fits in the low-count regime than chi-squared.
Chi-squared and C statistic minimization for low count per bin data. [sampling in X ray astronomy
NASA Technical Reports Server (NTRS)
Nousek, John A.; Shue, David R.
1989-01-01
Results are presented from a computer simulation comparing two statistical fitting techniques on data samples with large and small counts per bin; the results are then related specifically to X-ray astronomy. The Marquardt and Powell minimization techniques are compared by using both to minimize the chi-squared statistic. In addition, Cash's C statistic is applied, with Powell's method, and it is shown that the C statistic produces better fits in the low-count regime than chi-squared.
ERIC Educational Resources Information Center
van Krimpen-Stoop, Edith M. L. A.; Meijer, Rob R.
Person-fit research in the context of paper-and-pencil tests is reviewed, and some specific problems regarding person fit in the context of computerized adaptive testing (CAT) are discussed. Some new methods are proposed to investigate person fit in a CAT environment. These statistics are based on Statistical Process Control (SPC) theory. A…
Wu, Yazhou; Zhou, Liang; Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong
2015-01-01
Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent.
Normalization, bias correction, and peak calling for ChIP-seq
Diaz, Aaron; Park, Kiyoub; Lim, Daniel A.; Song, Jun S.
2012-01-01
Next-generation sequencing is rapidly transforming our ability to profile the transcriptional, genetic, and epigenetic states of a cell. In particular, sequencing DNA from the immunoprecipitation of protein-DNA complexes (ChIP-seq) and methylated DNA (MeDIP-seq) can reveal the locations of protein binding sites and epigenetic modifications. These approaches contain numerous biases which may significantly influence the interpretation of the resulting data. Rigorous computational methods for detecting and removing such biases are still lacking. Also, multi-sample normalization still remains an important open problem. This theoretical paper systematically characterizes the biases and properties of ChIP-seq data by comparing 62 separate publicly available datasets, using rigorous statistical models and signal processing techniques. Statistical methods for separating ChIP-seq signal from background noise, as well as correcting enrichment test statistics for sequence-dependent and sonication biases, are presented. Our method effectively separates reads into signal and background components prior to normalization, improving the signal-to-noise ratio. Moreover, most peak callers currently use a generic null model which suffers from low specificity at the sensitivity level requisite for detecting subtle, but true, ChIP enrichment. The proposed method of determining a cell type-specific null model, which accounts for cell type-specific biases, is shown to be capable of achieving a lower false discovery rate at a given significance threshold than current methods. PMID:22499706
Austin, Peter C.; van Klaveren, David; Vergouwe, Yvonne; Nieboer, Daan; Lee, Douglas S.; Steyerberg, Ewout W.
2017-01-01
Objective Validation of clinical prediction models traditionally refers to the assessment of model performance in new patients. We studied different approaches to geographic and temporal validation in the setting of multicenter data from two time periods. Study Design and Setting We illustrated different analytic methods for validation using a sample of 14,857 patients hospitalized with heart failure at 90 hospitals in two distinct time periods. Bootstrap resampling was used to assess internal validity. Meta-analytic methods were used to assess geographic transportability. Each hospital was used once as a validation sample, with the remaining hospitals used for model derivation. Hospital-specific estimates of discrimination (c-statistic) and calibration (calibration intercepts and slopes) were pooled using random effects meta-analysis methods. I2 statistics and prediction interval width quantified geographic transportability. Temporal transportability was assessed using patients from the earlier period for model derivation and patients from the later period for model validation. Results Estimates of reproducibility, pooled hospital-specific performance, and temporal transportability were on average very similar, with c-statistics of 0.75. Between-hospital variation was moderate according to I2 statistics and prediction intervals for c-statistics. Conclusion This study illustrates how performance of prediction models can be assessed in settings with multicenter data at different time periods. PMID:27262237
Statistical analysis of weigh-in-motion data for bridge design in Vermont.
DOT National Transportation Integrated Search
2014-10-01
This study investigates the suitability of the HL-93 live load model recommended by AASHTO LRFD Specifications : for its use in the analysis and design of bridges in Vermont. The method of approach consists in performing a : statistical analysis of w...
Zhang, Yun; Baheti, Saurabh; Sun, Zhifu
2018-05-01
High-throughput bisulfite methylation sequencing such as reduced representation bisulfite sequencing (RRBS), Agilent SureSelect Human Methyl-Seq (Methyl-seq) or whole-genome bisulfite sequencing is commonly used for base resolution methylome research. These data are represented either by the ratio of methylated cytosine versus total coverage at a CpG site or numbers of methylated and unmethylated cytosines. Multiple statistical methods can be used to detect differentially methylated CpGs (DMCs) between conditions, and these methods are often the base for the next step of differentially methylated region identification. The ratio data have a flexibility of fitting to many linear models, but the raw count data take consideration of coverage information. There is an array of options in each datatype for DMC detection; however, it is not clear which is an optimal statistical method. In this study, we systematically evaluated four statistic methods on methylation ratio data and four methods on count-based data and compared their performances with regard to type I error control, sensitivity and specificity of DMC detection and computational resource demands using real RRBS data along with simulation. Our results show that the ratio-based tests are generally more conservative (less sensitive) than the count-based tests. However, some count-based methods have high false-positive rates and should be avoided. The beta-binomial model gives a good balance between sensitivity and specificity and is preferred method. Selection of methods in different settings, signal versus noise and sample size estimation are also discussed.
A likelihood ratio test for evolutionary rate shifts and functional divergence among proteins
Knudsen, Bjarne; Miyamoto, Michael M.
2001-01-01
Changes in protein function can lead to changes in the selection acting on specific residues. This can often be detected as evolutionary rate changes at the sites in question. A maximum-likelihood method for detecting evolutionary rate shifts at specific protein positions is presented. The method determines significance values of the rate differences to give a sound statistical foundation for the conclusions drawn from the analyses. A statistical test for detecting slowly evolving sites is also described. The methods are applied to a set of Myc proteins for the identification of both conserved sites and those with changing evolutionary rates. Those positions with conserved and changing rates are related to the structures and functions of their proteins. The results are compared with an earlier Bayesian method, thereby highlighting the advantages of the new likelihood ratio tests. PMID:11734650
RAId_aPS: MS/MS Analysis with Multiple Scoring Functions and Spectrum-Specific Statistics
Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo
2010-01-01
Statistically meaningful comparison/combination of peptide identification results from various search methods is impeded by the lack of a universal statistical standard. Providing an -value calibration protocol, we demonstrated earlier the feasibility of translating either the score or heuristic -value reported by any method into the textbook-defined -value, which may serve as the universal statistical standard. This protocol, although robust, may lose spectrum-specific statistics and might require a new calibration when changes in experimental setup occur. To mitigate these issues, we developed a new MS/MS search tool, RAId_aPS, that is able to provide spectrum-specific -values for additive scoring functions. Given a selection of scoring functions out of RAId score, K-score, Hyperscore and XCorr, RAId_aPS generates the corresponding score histograms of all possible peptides using dynamic programming. Using these score histograms to assign -values enables a calibration-free protocol for accurate significance assignment for each scoring function. RAId_aPS features four different modes: (i) compute the total number of possible peptides for a given molecular mass range, (ii) generate the score histogram given a MS/MS spectrum and a scoring function, (iii) reassign -values for a list of candidate peptides given a MS/MS spectrum and the scoring functions chosen, and (iv) perform database searches using selected scoring functions. In modes (iii) and (iv), RAId_aPS is also capable of combining results from different scoring functions using spectrum-specific statistics. The web link is http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/raid_aps/index.html. Relevant binaries for Linux, Windows, and Mac OS X are available from the same page. PMID:21103371
Statistical evaluation of forecasts
NASA Astrophysics Data System (ADS)
Mader, Malenka; Mader, Wolfgang; Gluckman, Bruce J.; Timmer, Jens; Schelter, Björn
2014-08-01
Reliable forecasts of extreme but rare events, such as earthquakes, financial crashes, and epileptic seizures, would render interventions and precautions possible. Therefore, forecasting methods have been developed which intend to raise an alarm if an extreme event is about to occur. In order to statistically validate the performance of a prediction system, it must be compared to the performance of a random predictor, which raises alarms independent of the events. Such a random predictor can be obtained by bootstrapping or analytically. We propose an analytic statistical framework which, in contrast to conventional methods, allows for validating independently the sensitivity and specificity of a forecasting method. Moreover, our method accounts for the periods during which an event has to remain absent or occur after a respective forecast.
Tai, Patricia; Yu, Edward; Cserni, Gábor; Vlastos, Georges; Royce, Melanie; Kunkler, Ian; Vinh-Hung, Vincent
2005-01-01
Background The present commonly used five-year survival rates are not adequate to represent the statistical cure. In the present study, we established the minimum number of years required for follow-up to estimate statistical cure rate, by using a lognormal distribution of the survival time of those who died of their cancer. We introduced the term, threshold year, the follow-up time for patients dying from the specific cancer covers most of the survival data, leaving less than 2.25% uncovered. This is close enough to cure from that specific cancer. Methods Data from the Surveillance, Epidemiology and End Results (SEER) database were tested if the survival times of cancer patients who died of their disease followed the lognormal distribution using a minimum chi-square method. Patients diagnosed from 1973–1992 in the registries of Connecticut and Detroit were chosen so that a maximum of 27 years was allowed for follow-up to 1999. A total of 49 specific organ sites were tested. The parameters of those lognormal distributions were found for each cancer site. The cancer-specific survival rates at the threshold years were compared with the longest available Kaplan-Meier survival estimates. Results The characteristics of the cancer-specific survival times of cancer patients who died of their disease from 42 cancer sites out of 49 sites were verified to follow different lognormal distributions. The threshold years validated for statistical cure varied for different cancer sites, from 2.6 years for pancreas cancer to 25.2 years for cancer of salivary gland. At the threshold year, the statistical cure rates estimated for 40 cancer sites were found to match the actuarial long-term survival rates estimated by the Kaplan-Meier method within six percentage points. For two cancer sites: breast and thyroid, the threshold years were so long that the cancer-specific survival rates could yet not be obtained because the SEER data do not provide sufficiently long follow-up. Conclusion The present study suggests a certain threshold year is required to wait before the statistical cure rate can be estimated for each cancer site. For some cancers, such as breast and thyroid, the 5- or 10-year survival rates inadequately reflect statistical cure rates, and highlight the need for long-term follow-up of these patients. PMID:15904508
We have previously developed a statistical method to identify gene sets enriched with condition-specific genetic dependencies. The method constructs gene dependency networks from bootstrapped samples in one condition and computes the divergence between distributions of network likelihood scores from different conditions. It was shown to be capable of sensitive and specific identification of pathways with phenotype-specific dysregulation, i.e., rewiring of dependencies between genes in different conditions.
NASA Astrophysics Data System (ADS)
Guillen, George; Rainey, Gail; Morin, Michelle
2004-04-01
Currently, the Minerals Management Service uses the Oil Spill Risk Analysis model (OSRAM) to predict the movement of potential oil spills greater than 1000 bbl originating from offshore oil and gas facilities. OSRAM generates oil spill trajectories using meteorological and hydrological data input from either actual physical measurements or estimates generated from other hydrological models. OSRAM and many other models produce output matrices of average, maximum and minimum contact probabilities to specific landfall or target segments (columns) from oil spills at specific points (rows). Analysts and managers are often interested in identifying geographic areas or groups of facilities that pose similar risks to specific targets or groups of targets if a spill occurred. Unfortunately, due to the potentially large matrix generated by many spill models, this question is difficult to answer without the use of data reduction and visualization methods. In our study we utilized a multivariate statistical method called cluster analysis to group areas of similar risk based on potential distribution of landfall target trajectory probabilities. We also utilized ArcView™ GIS to display spill launch point groupings. The combination of GIS and multivariate statistical techniques in the post-processing of trajectory model output is a powerful tool for identifying and delineating areas of similar risk from multiple spill sources. We strongly encourage modelers, statistical and GIS software programmers to closely collaborate to produce a more seamless integration of these technologies and approaches to analyzing data. They are complimentary methods that strengthen the overall assessment of spill risks.
Wu, Zheyang; Yang, Chun; Tang, Dalin
2011-06-01
It has been hypothesized that mechanical risk factors may be used to predict future atherosclerotic plaque rupture. Truly predictive methods for plaque rupture and methods to identify the best predictor(s) from all the candidates are lacking in the literature. A novel combination of computational and statistical models based on serial magnetic resonance imaging (MRI) was introduced to quantify sensitivity and specificity of mechanical predictors to identify the best candidate for plaque rupture site prediction. Serial in vivo MRI data of carotid plaque from one patient was acquired with follow-up scan showing ulceration. 3D computational fluid-structure interaction (FSI) models using both baseline and follow-up data were constructed and plaque wall stress (PWS) and strain (PWSn) and flow maximum shear stress (FSS) were extracted from all 600 matched nodal points (100 points per matched slice, baseline matching follow-up) on the lumen surface for analysis. Each of the 600 points was marked "ulcer" or "nonulcer" using follow-up scan. Predictive statistical models for each of the seven combinations of PWS, PWSn, and FSS were trained using the follow-up data and applied to the baseline data to assess their sensitivity and specificity using the 600 data points for ulcer predictions. Sensitivity of prediction is defined as the proportion of the true positive outcomes that are predicted to be positive. Specificity of prediction is defined as the proportion of the true negative outcomes that are correctly predicted to be negative. Using probability 0.3 as a threshold to infer ulcer occurrence at the prediction stage, the combination of PWS and PWSn provided the best predictive accuracy with (sensitivity, specificity) = (0.97, 0.958). Sensitivity and specificity given by PWS, PWSn, and FSS individually were (0.788, 0.968), (0.515, 0.968), and (0.758, 0.928), respectively. The proposed computational-statistical process provides a novel method and a framework to assess the sensitivity and specificity of various risk indicators and offers the potential to identify the optimized predictor for plaque rupture using serial MRI with follow-up scan showing ulceration as the gold standard for method validation. While serial MRI data with actual rupture are hard to acquire, this single-case study suggests that combination of multiple predictors may provide potential improvement to existing plaque assessment schemes. With large-scale patient studies, this predictive modeling process may provide more solid ground for rupture predictor selection strategies and methods for image-based plaque vulnerability assessment.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mayer, B. P.; Valdez, C. A.; DeHope, A. J.
Critical to many modern forensic investigations is the chemical attribution of the origin of an illegal drug. This process greatly relies on identification of compounds indicative of its clandestine or commercial production. The results of these studies can yield detailed information on method of manufacture, sophistication of the synthesis operation, starting material source, and final product. In the present work, chemical attribution signatures (CAS) associated with the synthesis of the analgesic 3- methylfentanyl, N-(3-methyl-1-phenethylpiperidin-4-yl)-N-phenylpropanamide, were investigated. Six synthesis methods were studied in an effort to identify and classify route-specific signatures. These methods were chosen to minimize the use of scheduledmore » precursors, complicated laboratory equipment, number of overall steps, and demanding reaction conditions. Using gas and liquid chromatographies combined with mass spectrometric methods (GC-QTOF and LC-QTOF) in conjunction with inductivelycoupled plasma mass spectrometry (ICP-MS), over 240 distinct compounds and elements were monitored. As seen in our previous work with CAS of fentanyl synthesis the complexity of the resultant data matrix necessitated the use of multivariate statistical analysis. Using partial least squares discriminant analysis (PLS-DA), 62 statistically significant, route-specific CAS were identified. Statistical classification models using a variety of machine learning techniques were then developed with the ability to predict the method of 3-methylfentanyl synthesis from three blind crude samples generated by synthetic chemists without prior experience with these methods.« less
Quality evaluation of no-reference MR images using multidirectional filters and image statistics.
Jang, Jinseong; Bang, Kihun; Jang, Hanbyol; Hwang, Dosik
2018-09-01
This study aimed to develop a fully automatic, no-reference image-quality assessment (IQA) method for MR images. New quality-aware features were obtained by applying multidirectional filters to MR images and examining the feature statistics. A histogram of these features was then fitted to a generalized Gaussian distribution function for which the shape parameters yielded different values depending on the type of distortion in the MR image. Standard feature statistics were established through a training process based on high-quality MR images without distortion. Subsequently, the feature statistics of a test MR image were calculated and compared with the standards. The quality score was calculated as the difference between the shape parameters of the test image and the undistorted standard images. The proposed IQA method showed a >0.99 correlation with the conventional full-reference assessment methods; accordingly, this proposed method yielded the best performance among no-reference IQA methods for images containing six types of synthetic, MR-specific distortions. In addition, for authentically distorted images, the proposed method yielded the highest correlation with subjective assessments by human observers, thus demonstrating its superior performance over other no-reference IQAs. Our proposed IQA was designed to consider MR-specific features and outperformed other no-reference IQAs designed mainly for photographic images. Magn Reson Med 80:914-924, 2018. © 2018 International Society for Magnetic Resonance in Medicine. © 2018 International Society for Magnetic Resonance in Medicine.
Estimating Janka hardness from specific gravity for tropical and temperate species
Michael C. Wiemann; David W. Green
2007-01-01
Using mean values for basic (green) specific gravity and Janka side hardness for individual species obtained from the world literature, regression equations were developed to predict side hardness from specific gravity. Statistical and graphical methods showed that the hardnessâspecific gravity relationship is the same for tropical and temperate hardwoods, but that the...
Transport Coefficients from Large Deviation Functions
NASA Astrophysics Data System (ADS)
Gao, Chloe; Limmer, David
2017-10-01
We describe a method for computing transport coefficients from the direct evaluation of large deviation function. This method is general, relying on only equilibrium fluctuations, and is statistically efficient, employing trajectory based importance sampling. Equilibrium fluctuations of molecular currents are characterized by their large deviation functions, which is a scaled cumulant generating function analogous to the free energy. A diffusion Monte Carlo algorithm is used to evaluate the large deviation functions, from which arbitrary transport coefficients are derivable. We find significant statistical improvement over traditional Green-Kubo based calculations. The systematic and statistical errors of this method are analyzed in the context of specific transport coefficient calculations, including the shear viscosity, interfacial friction coefficient, and thermal conductivity.
Statistical procedures for analyzing mental health services data.
Elhai, Jon D; Calhoun, Patrick S; Ford, Julian D
2008-08-15
In mental health services research, analyzing service utilization data often poses serious problems, given the presence of substantially skewed data distributions. This article presents a non-technical introduction to statistical methods specifically designed to handle the complexly distributed datasets that represent mental health service use, including Poisson, negative binomial, zero-inflated, and zero-truncated regression models. A flowchart is provided to assist the investigator in selecting the most appropriate method. Finally, a dataset of mental health service use reported by medical patients is described, and a comparison of results across several different statistical methods is presented. Implications of matching data analytic techniques appropriately with the often complexly distributed datasets of mental health services utilization variables are discussed.
Morris, Jeffrey S
2012-01-01
In recent years, developments in molecular biotechnology have led to the increased promise of detecting and validating biomarkers, or molecular markers that relate to various biological or medical outcomes. Proteomics, the direct study of proteins in biological samples, plays an important role in the biomarker discovery process. These technologies produce complex, high dimensional functional and image data that present many analytical challenges that must be addressed properly for effective comparative proteomics studies that can yield potential biomarkers. Specific challenges include experimental design, preprocessing, feature extraction, and statistical analysis accounting for the inherent multiple testing issues. This paper reviews various computational aspects of comparative proteomic studies, and summarizes contributions I along with numerous collaborators have made. First, there is an overview of comparative proteomics technologies, followed by a discussion of important experimental design and preprocessing issues that must be considered before statistical analysis can be done. Next, the two key approaches to analyzing proteomics data, feature extraction and functional modeling, are described. Feature extraction involves detection and quantification of discrete features like peaks or spots that theoretically correspond to different proteins in the sample. After an overview of the feature extraction approach, specific methods for mass spectrometry ( Cromwell ) and 2D gel electrophoresis ( Pinnacle ) are described. The functional modeling approach involves modeling the proteomic data in their entirety as functions or images. A general discussion of the approach is followed by the presentation of a specific method that can be applied, wavelet-based functional mixed models, and its extensions. All methods are illustrated by application to two example proteomic data sets, one from mass spectrometry and one from 2D gel electrophoresis. While the specific methods presented are applied to two specific proteomic technologies, MALDI-TOF and 2D gel electrophoresis, these methods and the other principles discussed in the paper apply much more broadly to other expression proteomics technologies.
2011-01-01
Background Monitoring the time course of mortality by cause is a key public health issue. However, several mortality data production changes may affect cause-specific time trends, thus altering the interpretation. This paper proposes a statistical method that detects abrupt changes ("jumps") and estimates correction factors that may be used for further analysis. Methods The method was applied to a subset of the AMIEHS (Avoidable Mortality in the European Union, toward better Indicators for the Effectiveness of Health Systems) project mortality database and considered for six European countries and 13 selected causes of deaths. For each country and cause of death, an automated jump detection method called Polydect was applied to the log mortality rate time series. The plausibility of a data production change associated with each detected jump was evaluated through literature search or feedback obtained from the national data producers. For each plausible jump position, the statistical significance of the between-age and between-gender jump amplitude heterogeneity was evaluated by means of a generalized additive regression model, and correction factors were deduced from the results. Results Forty-nine jumps were detected by the Polydect method from 1970 to 2005. Most of the detected jumps were found to be plausible. The age- and gender-specific amplitudes of the jumps were estimated when they were statistically heterogeneous, and they showed greater by-age heterogeneity than by-gender heterogeneity. Conclusion The method presented in this paper was successfully applied to a large set of causes of death and countries. The method appears to be an alternative to bridge coding methods when the latter are not systematically implemented because they are time- and resource-consuming. PMID:21929756
Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H
2017-05-10
We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value
Wu, Hao
2018-05-01
In structural equation modelling (SEM), a robust adjustment to the test statistic or to its reference distribution is needed when its null distribution deviates from a χ 2 distribution, which usually arises when data do not follow a multivariate normal distribution. Unfortunately, existing studies on this issue typically focus on only a few methods and neglect the majority of alternative methods in statistics. Existing simulation studies typically consider only non-normal distributions of data that either satisfy asymptotic robustness or lead to an asymptotic scaled χ 2 distribution. In this work we conduct a comprehensive study that involves both typical methods in SEM and less well-known methods from the statistics literature. We also propose the use of several novel non-normal data distributions that are qualitatively different from the non-normal distributions widely used in existing studies. We found that several under-studied methods give the best performance under specific conditions, but the Satorra-Bentler method remains the most viable method for most situations. © 2017 The British Psychological Society.
Mixed-Methods Research in the Discipline of Nursing.
Beck, Cheryl Tatano; Harrison, Lisa
2016-01-01
In this review article, we examined the prevalence and characteristics of 294 mixed-methods studies in the discipline of nursing. Creswell and Plano Clark's typology was most frequently used along with concurrent timing. Bivariate statistics was most often the highest level of statistics reported in the results. As for qualitative data analysis, content analysis was most frequently used. The majority of nurse researchers did not specifically address the purpose, paradigm, typology, priority, timing, interaction, or integration of their mixed-methods studies. Strategies are suggested for improving the design, conduct, and reporting of mixed-methods studies in the discipline of nursing.
Smith, Joseph M.; Mather, Martha E.
2012-01-01
Ecological indicators are science-based tools used to assess how human activities have impacted environmental resources. For monitoring and environmental assessment, existing species assemblage data can be used to make these comparisons through time or across sites. An impediment to using assemblage data, however, is that these data are complex and need to be simplified in an ecologically meaningful way. Because multivariate statistics are mathematical relationships, statistical groupings may not make ecological sense and will not have utility as indicators. Our goal was to define a process to select defensible and ecologically interpretable statistical simplifications of assemblage data in which researchers and managers can have confidence. For this, we chose a suite of statistical methods, compared the groupings that resulted from these analyses, identified convergence among groupings, then we interpreted the groupings using species and ecological guilds. When we tested this approach using a statewide stream fish dataset, not all statistical methods worked equally well. For our dataset, logistic regression (Log), detrended correspondence analysis (DCA), cluster analysis (CL), and non-metric multidimensional scaling (NMDS) provided consistent, simplified output. Specifically, the Log, DCA, CL-1, and NMDS-1 groupings were ≥60% similar to each other, overlapped with the fluvial-specialist ecological guild, and contained a common subset of species. Groupings based on number of species (e.g., Log, DCA, CL and NMDS) outperformed groupings based on abundance [e.g., principal components analysis (PCA) and Poisson regression]. Although the specific methods that worked on our test dataset have generality, here we are advocating a process (e.g., identifying convergent groupings with redundant species composition that are ecologically interpretable) rather than the automatic use of any single statistical tool. We summarize this process in step-by-step guidance for the future use of these commonly available ecological and statistical methods in preparing assemblage data for use in ecological indicators.
Rollins, Derrick K; Teh, Ailing
2010-12-17
Microarray data sets provide relative expression levels for thousands of genes for a small number, in comparison, of different experimental conditions called assays. Data mining techniques are used to extract specific information of genes as they relate to the assays. The multivariate statistical technique of principal component analysis (PCA) has proven useful in providing effective data mining methods. This article extends the PCA approach of Rollins et al. to the development of ranking genes of microarray data sets that express most differently between two biologically different grouping of assays. This method is evaluated on real and simulated data and compared to a current approach on the basis of false discovery rate (FDR) and statistical power (SP) which is the ability to correctly identify important genes. This work developed and evaluated two new test statistics based on PCA and compared them to a popular method that is not PCA based. Both test statistics were found to be effective as evaluated in three case studies: (i) exposing E. coli cells to two different ethanol levels; (ii) application of myostatin to two groups of mice; and (iii) a simulated data study derived from the properties of (ii). The proposed method (PM) effectively identified critical genes in these studies based on comparison with the current method (CM). The simulation study supports higher identification accuracy for PM over CM for both proposed test statistics when the gene variance is constant and for one of the test statistics when the gene variance is non-constant. PM compares quite favorably to CM in terms of lower FDR and much higher SP. Thus, PM can be quite effective in producing accurate signatures from large microarray data sets for differential expression between assays groups identified in a preliminary step of the PCA procedure and is, therefore, recommended for use in these applications.
This editorial is the first of a series that each explains one practical aspect of statistics specifically tailored for biomarker data. Each editorial is focused on a very specific concept and gives the rationale, specific method, and a real-world example of a useful tool for da...
Gunsolus, Ian L; Jaffe, Allan S; Sexter, Anne; Schulz, Karen; Ler, Ranka; Lindgren, Brittany; Saenger, Amy K; Love, Sara A; Apple, Fred S
2017-12-01
Our purpose was to determine a) overall and sex-specific 99th percentile upper reference limits (URL) and b) influences of statistical methods and comorbidities on the URLs. Heparin plasma from 838 normal subjects (423 men, 415 women) were obtained from the AACC (Universal Sample Bank). The cobas e602 measured cTnT (Roche Gen 5 assay); limit of detection (LoD), 3ng/L. Hemoglobin A1c (URL 6.5%), NT-proBNP (URL 125ng/L) and eGFR (60mL/min/1.73m 2 ) were measured, along with identification of statin use, to better define normality. 99th percentile URLs were determined by the non-parametric (NP), Harrell-Davis Estimator (HDE) and Robust (R) methods. 355 men and 339 women remained after exclusions. Overall<50% of subjects had measureable concentrations ≥ LoD: 45.6% no exclusion, 43.5% after exclusion; compared to men: 68.1% no exclusion, 65.1% post exclusion; women: 22.7% no exclusion, 20.9% post exclusion. The statistical method used influenced URLs as follows: pre/post exclusion overall, NP 16/16ng/L, HDE 17/17ng/L, R not available; men NP 18/16ng/L, HDE 21/19ng/L, R 16/11ng/L; women NP 13/10ng/L, HDE 14/14ng/L, R not available. We demonstrated that a) the Gen 5 cTnT assay does not meet the IFCC guideline for high-sensitivity assays, b) surrogate biomarkers significantly lowers the URLs and c) statistical methods used impact URLs. Our data suggest lower sex-specific cTnT 99th percentiles than reported in the FDA approved package insert. We emphasize the importance of detailing the criteria used to include and exclude subjects for defining a healthy population and the statistical method used to calculate 99th percentiles and identify outliers. Copyright © 2017 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Brown, Andrew M.
2014-01-01
Numerical and Analytical methods developed to determine damage accumulation in specific engine components when speed variation included. Dither Life Ratio shown to be well over factor of 2 for specific example. Steady-State assumption shown to be accurate for most turbopump cases, allowing rapid calculation of DLR. If hot-fire speed data unknown, Monte Carlo method developed that uses speed statistics for similar engines. Application of techniques allow analyst to reduce both uncertainty and excess conservatism. High values of DLR could allow previously unacceptable part to pass HCF criteria without redesign. Given benefit and ease of implementation, recommend that any finite life turbomachine component analysis adopt these techniques. Probability Values calculated, compared, and evaluated for several industry-proposed methods for combining random and harmonic loads. Two new excel macros written to calculate combined load for any specific probability level. Closed form Curve fits generated for widely used 3(sigma) and 2(sigma) probability levels. For design of lightweight aerospace components, obtaining accurate, reproducible, statistically meaningful answer critical.
Methods and statistics for combining motif match scores.
Bailey, T L; Gribskov, M
1998-01-01
Position-specific scoring matrices are useful for representing and searching for protein sequence motifs. A sequence family can often be described by a group of one or more motifs, and an effective search must combine the scores for matching a sequence to each of the motifs in the group. We describe three methods for combining match scores and estimating the statistical significance of the combined scores and evaluate the search quality (classification accuracy) and the accuracy of the estimate of statistical significance of each. The three methods are: 1) sum of scores, 2) sum of reduced variates, 3) product of score p-values. We show that method 3) is superior to the other two methods in both regards, and that combining motif scores indeed gives better search accuracy. The MAST sequence homology search algorithm utilizing the product of p-values scoring method is available for interactive use and downloading at URL http:/(/)www.sdsc.edu/MEME.
Statistical scaling of geometric characteristics in stochastically generated pore microstructures
Hyman, Jeffrey D.; Guadagnini, Alberto; Winter, C. Larrabee
2015-05-21
In this study, we analyze the statistical scaling of structural attributes of virtual porous microstructures that are stochastically generated by thresholding Gaussian random fields. Characterization of the extent at which randomly generated pore spaces can be considered as representative of a particular rock sample depends on the metrics employed to compare the virtual sample against its physical counterpart. Typically, comparisons against features and/patterns of geometric observables, e.g., porosity and specific surface area, flow-related macroscopic parameters, e.g., permeability, or autocorrelation functions are used to assess the representativeness of a virtual sample, and thereby the quality of the generation method. Here, wemore » rely on manifestations of statistical scaling of geometric observables which were recently observed in real millimeter scale rock samples [13] as additional relevant metrics by which to characterize a virtual sample. We explore the statistical scaling of two geometric observables, namely porosity (Φ) and specific surface area (SSA), of porous microstructures generated using the method of Smolarkiewicz and Winter [42] and Hyman and Winter [22]. Our results suggest that the method can produce virtual pore space samples displaying the symptoms of statistical scaling observed in real rock samples. Order q sample structure functions (statistical moments of absolute increments) of Φ and SSA scale as a power of the separation distance (lag) over a range of lags, and extended self-similarity (linear relationship between log structure functions of successive orders) appears to be an intrinsic property of the generated media. The width of the range of lags where power-law scaling is observed and the Hurst coefficient associated with the variables we consider can be controlled by the generation parameters of the method.« less
Statistical Methodology for the Analysis of Repeated Duration Data in Behavioral Studies
ERIC Educational Resources Information Center
Letué, Frédérique; Martinez, Marie-José; Samson, Adeline; Vilain, Anne; Vilain, Coriandre
2018-01-01
Purpose: Repeated duration data are frequently used in behavioral studies. Classical linear or log-linear mixed models are often inadequate to analyze such data, because they usually consist of nonnegative and skew-distributed variables. Therefore, we recommend use of a statistical methodology specific to duration data. Method: We propose a…
NASA Astrophysics Data System (ADS)
Liu, Bilan; Qiu, Xing; Zhu, Tong; Tian, Wei; Hu, Rui; Ekholm, Sven; Schifitto, Giovanni; Zhong, Jianhui
2016-03-01
Subject-specific longitudinal DTI study is vital for investigation of pathological changes of lesions and disease evolution. Spatial Regression Analysis of Diffusion tensor imaging (SPREAD) is a non-parametric permutation-based statistical framework that combines spatial regression and resampling techniques to achieve effective detection of localized longitudinal diffusion changes within the whole brain at individual level without a priori hypotheses. However, boundary blurring and dislocation limit its sensitivity, especially towards detecting lesions of irregular shapes. In the present study, we propose an improved SPREAD (dubbed improved SPREAD, or iSPREAD) method by incorporating a three-dimensional (3D) nonlinear anisotropic diffusion filtering method, which provides edge-preserving image smoothing through a nonlinear scale space approach. The statistical inference based on iSPREAD was evaluated and compared with the original SPREAD method using both simulated and in vivo human brain data. Results demonstrated that the sensitivity and accuracy of the SPREAD method has been improved substantially by adapting nonlinear anisotropic filtering. iSPREAD identifies subject-specific longitudinal changes in the brain with improved sensitivity, accuracy, and enhanced statistical power, especially when the spatial correlation is heterogeneous among neighboring image pixels in DTI.
Lewis, Gregory F.; Furman, Senta A.; McCool, Martha F.; Porges, Stephen W.
2011-01-01
Three frequently used RSA metrics are investigated to document violations of assumptions for parametric analyses, moderation by respiration, influences of nonstationarity, and sensitivity to vagal blockade. Although all metrics are highly correlated, new findings illustrate that the metrics are noticeably different on the above dimensions. Only one method conforms to the assumptions for parametric analyses, is not moderated by respiration, is not influenced by nonstationarity, and reliably generates stronger effect sizes. Moreover, this method is also the most sensitive to vagal blockade. Specific features of this method may provide insights into improving the statistical characteristics of other commonly used RSA metrics. These data provide the evidence to question, based on statistical grounds, published reports using particular metrics of RSA. PMID:22138367
A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research
van de Schoot, Rens; Kaplan, David; Denissen, Jaap; Asendorpf, Jens B; Neyer, Franz J; van Aken, Marcel AG
2014-01-01
Bayesian statistical methods are becoming ever more popular in applied and fundamental research. In this study a gentle introduction to Bayesian analysis is provided. It is shown under what circumstances it is attractive to use Bayesian estimation, and how to interpret properly the results. First, the ingredients underlying Bayesian methods are introduced using a simplified example. Thereafter, the advantages and pitfalls of the specification of prior knowledge are discussed. To illustrate Bayesian methods explained in this study, in a second example a series of studies that examine the theoretical framework of dynamic interactionism are considered. In the Discussion the advantages and disadvantages of using Bayesian statistics are reviewed, and guidelines on how to report on Bayesian statistics are provided. PMID:24116396
A statistical method to estimate low-energy hadronic cross sections
NASA Astrophysics Data System (ADS)
Balassa, Gábor; Kovács, Péter; Wolf, György
2018-02-01
In this article we propose a model based on the Statistical Bootstrap approach to estimate the cross sections of different hadronic reactions up to a few GeV in c.m.s. energy. The method is based on the idea, when two particles collide a so-called fireball is formed, which after a short time period decays statistically into a specific final state. To calculate the probabilities we use a phase space description extended with quark combinatorial factors and the possibility of more than one fireball formation. In a few simple cases the probability of a specific final state can be calculated analytically, where we show that the model is able to reproduce the ratios of the considered cross sections. We also show that the model is able to describe proton-antiproton annihilation at rest. In the latter case we used a numerical method to calculate the more complicated final state probabilities. Additionally, we examined the formation of strange and charmed mesons as well, where we used existing data to fit the relevant model parameters.
Meta-analysis of diagnostic test data: a bivariate Bayesian modeling approach.
Verde, Pablo E
2010-12-30
In the last decades, the amount of published results on clinical diagnostic tests has expanded very rapidly. The counterpart to this development has been the formal evaluation and synthesis of diagnostic results. However, published results present substantial heterogeneity and they can be regarded as so far removed from the classical domain of meta-analysis, that they can provide a rather severe test of classical statistical methods. Recently, bivariate random effects meta-analytic methods, which model the pairs of sensitivities and specificities, have been presented from the classical point of view. In this work a bivariate Bayesian modeling approach is presented. This approach substantially extends the scope of classical bivariate methods by allowing the structural distribution of the random effects to depend on multiple sources of variability. Meta-analysis is summarized by the predictive posterior distributions for sensitivity and specificity. This new approach allows, also, to perform substantial model checking, model diagnostic and model selection. Statistical computations are implemented in the public domain statistical software (WinBUGS and R) and illustrated with real data examples. Copyright © 2010 John Wiley & Sons, Ltd.
Statistical Modeling of Retinal Optical Coherence Tomography.
Amini, Zahra; Rabbani, Hossein
2016-06-01
In this paper, a new model for retinal Optical Coherence Tomography (OCT) images is proposed. This statistical model is based on introducing a nonlinear Gaussianization transform to convert the probability distribution function (pdf) of each OCT intra-retinal layer to a Gaussian distribution. The retina is a layered structure and in OCT each of these layers has a specific pdf which is corrupted by speckle noise, therefore a mixture model for statistical modeling of OCT images is proposed. A Normal-Laplace distribution, which is a convolution of a Laplace pdf and Gaussian noise, is proposed as the distribution of each component of this model. The reason for choosing Laplace pdf is the monotonically decaying behavior of OCT intensities in each layer for healthy cases. After fitting a mixture model to the data, each component is gaussianized and all of them are combined by Averaged Maximum A Posterior (AMAP) method. To demonstrate the ability of this method, a new contrast enhancement method based on this statistical model is proposed and tested on thirteen healthy 3D OCTs taken by the Topcon 3D OCT and five 3D OCTs from Age-related Macular Degeneration (AMD) patients, taken by Zeiss Cirrus HD-OCT. Comparing the results with two contending techniques, the prominence of the proposed method is demonstrated both visually and numerically. Furthermore, to prove the efficacy of the proposed method for a more direct and specific purpose, an improvement in the segmentation of intra-retinal layers using the proposed contrast enhancement method as a preprocessing step, is demonstrated.
A method for determining the weak statistical stationarity of a random process
NASA Technical Reports Server (NTRS)
Sadeh, W. Z.; Koper, C. A., Jr.
1978-01-01
A method for determining the weak statistical stationarity of a random process is presented. The core of this testing procedure consists of generating an equivalent ensemble which approximates a true ensemble. Formation of an equivalent ensemble is accomplished through segmenting a sufficiently long time history of a random process into equal, finite, and statistically independent sample records. The weak statistical stationarity is ascertained based on the time invariance of the equivalent-ensemble averages. Comparison of these averages with their corresponding time averages over a single sample record leads to a heuristic estimate of the ergodicity of a random process. Specific variance tests are introduced for evaluating the statistical independence of the sample records, the time invariance of the equivalent-ensemble autocorrelations, and the ergodicity. Examination and substantiation of these procedures were conducted utilizing turbulent velocity signals.
A generalized plate method for estimating total aerobic microbial count.
Ho, Kai Fai
2004-01-01
The plate method outlined in Chapter 61: Microbial Limit Tests of the U.S. Pharmacopeia (USP 61) provides very specific guidance for assessing total aerobic bioburden in pharmaceutical articles. This methodology, while comprehensive, lacks the flexibility to be useful in all situations. By studying the plate method as a special case within a more general family of assays, the effects of each parameter in the guidance can be understood. Using a mathematical model to describe the plate counting procedure, a statistical framework for making more definitive statements about total aerobic bioburden is developed. Such a framework allows the laboratory scientist to adjust the USP 61 methods to satisfy specific practical constraints. In particular, it is shown that the plate method can be conducted, albeit with stricter acceptance criteria, using a test specimen quantity that is smaller than the 10 g or 10 mL prescribed in the guidance. Finally, the interpretation of results proffered by the guidance is re-examined within this statistical framework and shown to be overly aggressive.
Directions for new developments on statistical design and analysis of small population group trials.
Hilgers, Ralf-Dieter; Roes, Kit; Stallard, Nigel
2016-06-14
Most statistical design and analysis methods for clinical trials have been developed and evaluated where at least several hundreds of patients could be recruited. These methods may not be suitable to evaluate therapies if the sample size is unavoidably small, which is usually termed by small populations. The specific sample size cut off, where the standard methods fail, needs to be investigated. In this paper, the authors present their view on new developments for design and analysis of clinical trials in small population groups, where conventional statistical methods may be inappropriate, e.g., because of lack of power or poor adherence to asymptotic approximations due to sample size restrictions. Following the EMA/CHMP guideline on clinical trials in small populations, we consider directions for new developments in the area of statistical methodology for design and analysis of small population clinical trials. We relate the findings to the research activities of three projects, Asterix, IDeAl, and InSPiRe, which have received funding since 2013 within the FP7-HEALTH-2013-INNOVATION-1 framework of the EU. As not all aspects of the wide research area of small population clinical trials can be addressed, we focus on areas where we feel advances are needed and feasible. The general framework of the EMA/CHMP guideline on small population clinical trials stimulates a number of research areas. These serve as the basis for the three projects, Asterix, IDeAl, and InSPiRe, which use various approaches to develop new statistical methodology for design and analysis of small population clinical trials. Small population clinical trials refer to trials with a limited number of patients. Small populations may result form rare diseases or specific subtypes of more common diseases. New statistical methodology needs to be tailored to these specific situations. The main results from the three projects will constitute a useful toolbox for improved design and analysis of small population clinical trials. They address various challenges presented by the EMA/CHMP guideline as well as recent discussions about extrapolation. There is a need for involvement of the patients' perspective in the planning and conduct of small population clinical trials for a successful therapy evaluation.
Determination of polarimetric parameters of honey by near-infrared transflectance spectroscopy.
García-Alvarez, M; Ceresuela, S; Huidobro, J F; Hermida, M; Rodríguez-Otero, J L
2002-01-30
NIR transflectance spectroscopy was used to determine polarimetric parameters (direct polarization, polarization after inversion, specific rotation in dry matter, and polarization due to nonmonosaccharides) and sucrose in honey. In total, 156 honey samples were collected during 1992 (45 samples), 1995 (56 samples), and 1996 (55 samples). Samples were analyzed by NIR spectroscopy and polarimetric methods. Calibration (118 samples) and validation (38 samples) sets were made up; honeys from the three years were included in both sets. Calibrations were performed by modified partial least-squares regression and scatter correction by standard normal variation and detrend methods. For direct polarization, polarization after inversion, specific rotation in dry matter, and polarization due to nonmonosaccharides, good statistics (bias, SEV, and R(2)) were obtained for the validation set, and no statistically (p = 0.05) significant differences were found between instrumental and polarimetric methods for these parameters. Statistical data for sucrose were not as good as those of the other parameters. Therefore, NIR spectroscopy is not an effective method for quantitative analysis of sucrose in these honey samples. However, NIR spectroscopy may be an acceptable method for semiquantitative evaluation of sucrose for honeys, such as those in our study, containing up to 3% of sucrose. Further work is necessary to validate the uncertainty at higher levels.
[Study on commercial specification of atractylodes based on Delphi method].
Wang, Hao; Chen, Li-Xiao; Huang, Lu-Qi; Zhang, Tian-Tian; Li, Ying; Zheng, Yu-Guang
2016-03-01
This research adopts "Delphi method" to evaluate atractylodes traditional traits and rank correlation. By using methods of mathematical statistics the relationship of the traditional identification indicators and atractylodes goods rank correlation was analyzed, It is found that the main characteristics affectingatractylodes commodity specifications and grades of main characters wereoil points of transaction,color of transaction,color of surface,grain of transaction,texture of transaction andspoilage. The study points out that the original "seventy-six kinds of medicinal materials commodity specification standards of atractylodes differentiate commodity specification" is not in conformity with the actual market situation, we need to formulate corresponding atractylodes medicinal products specifications and grades.This study combined with experimental results "Delphi method" and the market actual situation, proposed the new draft atractylodes commodity specifications and grades, as the new atractylodes commodity specifications and grades standards. It provides a reference and theoretical basis. Copyright© by the Chinese Pharmaceutical Association.
Realistic finite temperature simulations of magnetic systems using quantum statistics
NASA Astrophysics Data System (ADS)
Bergqvist, Lars; Bergman, Anders
2018-01-01
We have performed realistic atomistic simulations at finite temperatures using Monte Carlo and atomistic spin dynamics simulations incorporating quantum (Bose-Einstein) statistics. The description is much improved at low temperatures compared to classical (Boltzmann) statistics normally used in these kind of simulations, while at higher temperatures the classical statistics are recovered. This corrected low-temperature description is reflected in both magnetization and the magnetic specific heat, the latter allowing for improved modeling of the magnetic contribution to free energies. A central property in the method is the magnon density of states at finite temperatures, and we have compared several different implementations for obtaining it. The method has no restrictions regarding chemical and magnetic order of the considered materials. This is demonstrated by applying the method to elemental ferromagnetic systems, including Fe and Ni, as well as Fe-Co random alloys and the ferrimagnetic system GdFe3.
Lotfy, Hayam Mahmoud; Hegazy, Maha A; Rezk, Mamdouh R; Omran, Yasmin Rostom
2014-05-21
Two smart and novel spectrophotometric methods namely; absorbance subtraction (AS) and amplitude modulation (AM) were developed and validated for the determination of a binary mixture of timolol maleate (TIM) and dorzolamide hydrochloride (DOR) in presence of benzalkonium chloride without prior separation, using unified regression equation. Additionally, simple, specific, accurate and precise spectrophotometric methods manipulating ratio spectra were developed and validated for simultaneous determination of the binary mixture namely; simultaneous ratio subtraction (SRS), ratio difference (RD), ratio subtraction (RS) coupled with extended ratio subtraction (EXRS), constant multiplication method (CM) and mean centering of ratio spectra (MCR). The proposed spectrophotometric procedures do not require any separation steps. Accuracy, precision and linearity ranges of the proposed methods were determined and the specificity was assessed by analyzing synthetic mixtures of both drugs. They were applied to their pharmaceutical formulation and the results obtained were statistically compared to that of a reported spectrophotometric method. The statistical comparison showed that there is no significant difference between the proposed methods and the reported one regarding both accuracy and precision. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Ryan, Robert S.; Townsend, John S.
1993-01-01
The prospective improvement of probabilistic methods for space program analysis/design entails the further development of theories, codes, and tools which match specific areas of application, the drawing of lessons from previous uses of probability and statistics data bases, the enlargement of data bases (especially in the field of structural failures), and the education of engineers and managers on the advantages of these methods. An evaluation is presently made of the current limitations of probabilistic engineering methods. Recommendations are made for specific applications.
2011-01-01
Background Clinical researchers have often preferred to use a fixed effects model for the primary interpretation of a meta-analysis. Heterogeneity is usually assessed via the well known Q and I2 statistics, along with the random effects estimate they imply. In recent years, alternative methods for quantifying heterogeneity have been proposed, that are based on a 'generalised' Q statistic. Methods We review 18 IPD meta-analyses of RCTs into treatments for cancer, in order to quantify the amount of heterogeneity present and also to discuss practical methods for explaining heterogeneity. Results Differing results were obtained when the standard Q and I2 statistics were used to test for the presence of heterogeneity. The two meta-analyses with the largest amount of heterogeneity were investigated further, and on inspection the straightforward application of a random effects model was not deemed appropriate. Compared to the standard Q statistic, the generalised Q statistic provided a more accurate platform for estimating the amount of heterogeneity in the 18 meta-analyses. Conclusions Explaining heterogeneity via the pre-specification of trial subgroups, graphical diagnostic tools and sensitivity analyses produced a more desirable outcome than an automatic application of the random effects model. Generalised Q statistic methods for quantifying and adjusting for heterogeneity should be incorporated as standard into statistical software. Software is provided to help achieve this aim. PMID:21473747
Length and Rate of Individual Participation in Various Activities on Recreation Sites and Areas
Gary L. Tyre; George A. James
1971-01-01
While statistically reliable methods exist for estimating recreation use on large areas, they often prove prohibitively expensive. Inexpensive alternatives involving the length and rate of individual participation in specific activites are presented, together with data and statistics on the recreational use of three large areas on the National Forests. This...
Shardell, Michelle; Harris, Anthony D; El-Kamary, Samer S; Furuno, Jon P; Miller, Ram R; Perencevich, Eli N
2007-10-01
Quasi-experimental study designs are frequently used to assess interventions that aim to limit the emergence of antimicrobial-resistant pathogens. However, previous studies using these designs have often used suboptimal statistical methods, which may result in researchers making spurious conclusions. Methods used to analyze quasi-experimental data include 2-group tests, regression analysis, and time-series analysis, and they all have specific assumptions, data requirements, strengths, and limitations. An example of a hospital-based intervention to reduce methicillin-resistant Staphylococcus aureus infection rates and reduce overall length of stay is used to explore these methods.
The Shock and Vibration Digest. Volume 15, Number 7
1983-07-01
systems noise -- for tant analytical tool, the statistical energy analysis example, from a specific metal, chain driven, con- method, has been the subject...34Experimental Determination of Vibration Parameters Re- ~~~quired in the Statistical Energy Analysis Meth- .,i. 31. Dubowsky, S. and Morris, T.L., "An...34Coupling Loss Factors for 55. Upton, R., "Sound Intensity -. A Powerful New Statistical Energy Analysis of Sound Trans- Measurement Tool," S/V, Sound
Marateb, Hamid Reza; Mansourian, Marjan; Adibi, Peyman; Farina, Dario
2014-01-01
Background: selecting the correct statistical test and data mining method depends highly on the measurement scale of data, type of variables, and purpose of the analysis. Different measurement scales are studied in details and statistical comparison, modeling, and data mining methods are studied based upon using several medical examples. We have presented two ordinal–variables clustering examples, as more challenging variable in analysis, using Wisconsin Breast Cancer Data (WBCD). Ordinal-to-Interval scale conversion example: a breast cancer database of nine 10-level ordinal variables for 683 patients was analyzed by two ordinal-scale clustering methods. The performance of the clustering methods was assessed by comparison with the gold standard groups of malignant and benign cases that had been identified by clinical tests. Results: the sensitivity and accuracy of the two clustering methods were 98% and 96%, respectively. Their specificity was comparable. Conclusion: by using appropriate clustering algorithm based on the measurement scale of the variables in the study, high performance is granted. Moreover, descriptive and inferential statistics in addition to modeling approach must be selected based on the scale of the variables. PMID:24672565
Statistical methods for change-point detection in surface temperature records
NASA Astrophysics Data System (ADS)
Pintar, A. L.; Possolo, A.; Zhang, N. F.
2013-09-01
We describe several statistical methods to detect possible change-points in a time series of values of surface temperature measured at a meteorological station, and to assess the statistical significance of such changes, taking into account the natural variability of the measured values, and the autocorrelations between them. These methods serve to determine whether the record may suffer from biases unrelated to the climate signal, hence whether there may be a need for adjustments as considered by M. J. Menne and C. N. Williams (2009) "Homogenization of Temperature Series via Pairwise Comparisons", Journal of Climate 22 (7), 1700-1717. We also review methods to characterize patterns of seasonality (seasonal decomposition using monthly medians or robust local regression), and explain the role they play in the imputation of missing values, and in enabling robust decompositions of the measured values into a seasonal component, a possible climate signal, and a station-specific remainder. The methods for change-point detection that we describe include statistical process control, wavelet multi-resolution analysis, adaptive weights smoothing, and a Bayesian procedure, all of which are applicable to single station records.
Strappini, Francesca; Gilboa, Elad; Pitzalis, Sabrina; Kay, Kendrick; McAvoy, Mark; Nehorai, Arye; Snyder, Abraham Z
2017-03-01
Temporal and spatial filtering of fMRI data is often used to improve statistical power. However, conventional methods, such as smoothing with fixed-width Gaussian filters, remove fine-scale structure in the data, necessitating a tradeoff between sensitivity and specificity. Specifically, smoothing may increase sensitivity (reduce noise and increase statistical power) but at the cost loss of specificity in that fine-scale structure in neural activity patterns is lost. Here, we propose an alternative smoothing method based on Gaussian processes (GP) regression for single subjects fMRI experiments. This method adapts the level of smoothing on a voxel by voxel basis according to the characteristics of the local neural activity patterns. GP-based fMRI analysis has been heretofore impractical owing to computational demands. Here, we demonstrate a new implementation of GP that makes it possible to handle the massive data dimensionality of the typical fMRI experiment. We demonstrate how GP can be used as a drop-in replacement to conventional preprocessing steps for temporal and spatial smoothing in a standard fMRI pipeline. We present simulated and experimental results that show the increased sensitivity and specificity compared to conventional smoothing strategies. Hum Brain Mapp 38:1438-1459, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Schäffer, Beat; Pieren, Reto; Mendolia, Franco; Basner, Mathias; Brink, Mark
2017-05-01
Noise exposure-response relationships are used to estimate the effects of noise on individuals or a population. Such relationships may be derived from independent or repeated binary observations, and modeled by different statistical methods. Depending on the method by which they were established, their application in population risk assessment or estimation of individual responses may yield different results, i.e., predict "weaker" or "stronger" effects. As far as the present body of literature on noise effect studies is concerned, however, the underlying statistical methodology to establish exposure-response relationships has not always been paid sufficient attention. This paper gives an overview on two statistical approaches (subject-specific and population-averaged logistic regression analysis) to establish noise exposure-response relationships from repeated binary observations, and their appropriate applications. The considerations are illustrated with data from three noise effect studies, estimating also the magnitude of differences in results when applying exposure-response relationships derived from the two statistical approaches. Depending on the underlying data set and the probability range of the binary variable it covers, the two approaches yield similar to very different results. The adequate choice of a specific statistical approach and its application in subsequent studies, both depending on the research question, are therefore crucial.
Changing Patterns in Methods of Suicide by Race and Sex.
ERIC Educational Resources Information Center
McIntosh, John L.; Santos, John F.
1982-01-01
Examined annual official national statistics for specific methods of suicide by sex and racial group from 1923 to 1978. Shifts were found in suicide methods employed, most notably for women and Asian Americans. Generally, firearm use increased among nearly all ethnic/racial-sex groups while the use of poisons declined. (JAC)
Lambert, Nathaniel D.; Pankratz, V. Shane; Larrabee, Beth R.; Ogee-Nwankwo, Adaeze; Chen, Min-hsin; Icenogle, Joseph P.
2014-01-01
Rubella remains a social and economic burden due to the high incidence of congenital rubella syndrome (CRS) in some countries. For this reason, an accurate and efficient high-throughput measure of antibody response to vaccination is an important tool. In order to measure rubella-specific neutralizing antibodies in a large cohort of vaccinated individuals, a high-throughput immunocolorimetric system was developed. Statistical interpolation models were applied to the resulting titers to refine quantitative estimates of neutralizing antibody titers relative to the assayed neutralizing antibody dilutions. This assay, including the statistical methods developed, can be used to assess the neutralizing humoral immune response to rubella virus and may be adaptable for assessing the response to other viral vaccines and infectious agents. PMID:24391140
An analysis of secular trends in method-specific suicides in Japan, 1950-1975.
Yoshioka, Eiji; Saijo, Yasuaki; Kawachi, Ichiro
2017-04-05
In Japan, a dramatic rise in suicide rates was observed in the 1950s, especially among the younger population, and then the rate decreased rapidly again in the 1960s. The aim of this study was to assess secular trends in method-specific suicides by gender and age in Japan between 1950 and 1975. We paid special attention to suicides by poisoning (solid and liquid substances), and their contribution to dramatic swings in the overall suicide rate in Japan during the 1950s and 1960s. Mortality and population data were obtained from the Vital Statistics of Japan and Statistics Bureau, Ministry of Internal Affairs and Communications in Japan, respectively. We calculated method-specific age-standardized suicide rates by gender and age group (15-29, 30-49, or 50+ years). The change in the suicide rate during the research period was larger in males than females in all age groups, and was more marked among people aged 15-29 years compared to those aged 30-49 years and 50 years or over. Poisoning by solid and liquid substances overwhelmingly contributed to the dramatic change in the overall suicide rates in males and females aged 15-49 years in the 1950s and 1960s. For the peak years of the rise in poisoning suicides, bromide was the most frequently used substance. Our results for the 1950s and 1960s in Japan illustrated how assessing secular trends in method-specific suicides by gender and age could provide a deeper understanding of the dramatic swings in overall suicide rate. Although rapid increases or decreases in suicide rates have been also observed in some countries or regions recently, trends in method-specific suicides have not been analyzed because of a lack of data on method-specific suicide in many countries. Our study illustrates how the collection and analysis of method-specific data can contribute to an understanding of dramatic shifts in national suicide rates.
Hu, Xiangdong; Liu, Yujiang; Qian, Linxue
2017-01-01
Abstract Background: Real-time elastography (RTE) and shear wave elastography (SWE) are noninvasive and easily available imaging techniques that measure the tissue strain, and it has been reported that the sensitivity and the specificity of elastography were better in differentiating between benign and malignant thyroid nodules than conventional technologies. Methods: Relevant articles were searched in multiple databases; the comparison of elasticity index (EI) was conducted with the Review Manager 5.0. Forest plots of the sensitivity and specificity and SROC curve of RTE and SWE were performed with STATA 10.0 software. In addition, sensitivity analysis and bias analysis of the studies were conducted to examine the quality of articles; and to estimate possible publication bias, funnel plot was used and the Egger test was conducted. Results: Finally 22 articles which eventually satisfied the inclusion criteria were included in this study. After eliminating the inefficient, benign and malignant nodules were 2106 and 613, respectively. The meta-analysis suggested that the difference of EI between benign and malignant nodules was statistically significant (SMD = 2.11, 95% CI [1.67, 2.55], P < .00001). The overall sensitivities of RTE and SWE were roughly comparable, whereas the difference of specificities between these 2 methods was statistically significant. In addition, statistically significant difference of AUC between RTE and SWE was observed between RTE and SWE (P < .01). Conclusion: The specificity of RTE was statistically higher than that of SWE; which suggests that compared with SWE, RTE may be more accurate on differentiating benign and malignant thyroid nodules. PMID:29068996
Efficient statistical tests to compare Youden index: accounting for contingency correlation.
Chen, Fangyao; Xue, Yuqiang; Tan, Ming T; Chen, Pingyan
2015-04-30
Youden index is widely utilized in studies evaluating accuracy of diagnostic tests and performance of predictive, prognostic, or risk models. However, both one and two independent sample tests on Youden index have been derived ignoring the dependence (association) between sensitivity and specificity, resulting in potentially misleading findings. Besides, paired sample test on Youden index is currently unavailable. This article develops efficient statistical inference procedures for one sample, independent, and paired sample tests on Youden index by accounting for contingency correlation, namely associations between sensitivity and specificity and paired samples typically represented in contingency tables. For one and two independent sample tests, the variances are estimated by Delta method, and the statistical inference is based on the central limit theory, which are then verified by bootstrap estimates. For paired samples test, we show that the estimated covariance of the two sensitivities and specificities can be represented as a function of kappa statistic so the test can be readily carried out. We then show the remarkable accuracy of the estimated variance using a constrained optimization approach. Simulation is performed to evaluate the statistical properties of the derived tests. The proposed approaches yield more stable type I errors at the nominal level and substantially higher power (efficiency) than does the original Youden's approach. Therefore, the simple explicit large sample solution performs very well. Because we can readily implement the asymptotic and exact bootstrap computation with common software like R, the method is broadly applicable to the evaluation of diagnostic tests and model performance. Copyright © 2015 John Wiley & Sons, Ltd.
Capture approximations beyond a statistical quantum mechanical method for atom-diatom reactions
NASA Astrophysics Data System (ADS)
Barrios, Lizandra; Rubayo-Soneira, Jesús; González-Lezana, Tomás
2016-03-01
Statistical techniques constitute useful approaches to investigate atom-diatom reactions mediated by insertion dynamics which involves complex-forming mechanisms. Different capture schemes based on energy considerations regarding the specific diatom rovibrational states are suggested to evaluate the corresponding probabilities of formation of such collision species between reactants and products in an attempt to test reliable alternatives for computationally demanding processes. These approximations are tested in combination with a statistical quantum mechanical method for the S + H2(v = 0 ,j = 1) → SH + H and Si + O2(v = 0 ,j = 1) → SiO + O reactions, where this dynamical mechanism plays a significant role, in order to probe their validity.
Alles, Susan; Peng, Linda X; Mozola, Mark A
2009-01-01
A modification to Performance-Tested Method (PTM) 070601, Reveal Listeria Test (Reveal), is described. The modified method uses a new media formulation, LESS enrichment broth, in single-step enrichment protocols for both foods and environmental sponge and swab samples. Food samples are enriched for 27-30 h at 30 degrees C and environmental samples for 24-48 h at 30 degrees C. Implementation of these abbreviated enrichment procedures allows test results to be obtained on a next-day basis. In testing of 14 food types in internal comparative studies with inoculated samples, there was a statistically significant difference in performance between the Reveal and reference culture [U.S. Food and Drug Administration's Bacteriological Analytical Manual (FDA/BAM) or U.S. Department of Agriculture-Food Safety and Inspection Service (USDA-FSIS)] methods for only a single food in one trial (pasteurized crab meat) at the 27 h enrichment time point, with more positive results obtained with the FDA/BAM reference method. No foods showed statistically significant differences in method performance at the 30 h time point. Independent laboratory testing of 3 foods again produced a statistically significant difference in results for crab meat at the 27 h time point; otherwise results of the Reveal and reference methods were statistically equivalent. Overall, considering both internal and independent laboratory trials, sensitivity of the Reveal method relative to the reference culture procedures in testing of foods was 85.9% at 27 h and 97.1% at 30 h. Results from 5 environmental surfaces inoculated with various strains of Listeria spp. showed that the Reveal method was more productive than the reference USDA-FSIS culture procedure for 3 surfaces (stainless steel, plastic, and cast iron), whereas results were statistically equivalent to the reference method for the other 2 surfaces (ceramic tile and sealed concrete). An independent laboratory trial with ceramic tile inoculated with L. monocytogenes confirmed the effectiveness of the Reveal method at the 24 h time point. Overall, sensitivity of the Reveal method at 24 h relative to that of the USDA-FSIS method was 153%. The Reveal method exhibited extremely high specificity, with only a single false-positive result in all trials combined for overall specificity of 99.5%.
Analysis strategies for longitudinal attachment loss data.
Beck, J D; Elter, J R
2000-02-01
The purpose of this invited review is to describe and discuss methods currently in use to quantify the progression of attachment loss in epidemiological studies of periodontal disease, and to make recommendations for specific analytic methods based upon the particular design of the study and structure of the data. The review concentrates on the definition of incident attachment loss (ALOSS) and its component parts; measurement issues including thresholds and regression to the mean; methods of accounting for longitudinal change, including changes in means, changes in proportions of affected sites, incidence density, the effect of tooth loss and reversals, and repeated events; statistical models of longitudinal change, including the incorporation of the time element, use of linear, logistic or Poisson regression or survival analysis, and statistical tests; site vs person level of analysis, including statistical adjustment for correlated data; the strengths and limitations of ALOSS data. Examples from the Piedmont 65+ Dental Study are used to illustrate specific concepts. We conclude that incidence density is the preferred methodology to use for periodontal studies with more than one period of follow-up and that the use of studies not employing methods for dealing with complex samples, correlated data, and repeated measures does not take advantage of our current understanding of the site- and person-level variables important in periodontal disease and may generate biased results.
Cancer survival: an overview of measures, uses, and interpretation.
Mariotto, Angela B; Noone, Anne-Michelle; Howlader, Nadia; Cho, Hyunsoon; Keel, Gretchen E; Garshell, Jessica; Woloshin, Steven; Schwartz, Lisa M
2014-11-01
Survival statistics are of great interest to patients, clinicians, researchers, and policy makers. Although seemingly simple, survival can be confusing: there are many different survival measures with a plethora of names and statistical methods developed to answer different questions. This paper aims to describe and disseminate different survival measures and their interpretation in less technical language. In addition, we introduce templates to summarize cancer survival statistic organized by their specific purpose: research and policy versus prognosis and clinical decision making. Published by Oxford University Press 2014.
Cancer Survival: An Overview of Measures, Uses, and Interpretation
Noone, Anne-Michelle; Howlader, Nadia; Cho, Hyunsoon; Keel, Gretchen E.; Garshell, Jessica; Woloshin, Steven; Schwartz, Lisa M.
2014-01-01
Survival statistics are of great interest to patients, clinicians, researchers, and policy makers. Although seemingly simple, survival can be confusing: there are many different survival measures with a plethora of names and statistical methods developed to answer different questions. This paper aims to describe and disseminate different survival measures and their interpretation in less technical language. In addition, we introduce templates to summarize cancer survival statistic organized by their specific purpose: research and policy versus prognosis and clinical decision making. PMID:25417231
Statistical Methods for Assessments in Simulations and Serious Games. Research Report. ETS RR-14-12
ERIC Educational Resources Information Center
Fu, Jianbin; Zapata, Diego; Mavronikolas, Elia
2014-01-01
Simulation or game-based assessments produce outcome data and process data. In this article, some statistical models that can potentially be used to analyze data from simulation or game-based assessments are introduced. Specifically, cognitive diagnostic models that can be used to estimate latent skills from outcome data so as to scale these…
ERIC Educational Resources Information Center
Jackson, Dan
2013-01-01
Statistical inference is problematic in the common situation in meta-analysis where the random effects model is fitted to just a handful of studies. In particular, the asymptotic theory of maximum likelihood provides a poor approximation, and Bayesian methods are sensitive to the prior specification. Hence, less efficient, but easily computed and…
Estimating the Proportion of True Null Hypotheses Using the Pattern of Observed p-values
Tong, Tiejun; Feng, Zeny; Hilton, Julia S.; Zhao, Hongyu
2013-01-01
Estimating the proportion of true null hypotheses, π0, has attracted much attention in the recent statistical literature. Besides its apparent relevance for a set of specific scientific hypotheses, an accurate estimate of this parameter is key for many multiple testing procedures. Most existing methods for estimating π0 in the literature are motivated from the independence assumption of test statistics, which is often not true in reality. Simulations indicate that most existing estimators in the presence of the dependence among test statistics can be poor, mainly due to the increase of variation in these estimators. In this paper, we propose several data-driven methods for estimating π0 by incorporating the distribution pattern of the observed p-values as a practical approach to address potential dependence among test statistics. Specifically, we use a linear fit to give a data-driven estimate for the proportion of true-null p-values in (λ, 1] over the whole range [0, 1] instead of using the expected proportion at 1 − λ. We find that the proposed estimators may substantially decrease the variance of the estimated true null proportion and thus improve the overall performance. PMID:24078762
Estimating the Proportion of True Null Hypotheses Using the Pattern of Observed p-values.
Tong, Tiejun; Feng, Zeny; Hilton, Julia S; Zhao, Hongyu
2013-01-01
Estimating the proportion of true null hypotheses, π 0 , has attracted much attention in the recent statistical literature. Besides its apparent relevance for a set of specific scientific hypotheses, an accurate estimate of this parameter is key for many multiple testing procedures. Most existing methods for estimating π 0 in the literature are motivated from the independence assumption of test statistics, which is often not true in reality. Simulations indicate that most existing estimators in the presence of the dependence among test statistics can be poor, mainly due to the increase of variation in these estimators. In this paper, we propose several data-driven methods for estimating π 0 by incorporating the distribution pattern of the observed p -values as a practical approach to address potential dependence among test statistics. Specifically, we use a linear fit to give a data-driven estimate for the proportion of true-null p -values in (λ, 1] over the whole range [0, 1] instead of using the expected proportion at 1 - λ. We find that the proposed estimators may substantially decrease the variance of the estimated true null proportion and thus improve the overall performance.
Renaudin, Isabelle; Poliakoff, Françoise
2017-01-01
A working group established in the framework of the EUPHRESCO European collaborative project aimed to compare and validate diagnostic protocols for the detection of “Flavescence dorée” (FD) phytoplasma in grapevines. Seven molecular protocols were compared in an interlaboratory test performance study where each laboratory had to analyze the same panel of samples consisting of DNA extracts prepared by the organizing laboratory. The tested molecular methods consisted of universal and group-specific real-time and end-point nested PCR tests. Different statistical approaches were applied to this collaborative study. Firstly, there was the standard statistical approach consisting in analyzing samples which are known to be positive and samples which are known to be negative and reporting the proportion of false-positive and false-negative results to respectively calculate diagnostic specificity and sensitivity. This approach was supplemented by the calculation of repeatability and reproducibility for qualitative methods based on the notions of accordance and concordance. Other new approaches were also implemented, based, on the one hand, on the probability of detection model, and, on the other hand, on Bayes’ theorem. These various statistical approaches are complementary and give consistent results. Their combination, and in particular, the introduction of new statistical approaches give overall information on the performance and limitations of the different methods, and are particularly useful for selecting the most appropriate detection scheme with regards to the prevalence of the pathogen. Three real-time PCR protocols (methods M4, M5 and M6 respectively developed by Hren (2007), Pelletier (2009) and under patent oligonucleotides) achieved the highest levels of performance for FD phytoplasma detection. This paper also addresses the issue of indeterminate results and the identification of outlier results. The statistical tools presented in this paper and their combination can be applied to many other studies concerning plant pathogens and other disciplines that use qualitative detection methods. PMID:28384335
Chabirand, Aude; Loiseau, Marianne; Renaudin, Isabelle; Poliakoff, Françoise
2017-01-01
A working group established in the framework of the EUPHRESCO European collaborative project aimed to compare and validate diagnostic protocols for the detection of "Flavescence dorée" (FD) phytoplasma in grapevines. Seven molecular protocols were compared in an interlaboratory test performance study where each laboratory had to analyze the same panel of samples consisting of DNA extracts prepared by the organizing laboratory. The tested molecular methods consisted of universal and group-specific real-time and end-point nested PCR tests. Different statistical approaches were applied to this collaborative study. Firstly, there was the standard statistical approach consisting in analyzing samples which are known to be positive and samples which are known to be negative and reporting the proportion of false-positive and false-negative results to respectively calculate diagnostic specificity and sensitivity. This approach was supplemented by the calculation of repeatability and reproducibility for qualitative methods based on the notions of accordance and concordance. Other new approaches were also implemented, based, on the one hand, on the probability of detection model, and, on the other hand, on Bayes' theorem. These various statistical approaches are complementary and give consistent results. Their combination, and in particular, the introduction of new statistical approaches give overall information on the performance and limitations of the different methods, and are particularly useful for selecting the most appropriate detection scheme with regards to the prevalence of the pathogen. Three real-time PCR protocols (methods M4, M5 and M6 respectively developed by Hren (2007), Pelletier (2009) and under patent oligonucleotides) achieved the highest levels of performance for FD phytoplasma detection. This paper also addresses the issue of indeterminate results and the identification of outlier results. The statistical tools presented in this paper and their combination can be applied to many other studies concerning plant pathogens and other disciplines that use qualitative detection methods.
Does daily nurse staffing match ward workload variability? Three hospitals' experiences.
Gabbay, Uri; Bukchin, Michael
2009-01-01
Nurse shortage and rising healthcare resource burdens mean that appropriate workforce use is imperative. This paper aims to evaluate whether daily nursing staffing meets ward workload needs. Nurse attendance and daily nurses' workload capacity in three hospitals were evaluated. Statistical process control was used to evaluate intra-ward nurse workload capacity and day-to-day variations. Statistical process control is a statistics-based method for process monitoring that uses charts with predefined target measure and control limits. Standardization was performed for inter-ward analysis by converting ward-specific crude measures to ward-specific relative measures by dividing observed/expected. Two charts: acceptable and tolerable daily nurse workload intensity, were defined. Appropriate staffing indicators were defined as those exceeding predefined rates within acceptable and tolerable limits (50 percent and 80 percent respectively). A total of 42 percent of the overall days fell within acceptable control limits and 71 percent within tolerable control limits. Appropriate staffing indicators were met in only 33 percent of wards regarding acceptable nurse workload intensity and in only 45 percent of wards regarding tolerable workloads. The study work did not differentiate crude nurse attendance and it did not take into account patient severity since crude bed occupancy was used. Double statistical process control charts and certain staffing indicators were used, which is open to debate. Wards that met appropriate staffing indicators prove the method's feasibility. Wards that did not meet appropriate staffing indicators prove the importance and the need for process evaluations and monitoring. Methods presented for monitoring daily staffing appropriateness are simple to implement either for intra-ward day-to-day variation by using nurse workload capacity statistical process control charts or for inter-ward evaluation using standardized measure of nurse workload intensity. The real challenge will be to develop planning systems and implement corrective interventions such as dynamic and flexible daily staffing, which will face difficulties and barriers. The paper fulfils the need for workforce utilization evaluation. A simple method using available data for daily staffing appropriateness evaluation, which is easy to implement and operate, is presented. The statistical process control method enables intra-ward evaluation, while standardization by converting crude into relative measures enables inter-ward analysis. The staffing indicator definitions enable performance evaluation. This original study uses statistical process control to develop simple standardization methods and applies straightforward statistical tools. This method is not limited to crude measures, rather it uses weighted workload measures such as nursing acuity or weighted nurse level (i.e. grade/band).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parker, S
2015-06-15
Purpose: To evaluate the ability of statistical process control methods to detect systematic errors when using a two dimensional (2D) detector array for routine electron beam energy verification. Methods: Electron beam energy constancy was measured using an aluminum wedge and a 2D diode array on four linear accelerators. Process control limits were established. Measurements were recorded in control charts and compared with both calculated process control limits and TG-142 recommended specification limits. The data was tested for normality, process capability and process acceptability. Additional measurements were recorded while systematic errors were intentionally introduced. Systematic errors included shifts in the alignmentmore » of the wedge, incorrect orientation of the wedge, and incorrect array calibration. Results: Control limits calculated for each beam were smaller than the recommended specification limits. Process capability and process acceptability ratios were greater than one in all cases. All data was normally distributed. Shifts in the alignment of the wedge were most apparent for low energies. The smallest shift (0.5 mm) was detectable using process control limits in some cases, while the largest shift (2 mm) was detectable using specification limits in only one case. The wedge orientation tested did not affect the measurements as this did not affect the thickness of aluminum over the detectors of interest. Array calibration dependence varied with energy and selected array calibration. 6 MeV was the least sensitive to array calibration selection while 16 MeV was the most sensitive. Conclusion: Statistical process control methods demonstrated that the data distribution was normally distributed, the process was capable of meeting specifications, and that the process was centered within the specification limits. Though not all systematic errors were distinguishable from random errors, process control limits increased the ability to detect systematic errors using routine measurement of electron beam energy constancy.« less
Zhu, Zhaozhong; Anttila, Verneri; Smoller, Jordan W; Lee, Phil H
2018-01-01
Advances in recent genome wide association studies (GWAS) suggest that pleiotropic effects on human complex traits are widespread. A number of classic and recent meta-analysis methods have been used to identify genetic loci with pleiotropic effects, but the overall performance of these methods is not well understood. In this work, we use extensive simulations and case studies of GWAS datasets to investigate the power and type-I error rates of ten meta-analysis methods. We specifically focus on three conditions commonly encountered in the studies of multiple traits: (1) extensive heterogeneity of genetic effects; (2) characterization of trait-specific association; and (3) inflated correlation of GWAS due to overlapping samples. Although the statistical power is highly variable under distinct study conditions, we found the superior power of several methods under diverse heterogeneity. In particular, classic fixed-effects model showed surprisingly good performance when a variant is associated with more than a half of study traits. As the number of traits with null effects increases, ASSET performed the best along with competitive specificity and sensitivity. With opposite directional effects, CPASSOC featured the first-rate power. However, caution is advised when using CPASSOC for studying genetically correlated traits with overlapping samples. We conclude with a discussion of unresolved issues and directions for future research.
ERIC Educational Resources Information Center
Blanchette, Judith
2012-01-01
The purpose of this empirical study was to determine the extent to which three different objective analytical methods--sequence analysis, surface cohesion analysis, and lexical cohesion analysis--can most accurately identify specific characteristics of online interaction. Statistically significant differences were found in all points of…
Optimization of Statistical Methods Impact on Quantitative Proteomics Data.
Pursiheimo, Anna; Vehmas, Anni P; Afzal, Saira; Suomi, Tomi; Chand, Thaman; Strauss, Leena; Poutanen, Matti; Rokka, Anne; Corthals, Garry L; Elo, Laura L
2015-10-02
As tools for quantitative label-free mass spectrometry (MS) rapidly develop, a consensus about the best practices is not apparent. In the work described here we compared popular statistical methods for detecting differential protein expression from quantitative MS data using both controlled experiments with known quantitative differences for specific proteins used as standards as well as "real" experiments where differences in protein abundance are not known a priori. Our results suggest that data-driven reproducibility-optimization can consistently produce reliable differential expression rankings for label-free proteome tools and are straightforward in their application.
Methods of Suicide by Age: Sex and Race Differences among the Young and Old.
ERIC Educational Resources Information Center
McIntosh, John L.; Santos, John F.
1986-01-01
Annual official statistics for specific methods of suicide (firearms, hanging, poisons) by age for different sex and racial groups (Whites, Blacks, non-Whites excluding Black) were examined from 1960 to 1978. Comparisons among the age-sex-race groups, along with trends over time and differences in the methods employed, were noted. (Author/ABL)
VALUE - A Framework to Validate Downscaling Approaches for Climate Change Studies
NASA Astrophysics Data System (ADS)
Maraun, Douglas; Widmann, Martin; Gutiérrez, José M.; Kotlarski, Sven; Chandler, Richard E.; Hertig, Elke; Wibig, Joanna; Huth, Radan; Wilke, Renate A. I.
2015-04-01
VALUE is an open European network to validate and compare downscaling methods for climate change research. VALUE aims to foster collaboration and knowledge exchange between climatologists, impact modellers, statisticians, and stakeholders to establish an interdisciplinary downscaling community. A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. Here, we present the key ingredients of this framework. VALUE's main approach to validation is user-focused: starting from a specific user problem, a validation tree guides the selection of relevant validation indices and performance measures. Several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur: what is the isolated downscaling skill? How do statistical and dynamical methods compare? How do methods perform at different spatial scales? Do methods fail in representing regional climate change? How is the overall representation of regional climate, including errors inherited from global climate models? The framework will be the basis for a comprehensive community-open downscaling intercomparison study, but is intended also to provide general guidance for other validation studies.
VALUE: A framework to validate downscaling approaches for climate change studies
NASA Astrophysics Data System (ADS)
Maraun, Douglas; Widmann, Martin; Gutiérrez, José M.; Kotlarski, Sven; Chandler, Richard E.; Hertig, Elke; Wibig, Joanna; Huth, Radan; Wilcke, Renate A. I.
2015-01-01
VALUE is an open European network to validate and compare downscaling methods for climate change research. VALUE aims to foster collaboration and knowledge exchange between climatologists, impact modellers, statisticians, and stakeholders to establish an interdisciplinary downscaling community. A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. In this paper, we present the key ingredients of this framework. VALUE's main approach to validation is user- focused: starting from a specific user problem, a validation tree guides the selection of relevant validation indices and performance measures. Several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur: what is the isolated downscaling skill? How do statistical and dynamical methods compare? How do methods perform at different spatial scales? Do methods fail in representing regional climate change? How is the overall representation of regional climate, including errors inherited from global climate models? The framework will be the basis for a comprehensive community-open downscaling intercomparison study, but is intended also to provide general guidance for other validation studies.
May, Michael R; Moore, Brian R
2016-11-01
Evolutionary biologists have long been fascinated by the extreme differences in species numbers across branches of the Tree of Life. This has motivated the development of statistical methods for detecting shifts in the rate of lineage diversification across the branches of phylogenic trees. One of the most frequently used methods, MEDUSA, explores a set of diversification-rate models, where each model assigns branches of the phylogeny to a set of diversification-rate categories. Each model is first fit to the data, and the Akaike information criterion (AIC) is then used to identify the optimal diversification model. Surprisingly, the statistical behavior of this popular method is uncharacterized, which is a concern in light of: (1) the poor performance of the AIC as a means of choosing among models in other phylogenetic contexts; (2) the ad hoc algorithm used to visit diversification models, and; (3) errors that we reveal in the likelihood function used to fit diversification models to the phylogenetic data. Here, we perform an extensive simulation study demonstrating that MEDUSA (1) has a high false-discovery rate (on average, spurious diversification-rate shifts are identified [Formula: see text] of the time), and (2) provides biased estimates of diversification-rate parameters. Understanding the statistical behavior of MEDUSA is critical both to empirical researchers-in order to clarify whether these methods can make reliable inferences from empirical datasets-and to theoretical biologists-in order to clarify the specific problems that need to be solved in order to develop more reliable approaches for detecting shifts in the rate of lineage diversification. [Akaike information criterion; extinction; lineage-specific diversification rates; phylogenetic model selection; speciation.]. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
May, Michael R.; Moore, Brian R.
2016-01-01
Evolutionary biologists have long been fascinated by the extreme differences in species numbers across branches of the Tree of Life. This has motivated the development of statistical methods for detecting shifts in the rate of lineage diversification across the branches of phylogenic trees. One of the most frequently used methods, MEDUSA, explores a set of diversification-rate models, where each model assigns branches of the phylogeny to a set of diversification-rate categories. Each model is first fit to the data, and the Akaike information criterion (AIC) is then used to identify the optimal diversification model. Surprisingly, the statistical behavior of this popular method is uncharacterized, which is a concern in light of: (1) the poor performance of the AIC as a means of choosing among models in other phylogenetic contexts; (2) the ad hoc algorithm used to visit diversification models, and; (3) errors that we reveal in the likelihood function used to fit diversification models to the phylogenetic data. Here, we perform an extensive simulation study demonstrating that MEDUSA (1) has a high false-discovery rate (on average, spurious diversification-rate shifts are identified ≈30% of the time), and (2) provides biased estimates of diversification-rate parameters. Understanding the statistical behavior of MEDUSA is critical both to empirical researchers—in order to clarify whether these methods can make reliable inferences from empirical datasets—and to theoretical biologists—in order to clarify the specific problems that need to be solved in order to develop more reliable approaches for detecting shifts in the rate of lineage diversification. [Akaike information criterion; extinction; lineage-specific diversification rates; phylogenetic model selection; speciation.] PMID:27037081
Verification of Eulerian-Eulerian and Eulerian-Lagrangian simulations for fluid-particle flows
NASA Astrophysics Data System (ADS)
Kong, Bo; Patel, Ravi G.; Capecelatro, Jesse; Desjardins, Olivier; Fox, Rodney O.
2017-11-01
In this work, we study the performance of three simulation techniques for fluid-particle flows: (1) a volume-filtered Euler-Lagrange approach (EL), (2) a quadrature-based moment method using the anisotropic Gaussian closure (AG), and (3) a traditional two-fluid model. By simulating two problems: particles in frozen homogeneous isotropic turbulence (HIT), and cluster-induced turbulence (CIT), the convergence of the methods under grid refinement is found to depend on the simulation method and the specific problem, with CIT simulations facing fewer difficulties than HIT. Although EL converges under refinement for both HIT and CIT, its statistical results exhibit dependence on the techniques used to extract statistics for the particle phase. For HIT, converging both EE methods (TFM and AG) poses challenges, while for CIT, AG and EL produce similar results. Overall, all three methods face challenges when trying to extract converged, parameter-independent statistics due to the presence of shocks in the particle phase. National Science Foundation and National Energy Technology Laboratory.
Gai, Liping; Liu, Hui; Cui, Jing-Hui; Yu, Weijian; Ding, Xiao-Dong
2017-03-20
The purpose of this study was to examine the specific allele combinations of three loci connected with the liver cancers, stomach cancers, hematencephalon and patients with chronic obstructive pulmonary disease (COPD) and to explore the feasibility of the research methods. We explored different mathematical methods for statistical analyses to assess the association between the genotype and phenotype. At the same time we still analyses the statistical results of allele combinations of three loci by difference value method and ratio method. All the DNA blood samples were collected from patients with 50 liver cancers, 75 stomach cancers, 50 hematencephalon, 72 COPD and 200 normal populations. All the samples were from Chinese. Alleles from short tandem repeat (STR) loci were determined using the STR Profiler plus PCR amplification kit (15 STR loci). Previous research was based on combinations of single-locus alleles, and combinations of cross-loci (two loci) alleles. Allele combinations of three loci were obtained by computer counting and stronger genetic signal was obtained. The methods of allele combinations of three loci can help to identify the statistically significant differences of allele combinations between liver cancers, stomach cancers, patients with hematencephalon, COPD and the normal population. The probability of illness followed different rules and had apparent specificity. This method can be extended to other diseases and provide reference for early clinical diagnosis. Copyright © 2016. Published by Elsevier B.V.
Partitioning heritability by functional annotation using genome-wide association summary statistics.
Finucane, Hilary K; Bulik-Sullivan, Brendan; Gusev, Alexander; Trynka, Gosia; Reshef, Yakir; Loh, Po-Ru; Anttila, Verneri; Xu, Han; Zang, Chongzhi; Farh, Kyle; Ripke, Stephan; Day, Felix R; Purcell, Shaun; Stahl, Eli; Lindstrom, Sara; Perry, John R B; Okada, Yukinori; Raychaudhuri, Soumya; Daly, Mark J; Patterson, Nick; Neale, Benjamin M; Price, Alkes L
2015-11-01
Recent work has demonstrated that some functional categories of the genome contribute disproportionately to the heritability of complex diseases. Here we analyze a broad set of functional elements, including cell type-specific elements, to estimate their polygenic contributions to heritability in genome-wide association studies (GWAS) of 17 complex diseases and traits with an average sample size of 73,599. To enable this analysis, we introduce a new method, stratified LD score regression, for partitioning heritability from GWAS summary statistics while accounting for linked markers. This new method is computationally tractable at very large sample sizes and leverages genome-wide information. Our findings include a large enrichment of heritability in conserved regions across many traits, a very large immunological disease-specific enrichment of heritability in FANTOM5 enhancers and many cell type-specific enrichments, including significant enrichment of central nervous system cell types in the heritability of body mass index, age at menarche, educational attainment and smoking behavior.
Switzer, P.; Harden, J.W.; Mark, R.K.
1988-01-01
A statistical method for estimating rates of soil development in a given region based on calibration from a series of dated soils is used to estimate ages of soils in the same region that are not dated directly. The method is designed specifically to account for sampling procedures and uncertainties that are inherent in soil studies. Soil variation and measurement error, uncertainties in calibration dates and their relation to the age of the soil, and the limited number of dated soils are all considered. Maximum likelihood (ML) is employed to estimate a parametric linear calibration curve, relating soil development to time or age on suitably transformed scales. Soil variation on a geomorphic surface of a certain age is characterized by replicate sampling of soils on each surface; such variation is assumed to have a Gaussian distribution. The age of a geomorphic surface is described by older and younger bounds. This technique allows age uncertainty to be characterized by either a Gaussian distribution or by a triangular distribution using minimum, best-estimate, and maximum ages. The calibration curve is taken to be linear after suitable (in certain cases logarithmic) transformations, if required, of the soil parameter and age variables. Soil variability, measurement error, and departures from linearity are described in a combined fashion using Gaussian distributions with variances particular to each sampled geomorphic surface and the number of sample replicates. Uncertainty in age of a geomorphic surface used for calibration is described using three parameters by one of two methods. In the first method, upper and lower ages are specified together with a coverage probability; this specification is converted to a Gaussian distribution with the appropriate mean and variance. In the second method, "absolute" older and younger ages are specified together with a most probable age; this specification is converted to an asymmetric triangular distribution with mode at the most probable age. The statistical variability of the ML-estimated calibration curve is assessed by a Monte Carlo method in which simulated data sets repeatedly are drawn from the distributional specification; calibration parameters are reestimated for each such simulation in order to assess their statistical variability. Several examples are used for illustration. The age of undated soils in a related setting may be estimated from the soil data using the fitted calibration curve. A second simulation to assess age estimate variability is described and applied to the examples. ?? 1988 International Association for Mathematical Geology.
Alternative Derivations of the Statistical Mechanical Distribution Laws
Wall, Frederick T.
1971-01-01
A new approach is presented for the derivation of statistical mechanical distribution laws. The derivations are accomplished by minimizing the Helmholtz free energy under constant temperature and volume, instead of maximizing the entropy under constant energy and volume. An alternative method involves stipulating equality of chemical potential, or equality of activity, for particles in different energy levels. This approach leads to a general statement of distribution laws applicable to all systems for which thermodynamic probabilities can be written. The methods also avoid use of the calculus of variations, Lagrangian multipliers, and Stirling's approximation for the factorial. The results are applied specifically to Boltzmann, Fermi-Dirac, and Bose-Einstein statistics. The special significance of chemical potential and activity is discussed for microscopic systems. PMID:16578712
Alternative derivations of the statistical mechanical distribution laws.
Wall, F T
1971-08-01
A new approach is presented for the derivation of statistical mechanical distribution laws. The derivations are accomplished by minimizing the Helmholtz free energy under constant temperature and volume, instead of maximizing the entropy under constant energy and volume. An alternative method involves stipulating equality of chemical potential, or equality of activity, for particles in different energy levels. This approach leads to a general statement of distribution laws applicable to all systems for which thermodynamic probabilities can be written. The methods also avoid use of the calculus of variations, Lagrangian multipliers, and Stirling's approximation for the factorial. The results are applied specifically to Boltzmann, Fermi-Dirac, and Bose-Einstein statistics. The special significance of chemical potential and activity is discussed for microscopic systems.
Skelly, Daniel A.; Johansson, Marnie; Madeoy, Jennifer; Wakefield, Jon; Akey, Joshua M.
2011-01-01
Variation in gene expression is thought to make a significant contribution to phenotypic diversity among individuals within populations. Although high-throughput cDNA sequencing offers a unique opportunity to delineate the genome-wide architecture of regulatory variation, new statistical methods need to be developed to capitalize on the wealth of information contained in RNA-seq data sets. To this end, we developed a powerful and flexible hierarchical Bayesian model that combines information across loci to allow both global and locus-specific inferences about allele-specific expression (ASE). We applied our methodology to a large RNA-seq data set obtained in a diploid hybrid of two diverse Saccharomyces cerevisiae strains, as well as to RNA-seq data from an individual human genome. Our statistical framework accurately quantifies levels of ASE with specified false-discovery rates, achieving high reproducibility between independent sequencing platforms. We pinpoint loci that show unusual and biologically interesting patterns of ASE, including allele-specific alternative splicing and transcription termination sites. Our methodology provides a rigorous, quantitative, and high-resolution tool for profiling ASE across whole genomes. PMID:21873452
NASA Astrophysics Data System (ADS)
Kerr, Laura T.; Adams, Aine; O'Dea, Shirley; Domijan, Katarina; Cullen, Ivor; Hennelly, Bryan M.
2014-05-01
Raman microspectroscopy can be applied to the urinary bladder for highly accurate classification and diagnosis of bladder cancer. This technique can be applied in vitro to bladder epithelial cells obtained from urine cytology or in vivo as an optical biopsy" to provide results in real-time with higher sensitivity and specificity than current clinical methods. However, there exists a high degree of variability across experimental parameters which need to be standardised before this technique can be utilized in an everyday clinical environment. In this study, we investigate different laser wavelengths (473 nm and 532 nm), sample substrates (glass, fused silica and calcium fluoride) and multivariate statistical methods in order to gain insight into how these various experimental parameters impact on the sensitivity and specificity of Raman cytology.
Chen, Yi-Ju; Lu, Cheng-Tsung; Huang, Kai-Yao; Wu, Hsin-Yi; Chen, Yu-Ju; Lee, Tzong-Yi
2015-01-01
S-glutathionylation, the covalent attachment of a glutathione (GSH) to the sulfur atom of cysteine, is a selective and reversible protein post-translational modification (PTM) that regulates protein activity, localization, and stability. Despite its implication in the regulation of protein functions and cell signaling, the substrate specificity of cysteine S-glutathionylation remains unknown. Based on a total of 1783 experimentally identified S-glutathionylation sites from mouse macrophages, this work presents an informatics investigation on S-glutathionylation sites including structural factors such as the flanking amino acids composition and the accessible surface area (ASA). TwoSampleLogo presents that positively charged amino acids flanking the S-glutathionylated cysteine may influence the formation of S-glutathionylation in closed three-dimensional environment. A statistical method is further applied to iteratively detect the conserved substrate motifs with statistical significance. Support vector machine (SVM) is then applied to generate predictive model considering the substrate motifs. According to five-fold cross-validation, the SVMs trained with substrate motifs could achieve an enhanced sensitivity, specificity, and accuracy, and provides a promising performance in an independent test set. The effectiveness of the proposed method is demonstrated by the correct identification of previously reported S-glutathionylation sites of mouse thioredoxin (TXN) and human protein tyrosine phosphatase 1b (PTP1B). Finally, the constructed models are adopted to implement an effective web-based tool, named GSHSite (http://csb.cse.yzu.edu.tw/GSHSite/), for identifying uncharacterized GSH substrate sites on the protein sequences. PMID:25849935
Ensemble stacking mitigates biases in inference of synaptic connectivity.
Chambers, Brendan; Levy, Maayan; Dechery, Joseph B; MacLean, Jason N
2018-01-01
A promising alternative to directly measuring the anatomical connections in a neuronal population is inferring the connections from the activity. We employ simulated spiking neuronal networks to compare and contrast commonly used inference methods that identify likely excitatory synaptic connections using statistical regularities in spike timing. We find that simple adjustments to standard algorithms improve inference accuracy: A signing procedure improves the power of unsigned mutual-information-based approaches and a correction that accounts for differences in mean and variance of background timing relationships, such as those expected to be induced by heterogeneous firing rates, increases the sensitivity of frequency-based methods. We also find that different inference methods reveal distinct subsets of the synaptic network and each method exhibits different biases in the accurate detection of reciprocity and local clustering. To correct for errors and biases specific to single inference algorithms, we combine methods into an ensemble. Ensemble predictions, generated as a linear combination of multiple inference algorithms, are more sensitive than the best individual measures alone, and are more faithful to ground-truth statistics of connectivity, mitigating biases specific to single inference methods. These weightings generalize across simulated datasets, emphasizing the potential for the broad utility of ensemble-based approaches.
Bayesian methods in reliability
NASA Astrophysics Data System (ADS)
Sander, P.; Badoux, R.
1991-11-01
The present proceedings from a course on Bayesian methods in reliability encompasses Bayesian statistical methods and their computational implementation, models for analyzing censored data from nonrepairable systems, the traits of repairable systems and growth models, the use of expert judgment, and a review of the problem of forecasting software reliability. Specific issues addressed include the use of Bayesian methods to estimate the leak rate of a gas pipeline, approximate analyses under great prior uncertainty, reliability estimation techniques, and a nonhomogeneous Poisson process. Also addressed are the calibration sets and seed variables of expert judgment systems for risk assessment, experimental illustrations of the use of expert judgment for reliability testing, and analyses of the predictive quality of software-reliability growth models such as the Weibull order statistics.
TRAPR: R Package for Statistical Analysis and Visualization of RNA-Seq Data.
Lim, Jae Hyun; Lee, Soo Youn; Kim, Ju Han
2017-03-01
High-throughput transcriptome sequencing, also known as RNA sequencing (RNA-Seq), is a standard technology for measuring gene expression with unprecedented accuracy. Numerous bioconductor packages have been developed for the statistical analysis of RNA-Seq data. However, these tools focus on specific aspects of the data analysis pipeline, and are difficult to appropriately integrate with one another due to their disparate data structures and processing methods. They also lack visualization methods to confirm the integrity of the data and the process. In this paper, we propose an R-based RNA-Seq analysis pipeline called TRAPR, an integrated tool that facilitates the statistical analysis and visualization of RNA-Seq expression data. TRAPR provides various functions for data management, the filtering of low-quality data, normalization, transformation, statistical analysis, data visualization, and result visualization that allow researchers to build customized analysis pipelines.
Multiple Versus Single Set Validation of Multivariate Models to Avoid Mistakes.
Harrington, Peter de Boves
2018-01-02
Validation of multivariate models is of current importance for a wide range of chemical applications. Although important, it is neglected. The common practice is to use a single external validation set for evaluation. This approach is deficient and may mislead investigators with results that are specific to the single validation set of data. In addition, no statistics are available regarding the precision of a derived figure of merit (FOM). A statistical approach using bootstrapped Latin partitions is advocated. This validation method makes an efficient use of the data because each object is used once for validation. It was reviewed a decade earlier but primarily for the optimization of chemometric models this review presents the reasons it should be used for generalized statistical validation. Average FOMs with confidence intervals are reported and powerful, matched-sample statistics may be applied for comparing models and methods. Examples demonstrate the problems with single validation sets.
Bryant, Fred B
2016-12-01
This paper introduces a special section of the current issue of the Journal of Evaluation in Clinical Practice that includes a set of 6 empirical articles showcasing a versatile, new machine-learning statistical method, known as optimal data (or discriminant) analysis (ODA), specifically designed to produce statistical models that maximize predictive accuracy. As this set of papers clearly illustrates, ODA offers numerous important advantages over traditional statistical methods-advantages that enhance the validity and reproducibility of statistical conclusions in empirical research. This issue of the journal also includes a review of a recently published book that provides a comprehensive introduction to the logic, theory, and application of ODA in empirical research. It is argued that researchers have much to gain by using ODA to analyze their data. © 2016 John Wiley & Sons, Ltd.
A high-fidelity weather time series generator using the Markov Chain process on a piecewise level
NASA Astrophysics Data System (ADS)
Hersvik, K.; Endrerud, O.-E. V.
2017-12-01
A method is developed for generating a set of unique weather time-series based on an existing weather series. The method allows statistically valid weather variations to take place within repeated simulations of offshore operations. The numerous generated time series need to share the same statistical qualities as the original time series. Statistical qualities here refer mainly to the distribution of weather windows available for work, including durations and frequencies of such weather windows, and seasonal characteristics. The method is based on the Markov chain process. The core new development lies in how the Markov Process is used, specifically by joining small pieces of random length time series together rather than joining individual weather states, each from a single time step, which is a common solution found in the literature. This new Markov model shows favorable characteristics with respect to the requirements set forth and all aspects of the validation performed.
A powerful score-based test statistic for detecting gene-gene co-association.
Xu, Jing; Yuan, Zhongshang; Ji, Jiadong; Zhang, Xiaoshuai; Li, Hongkai; Wu, Xuesen; Xue, Fuzhong; Liu, Yanxun
2016-01-29
The genetic variants identified by Genome-wide association study (GWAS) can only account for a small proportion of the total heritability for complex disease. The existence of gene-gene joint effects which contains the main effects and their co-association is one of the possible explanations for the "missing heritability" problems. Gene-gene co-association refers to the extent to which the joint effects of two genes differ from the main effects, not only due to the traditional interaction under nearly independent condition but the correlation between genes. Generally, genes tend to work collaboratively within specific pathway or network contributing to the disease and the specific disease-associated locus will often be highly correlated (e.g. single nucleotide polymorphisms (SNPs) in linkage disequilibrium). Therefore, we proposed a novel score-based statistic (SBS) as a gene-based method for detecting gene-gene co-association. Various simulations illustrate that, under different sample sizes, marginal effects of causal SNPs and co-association levels, the proposed SBS has the better performance than other existed methods including single SNP-based and principle component analysis (PCA)-based logistic regression model, the statistics based on canonical correlations (CCU), kernel canonical correlation analysis (KCCU), partial least squares path modeling (PLSPM) and delta-square (δ (2)) statistic. The real data analysis of rheumatoid arthritis (RA) further confirmed its advantages in practice. SBS is a powerful and efficient gene-based method for detecting gene-gene co-association.
A close examination of double filtering with fold change and t test in microarray analysis
2009-01-01
Background Many researchers use the double filtering procedure with fold change and t test to identify differentially expressed genes, in the hope that the double filtering will provide extra confidence in the results. Due to its simplicity, the double filtering procedure has been popular with applied researchers despite the development of more sophisticated methods. Results This paper, for the first time to our knowledge, provides theoretical insight on the drawback of the double filtering procedure. We show that fold change assumes all genes to have a common variance while t statistic assumes gene-specific variances. The two statistics are based on contradicting assumptions. Under the assumption that gene variances arise from a mixture of a common variance and gene-specific variances, we develop the theoretically most powerful likelihood ratio test statistic. We further demonstrate that the posterior inference based on a Bayesian mixture model and the widely used significance analysis of microarrays (SAM) statistic are better approximations to the likelihood ratio test than the double filtering procedure. Conclusion We demonstrate through hypothesis testing theory, simulation studies and real data examples, that well constructed shrinkage testing methods, which can be united under the mixture gene variance assumption, can considerably outperform the double filtering procedure. PMID:19995439
Grain-Boundary Resistance in Copper Interconnects: From an Atomistic Model to a Neural Network
NASA Astrophysics Data System (ADS)
Valencia, Daniel; Wilson, Evan; Jiang, Zhengping; Valencia-Zapata, Gustavo A.; Wang, Kuang-Chung; Klimeck, Gerhard; Povolotskyi, Michael
2018-04-01
Orientation effects on the specific resistance of copper grain boundaries are studied systematically with two different atomistic tight-binding methods. A methodology is developed to model the specific resistance of grain boundaries in the ballistic limit using the embedded atom model, tight- binding methods, and nonequilibrium Green's functions. The methodology is validated against first-principles calculations for thin films with a single coincident grain boundary, with 6.4% deviation in the specific resistance. A statistical ensemble of 600 large, random structures with grains is studied. For structures with three grains, it is found that the distribution of specific resistances is close to normal. Finally, a compact model for grain-boundary-specific resistance is constructed based on a neural network.
Identification of abnormal accident patterns at intersections
DOT National Transportation Integrated Search
1999-08-01
This report presents the findings and recommendations based on the Identification of Abnormal Accident Patterns at Intersections. This project used a statistically valid sampling method to determine whether a specific intersection has an abnormally h...
Resonance Raman of BCC and normal skin
NASA Astrophysics Data System (ADS)
Liu, Cheng-hui; Sriramoju, Vidyasagar; Boydston-White, Susie; Wu, Binlin; Zhang, Chunyuan; Pei, Zhe; Sordillo, Laura; Beckman, Hugh; Alfano, Robert R.
2017-02-01
The Resonance Raman (RR) spectra of basal cell carcinoma (BCC) and normal human skin tissues were analyzed using 532nm laser excitation. RR spectral differences in vibrational fingerprints revealed skin normal and cancerous states tissues. The standard diagnosis criterion for BCC tissues are created by native RR biomarkers and its changes at peak intensity. The diagnostic algorithms for the classification of BCC and normal were generated based on SVM classifier and PCA statistical method. These statistical methods were used to analyze the RR spectral data collected from skin tissues, yielding a diagnostic sensitivity of 98.7% and specificity of 79% compared with pathological reports.
The effect of substrate composition and storage time on urine specific gravity in dogs.
Steinberg, E; Drobatz, K; Aronson, L
2009-10-01
The purpose of this study is to evaluate the effects of substrate composition and storage time on urine specific gravity in dogs. A descriptive cohort study of 15 dogs. The urine specific gravity of free catch urine samples was analysed during a 5-hour time period using three separate storage methods; a closed syringe, a diaper pad and non-absorbable cat litter. The urine specific gravity increased over time in all three substrates. The syringe sample had the least change from baseline and the diaper sample had the greatest change from baseline. The urine specific gravity for the litter and diaper samples had a statistically significant increase from the 1-hour to the 5-hour time point. The urine specific gravity from canine urine stored either on a diaper or in a non-absorbable litter increased over time. Although the change was found to be statistically significant over the 5-hour study period it is unlikely to be clinically significant.
Statistical procedures for evaluating daily and monthly hydrologic model predictions
Coffey, M.E.; Workman, S.R.; Taraba, J.L.; Fogle, A.W.
2004-01-01
The overall study objective was to evaluate the applicability of different qualitative and quantitative methods for comparing daily and monthly SWAT computer model hydrologic streamflow predictions to observed data, and to recommend statistical methods for use in future model evaluations. Statistical methods were tested using daily streamflows and monthly equivalent runoff depths. The statistical techniques included linear regression, Nash-Sutcliffe efficiency, nonparametric tests, t-test, objective functions, autocorrelation, and cross-correlation. None of the methods specifically applied to the non-normal distribution and dependence between data points for the daily predicted and observed data. Of the tested methods, median objective functions, sign test, autocorrelation, and cross-correlation were most applicable for the daily data. The robust coefficient of determination (CD*) and robust modeling efficiency (EF*) objective functions were the preferred methods for daily model results due to the ease of comparing these values with a fixed ideal reference value of one. Predicted and observed monthly totals were more normally distributed, and there was less dependence between individual monthly totals than was observed for the corresponding predicted and observed daily values. More statistical methods were available for comparing SWAT model-predicted and observed monthly totals. The 1995 monthly SWAT model predictions and observed data had a regression Rr2 of 0.70, a Nash-Sutcliffe efficiency of 0.41, and the t-test failed to reject the equal data means hypothesis. The Nash-Sutcliffe coefficient and the R r2 coefficient were the preferred methods for monthly results due to the ability to compare these coefficients to a set ideal value of one.
Patel, Ravi G.; Desjardins, Olivier; Kong, Bo; ...
2017-09-01
Here, we present a verification study of three simulation techniques for fluid–particle flows, including an Euler–Lagrange approach (EL) inspired by Jackson's seminal work on fluidized particles, a quadrature–based moment method based on the anisotropic Gaussian closure (AG), and the traditional two-fluid model. We perform simulations of two problems: particles in frozen homogeneous isotropic turbulence (HIT) and cluster-induced turbulence (CIT). For verification, we evaluate various techniques for extracting statistics from EL and study the convergence properties of the three methods under grid refinement. The convergence is found to depend on the simulation method and on the problem, with CIT simulations posingmore » fewer difficulties than HIT. Specifically, EL converges under refinement for both HIT and CIT, but statistics exhibit dependence on the postprocessing parameters. For CIT, AG produces similar results to EL. For HIT, converging both TFM and AG poses challenges. Overall, extracting converged, parameter-independent Eulerian statistics remains a challenge for all methods.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Patel, Ravi G.; Desjardins, Olivier; Kong, Bo
Here, we present a verification study of three simulation techniques for fluid–particle flows, including an Euler–Lagrange approach (EL) inspired by Jackson's seminal work on fluidized particles, a quadrature–based moment method based on the anisotropic Gaussian closure (AG), and the traditional two-fluid model. We perform simulations of two problems: particles in frozen homogeneous isotropic turbulence (HIT) and cluster-induced turbulence (CIT). For verification, we evaluate various techniques for extracting statistics from EL and study the convergence properties of the three methods under grid refinement. The convergence is found to depend on the simulation method and on the problem, with CIT simulations posingmore » fewer difficulties than HIT. Specifically, EL converges under refinement for both HIT and CIT, but statistics exhibit dependence on the postprocessing parameters. For CIT, AG produces similar results to EL. For HIT, converging both TFM and AG poses challenges. Overall, extracting converged, parameter-independent Eulerian statistics remains a challenge for all methods.« less
NASA Astrophysics Data System (ADS)
Ha, Vu Thi Thanh; Hung, Vu Van; Hanh, Pham Thi Minh; Tuyen, Nguyen Viet; Hai, Tran Thi; Hieu, Ho Khac
2018-03-01
The thermodynamic and mechanical properties of III-V zinc-blende AlP, InP semiconductors and their alloys have been studied in detail from statistical moment method taking into account the anharmonicity effects of the lattice vibrations. The nearest neighbor distance, thermal expansion coefficient, bulk moduli, specific heats at the constant volume and constant pressure of the zincblende AlP, InP and AlyIn1-yP alloys are calculated as functions of the temperature. The statistical moment method calculations are performed by using the many-body Stillinger-Weber potential. The concentration dependences of the thermodynamic quantities of zinc-blende AlyIn1-yP crystals have also been discussed and compared with those of the experimental results. Our results are reasonable agreement with earlier density functional theory calculations and can provide useful qualitative information for future experiments. The moment method then can be developed extensively for studying the atomistic structure and thermodynamic properties of nanoscale materials as well.
Theoretical limitations of quantification for noncompetitive sandwich immunoassays.
Woolley, Christine F; Hayes, Mark A; Mahanti, Prasun; Douglass Gilman, S; Taylor, Tom
2015-11-01
Immunoassays exploit the highly selective interaction between antibodies and antigens to provide a vital method for biomolecule detection at low concentrations. Developers and practitioners of immunoassays have long known that non-specific binding often restricts immunoassay limits of quantification (LOQs). Aside from non-specific binding, most efforts by analytical chemists to reduce the LOQ for these techniques have focused on improving the signal amplification methods and minimizing the limitations of the detection system. However, with detection technology now capable of sensing single-fluorescence molecules, this approach is unlikely to lead to dramatic improvements in the future. Here, fundamental interactions based on the law of mass action are analytically connected to signal generation, replacing the four- and five-parameter fittings commercially used to approximate sigmoidal immunoassay curves and allowing quantitative consideration of non-specific binding and statistical limitations in order to understand the ultimate detection capabilities of immunoassays. The restrictions imposed on limits of quantification by instrumental noise, non-specific binding, and counting statistics are discussed based on equilibrium relations for a sandwich immunoassay. Understanding the maximal capabilities of immunoassays for each of these regimes can greatly assist in the development and evaluation of immunoassay platforms. While many studies suggest that single molecule detection is possible through immunoassay techniques, here, it is demonstrated that the fundamental limit of quantification (precision of 10 % or better) for an immunoassay is approximately 131 molecules and this limit is based on fundamental and unavoidable statistical limitations.
NASA Astrophysics Data System (ADS)
Xu, Xianjin; Yan, Chengfei; Zou, Xiaoqin
2017-08-01
The growing number of protein-ligand complex structures, particularly the structures of proteins co-bound with different ligands, in the Protein Data Bank helps us tackle two major challenges in molecular docking studies: the protein flexibility and the scoring function. Here, we introduced a systematic strategy by using the information embedded in the known protein-ligand complex structures to improve both binding mode and binding affinity predictions. Specifically, a ligand similarity calculation method was employed to search a receptor structure with a bound ligand sharing high similarity with the query ligand for the docking use. The strategy was applied to the two datasets (HSP90 and MAP4K4) in recent D3R Grand Challenge 2015. In addition, for the HSP90 dataset, a system-specific scoring function (ITScore2_hsp90) was generated by recalibrating our statistical potential-based scoring function (ITScore2) using the known protein-ligand complex structures and the statistical mechanics-based iterative method. For the HSP90 dataset, better performances were achieved for both binding mode and binding affinity predictions comparing with the original ITScore2 and with ensemble docking. For the MAP4K4 dataset, although there were only eight known protein-ligand complex structures, our docking strategy achieved a comparable performance with ensemble docking. Our method for receptor conformational selection and iterative method for the development of system-specific statistical potential-based scoring functions can be easily applied to other protein targets that have a number of protein-ligand complex structures available to improve predictions on binding.
Vallée, Julie; Souris, Marc; Fournet, Florence; Bochaton, Audrey; Mobillion, Virginie; Peyronnie, Karine; Salem, Gérard
2007-01-01
Background Geographical objectives and probabilistic methods are difficult to reconcile in a unique health survey. Probabilistic methods focus on individuals to provide estimates of a variable's prevalence with a certain precision, while geographical approaches emphasise the selection of specific areas to study interactions between spatial characteristics and health outcomes. A sample selected from a small number of specific areas creates statistical challenges: the observations are not independent at the local level, and this results in poor statistical validity at the global level. Therefore, it is difficult to construct a sample that is appropriate for both geographical and probability methods. Methods We used a two-stage selection procedure with a first non-random stage of selection of clusters. Instead of randomly selecting clusters, we deliberately chose a group of clusters, which as a whole would contain all the variation in health measures in the population. As there was no health information available before the survey, we selected a priori determinants that can influence the spatial homogeneity of the health characteristics. This method yields a distribution of variables in the sample that closely resembles that in the overall population, something that cannot be guaranteed with randomly-selected clusters, especially if the number of selected clusters is small. In this way, we were able to survey specific areas while minimising design effects and maximising statistical precision. Application We applied this strategy in a health survey carried out in Vientiane, Lao People's Democratic Republic. We selected well-known health determinants with unequal spatial distribution within the city: nationality and literacy. We deliberately selected a combination of clusters whose distribution of nationality and literacy is similar to the distribution in the general population. Conclusion This paper describes the conceptual reasoning behind the construction of the survey sample and shows that it can be advantageous to choose clusters using reasoned hypotheses, based on both probability and geographical approaches, in contrast to a conventional, random cluster selection strategy. PMID:17543100
Estimation of true height: a study in population-specific methods among young South African adults.
Lahner, Christen Renée; Kassier, Susanna Maria; Veldman, Frederick Johannes
2017-02-01
To investigate the accuracy of arm-associated height estimation methods in the calculation of true height compared with stretch stature in a sample of young South African adults. A cross-sectional descriptive design was employed. Pietermaritzburg, Westville and Durban, KwaZulu-Natal, South Africa, 2015. Convenience sample (N 900) aged 18-24 years, which included an equal number of participants from both genders (150 per gender) stratified across race (Caucasian, Black African and Indian). Continuous variables that were investigated included: (i) stretch stature; (ii) total armspan; (iii) half-armspan; (iv) half-armspan ×2; (v) demi-span; (vi) demi-span gender-specific equation; (vii) WHO equation; and (viii) WHO-adjusted equations; as well as categorization according to gender and race. Statistical analysis was conducted using IBM SPSS Statistics Version 21.0. Significant correlations were identified between gender and height estimation measurements, with males being anatomically larger than females (P<0·001). Significant differences were documented when study participants were stratified according to race and gender (P<0·001). Anatomical similarities were noted between Indians and Black Africans, whereas Caucasians were anatomically different from the other race groups. Arm-associated height estimation methods were able to estimate true height; however, each method was specific to each gender and race group. Height can be calculated by using arm-associated measurements. Although universal equations for estimating true height exist, for the enhancement of accuracy, the use of equations that are race-, gender- and population-specific should be considered.
Krystkowiak, Izabella; Manguy, Jean; Davey, Norman E
2018-06-05
There is a pressing need for in silico tools that can aid in the identification of the complete repertoire of protein binding (SLiMs, MoRFs, miniMotifs) and modification (moiety attachment/removal, isomerization, cleavage) motifs. We have created PSSMSearch, an interactive web-based tool for rapid statistical modeling, visualization, discovery and annotation of protein motif specificity determinants to discover novel motifs in a proteome-wide manner. PSSMSearch analyses proteomes for regions with significant similarity to a motif specificity determinant model built from a set of aligned motif-containing peptides. Multiple scoring methods are available to build a position-specific scoring matrix (PSSM) describing the motif specificity determinant model. This model can then be modified by a user to add prior knowledge of specificity determinants through an interactive PSSM heatmap. PSSMSearch includes a statistical framework to calculate the significance of specificity determinant model matches against a proteome of interest. PSSMSearch also includes the SLiMSearch framework's annotation, motif functional analysis and filtering tools to highlight relevant discriminatory information. Additional tools to annotate statistically significant shared keywords and GO terms, or experimental evidence of interaction with a motif-recognizing protein have been added. Finally, PSSM-based conservation metrics have been created for taxonomic range analyses. The PSSMSearch web server is available at http://slim.ucd.ie/pssmsearch/.
Bautista, Ami C; Zhou, Lei; Jawa, Vibha
2013-10-01
Immunogenicity support during nonclinical biotherapeutic development can be resource intensive if supported by conventional methodologies. A universal indirect species-specific immunoassay can eliminate the need for biotherapeutic-specific anti-drug antibody immunoassays without compromising quality. By implementing the R's of sustainability (reduce, reuse, rethink), conservation of resources and greener laboratory practices were achieved in this study. Statistical analysis across four biotherapeutics supported identification of consistent product performance standards (cut points, sensitivity and reference limits) and a streamlined universal anti-drug antibody immunoassay method implementation strategy. We propose an efficient, fit-for-purpose, scientifically and statistically supported nonclinical immunogenicity assessment strategy. Utilization of a universal method and streamlined validation, while retaining comparability to conventional immunoassays and meeting the industry recommended standards, provides environmental credits in the scientific laboratory. Collectively, individual reductions in critical material consumption, energy usage, waste and non-environment friendly consumables, such as plastic and paper, support a greener laboratory environment.
A Bayesian approach to the statistical analysis of device preference studies.
Fu, Haoda; Qu, Yongming; Zhu, Baojin; Huster, William
2012-01-01
Drug delivery devices are required to have excellent technical specifications to deliver drugs accurately, and in addition, the devices should provide a satisfactory experience to patients because this can have a direct effect on drug compliance. To compare patients' experience with two devices, cross-over studies with patient-reported outcomes (PRO) as response variables are often used. Because of the strength of cross-over designs, each subject can directly compare the two devices by using the PRO variables, and variables indicating preference (preferring A, preferring B, or no preference) can be easily derived. Traditionally, methods based on frequentist statistics can be used to analyze such preference data, but there are some limitations for the frequentist methods. Recently, Bayesian methods are considered an acceptable method by the US Food and Drug Administration to design and analyze device studies. In this paper, we propose a Bayesian statistical method to analyze the data from preference trials. We demonstrate that the new Bayesian estimator enjoys some optimal properties versus the frequentist estimator. Copyright © 2012 John Wiley & Sons, Ltd.
Fathiazar, Elham; Anemuller, Jorn; Kretzberg, Jutta
2016-08-01
Voltage-Sensitive Dye (VSD) imaging is an optical imaging method that allows measuring the graded voltage changes of multiple neurons simultaneously. In neuroscience, this method is used to reveal networks of neurons involved in certain tasks. However, the recorded relative dye fluorescence changes are usually low and signals are superimposed by noise and artifacts. Therefore, establishing a reliable method to identify which cells are activated by specific stimulus conditions is the first step to identify functional networks. In this paper, we present a statistical method to identify stimulus-activated network nodes as cells, whose activities during sensory network stimulation differ significantly from the un-stimulated control condition. This method is demonstrated based on voltage-sensitive dye recordings from up to 100 neurons in a ganglion of the medicinal leech responding to tactile skin stimulation. Without relying on any prior physiological knowledge, the network nodes identified by our statistical analysis were found to match well with published cell types involved in tactile stimulus processing and to be consistent across stimulus conditions and preparations.
NASA Astrophysics Data System (ADS)
Ohyanagi, S.; Dileonardo, C.
2013-12-01
As a natural phenomenon earthquake occurrence is difficult to predict. Statistical analysis of earthquake data was performed using candlestick chart and Bollinger Band methods. These statistical methods, commonly used in the financial world to analyze market trends were tested against earthquake data. Earthquakes above Mw 4.0 located on shore of Sanriku (37.75°N ~ 41.00°N, 143.00°E ~ 144.50°E) from February 1973 to May 2013 were selected for analysis. Two specific patterns in earthquake occurrence were recognized through the analysis. One is a spread of candlestick prior to the occurrence of events greater than Mw 6.0. A second pattern shows convergence in the Bollinger Band, which implies a positive or negative change in the trend of earthquakes. Both patterns match general models for the buildup and release of strain through the earthquake cycle, and agree with both the characteristics of the candlestick chart and Bollinger Band analysis. These results show there is a high correlation between patterns in earthquake occurrence and trend analysis by these two statistical methods. The results of this study agree with the appropriateness of the application of these financial analysis methods to the analysis of earthquake occurrence.
Development of hi-resolution regional climate scenarios in Japan by statistical downscaling
NASA Astrophysics Data System (ADS)
Dairaku, K.
2016-12-01
Climate information and services for Impacts, Adaptation and Vulnerability (IAV) Assessments are of great concern. To meet with the needs of stakeholders such as local governments, a Japan national project, Social Implementation Program on Climate Change Adaptation Technology (SI-CAT), launched in December 2015. It develops reliable technologies for near-term climate change predictions. Multi-model ensemble regional climate scenarios with 1km horizontal grid-spacing over Japan are developed by using CMIP5 GCMs and a statistical downscaling method to support various municipal adaptation measures appropriate for possible regional climate changes. A statistical downscaling method, Bias Correction Spatial Disaggregation (BCSD), is employed to develop regional climate scenarios based on CMIP5 RCP8.5 five GCMs (MIROC5, MRI-CGCM3, GFDL-CM3, CSIRO-Mk3-6-0, HadGEM2-ES) for the periods of historical climate (1970-2005) and near future climate (2020-2055). Downscaled variables are monthly/daily precipitation and temperature. File format is NetCDF4 (conforming to CF1.6, HDF5 compression). Developed regional climate scenarios will be expanded to meet with needs of stakeholders and interface applications to access and download the data are under developing. Statistical downscaling method is not necessary to well represent locally forced nonlinear phenomena, extreme events such as heavy rain, heavy snow, etc. To complement the statistical method, dynamical downscaling approach is also combined and applied to some specific regions which have needs of stakeholders. The added values of statistical/dynamical downscaling methods compared with parent GCMs are investigated.
No-Reference Video Quality Assessment Based on Statistical Analysis in 3D-DCT Domain.
Li, Xuelong; Guo, Qun; Lu, Xiaoqiang
2016-05-13
It is an important task to design models for universal no-reference video quality assessment (NR-VQA) in multiple video processing and computer vision applications. However, most existing NR-VQA metrics are designed for specific distortion types which are not often aware in practical applications. A further deficiency is that the spatial and temporal information of videos is hardly considered simultaneously. In this paper, we propose a new NR-VQA metric based on the spatiotemporal natural video statistics (NVS) in 3D discrete cosine transform (3D-DCT) domain. In the proposed method, a set of features are firstly extracted based on the statistical analysis of 3D-DCT coefficients to characterize the spatiotemporal statistics of videos in different views. These features are used to predict the perceived video quality via the efficient linear support vector regression (SVR) model afterwards. The contributions of this paper are: 1) we explore the spatiotemporal statistics of videos in 3DDCT domain which has the inherent spatiotemporal encoding advantage over other widely used 2D transformations; 2) we extract a small set of simple but effective statistical features for video visual quality prediction; 3) the proposed method is universal for multiple types of distortions and robust to different databases. The proposed method is tested on four widely used video databases. Extensive experimental results demonstrate that the proposed method is competitive with the state-of-art NR-VQA metrics and the top-performing FR-VQA and RR-VQA metrics.
Quaglio, Pietro; Yegenoglu, Alper; Torre, Emiliano; Endres, Dominik M; Grün, Sonja
2017-01-01
Repeated, precise sequences of spikes are largely considered a signature of activation of cell assemblies. These repeated sequences are commonly known under the name of spatio-temporal patterns (STPs). STPs are hypothesized to play a role in the communication of information in the computational process operated by the cerebral cortex. A variety of statistical methods for the detection of STPs have been developed and applied to electrophysiological recordings, but such methods scale poorly with the current size of available parallel spike train recordings (more than 100 neurons). In this work, we introduce a novel method capable of overcoming the computational and statistical limits of existing analysis techniques in detecting repeating STPs within massively parallel spike trains (MPST). We employ advanced data mining techniques to efficiently extract repeating sequences of spikes from the data. Then, we introduce and compare two alternative approaches to distinguish statistically significant patterns from chance sequences. The first approach uses a measure known as conceptual stability, of which we investigate a computationally cheap approximation for applications to such large data sets. The second approach is based on the evaluation of pattern statistical significance. In particular, we provide an extension to STPs of a method we recently introduced for the evaluation of statistical significance of synchronous spike patterns. The performance of the two approaches is evaluated in terms of computational load and statistical power on a variety of artificial data sets that replicate specific features of experimental data. Both methods provide an effective and robust procedure for detection of STPs in MPST data. The method based on significance evaluation shows the best overall performance, although at a higher computational cost. We name the novel procedure the spatio-temporal Spike PAttern Detection and Evaluation (SPADE) analysis.
Quaglio, Pietro; Yegenoglu, Alper; Torre, Emiliano; Endres, Dominik M.; Grün, Sonja
2017-01-01
Repeated, precise sequences of spikes are largely considered a signature of activation of cell assemblies. These repeated sequences are commonly known under the name of spatio-temporal patterns (STPs). STPs are hypothesized to play a role in the communication of information in the computational process operated by the cerebral cortex. A variety of statistical methods for the detection of STPs have been developed and applied to electrophysiological recordings, but such methods scale poorly with the current size of available parallel spike train recordings (more than 100 neurons). In this work, we introduce a novel method capable of overcoming the computational and statistical limits of existing analysis techniques in detecting repeating STPs within massively parallel spike trains (MPST). We employ advanced data mining techniques to efficiently extract repeating sequences of spikes from the data. Then, we introduce and compare two alternative approaches to distinguish statistically significant patterns from chance sequences. The first approach uses a measure known as conceptual stability, of which we investigate a computationally cheap approximation for applications to such large data sets. The second approach is based on the evaluation of pattern statistical significance. In particular, we provide an extension to STPs of a method we recently introduced for the evaluation of statistical significance of synchronous spike patterns. The performance of the two approaches is evaluated in terms of computational load and statistical power on a variety of artificial data sets that replicate specific features of experimental data. Both methods provide an effective and robust procedure for detection of STPs in MPST data. The method based on significance evaluation shows the best overall performance, although at a higher computational cost. We name the novel procedure the spatio-temporal Spike PAttern Detection and Evaluation (SPADE) analysis. PMID:28596729
Testing statistical isotropy in cosmic microwave background polarization maps
NASA Astrophysics Data System (ADS)
Rath, Pranati K.; Samal, Pramoda Kumar; Panda, Srikanta; Mishra, Debesh D.; Aluri, Pavan K.
2018-04-01
We apply our symmetry based Power tensor technique to test conformity of PLANCK Polarization maps with statistical isotropy. On a wide range of angular scales (l = 40 - 150), our preliminary analysis detects many statistically anisotropic multipoles in foreground cleaned full sky PLANCK polarization maps viz., COMMANDER and NILC. We also study the effect of residual foregrounds that may still be present in the Galactic plane using both common UPB77 polarization mask, as well as the individual component separation method specific polarization masks. However, some of the statistically anisotropic modes still persist, albeit significantly in NILC map. We further probed the data for any coherent alignments across multipoles in several bins from the chosen multipole range.
Langholz, Bryan; Thomas, Duncan C.; Stovall, Marilyn; Smith, Susan A.; Boice, John D.; Shore, Roy E.; Bernstein, Leslie; Lynch, Charles F.; Zhang, Xinbo; Bernstein, Jonine L.
2009-01-01
Summary Methods for the analysis of individually matched case-control studies with location-specific radiation dose and tumor location information are described. These include likelihood methods for analyses that just use cases with precise location of tumor information and methods that also include cases with imprecise tumor location information. The theory establishes that each of these likelihood based methods estimates the same radiation rate ratio parameters, within the context of the appropriate model for location and subject level covariate effects. The underlying assumptions are characterized and the potential strengths and limitations of each method are described. The methods are illustrated and compared using the WECARE study of radiation and asynchronous contralateral breast cancer. PMID:18647297
White, Sarah A; van den Broek, Nynke R
2004-05-30
Before introducing a new measurement tool it is necessary to evaluate its performance. Several statistical methods have been developed, or used, to evaluate the reliability and validity of a new assessment method in such circumstances. In this paper we review some commonly used methods. Data from a study that was conducted to evaluate the usefulness of a specific measurement tool (the WHO Colour Scale) is then used to illustrate the application of these methods. The WHO Colour Scale was developed under the auspices of the WHO to provide a simple portable and reliable method of detecting anaemia. This Colour Scale is a discrete interval scale, whereas the actual haemoglobin values it is used to estimate are on a continuous interval scale and can be measured accurately using electrical laboratory equipment. The methods we consider are: linear regression, correlation coefficients, paired t-tests plotting differences against mean values and deriving limits of agreement; kappa and weighted kappa statistics, sensitivity and specificity, an intraclass correlation coefficient and the repeatability coefficient. We note that although the definition and properties of each of these methods is well established inappropriate methods continue to be used in medical literature for assessing reliability and validity, as evidenced in the context of the evaluation of the WHO Colour Scale. Copyright 2004 John Wiley & Sons, Ltd.
A statistical approach to selecting and confirming validation targets in -omics experiments
2012-01-01
Background Genomic technologies are, by their very nature, designed for hypothesis generation. In some cases, the hypotheses that are generated require that genome scientists confirm findings about specific genes or proteins. But one major advantage of high-throughput technology is that global genetic, genomic, transcriptomic, and proteomic behaviors can be observed. Manual confirmation of every statistically significant genomic result is prohibitively expensive. This has led researchers in genomics to adopt the strategy of confirming only a handful of the most statistically significant results, a small subset chosen for biological interest, or a small random subset. But there is no standard approach for selecting and quantitatively evaluating validation targets. Results Here we present a new statistical method and approach for statistically validating lists of significant results based on confirming only a small random sample. We apply our statistical method to show that the usual practice of confirming only the most statistically significant results does not statistically validate result lists. We analyze an extensively validated RNA-sequencing experiment to show that confirming a random subset can statistically validate entire lists of significant results. Finally, we analyze multiple publicly available microarray experiments to show that statistically validating random samples can both (i) provide evidence to confirm long gene lists and (ii) save thousands of dollars and hundreds of hours of labor over manual validation of each significant result. Conclusions For high-throughput -omics studies, statistical validation is a cost-effective and statistically valid approach to confirming lists of significant results. PMID:22738145
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matzke, Brett D.; Wilson, John E.; Hathaway, J.
2008-02-12
Statistically defensible methods are presented for developing geophysical detector sampling plans and analyzing data for munitions response sites where unexploded ordnance (UXO) may exist. Detection methods for identifying areas of elevated anomaly density from background density are shown. Additionally, methods are described which aid in the choice of transect pattern and spacing to assure with degree of confidence that a target area (TA) of specific size, shape, and anomaly density will be identified using the detection methods. Methods for evaluating the sensitivity of designs to variation in certain parameters are also discussed. Methods presented have been incorporated into the Visualmore » Sample Plan (VSP) software (free at http://dqo.pnl.gov/vsp) and demonstrated at multiple sites in the United States. Application examples from actual transect designs and surveys from the previous two years are demonstrated.« less
State-of-the-art in asphalt pavement specifications
DOT National Transportation Integrated Search
1984-07-01
The great increase in highway construction beginning in the 1950's made evident the need for better control of materials .and construction. A comprehensive research and development program was begun to use statistical methods for quality assurance in...
Nguyen, Van-Nui; Huang, Kai-Yao; Huang, Chien-Hsun; Chang, Tzu-Hao; Bretaña, Neil; Lai, K; Weng, Julia; Lee, Tzong-Yi
2015-01-01
In eukaryotes, ubiquitin-conjugation is an important mechanism underlying proteasome-mediated degradation of proteins, and as such, plays an essential role in the regulation of many cellular processes. In the ubiquitin-proteasome pathway, E3 ligases play important roles by recognizing a specific protein substrate and catalyzing the attachment of ubiquitin to a lysine (K) residue. As more and more experimental data on ubiquitin conjugation sites become available, it becomes possible to develop prediction models that can be scaled to big data. However, no development that focuses on the investigation of ubiquitinated substrate specificities has existed. Herein, we present an approach that exploits an iteratively statistical method to identify ubiquitin conjugation sites with substrate site specificities. In this investigation, totally 6259 experimentally validated ubiquitinated proteins were obtained from dbPTM. After having filtered out homologous fragments with 40% sequence identity, the training data set contained 2658 ubiquitination sites (positive data) and 5532 non-ubiquitinated sites (negative data). Due to the difficulty in characterizing the substrate site specificities of E3 ligases by conventional sequence logo analysis, a recursively statistical method has been applied to obtain significant conserved motifs. The profile hidden Markov model (profile HMM) was adopted to construct the predictive models learned from the identified substrate motifs. A five-fold cross validation was then used to evaluate the predictive model, achieving sensitivity, specificity, and accuracy of 73.07%, 65.46%, and 67.93%, respectively. Additionally, an independent testing set, completely blind to the training data of the predictive model, was used to demonstrate that the proposed method could provide a promising accuracy (76.13%) and outperform other ubiquitination site prediction tool. A case study demonstrated the effectiveness of the characterized substrate motifs for identifying ubiquitination sites. The proposed method presents a practical means of preliminary analysis and greatly diminishes the total number of potential targets required for further experimental confirmation. This method may help unravel their mechanisms and roles in E3 recognition and ubiquitin-mediated protein degradation.
Ha Dinh, Thi Thuy; Bonner, Ann; Clark, Robyn; Ramsbotham, Joanne; Hines, Sonia
2016-01-01
Chronic diseases are increasing worldwide and have become a significant burden to those affected by those diseases. Disease-specific education programs have demonstrated improved outcomes, although people do forget information quickly or memorize it incorrectly. The teach-back method was introduced in an attempt to reinforce education to patients. To date, the evidence regarding the effectiveness of health education employing the teach-back method in improved care has not yet been reviewed systematically. This systematic review examined the evidence on using the teach-back method in health education programs for improving adherence and self-management of people with chronic disease. Adults aged 18 years and over with one or more than one chronic disease.All types of interventions which included the teach-back method in an education program for people with chronic diseases. The comparator was chronic disease education programs that did not involve the teach-back method.Randomized and non-randomized controlled trials, cohort studies, before-after studies and case-control studies.The outcomes of interest were adherence, self-management, disease-specific knowledge, readmission, knowledge retention, self-efficacy and quality of life. Searches were conducted in CINAHL, MEDLINE, EMBASE, Cochrane CENTRAL, Web of Science, ProQuest Nursing and Allied Health Source, and Google Scholar databases. Search terms were combined by AND or OR in search strings. Reference lists of included articles were also searched for further potential references. Two reviewers conducted quality appraisal of papers using the Joanna Briggs Institute Meta-Analysis of Statistics Assessment and Review Instrument. Data were extracted using the Joanna Briggs Institute Meta-Analysis of Statistics Assessment and Review Instrument data extraction instruments. There was significant heterogeneity in selected studies, hence a meta-analysis was not possible and the results were presented in narrative form. Of the 21 articles retrieved in full, 12 on the use of the teach-back method met the inclusion criteria and were selected for analysis. Four studies confirmed improved disease-specific knowledge in intervention participants. One study showed a statistically significant improvement in adherence to medication and diet among type 2 diabetics patients in the intervention group compared to the control group (p < 0.001). Two studies found statistically significant improvements in self-efficacy (p = 0.0026 and p < 0.001) in the intervention groups. One study examined quality of life in heart failure patients but the results did not improve from the intervention (p = 0.59). Five studies found a reduction in readmission rates and hospitalization but these were not always statistically significant. Two studies showed improvement in daily weighing among heart failure participants, and in adherence to diet, exercise and foot care among those with type 2 diabetes. Overall, the teach-back method showed positive effects in a wide range of health care outcomes although these were not always statistically significant. Studies in this systematic review revealed improved outcomes in disease-specific knowledge, adherence, self-efficacy and the inhaler technique. There was a positive but inconsistent trend also seen in improved self-care and reduction of hospital readmission rates. There was limited evidence on improvement in quality of life or disease related knowledge retention.Evidence from the systematic review supports the use of the teach-back method in educating people with chronic disease to maximize their disease understanding and promote knowledge, adherence, self-efficacy and self-care skills.Future studies are required to strengthen the evidence on effects of the teach-back method. Larger randomized controlled trials will be needed to determine the effectiveness of the teach-back method in quality of life, reduction of readmission, and hospitalizations.
Rand, R.S.; Clark, R.N.; Livo, K.E.
2011-01-01
The Deepwater Horizon oil spill covered a very large geographical area in the Gulf of Mexico creating potentially serious environmental impacts on both marine life and the coastal shorelines. Knowing the oil's areal extent and thickness as well as denoting different categories of the oil's physical state is important for assessing these impacts. High spectral resolution data in hyperspectral imagery (HSI) sensors such as Airborne Visible and Infrared Imaging Spectrometer (AVIRIS) provide a valuable source of information that can be used for analysis by semi-automatic methods for tracking an oil spill's areal extent, oil thickness, and oil categories. However, the spectral behavior of oil in water is inherently a highly non-linear and variable phenomenon that changes depending on oil thickness and oil/water ratios. For certain oil thicknesses there are well-defined absorption features, whereas for very thin films sometimes there are almost no observable features. Feature-based imaging spectroscopy methods are particularly effective at classifying materials that exhibit specific well-defined spectral absorption features. Statistical methods are effective at classifying materials with spectra that exhibit a considerable amount of variability and that do not necessarily exhibit well-defined spectral absorption features. This study investigates feature-based and statistical methods for analyzing oil spills using hyperspectral imagery. The appropriate use of each approach is investigated and a combined feature-based and statistical method is proposed.
Xue, Xiaonan; Kim, Mimi Y; Castle, Philip E; Strickler, Howard D
2014-03-01
Studies to evaluate clinical screening tests often face the problem that the "gold standard" diagnostic approach is costly and/or invasive. It is therefore common to verify only a subset of negative screening tests using the gold standard method. However, undersampling the screen negatives can lead to substantial overestimation of the sensitivity and underestimation of the specificity of the diagnostic test. Our objective was to develop a simple and accurate statistical method to address this "verification bias." We developed a weighted generalized estimating equation approach to estimate, in a single model, the accuracy (eg, sensitivity/specificity) of multiple assays and simultaneously compare results between assays while addressing verification bias. This approach can be implemented using standard statistical software. Simulations were conducted to assess the proposed method. An example is provided using a cervical cancer screening trial that compared the accuracy of human papillomavirus and Pap tests, with histologic data as the gold standard. The proposed approach performed well in estimating and comparing the accuracy of multiple assays in the presence of verification bias. The proposed approach is an easy to apply and accurate method for addressing verification bias in studies of multiple screening methods. Copyright © 2014 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Boichuk, T. M.; Bachinskiy, V. T.; Vanchuliak, O. Ya.; Minzer, O. P.; Garazdiuk, M.; Motrich, A. V.
2014-08-01
This research presents the results of investigation of laser polarization fluorescence of biological layers (histological sections of the myocardium). The polarized structure of autofluorescence imaging layers of biological tissues was detected and investigated. Proposed the model of describing the formation of polarization inhomogeneous of autofluorescence imaging biological optically anisotropic layers. On this basis, analytically and experimentally tested to justify the method of laser polarimetry autofluorescent. Analyzed the effectiveness of this method in the postmortem diagnosis of infarction. The objective criteria (statistical moments) of differentiation of autofluorescent images of histological sections myocardium were defined. The operational characteristics (sensitivity, specificity, accuracy) of these technique were determined.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mayer, B. P.; Mew, D. A.; DeHope, A.
Attribution of the origin of an illicit drug relies on identification of compounds indicative of its clandestine production and is a key component of many modern forensic investigations. The results of these studies can yield detailed information on method of manufacture, starting material source, and final product - all critical forensic evidence. In the present work, chemical attribution signatures (CAS) associated with the synthesis of the analgesic fentanyl, N-(1-phenylethylpiperidin-4-yl)-N-phenylpropanamide, were investigated. Six synthesis methods, all previously published fentanyl synthetic routes or hybrid versions thereof, were studied in an effort to identify and classify route-specific signatures. 160 distinct compounds and inorganicmore » species were identified using gas and liquid chromatographies combined with mass spectrometric methods (GC-MS and LCMS/ MS-TOF) in conjunction with inductively coupled plasma mass spectrometry (ICPMS). The complexity of the resultant data matrix urged the use of multivariate statistical analysis. Using partial least squares discriminant analysis (PLS-DA), 87 route-specific CAS were classified and a statistical model capable of predicting the method of fentanyl synthesis was validated and tested against CAS profiles from crude fentanyl products deposited and later extracted from two operationally relevant surfaces: stainless steel and vinyl tile. This work provides the most detailed fentanyl CAS investigation to date by using orthogonal mass spectral data to identify CAS of forensic significance for illicit drug detection, profiling, and attribution.« less
Vallée, Julie; Souris, Marc; Fournet, Florence; Bochaton, Audrey; Mobillion, Virginie; Peyronnie, Karine; Salem, Gérard
2007-06-01
Geographical objectives and probabilistic methods are difficult to reconcile in a unique health survey. Probabilistic methods focus on individuals to provide estimates of a variable's prevalence with a certain precision, while geographical approaches emphasise the selection of specific areas to study interactions between spatial characteristics and health outcomes. A sample selected from a small number of specific areas creates statistical challenges: the observations are not independent at the local level, and this results in poor statistical validity at the global level. Therefore, it is difficult to construct a sample that is appropriate for both geographical and probability methods. We used a two-stage selection procedure with a first non-random stage of selection of clusters. Instead of randomly selecting clusters, we deliberately chose a group of clusters, which as a whole would contain all the variation in health measures in the population. As there was no health information available before the survey, we selected a priori determinants that can influence the spatial homogeneity of the health characteristics. This method yields a distribution of variables in the sample that closely resembles that in the overall population, something that cannot be guaranteed with randomly-selected clusters, especially if the number of selected clusters is small. In this way, we were able to survey specific areas while minimising design effects and maximising statistical precision. We applied this strategy in a health survey carried out in Vientiane, Lao People's Democratic Republic. We selected well-known health determinants with unequal spatial distribution within the city: nationality and literacy. We deliberately selected a combination of clusters whose distribution of nationality and literacy is similar to the distribution in the general population. This paper describes the conceptual reasoning behind the construction of the survey sample and shows that it can be advantageous to choose clusters using reasoned hypotheses, based on both probability and geographical approaches, in contrast to a conventional, random cluster selection strategy.
The choice of statistical methods for comparisons of dosimetric data in radiotherapy.
Chaikh, Abdulhamid; Giraud, Jean-Yves; Perrin, Emmanuel; Bresciani, Jean-Pierre; Balosso, Jacques
2014-09-18
Novel irradiation techniques are continuously introduced in radiotherapy to optimize the accuracy, the security and the clinical outcome of treatments. These changes could raise the question of discontinuity in dosimetric presentation and the subsequent need for practice adjustments in case of significant modifications. This study proposes a comprehensive approach to compare different techniques and tests whether their respective dose calculation algorithms give rise to statistically significant differences in the treatment doses for the patient. Statistical investigation principles are presented in the framework of a clinical example based on 62 fields of radiotherapy for lung cancer. The delivered doses in monitor units were calculated using three different dose calculation methods: the reference method accounts the dose without tissues density corrections using Pencil Beam Convolution (PBC) algorithm, whereas new methods calculate the dose with tissues density correction for 1D and 3D using Modified Batho (MB) method and Equivalent Tissue air ratio (ETAR) method, respectively. The normality of the data and the homogeneity of variance between groups were tested using Shapiro-Wilks and Levene test, respectively, then non-parametric statistical tests were performed. Specifically, the dose means estimated by the different calculation methods were compared using Friedman's test and Wilcoxon signed-rank test. In addition, the correlation between the doses calculated by the three methods was assessed using Spearman's rank and Kendall's rank tests. The Friedman's test showed a significant effect on the calculation method for the delivered dose of lung cancer patients (p <0.001). The density correction methods yielded to lower doses as compared to PBC by on average (-5 ± 4.4 SD) for MB and (-4.7 ± 5 SD) for ETAR. Post-hoc Wilcoxon signed-rank test of paired comparisons indicated that the delivered dose was significantly reduced using density-corrected methods as compared to the reference method. Spearman's and Kendall's rank tests indicated a positive correlation between the doses calculated with the different methods. This paper illustrates and justifies the use of statistical tests and graphical representations for dosimetric comparisons in radiotherapy. The statistical analysis shows the significance of dose differences resulting from two or more techniques in radiotherapy.
Kanda, Junya
2016-01-01
The Transplant Registry Unified Management Program (TRUMP) made it possible for members of the Japan Society for Hematopoietic Cell Transplantation (JSHCT) to analyze large sets of national registry data on autologous and allogeneic hematopoietic stem cell transplantation. However, as the processes used to collect transplantation information are complex and differed over time, the background of these processes should be understood when using TRUMP data. Previously, information on the HLA locus of patients and donors had been collected using a questionnaire-based free-description method, resulting in some input errors. To correct minor but significant errors and provide accurate HLA matching data, the use of a Stata or EZR/R script offered by the JSHCT is strongly recommended when analyzing HLA data in the TRUMP dataset. The HLA mismatch direction, mismatch counting method, and different impacts of HLA mismatches by stem cell source are other important factors in the analysis of HLA data. Additionally, researchers should understand the statistical analyses specific for hematopoietic stem cell transplantation, such as competing risk, landmark analysis, and time-dependent analysis, to correctly analyze transplant data. The data center of the JSHCT can be contacted if statistical assistance is required.
Conducting Simulation Studies in the R Programming Environment.
Hallgren, Kevin A
2013-10-12
Simulation studies allow researchers to answer specific questions about data analysis, statistical power, and best-practices for obtaining accurate results in empirical research. Despite the benefits that simulation research can provide, many researchers are unfamiliar with available tools for conducting their own simulation studies. The use of simulation studies need not be restricted to researchers with advanced skills in statistics and computer programming, and such methods can be implemented by researchers with a variety of abilities and interests. The present paper provides an introduction to methods used for running simulation studies using the R statistical programming environment and is written for individuals with minimal experience running simulation studies or using R. The paper describes the rationale and benefits of using simulations and introduces R functions relevant for many simulation studies. Three examples illustrate different applications for simulation studies, including (a) the use of simulations to answer a novel question about statistical analysis, (b) the use of simulations to estimate statistical power, and (c) the use of simulations to obtain confidence intervals of parameter estimates through bootstrapping. Results and fully annotated syntax from these examples are provided.
Direct statistical modeling and its implications for predictive mapping in mining exploration
NASA Astrophysics Data System (ADS)
Sterligov, Boris; Gumiaux, Charles; Barbanson, Luc; Chen, Yan; Cassard, Daniel; Cherkasov, Sergey; Zolotaya, Ludmila
2010-05-01
Recent advances in geosciences make more and more multidisciplinary data available for mining exploration. This allowed developing methodologies for computing forecast ore maps from the statistical combination of such different input parameters, all based on an inverse problem theory. Numerous statistical methods (e.g. algebraic method, weight of evidence, Siris method, etc) with varying degrees of complexity in their development and implementation, have been proposed and/or adapted for ore geology purposes. In literature, such approaches are often presented through applications on natural examples and the results obtained can present specificities due to local characteristics. Moreover, though crucial for statistical computations, "minimum requirements" needed for input parameters (number of minimum data points, spatial distribution of objects, etc) are often only poorly expressed. From these, problems often arise when one has to choose between one and the other method for her/his specific question. In this study, a direct statistical modeling approach is developed in order to i) evaluate the constraints on the input parameters and ii) test the validity of different existing inversion methods. The approach particularly focused on the analysis of spatial relationships between location of points and various objects (e.g. polygons and /or polylines) which is particularly well adapted to constrain the influence of intrusive bodies - such as a granite - and faults or ductile shear-zones on spatial location of ore deposits (point objects). The method is designed in a way to insure a-dimensionality with respect to scale. In this approach, both spatial distribution and topology of objects (polygons and polylines) can be parametrized by the user (e.g. density of objects, length, surface, orientation, clustering). Then, the distance of points with respect to a given type of objects (polygons or polylines) is given using a probability distribution. The location of points is computed assuming either independency or different grades of dependency between the two probability distributions. The results show that i)polygons surface mean value, polylines length mean value, the number of objects and their clustering are critical and ii) the validity of the different tested inversion methods strongly depends on the relative importance and on the dependency between the parameters used. In addition, this combined approach of direct and inverse modeling offers an opportunity to test the robustness of the inferred distribution point laws with respect to the quality of the input data set.
15 CFR 200.103 - Consulting and advisory services.
Code of Federal Regulations, 2013 CFR
2013-01-01
...., details of design and construction, operational aspects, unusual or extreme conditions, methods of statistical control of the measurement process, automated acquisition of laboratory data, and data reduction... group seminars on the precision measurement of specific types of physical quantities, offering the...
15 CFR 200.103 - Consulting and advisory services.
Code of Federal Regulations, 2011 CFR
2011-01-01
...., details of design and construction, operational aspects, unusual or extreme conditions, methods of statistical control of the measurement process, automated acquisition of laboratory data, and data reduction... group seminars on the precision measurement of specific types of physical quantities, offering the...
Lingner, Thomas; Kataya, Amr R. A.; Reumann, Sigrun
2012-01-01
We recently developed the first algorithms specifically for plants to predict proteins carrying peroxisome targeting signals type 1 (PTS1) from genome sequences.1 As validated experimentally, the prediction methods are able to correctly predict unknown peroxisomal Arabidopsis proteins and to infer novel PTS1 tripeptides. The high prediction performance is primarily determined by the large number and sequence diversity of the underlying positive example sequences, which mainly derived from EST databases. However, a few constructs remained cytosolic in experimental validation studies, indicating sequencing errors in some ESTs. To identify erroneous sequences, we validated subcellular targeting of additional positive example sequences in the present study. Moreover, we analyzed the distribution of prediction scores separately for each orthologous group of PTS1 proteins, which generally resembled normal distributions with group-specific mean values. The cytosolic sequences commonly represented outliers of low prediction scores and were located at the very tail of a fitted normal distribution. Three statistical methods for identifying outliers were compared in terms of sensitivity and specificity.” Their combined application allows elimination of erroneous ESTs from positive example data sets. This new post-validation method will further improve the prediction accuracy of both PTS1 and PTS2 protein prediction models for plants, fungi, and mammals. PMID:22415050
Lingner, Thomas; Kataya, Amr R A; Reumann, Sigrun
2012-02-01
We recently developed the first algorithms specifically for plants to predict proteins carrying peroxisome targeting signals type 1 (PTS1) from genome sequences. As validated experimentally, the prediction methods are able to correctly predict unknown peroxisomal Arabidopsis proteins and to infer novel PTS1 tripeptides. The high prediction performance is primarily determined by the large number and sequence diversity of the underlying positive example sequences, which mainly derived from EST databases. However, a few constructs remained cytosolic in experimental validation studies, indicating sequencing errors in some ESTs. To identify erroneous sequences, we validated subcellular targeting of additional positive example sequences in the present study. Moreover, we analyzed the distribution of prediction scores separately for each orthologous group of PTS1 proteins, which generally resembled normal distributions with group-specific mean values. The cytosolic sequences commonly represented outliers of low prediction scores and were located at the very tail of a fitted normal distribution. Three statistical methods for identifying outliers were compared in terms of sensitivity and specificity." Their combined application allows elimination of erroneous ESTs from positive example data sets. This new post-validation method will further improve the prediction accuracy of both PTS1 and PTS2 protein prediction models for plants, fungi, and mammals.
NASA Astrophysics Data System (ADS)
Lotfy, Hayam Mahmoud; Hegazy, Maha Abdel Monem
2013-09-01
Four simple, specific, accurate and precise spectrophotometric methods manipulating ratio spectra were developed and validated for simultaneous determination of simvastatin (SM) and ezetimibe (EZ) namely; extended ratio subtraction (EXRSM), simultaneous ratio subtraction (SRSM), ratio difference (RDSM) and absorption factor (AFM). The proposed spectrophotometric procedures do not require any preliminary separation step. The accuracy, precision and linearity ranges of the proposed methods were determined, and the methods were validated and the specificity was assessed by analyzing synthetic mixtures containing the cited drugs. The four methods were applied for the determination of the cited drugs in tablets and the obtained results were statistically compared with each other and with those of a reported HPLC method. The comparison showed that there is no significant difference between the proposed methods and the reported method regarding both accuracy and precision.
Narayanan, Roshni; Nugent, Rebecca; Nugent, Kenneth
2015-10-01
Accreditation Council for Graduate Medical Education guidelines require internal medicine residents to develop skills in the interpretation of medical literature and to understand the principles of research. A necessary component is the ability to understand the statistical methods used and their results, material that is not an in-depth focus of most medical school curricula and residency programs. Given the breadth and depth of the current medical literature and an increasing emphasis on complex, sophisticated statistical analyses, the statistical foundation and education necessary for residents are uncertain. We reviewed the statistical methods and terms used in 49 articles discussed at the journal club in the Department of Internal Medicine residency program at Texas Tech University between January 1, 2013 and June 30, 2013. We collected information on the study type and on the statistical methods used for summarizing and comparing samples, determining the relations between independent variables and dependent variables, and estimating models. We then identified the typical statistics education level at which each term or method is learned. A total of 14 articles came from the Journal of the American Medical Association Internal Medicine, 11 from the New England Journal of Medicine, 6 from the Annals of Internal Medicine, 5 from the Journal of the American Medical Association, and 13 from other journals. Twenty reported randomized controlled trials. Summary statistics included mean values (39 articles), category counts (38), and medians (28). Group comparisons were based on t tests (14 articles), χ2 tests (21), and nonparametric ranking tests (10). The relations between dependent and independent variables were analyzed with simple regression (6 articles), multivariate regression (11), and logistic regression (8). Nine studies reported odds ratios with 95% confidence intervals, and seven analyzed test performance using sensitivity and specificity calculations. These papers used 128 statistical terms and context-defined concepts, including some from data analysis (56), epidemiology-biostatistics (31), modeling (24), data collection (12), and meta-analysis (5). Ten different software programs were used in these articles. Based on usual undergraduate and graduate statistics curricula, 64.3% of the concepts and methods used in these papers required at least a master's degree-level statistics education. The interpretation of the current medical literature can require an extensive background in statistical methods at an education level exceeding the material and resources provided to most medical students and residents. Given the complexity and time pressure of medical education, these deficiencies will be hard to correct, but this project can serve as a basis for developing a curriculum in study design and statistical methods needed by physicians-in-training.
NASA Astrophysics Data System (ADS)
Ortega-Martinez, Antonio; Padilla-Martinez, Juan Pablo; Franco, Walfre
2016-04-01
The skin contains several fluorescent molecules or fluorophores that serve as markers of structure, function and composition. UV fluorescence excitation photography is a simple and effective way to image specific intrinsic fluorophores, such as the one ascribed to tryptophan which emits at a wavelength of 345 nm upon excitation at 295 nm, and is a marker of cellular proliferation. Earlier, we built a clinical UV photography system to image cellular proliferation. In some samples, the naturally low intensity of the fluorescence can make it difficult to separate the fluorescence of cells in higher proliferation states from background fluorescence and other imaging artifacts -- like electronic noise. In this work, we describe a statistical image segmentation method to separate the fluorescence of interest. Statistical image segmentation is based on image averaging, background subtraction and pixel statistics. This method allows to better quantify the intensity and surface distributions of fluorescence, which in turn simplify the detection of borders. Using this method we delineated the borders of highly-proliferative skin conditions and diseases, in particular, allergic contact dermatitis, psoriatic lesions and basal cell carcinoma. Segmented images clearly define lesion borders. UV fluorescence excitation photography along with statistical image segmentation may serve as a quick and simple diagnostic tool for clinicians.
La lexicometrie allemande: 1898-1970 (German Lexicometry from 1898 to 1970).
ERIC Educational Resources Information Center
Njock, Pierre Emmanuel
The role of lexicometry is to furnish statistical data on all measurable aspects of vocabulary. This study presents an inventory of works on the choice of elements of the German language and outlines the methods of compiling vocabulary with specific reference to the method of choosing words useful for the teaching of a language. It also attempts…
Signatures of criticality arise from random subsampling in simple population models.
Nonnenmacher, Marcel; Behrens, Christian; Berens, Philipp; Bethge, Matthias; Macke, Jakob H
2017-10-01
The rise of large-scale recordings of neuronal activity has fueled the hope to gain new insights into the collective activity of neural ensembles. How can one link the statistics of neural population activity to underlying principles and theories? One attempt to interpret such data builds upon analogies to the behaviour of collective systems in statistical physics. Divergence of the specific heat-a measure of population statistics derived from thermodynamics-has been used to suggest that neural populations are optimized to operate at a "critical point". However, these findings have been challenged by theoretical studies which have shown that common inputs can lead to diverging specific heat. Here, we connect "signatures of criticality", and in particular the divergence of specific heat, back to statistics of neural population activity commonly studied in neural coding: firing rates and pairwise correlations. We show that the specific heat diverges whenever the average correlation strength does not depend on population size. This is necessarily true when data with correlations is randomly subsampled during the analysis process, irrespective of the detailed structure or origin of correlations. We also show how the characteristic shape of specific heat capacity curves depends on firing rates and correlations, using both analytically tractable models and numerical simulations of a canonical feed-forward population model. To analyze these simulations, we develop efficient methods for characterizing large-scale neural population activity with maximum entropy models. We find that, consistent with experimental findings, increases in firing rates and correlation directly lead to more pronounced signatures. Thus, previous reports of thermodynamical criticality in neural populations based on the analysis of specific heat can be explained by average firing rates and correlations, and are not indicative of an optimized coding strategy. We conclude that a reliable interpretation of statistical tests for theories of neural coding is possible only in reference to relevant ground-truth models.
Ganger, Michael T; Dietz, Geoffrey D; Ewing, Sarah J
2017-12-01
qPCR has established itself as the technique of choice for the quantification of gene expression. Procedures for conducting qPCR have received significant attention; however, more rigorous approaches to the statistical analysis of qPCR data are needed. Here we develop a mathematical model, termed the Common Base Method, for analysis of qPCR data based on threshold cycle values (C q ) and efficiencies of reactions (E). The Common Base Method keeps all calculations in the logscale as long as possible by working with log 10 (E) ∙ C q , which we call the efficiency-weighted C q value; subsequent statistical analyses are then applied in the logscale. We show how efficiency-weighted C q values may be analyzed using a simple paired or unpaired experimental design and develop blocking methods to help reduce unexplained variation. The Common Base Method has several advantages. It allows for the incorporation of well-specific efficiencies and multiple reference genes. The method does not necessitate the pairing of samples that must be performed using traditional analysis methods in order to calculate relative expression ratios. Our method is also simple enough to be implemented in any spreadsheet or statistical software without additional scripts or proprietary components.
Sampling methods to the statistical control of the production of blood components.
Pereira, Paulo; Seghatchian, Jerard; Caldeira, Beatriz; Santos, Paula; Castro, Rosa; Fernandes, Teresa; Xavier, Sandra; de Sousa, Gracinda; de Almeida E Sousa, João Paulo
2017-12-01
The control of blood components specifications is a requirement generalized in Europe by the European Commission Directives and in the US by the AABB standards. The use of a statistical process control methodology is recommended in the related literature, including the EDQM guideline. The control reliability is dependent of the sampling. However, a correct sampling methodology seems not to be systematically applied. Commonly, the sampling is intended to comply uniquely with the 1% specification to the produced blood components. Nevertheless, on a purely statistical viewpoint, this model could be argued not to be related to a consistent sampling technique. This could be a severe limitation to detect abnormal patterns and to assure that the production has a non-significant probability of producing nonconforming components. This article discusses what is happening in blood establishments. Three statistical methodologies are proposed: simple random sampling, sampling based on the proportion of a finite population, and sampling based on the inspection level. The empirical results demonstrate that these models are practicable in blood establishments contributing to the robustness of sampling and related statistical process control decisions for the purpose they are suggested for. Copyright © 2017 Elsevier Ltd. All rights reserved.
BetaTPred: prediction of beta-TURNS in a protein using statistical algorithms.
Kaur, Harpreet; Raghava, G P S
2002-03-01
beta-turns play an important role from a structural and functional point of view. beta-turns are the most common type of non-repetitive structures in proteins and comprise on average, 25% of the residues. In the past numerous methods have been developed to predict beta-turns in a protein. Most of these prediction methods are based on statistical approaches. In order to utilize the full potential of these methods, there is a need to develop a web server. This paper describes a web server called BetaTPred, developed for predicting beta-TURNS in a protein from its amino acid sequence. BetaTPred allows the user to predict turns in a protein using existing statistical algorithms. It also allows to predict different types of beta-TURNS e.g. type I, I', II, II', VI, VIII and non-specific. This server assists the users in predicting the consensus beta-TURNS in a protein. The server is accessible from http://imtech.res.in/raghava/betatpred/
Statistical Model Selection for TID Hardness Assurance
NASA Technical Reports Server (NTRS)
Ladbury, R.; Gorelick, J. L.; McClure, S.
2010-01-01
Radiation Hardness Assurance (RHA) methodologies against Total Ionizing Dose (TID) degradation impose rigorous statistical treatments for data from a part's Radiation Lot Acceptance Test (RLAT) and/or its historical performance. However, no similar methods exist for using "similarity" data - that is, data for similar parts fabricated in the same process as the part under qualification. This is despite the greater difficulty and potential risk in interpreting of similarity data. In this work, we develop methods to disentangle part-to-part, lot-to-lot and part-type-to-part-type variation. The methods we develop apply not just for qualification decisions, but also for quality control and detection of process changes and other "out-of-family" behavior. We begin by discussing the data used in ·the study and the challenges of developing a statistic providing a meaningful measure of degradation across multiple part types, each with its own performance specifications. We then develop analysis techniques and apply them to the different data sets.
NASA Technical Reports Server (NTRS)
Sprowls, D. O.; Bucci, R. J.; Ponchel, B. M.; Brazill, R. L.; Bretz, P. E.
1984-01-01
A technique is demonstrated for accelerated stress corrosion testing of high strength aluminum alloys. The method offers better precision and shorter exposure times than traditional pass fail procedures. The approach uses data from tension tests performed on replicate groups of smooth specimens after various lengths of exposure to static stress. The breaking strength measures degradation in the test specimen load carrying ability due to the environmental attack. Analysis of breaking load data by extreme value statistics enables the calculation of survival probabilities and a statistically defined threshold stress applicable to the specific test conditions. A fracture mechanics model is given which quantifies depth of attack in the stress corroded specimen by an effective flaw size calculated from the breaking stress and the material strength and fracture toughness properties. Comparisons are made with experimental results from three tempers of 7075 alloy plate tested by the breaking load method and by traditional tests of statistically loaded smooth tension bars and conventional precracked specimens.
A Comparison of Methods for Estimating the Determinant of High-Dimensional Covariance Matrix.
Hu, Zongliang; Dong, Kai; Dai, Wenlin; Tong, Tiejun
2017-09-21
The determinant of the covariance matrix for high-dimensional data plays an important role in statistical inference and decision. It has many real applications including statistical tests and information theory. Due to the statistical and computational challenges with high dimensionality, little work has been proposed in the literature for estimating the determinant of high-dimensional covariance matrix. In this paper, we estimate the determinant of the covariance matrix using some recent proposals for estimating high-dimensional covariance matrix. Specifically, we consider a total of eight covariance matrix estimation methods for comparison. Through extensive simulation studies, we explore and summarize some interesting comparison results among all compared methods. We also provide practical guidelines based on the sample size, the dimension, and the correlation of the data set for estimating the determinant of high-dimensional covariance matrix. Finally, from a perspective of the loss function, the comparison study in this paper may also serve as a proxy to assess the performance of the covariance matrix estimation.
Erus, Guray; Zacharaki, Evangelia I; Davatzikos, Christos
2014-04-01
This paper presents a method for capturing statistical variation of normal imaging phenotypes, with emphasis on brain structure. The method aims to estimate the statistical variation of a normative set of images from healthy individuals, and identify abnormalities as deviations from normality. A direct estimation of the statistical variation of the entire volumetric image is challenged by the high-dimensionality of images relative to smaller sample sizes. To overcome this limitation, we iteratively sample a large number of lower dimensional subspaces that capture image characteristics ranging from fine and localized to coarser and more global. Within each subspace, a "target-specific" feature selection strategy is applied to further reduce the dimensionality, by considering only imaging characteristics present in a test subject's images. Marginal probability density functions of selected features are estimated through PCA models, in conjunction with an "estimability" criterion that limits the dimensionality of estimated probability densities according to available sample size and underlying anatomy variation. A test sample is iteratively projected to the subspaces of these marginals as determined by PCA models, and its trajectory delineates potential abnormalities. The method is applied to segmentation of various brain lesion types, and to simulated data on which superiority of the iterative method over straight PCA is demonstrated. Copyright © 2014 Elsevier B.V. All rights reserved.
2011-01-01
Background This study aims to identify the statistical software applications most commonly employed for data analysis in health services research (HSR) studies in the U.S. The study also examines the extent to which information describing the specific analytical software utilized is provided in published articles reporting on HSR studies. Methods Data were extracted from a sample of 1,139 articles (including 877 original research articles) published between 2007 and 2009 in three U.S. HSR journals, that were considered to be representative of the field based upon a set of selection criteria. Descriptive analyses were conducted to categorize patterns in statistical software usage in those articles. The data were stratified by calendar year to detect trends in software use over time. Results Only 61.0% of original research articles in prominent U.S. HSR journals identified the particular type of statistical software application used for data analysis. Stata and SAS were overwhelmingly the most commonly used software applications employed (in 46.0% and 42.6% of articles respectively). However, SAS use grew considerably during the study period compared to other applications. Stratification of the data revealed that the type of statistical software used varied considerably by whether authors were from the U.S. or from other countries. Conclusions The findings highlight a need for HSR investigators to identify more consistently the specific analytical software used in their studies. Knowing that information can be important, because different software packages might produce varying results, owing to differences in the software's underlying estimation methods. PMID:21977990
Scientific computations section monthly report, November 1993
DOE Office of Scientific and Technical Information (OSTI.GOV)
Buckner, M.R.
1993-12-30
This progress report from the Savannah River Technology Center contains abstracts from papers from the computational modeling, applied statistics, applied physics, experimental thermal hydraulics, and packaging and transportation groups. Specific topics covered include: engineering modeling and process simulation, criticality methods and analysis, plutonium disposition.
Zhou, Xiangrong; Xu, Rui; Hara, Takeshi; Hirano, Yasushi; Yokoyama, Ryujiro; Kanematsu, Masayuki; Hoshi, Hiroaki; Kido, Shoji; Fujita, Hiroshi
2014-07-01
The shapes of the inner organs are important information for medical image analysis. Statistical shape modeling provides a way of quantifying and measuring shape variations of the inner organs in different patients. In this study, we developed a universal scheme that can be used for building the statistical shape models for different inner organs efficiently. This scheme combines the traditional point distribution modeling with a group-wise optimization method based on a measure called minimum description length to provide a practical means for 3D organ shape modeling. In experiments, the proposed scheme was applied to the building of five statistical shape models for hearts, livers, spleens, and right and left kidneys by use of 50 cases of 3D torso CT images. The performance of these models was evaluated by three measures: model compactness, model generalization, and model specificity. The experimental results showed that the constructed shape models have good "compactness" and satisfied the "generalization" performance for different organ shape representations; however, the "specificity" of these models should be improved in the future.
Moore, Jason H; Amos, Ryan; Kiralis, Jeff; Andrews, Peter C
2015-01-01
Simulation plays an essential role in the development of new computational and statistical methods for the genetic analysis of complex traits. Most simulations start with a statistical model using methods such as linear or logistic regression that specify the relationship between genotype and phenotype. This is appealing due to its simplicity and because these statistical methods are commonly used in genetic analysis. It is our working hypothesis that simulations need to move beyond simple statistical models to more realistically represent the biological complexity of genetic architecture. The goal of the present study was to develop a prototype genotype–phenotype simulation method and software that are capable of simulating complex genetic effects within the context of a hierarchical biology-based framework. Specifically, our goal is to simulate multilocus epistasis or gene–gene interaction where the genetic variants are organized within the framework of one or more genes, their regulatory regions and other regulatory loci. We introduce here the Heuristic Identification of Biological Architectures for simulating Complex Hierarchical Interactions (HIBACHI) method and prototype software for simulating data in this manner. This approach combines a biological hierarchy, a flexible mathematical framework, a liability threshold model for defining disease endpoints, and a heuristic search strategy for identifying high-order epistatic models of disease susceptibility. We provide several simulation examples using genetic models exhibiting independent main effects and three-way epistatic effects. PMID:25395175
2011-01-01
Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5. Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing. PMID:21849043
Optical diagnosis of cervical cancer by higher order spectra and boosting
NASA Astrophysics Data System (ADS)
Pratiher, Sawon; Mukhopadhyay, Sabyasachi; Barman, Ritwik; Pratiher, Souvik; Pradhan, Asima; Ghosh, Nirmalya; Panigrahi, Prasanta K.
2017-03-01
In this contribution, we report the application of higher order statistical moments using decision tree and ensemble based learning methodology for the development of diagnostic algorithms for optical diagnosis of cancer. The classification results were compared to those obtained with an independent feature extractors like linear discriminant analysis (LDA). The performance and efficacy of these methodology using higher order statistics as a classifier using boosting has higher specificity and sensitivity while being much faster as compared to other time-frequency domain based methods.
Statistical science: a grammar for research.
Cox, David R
2017-06-01
I greatly appreciate the invitation to give this lecture with its century long history. The title is a warning that the lecture is rather discursive and not highly focused and technical. The theme is simple. That statistical thinking provides a unifying set of general ideas and specific methods relevant whenever appreciable natural variation is present. To be most fruitful these ideas should merge seamlessly with subject-matter considerations. By contrast, there is sometimes a temptation to regard formal statistical analysis as a ritual to be added after the serious work has been done, a ritual to satisfy convention, referees, and regulatory agencies. I want implicitly to refute that idea.
Georges, Patrick
2017-01-01
This paper proposes a statistical analysis that captures similarities and differences between classical music composers with the eventual aim to understand why particular composers 'sound' different even if their 'lineages' (influences network) are similar or why they 'sound' alike if their 'lineages' are different. In order to do this we use statistical methods and measures of association or similarity (based on presence/absence of traits such as specific 'ecological' characteristics and personal musical influences) that have been developed in biosystematics, scientometrics, and bibliographic coupling. This paper also represents a first step towards a more ambitious goal of developing an evolutionary model of Western classical music.
NASA Astrophysics Data System (ADS)
Li, Xiaohui; Yang, Sibo; Fan, Rongwei; Yu, Xin; Chen, Deying
2018-06-01
In this paper, discrimination of soft tissues using laser-induced breakdown spectroscopy (LIBS) in combination with multivariate statistical methods is presented. Fresh pork fat, skin, ham, loin and tenderloin muscle tissues are manually cut into slices and ablated using a 1064 nm pulsed Nd:YAG laser. Discrimination analyses between fat, skin and muscle tissues, and further between highly similar ham, loin and tenderloin muscle tissues, are performed based on the LIBS spectra in combination with multivariate statistical methods, including principal component analysis (PCA), k nearest neighbors (kNN) classification, and support vector machine (SVM) classification. Performances of the discrimination models, including accuracy, sensitivity and specificity, are evaluated using 10-fold cross validation. The classification models are optimized to achieve best discrimination performances. The fat, skin and muscle tissues can be definitely discriminated using both kNN and SVM classifiers, with accuracy of over 99.83%, sensitivity of over 0.995 and specificity of over 0.998. The highly similar ham, loin and tenderloin muscle tissues can also be discriminated with acceptable performances. The best performances are achieved with SVM classifier using Gaussian kernel function, with accuracy of 76.84%, sensitivity of over 0.742 and specificity of over 0.869. The results show that the LIBS technique assisted with multivariate statistical methods could be a powerful tool for online discrimination of soft tissues, even for tissues of high similarity, such as muscles from different parts of the animal body. This technique could be used for discrimination of tissues suffering minor clinical changes, thus may advance the diagnosis of early lesions and abnormalities.
The Mantel-Haenszel procedure revisited: models and generalizations.
Fidler, Vaclav; Nagelkerke, Nico
2013-01-01
Several statistical methods have been developed for adjusting the Odds Ratio of the relation between two dichotomous variables X and Y for some confounders Z. With the exception of the Mantel-Haenszel method, commonly used methods, notably binary logistic regression, are not symmetrical in X and Y. The classical Mantel-Haenszel method however only works for confounders with a limited number of discrete strata, which limits its utility, and appears to have no basis in statistical models. Here we revisit the Mantel-Haenszel method and propose an extension to continuous and vector valued Z. The idea is to replace the observed cell entries in strata of the Mantel-Haenszel procedure by subject specific classification probabilities for the four possible values of (X,Y) predicted by a suitable statistical model. For situations where X and Y can be treated symmetrically we propose and explore the multinomial logistic model. Under the homogeneity hypothesis, which states that the odds ratio does not depend on Z, the logarithm of the odds ratio estimator can be expressed as a simple linear combination of three parameters of this model. Methods for testing the homogeneity hypothesis are proposed. The relationship between this method and binary logistic regression is explored. A numerical example using survey data is presented.
The Mantel-Haenszel Procedure Revisited: Models and Generalizations
Fidler, Vaclav; Nagelkerke, Nico
2013-01-01
Several statistical methods have been developed for adjusting the Odds Ratio of the relation between two dichotomous variables X and Y for some confounders Z. With the exception of the Mantel-Haenszel method, commonly used methods, notably binary logistic regression, are not symmetrical in X and Y. The classical Mantel-Haenszel method however only works for confounders with a limited number of discrete strata, which limits its utility, and appears to have no basis in statistical models. Here we revisit the Mantel-Haenszel method and propose an extension to continuous and vector valued Z. The idea is to replace the observed cell entries in strata of the Mantel-Haenszel procedure by subject specific classification probabilities for the four possible values of (X,Y) predicted by a suitable statistical model. For situations where X and Y can be treated symmetrically we propose and explore the multinomial logistic model. Under the homogeneity hypothesis, which states that the odds ratio does not depend on Z, the logarithm of the odds ratio estimator can be expressed as a simple linear combination of three parameters of this model. Methods for testing the homogeneity hypothesis are proposed. The relationship between this method and binary logistic regression is explored. A numerical example using survey data is presented. PMID:23516463
Texture analysis with statistical methods for wheat ear extraction
NASA Astrophysics Data System (ADS)
Bakhouche, M.; Cointault, F.; Gouton, P.
2007-01-01
In agronomic domain, the simplification of crop counting, necessary for yield prediction and agronomic studies, is an important project for technical institutes such as Arvalis. Although the main objective of our global project is to conceive a mobile robot for natural image acquisition directly in a field, Arvalis has proposed us first to detect by image processing the number of wheat ears in images before to count them, which will allow to obtain the first component of the yield. In this paper we compare different texture image segmentation techniques based on feature extraction by first and higher order statistical methods which have been applied on our images. The extracted features are used for unsupervised pixel classification to obtain the different classes in the image. So, the K-means algorithm is implemented before the choice of a threshold to highlight the ears. Three methods have been tested in this feasibility study with very average error of 6%. Although the evaluation of the quality of the detection is visually done, automatic evaluation algorithms are currently implementing. Moreover, other statistical methods of higher order will be implemented in the future jointly with methods based on spatio-frequential transforms and specific filtering.
Nakae, Ken; Ikegaya, Yuji; Ishikawa, Tomoe; Oba, Shigeyuki; Urakubo, Hidetoshi; Koyama, Masanori; Ishii, Shin
2014-01-01
Crosstalk between neurons and glia may constitute a significant part of information processing in the brain. We present a novel method of statistically identifying interactions in a neuron–glia network. We attempted to identify neuron–glia interactions from neuronal and glial activities via maximum-a-posteriori (MAP)-based parameter estimation by developing a generalized linear model (GLM) of a neuron–glia network. The interactions in our interest included functional connectivity and response functions. We evaluated the cross-validated likelihood of GLMs that resulted from the addition or removal of connections to confirm the existence of specific neuron-to-glia or glia-to-neuron connections. We only accepted addition or removal when the modification improved the cross-validated likelihood. We applied the method to a high-throughput, multicellular in vitro Ca2+ imaging dataset obtained from the CA3 region of a rat hippocampus, and then evaluated the reliability of connectivity estimates using a statistical test based on a surrogate method. Our findings based on the estimated connectivity were in good agreement with currently available physiological knowledge, suggesting our method can elucidate undiscovered functions of neuron–glia systems. PMID:25393874
Flood type specific construction of synthetic design hydrographs
NASA Astrophysics Data System (ADS)
Brunner, Manuela I.; Viviroli, Daniel; Sikorska, Anna E.; Vannier, Olivier; Favre, Anne-Catherine; Seibert, Jan
2017-02-01
Accurate estimates of flood peaks, corresponding volumes, and hydrographs are required to design safe and cost-effective hydraulic structures. In this paper, we propose a statistical approach for the estimation of the design variables peak and volume by constructing synthetic design hydrographs for different flood types such as flash-floods, short-rain floods, long-rain floods, and rain-on-snow floods. Our approach relies on the fitting of probability density functions to observed flood hydrographs of a certain flood type and accounts for the dependence between peak discharge and flood volume. It makes use of the statistical information contained in the data and retains the process information of the flood type. The method was tested based on data from 39 mesoscale catchments in Switzerland and provides catchment specific and flood type specific synthetic design hydrographs for all of these catchments. We demonstrate that flood type specific synthetic design hydrographs are meaningful in flood-risk management when combined with knowledge on the seasonality and the frequency of different flood types.
An Automated Method for Landmark Identification and Finite-Element Modeling of the Lumbar Spine.
Campbell, Julius Quinn; Petrella, Anthony J
2015-11-01
The purpose of this study was to develop a method for the automated creation of finite-element models of the lumbar spine. Custom scripts were written to extract bone landmarks of lumbar vertebrae and assemble L1-L5 finite-element models. End-plate borders, ligament attachment points, and facet surfaces were identified. Landmarks were identified to maintain mesh correspondence between meshes for later use in statistical shape modeling. 90 lumbar vertebrae were processed creating 18 subject-specific finite-element models. Finite-element model surfaces and ligament attachment points were reproduced within 1e-5 mm of the bone surface, including the critical contact surfaces of the facets. Element quality exceeded specifications in 97% of elements for the 18 models created. The current method is capable of producing subject-specific finite-element models of the lumbar spine with good accuracy, quality, and robustness. The automated methods developed represent advancement in the state of the art of subject-specific lumbar spine modeling to a scale not possible with prior manual and semiautomated methods.
Garbade, Sven F; Greenberg, Cheryl R; Demirkol, Mübeccel; Gökçay, Gülden; Ribes, Antonia; Campistol, Jaume; Burlina, Alberto B; Burgard, Peter; Kölker, Stefan
2014-09-01
Glutaric aciduria type I (GA-I) is a cerebral organic aciduria caused by inherited deficiency of glutaryl-CoA dehydrogenase and is characterized biochemically by an accumulation of putatively neurotoxic dicarboxylic metabolites. The majority of untreated patients develops a complex movement disorder with predominant dystonia during age 3-36 months. Magnetic resonance imaging (MRI) studies have demonstrated striatal and extrastriatal abnormalities. The major aim of this study was to elucidate the complex neuroradiological pattern of patients with GA-I and to associate the MRI findings with the severity of predominant neurological symptoms. In 180 patients, detailed information about the neurological presentation and brain region-specific MRI abnormalities were obtained via a standardized questionnaire. Patients with a movement disorder had more often MRI abnormalities in putamen, caudate, cortex, ventricles and external CSF spaces than patients without or with minor neurological symptoms. Putaminal MRI changes and strongly dilated ventricles were identified as the most reliable predictors of a movement disorder. In contrast, abnormalities in globus pallidus were not clearly associated with a movement disorder. Caudate and putamen as well as cortex, ventricles and external CSF spaces clearly collocalized on a two-dimensional map demonstrating statistical similarity and suggesting the same underlying pathomechanism. This study demonstrates that complex statistical methods are useful to decipher the age-dependent and region-specific MRI patterns of rare neurometabolic diseases and that these methods are helpful to elucidate the clinical relevance of specific MRI findings.
González, Juan R; Carrasco, Josep L; Armengol, Lluís; Villatoro, Sergi; Jover, Lluís; Yasui, Yutaka; Estivill, Xavier
2008-01-01
Background MLPA method is a potentially useful semi-quantitative method to detect copy number alterations in targeted regions. In this paper, we propose a method for the normalization procedure based on a non-linear mixed-model, as well as a new approach for determining the statistical significance of altered probes based on linear mixed-model. This method establishes a threshold by using different tolerance intervals that accommodates the specific random error variability observed in each test sample. Results Through simulation studies we have shown that our proposed method outperforms two existing methods that are based on simple threshold rules or iterative regression. We have illustrated the method using a controlled MLPA assay in which targeted regions are variable in copy number in individuals suffering from different disorders such as Prader-Willi, DiGeorge or Autism showing the best performace. Conclusion Using the proposed mixed-model, we are able to determine thresholds to decide whether a region is altered. These threholds are specific for each individual, incorporating experimental variability, resulting in improved sensitivity and specificity as the examples with real data have revealed. PMID:18522760
Akazawa, K; Nakamura, T; Moriguchi, S; Shimada, M; Nose, Y
1991-07-01
Small sample properties of the maximum partial likelihood estimates for Cox's proportional hazards model depend on the sample size, the true values of regression coefficients, covariate structure, censoring pattern and possibly baseline hazard functions. Therefore, it would be difficult to construct a formula or table to calculate the exact power of a statistical test for the treatment effect in any specific clinical trial. The simulation program, written in SAS/IML, described in this paper uses Monte-Carlo methods to provide estimates of the exact power for Cox's proportional hazards model. For illustrative purposes, the program was applied to real data obtained from a clinical trial performed in Japan. Since the program does not assume any specific function for the baseline hazard, it is, in principle, applicable to any censored survival data as long as they follow Cox's proportional hazards model.
Specification of ISS Plasma Environment Variability
NASA Technical Reports Server (NTRS)
Minow, Joseph I.; Neergaard, Linda F.; Bui, Them H.; Mikatarian, Ronald R.; Barsamian, H.; Koontz, Steven L.
2004-01-01
Quantifying spacecraft charging risks and associated hazards for the International Space Station (ISS) requires a plasma environment specification for the natural variability of ionospheric temperature (Te) and density (Ne). Empirical ionospheric specification and forecast models such as the International Reference Ionosphere (IRI) model typically only provide long term (seasonal) mean Te and Ne values for the low Earth orbit environment. This paper describes a statistical analysis of historical ionospheric low Earth orbit plasma measurements from the AE-C, AE-D, and DE-2 satellites used to derive a model of deviations of observed data values from IRI-2001 estimates of Ne, Te parameters for each data point to provide a statistical basis for modeling the deviations of the plasma environment from the IRI model output. Application of the deviation model with the IRI-2001 output yields a method for estimating extreme environments for the ISS spacecraft charging analysis.
Sampling and counting genome rearrangement scenarios
2015-01-01
Background Even for moderate size inputs, there are a tremendous number of optimal rearrangement scenarios, regardless what the model is and which specific question is to be answered. Therefore giving one optimal solution might be misleading and cannot be used for statistical inferring. Statistically well funded methods are necessary to sample uniformly from the solution space and then a small number of samples are sufficient for statistical inferring. Contribution In this paper, we give a mini-review about the state-of-the-art of sampling and counting rearrangement scenarios, focusing on the reversal, DCJ and SCJ models. Above that, we also give a Gibbs sampler for sampling most parsimonious labeling of evolutionary trees under the SCJ model. The method has been implemented and tested on real life data. The software package together with example data can be downloaded from http://www.renyi.hu/~miklosi/SCJ-Gibbs/ PMID:26452124
Statistical shape analysis using 3D Poisson equation--A quantitatively validated approach.
Gao, Yi; Bouix, Sylvain
2016-05-01
Statistical shape analysis has been an important area of research with applications in biology, anatomy, neuroscience, agriculture, paleontology, etc. Unfortunately, the proposed methods are rarely quantitatively evaluated, and as shown in recent studies, when they are evaluated, significant discrepancies exist in their outputs. In this work, we concentrate on the problem of finding the consistent location of deformation between two population of shapes. We propose a new shape analysis algorithm along with a framework to perform a quantitative evaluation of its performance. Specifically, the algorithm constructs a Signed Poisson Map (SPoM) by solving two Poisson equations on the volumetric shapes of arbitrary topology, and statistical analysis is then carried out on the SPoMs. The method is quantitatively evaluated on synthetic shapes and applied on real shape data sets in brain structures. Copyright © 2016 Elsevier B.V. All rights reserved.
Significance of noisy signals in periodograms
NASA Astrophysics Data System (ADS)
Süveges, Maria
2015-08-01
The detection of tiny periodic signals in noisy and irregularly sampled time series is a challenging task. Once a small peak is found in the periodogram, the next step is to see how probable it is that pure noise produced a peak so extreme - that is to say, compute its False Alarm Probability (FAP). This useful measure quantifies the statistical plausibility of the found signal among the noise. However, its derivation from statistical principles is very hard due to the specificities of astronomical periodograms, such as oversampling and the ensuing strong correlation among its values at different frequencies. I will present a method to compute the FAP based on extreme-value statistics (Süveges 2014), and compare it to two other methods proposed by Baluev (2008) and Paltani (2004) and Schwarzenberg-Czerny (2012) on signals with various signal shapes and at different signal-to-noise ratios.
Kao, Hui-Ju; Weng, Shun-Long; Huang, Kai-Yao; Kaunang, Fergie Joanda; Hsu, Justin Bo-Kai; Huang, Chien-Hsun; Lee, Tzong-Yi
2017-12-21
Carbonylation, which takes place through oxidation of reactive oxygen species (ROS) on specific residues, is an irreversibly oxidative modification of proteins. It has been reported that the carbonylation is related to a number of metabolic or aging diseases including diabetes, chronic lung disease, Parkinson's disease, and Alzheimer's disease. Due to the lack of computational methods dedicated to exploring motif signatures of protein carbonylation sites, we were motivated to exploit an iterative statistical method to characterize and identify carbonylated sites with motif signatures. By manually curating experimental data from research articles, we obtained 332, 144, 135, and 140 verified substrate sites for K (lysine), R (arginine), T (threonine), and P (proline) residues, respectively, from 241 carbonylated proteins. In order to examine the informative attributes for classifying between carbonylated and non-carbonylated sites, multifarious features including composition of twenty amino acids (AAC), composition of amino acid pairs (AAPC), position-specific scoring matrix (PSSM), and positional weighted matrix (PWM) were investigated in this study. Additionally, in an attempt to explore the motif signatures of carbonylation sites, an iterative statistical method was adopted to detect statistically significant dependencies of amino acid compositions between specific positions around substrate sites. Profile hidden Markov model (HMM) was then utilized to train a predictive model from each motif signature. Moreover, based on the method of support vector machine (SVM), we adopted it to construct an integrative model by combining the values of bit scores obtained from profile HMMs. The combinatorial model could provide an enhanced performance with evenly predictive sensitivity and specificity in the evaluation of cross-validation and independent testing. This study provides a new scheme for exploring potential motif signatures at substrate sites of protein carbonylation. The usefulness of the revealed motifs in the identification of carbonylated sites is demonstrated by their effective performance in cross-validation and independent testing. Finally, these substrate motifs were adopted to build an available online resource (MDD-Carb, http://csb.cse.yzu.edu.tw/MDDCarb/ ) and are also anticipated to facilitate the study of large-scale carbonylated proteomes.
This paper describes the application and method performance parameters of a Luminex xMAP™ bead-based, multiplex immunoassay for measuring specific antibody responses in saliva samples (n=5438) to antigens of six common waterborne pathogens (Campylobacter jejuni, Helicobacter pylo...
Agro-ecoregionalization of Iowa using multivariate geographical clustering
Carol L. Williams; William W. Hargrove; Matt Leibman; David E. James
2008-01-01
Agro-ecoregionalization is categorization of landscapes for use in crop suitability analysis, strategic agroeconomic development, risk analysis, and other purposes. Past agro-ecoregionalizations have been subjective, expert opinion driven, crop specific, and unsuitable for statistical extrapolation. Use of quantitative analytical methods provides an opportunity for...
Kournetas, N; Spintzyk, S; Schweizer, E; Sawada, T; Said, F; Schmid, P; Geis-Gerstorfer, J; Eliades, G; Rupp, F
2017-08-01
Comparability of topographical data of implant surfaces in literature is low and their clinical relevance often equivocal. The aim of this study was to investigate the ability of scanning electron microscopy and optical interferometry to assess statistically similar 3-dimensional roughness parameter results and to evaluate these data based on predefined criteria regarded relevant for a favorable biological response. Four different commercial dental screw-type implants (NanoTite Certain Prevail, TiUnite Brånemark Mk III, XiVE S Plus and SLA Standard Plus) were analyzed by stereo scanning electron microscopy and white light interferometry. Surface height, spatial and hybrid roughness parameters (Sa, Sz, Ssk, Sku, Sal, Str, Sdr) were assessed from raw and filtered data (Gaussian 50μm and 5μm cut-off-filters), respectively. Data were statistically compared by one-way ANOVA and Tukey-Kramer post-hoc test. For a clinically relevant interpretation, a categorizing evaluation approach was used based on predefined threshold criteria for each roughness parameter. The two methods exhibited predominantly statistical differences. Dependent on roughness parameters and filter settings, both methods showed variations in rankings of the implant surfaces and differed in their ability to discriminate the different topographies. Overall, the analyses revealed scale-dependent roughness data. Compared to the pure statistical approach, the categorizing evaluation resulted in much more similarities between the two methods. This study suggests to reconsider current approaches for the topographical evaluation of implant surfaces and to further seek after proper experimental settings. Furthermore, the specific role of different roughness parameters for the bioresponse has to be studied in detail in order to better define clinically relevant, scale-dependent and parameter-specific thresholds and ranges. Copyright © 2017 The Academy of Dental Materials. Published by Elsevier Ltd. All rights reserved.
Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J
2008-01-01
ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of predictive genomic investigations.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Piepel, Gregory F.; Matzke, Brett D.; Sego, Landon H.
2013-04-27
This report discusses the methodology, formulas, and inputs needed to make characterization and clearance decisions for Bacillus anthracis-contaminated and uncontaminated (or decontaminated) areas using a statistical sampling approach. Specifically, the report includes the methods and formulas for calculating the • number of samples required to achieve a specified confidence in characterization and clearance decisions • confidence in making characterization and clearance decisions for a specified number of samples for two common statistically based environmental sampling approaches. In particular, the report addresses an issue raised by the Government Accountability Office by providing methods and formulas to calculate the confidence that amore » decision area is uncontaminated (or successfully decontaminated) if all samples collected according to a statistical sampling approach have negative results. Key to addressing this topic is the probability that an individual sample result is a false negative, which is commonly referred to as the false negative rate (FNR). The two statistical sampling approaches currently discussed in this report are 1) hotspot sampling to detect small isolated contaminated locations during the characterization phase, and 2) combined judgment and random (CJR) sampling during the clearance phase. Typically if contamination is widely distributed in a decision area, it will be detectable via judgment sampling during the characterization phrase. Hotspot sampling is appropriate for characterization situations where contamination is not widely distributed and may not be detected by judgment sampling. CJR sampling is appropriate during the clearance phase when it is desired to augment judgment samples with statistical (random) samples. The hotspot and CJR statistical sampling approaches are discussed in the report for four situations: 1. qualitative data (detect and non-detect) when the FNR = 0 or when using statistical sampling methods that account for FNR > 0 2. qualitative data when the FNR > 0 but statistical sampling methods are used that assume the FNR = 0 3. quantitative data (e.g., contaminant concentrations expressed as CFU/cm2) when the FNR = 0 or when using statistical sampling methods that account for FNR > 0 4. quantitative data when the FNR > 0 but statistical sampling methods are used that assume the FNR = 0. For Situation 2, the hotspot sampling approach provides for stating with Z% confidence that a hotspot of specified shape and size with detectable contamination will be found. Also for Situation 2, the CJR approach provides for stating with X% confidence that at least Y% of the decision area does not contain detectable contamination. Forms of these statements for the other three situations are discussed in Section 2.2. Statistical methods that account for FNR > 0 currently only exist for the hotspot sampling approach with qualitative data (or quantitative data converted to qualitative data). This report documents the current status of methods and formulas for the hotspot and CJR sampling approaches. Limitations of these methods are identified. Extensions of the methods that are applicable when FNR = 0 to account for FNR > 0, or to address other limitations, will be documented in future revisions of this report if future funding supports the development of such extensions. For quantitative data, this report also presents statistical methods and formulas for 1. quantifying the uncertainty in measured sample results 2. estimating the true surface concentration corresponding to a surface sample 3. quantifying the uncertainty of the estimate of the true surface concentration. All of the methods and formulas discussed in the report were applied to example situations to illustrate application of the methods and interpretation of the results.« less
Mayer, Brian P.; DeHope, Alan J.; Mew, Daniel A.; ...
2016-03-24
Attribution of the origin of an illicit drug relies on identification of compounds indicative of its clandestine production and is a key component of many modern forensic investigations. Here, the results of these studies can yield detailed information on method of manufacture, starting material source, and final product, all critical forensic evidence. In the present work, chemical attribution signatures (CAS) associated with the synthesis of the analgesic fentanyl, N-(1-phenylethylpiperidin-4-yl)-N-phenylpropanamide, were investigated. Six synthesis methods, all previously published fentanyl synthetic routes or hybrid versions thereof, were studied in an effort to identify and classify route-specific signatures. A total of 160 distinctmore » compounds and inorganic species were identified using gas and liquid chromatographies combined with mass spectrometric methods (gas chromatography/mass spectrometry (GC/MS) and liquid chromatography–tandem mass spectrometry-time of-flight (LC–MS/MS-TOF)) in conjunction with inductively coupled plasma mass spectrometry (ICPMS). The complexity of the resultant data matrix urged the use of multivariate statistical analysis. Using partial least-squares-discriminant analysis (PLS-DA), 87 route-specific CAS were classified and a statistical model capable of predicting the method of fentanyl synthesis was validated and tested against CAS profiles from crude fentanyl products deposited and later extracted from two operationally relevant surfaces: stainless steel and vinyl tile. Finally, this work provides the most detailed fentanyl CAS investigation to date by using orthogonal mass spectral data to identify CAS of forensic significance for illicit drug detection, profiling, and attribution.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mayer, Brian P.; DeHope, Alan J.; Mew, Daniel A.
Attribution of the origin of an illicit drug relies on identification of compounds indicative of its clandestine production and is a key component of many modern forensic investigations. Here, the results of these studies can yield detailed information on method of manufacture, starting material source, and final product, all critical forensic evidence. In the present work, chemical attribution signatures (CAS) associated with the synthesis of the analgesic fentanyl, N-(1-phenylethylpiperidin-4-yl)-N-phenylpropanamide, were investigated. Six synthesis methods, all previously published fentanyl synthetic routes or hybrid versions thereof, were studied in an effort to identify and classify route-specific signatures. A total of 160 distinctmore » compounds and inorganic species were identified using gas and liquid chromatographies combined with mass spectrometric methods (gas chromatography/mass spectrometry (GC/MS) and liquid chromatography–tandem mass spectrometry-time of-flight (LC–MS/MS-TOF)) in conjunction with inductively coupled plasma mass spectrometry (ICPMS). The complexity of the resultant data matrix urged the use of multivariate statistical analysis. Using partial least-squares-discriminant analysis (PLS-DA), 87 route-specific CAS were classified and a statistical model capable of predicting the method of fentanyl synthesis was validated and tested against CAS profiles from crude fentanyl products deposited and later extracted from two operationally relevant surfaces: stainless steel and vinyl tile. Finally, this work provides the most detailed fentanyl CAS investigation to date by using orthogonal mass spectral data to identify CAS of forensic significance for illicit drug detection, profiling, and attribution.« less
A survey of design methods for failure detection in dynamic systems
NASA Technical Reports Server (NTRS)
Willsky, A. S.
1975-01-01
A number of methods for detecting abrupt changes (such as failures) in stochastic dynamical systems are surveyed. The class of linear systems is concentrated on but the basic concepts, if not the detailed analyses, carry over to other classes of systems. The methods surveyed range from the design of specific failure-sensitive filters, to the use of statistical tests on filter innovations, to the development of jump process formulations. Tradeoffs in complexity versus performance are discussed.
Naeger, D M; Chang, S D; Kolli, P; Shah, V; Huang, W; Thoeni, R F
2011-01-01
Objective The study compared the sensitivity, specificity, confidence and interpretation time of readers of differing experience in diagnosing acute appendicitis with contrast-enhanced CT using neutral vs positive oral contrast agents. Methods Contrast-enhanced CT for right lower quadrant or right flank pain was performed in 200 patients with neutral and 200 with positive oral contrast including 199 with proven acute appendicitis and 201 with other diagnoses. Test set disease prevalence was 50%. Two experienced gastrointestinal radiologists, one fellow and two first-year residents blindly assessed all studies for appendicitis (2000 readings) and assigned confidence scores (1=poor to 4=excellent). Receiver operating characteristic (ROC) curves were generated. Total interpretation time was recorded. Each reader's interpretation with the two agents was compared using standard statistical methods. Results Average reader sensitivity was found to be 96% (range 91–99%) with positive and 95% (89–98%) with neutral oral contrast; specificity was 96% (92–98%) and 94% (90–97%). For each reader, no statistically significant difference was found between the two agents (sensitivities p-values >0.6; specificities p-values>0.08), in the area under the ROC curve (range 0.95–0.99) or in average interpretation times. In cases without appendicitis, positive oral contrast demonstrated improved appendix identification (average 90% vs 78%) and higher confidence scores for three readers. Average interpretation times showed no statistically significant differences between the agents. Conclusion Neutral vs positive oral contrast does not affect the accuracy of contrast-enhanced CT for diagnosing acute appendicitis. Although positive oral contrast might help to identify normal appendices, we continue to use neutral oral contrast given its other potential benefits. PMID:20959365
De Los Ríos, F. A.; Paluszny, M.
2015-01-01
We consider some methods to extract information about the rotator cuff based on magnetic resonance images; the study aims to define an alternative method of display that might facilitate the detection of partial tears in the supraspinatus tendon. Specifically, we are going to use families of ellipsoidal triangular patches to cover the humerus head near the affected area. These patches are going to be textured and displayed with the information of the magnetic resonance images using the trilinear interpolation technique. For the generation of points to texture each patch, we propose a new method that guarantees the uniform distribution of its points using a random statistical method. Its computational cost, defined as the average computing time to generate a fixed number of points, is significantly lower as compared with deterministic and other standard statistical techniques. PMID:25650281
Fuzzy-logic based strategy for validation of multiplex methods: example with qualitative GMO assays.
Bellocchi, Gianni; Bertholet, Vincent; Hamels, Sandrine; Moens, W; Remacle, José; Van den Eede, Guy
2010-02-01
This paper illustrates the advantages that a fuzzy-based aggregation method could bring into the validation of a multiplex method for GMO detection (DualChip GMO kit, Eppendorf). Guidelines for validation of chemical, bio-chemical, pharmaceutical and genetic methods have been developed and ad hoc validation statistics are available and routinely used, for in-house and inter-laboratory testing, and decision-making. Fuzzy logic allows summarising the information obtained by independent validation statistics into one synthetic indicator of overall method performance. The microarray technology, introduced for simultaneous identification of multiple GMOs, poses specific validation issues (patterns of performance for a variety of GMOs at different concentrations). A fuzzy-based indicator for overall evaluation is illustrated in this paper, and applied to validation data for different genetically modified elements. Remarks were drawn on the analytical results. The fuzzy-logic based rules were shown to be applicable to improve interpretation of results and facilitate overall evaluation of the multiplex method.
Reusable Software Component Retrieval via Normalized Algebraic Specifications
1991-12-01
outputs. In fact, this method of query is simpler for matching since it relieves the system from the burden of generating a test set. Eichmann [Eich9l...September 1991. [Eich9l] Eichmann , David A., "Selecting Reusable Components Using Algebraic Specifications", Proceedings of the Second International...Technology Atlanta, Georgia 30332-0800 12. Dr. David Eichmann 1 Department of Statistics and Computer Science Knapp Hall West Virginia University Morgantown, West Virginia 26506 226
Sawmill simulation: concepts and computer use
Hugh W. Reynolds; Charles J. Gatchell
1969-01-01
Product specifications were fed into a computer so that the yield of products from the same sample of logs could be determined for simulated sawing methods. Since different sawing patterns were tested on the same sample, variation among log samples was eliminated; hence, the statistical conclusions are very precise.
Modulation of Molecular Markers by CLA.
1998-10-01
sequence information obtained for each gene fragment, a gene-specific primer was synthesized (Integrated DNA Technology, Inc, Coralville , IA) as the down...G.W. and Cochran, W.G. (1967) Statistical Methods, Ed. 6 Iowa University Press. 81. JK Beckman, T Yoshioka, SM Knobel, HL Green. Biphasic changes in
Statistical detection of patterns in unidimensional distributions by continuous wavelet transforms
NASA Astrophysics Data System (ADS)
Baluev, R. V.
2018-04-01
Objective detection of specific patterns in statistical distributions, like groupings or gaps or abrupt transitions between different subsets, is a task with a rich range of applications in astronomy: Milky Way stellar population analysis, investigations of the exoplanets diversity, Solar System minor bodies statistics, extragalactic studies, etc. We adapt the powerful technique of the wavelet transforms to this generalized task, making a strong emphasis on the assessment of the patterns detection significance. Among other things, our method also involves optimal minimum-noise wavelets and minimum-noise reconstruction of the distribution density function. Based on this development, we construct a self-closed algorithmic pipeline aimed to process statistical samples. It is currently applicable to single-dimensional distributions only, but it is flexible enough to undergo further generalizations and development.
Statistical analysis of arthroplasty data
2011-01-01
It is envisaged that guidelines for statistical analysis and presentation of results will improve the quality and value of research. The Nordic Arthroplasty Register Association (NARA) has therefore developed guidelines for the statistical analysis of arthroplasty register data. The guidelines are divided into two parts, one with an introduction and a discussion of the background to the guidelines (Ranstam et al. 2011a, see pages x-y in this issue), and this one with a more technical statistical discussion on how specific problems can be handled. This second part contains (1) recommendations for the interpretation of methods used to calculate survival, (2) recommendations on howto deal with bilateral observations, and (3) a discussion of problems and pitfalls associated with analysis of factors that influence survival or comparisons between outcomes extracted from different hospitals. PMID:21619500
Brown, Patrick O.
2013-01-01
Background High throughput molecular-interaction studies using immunoprecipitations (IP) or affinity purifications are powerful and widely used in biology research. One of many important applications of this method is to identify the set of RNAs that interact with a particular RNA-binding protein (RBP). Here, the unique statistical challenge presented is to delineate a specific set of RNAs that are enriched in one sample relative to another, typically a specific IP compared to a non-specific control to model background. The choice of normalization procedure critically impacts the number of RNAs that will be identified as interacting with an RBP at a given significance threshold – yet existing normalization methods make assumptions that are often fundamentally inaccurate when applied to IP enrichment data. Methods In this paper, we present a new normalization methodology that is specifically designed for identifying enriched RNA or DNA sequences in an IP. The normalization (called adaptive or AD normalization) uses a basic model of the IP experiment and is not a variant of mean, quantile, or other methodology previously proposed. The approach is evaluated statistically and tested with simulated and empirical data. Results and Conclusions The adaptive (AD) normalization method results in a greatly increased range in the number of enriched RNAs identified, fewer false positives, and overall better concordance with independent biological evidence, for the RBPs we analyzed, compared to median normalization. The approach is also applicable to the study of pairwise RNA, DNA and protein interactions such as the analysis of transcription factors via chromatin immunoprecipitation (ChIP) or any other experiments where samples from two conditions, one of which contains an enriched subset of the other, are studied. PMID:23349766
NASA Astrophysics Data System (ADS)
Spampinato, A.; Axinte, D. A.
2017-12-01
The mechanisms of interaction between bodies with statistically arranged features present characteristics common to different abrasive processes, such as dressing of abrasive tools. In contrast with the current empirical approach used to estimate the results of operations based on attritive interactions, the method we present in this paper allows us to predict the output forces and the topography of a simulated grinding wheel for a set of specific operational parameters (speed ratio and radial feed-rate), providing a thorough understanding of the complex mechanisms regulating these processes. In modelling the dressing mechanisms, the abrasive characteristics of both bodies (grain size, geometry, inter-space and protrusion) are first simulated; thus, their interaction is simulated in terms of grain collisions. Exploiting a specifically designed contact/impact evaluation algorithm, the model simulates the collisional effects of the dresser abrasives on the grinding wheel topography (grain fracture/break-out). The method has been tested for the case of a diamond rotary dresser, predicting output forces within less than 10% error and obtaining experimentally validated grinding wheel topographies. The study provides a fundamental understanding of the dressing operation, enabling the improvement of its performance in an industrial scenario, while being of general interest in modelling collision-based processes involving statistically distributed elements.
Causes of Death Data in the Global Burden of Disease Estimates for Ischemic and Hemorrhagic Stroke
Truelsen, Thomas; Krarup, Lars-Henrik; Iversen, Helle; Mensah, George A.; Feigin, Valery; Sposato, Luciano; Naghavi, Mohsen
2015-01-01
Background Stroke mortality estimates in the Global Burden of Disease (GBD) study are based on routine mortality statistics and redistribution of ill-defined codes that cannot be a cause of death, the so-called “garbage codes”. This study describes the contribution of these codes to stroke mortality estimates. Methods All available mortality data were compiled and non-specific cause codes were redistributed based on literature review and statistical methods. Ill-defined codes were redistributed to their specific cause of disease by age, sex, country, and year. The reassignment was done based on the international classification of diseases and the pathology behind each code by checking multiple causes of death and literature review. Results Unspecified stroke, and primary and secondary hypertension are leading contributing “garbage codes” to stroke mortality estimates for intracranial hemorrhagic stroke and ischemic stroke. There were marked differences in the fraction of death assigned to ischemic stroke and hemorrhagic stroke for unspecified stroke and hypertension between GBD regions and between age groups. Conclusions A large proportion of stroke fatalities is derived from the redistribution of “unspecified stroke” and “hypertension” with marked regional differences. Future advancements in stroke certification, data collections, and statistical analyses may improve the estimation of the global stroke burden. PMID:26505189
2013-01-01
Background The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based. Results One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to “filter” redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate. Conclusion We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the summary-statistic based approach. We also implement the summary-statistic test using Z-statistics from an already-published GWAS of Chronic Obstructive Pulmonary Disorder (COPD) and correlation structure obtained from HapMap. We experiment with the modification of this test because the correlation structure is assumed imperfectly known. PMID:24199751
Swanson, David M; Blacker, Deborah; Alchawa, Taofik; Ludwig, Kerstin U; Mangold, Elisabeth; Lange, Christoph
2013-11-07
The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based. One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to "filter" redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate. We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the summary-statistic based approach. We also implement the summary-statistic test using Z-statistics from an already-published GWAS of Chronic Obstructive Pulmonary Disorder (COPD) and correlation structure obtained from HapMap. We experiment with the modification of this test because the correlation structure is assumed imperfectly known.
Lotfy, Hayam Mahmoud; Salem, Hesham; Abdelkawy, Mohammad; Samir, Ahmed
2015-04-05
Five spectrophotometric methods were successfully developed and validated for the determination of betamethasone valerate and fusidic acid in their binary mixture. Those methods are isoabsorptive point method combined with the first derivative (ISO Point--D1) and the recently developed and well established methods namely ratio difference (RD) and constant center coupled with spectrum subtraction (CC) methods, in addition to derivative ratio (1DD) and mean centering of ratio spectra (MCR). New enrichment technique called spectrum addition technique was used instead of traditional spiking technique. The proposed spectrophotometric procedures do not require any separation steps. Accuracy, precision and linearity ranges of the proposed methods were determined and the specificity was assessed by analyzing synthetic mixtures of both drugs. They were applied to their pharmaceutical formulation and the results obtained were statistically compared to that of official methods. The statistical comparison showed that there is no significant difference between the proposed methods and the official ones regarding both accuracy and precision. Copyright © 2015 Elsevier B.V. All rights reserved.
Baseline Estimation and Outlier Identification for Halocarbons
NASA Astrophysics Data System (ADS)
Wang, D.; Schuck, T.; Engel, A.; Gallman, F.
2017-12-01
The aim of this paper is to build a baseline model for halocarbons and to statistically identify the outliers under specific conditions. In this paper, time series of regional CFC-11 and Chloromethane measurements was discussed, which taken over the last 4 years at two locations, including a monitoring station at northwest of Frankfurt am Main (Germany) and Mace Head station (Ireland). In addition to analyzing time series of CFC-11 and Chloromethane, more importantly, a statistical approach of outlier identification is also introduced in this paper in order to make a better estimation of baseline. A second-order polynomial plus harmonics are fitted to CFC-11 and chloromethane mixing ratios data. Measurements with large distance to the fitting curve are regard as outliers and flagged. Under specific requirement, the routine is iteratively adopted without the flagged measurements until no additional outliers are found. Both model fitting and the proposed outlier identification method are realized with the help of a programming language, Python. During the period, CFC-11 shows a gradual downward trend. And there is a slightly upward trend in the mixing ratios of Chloromethane. The concentration of chloromethane also has a strong seasonal variation, mostly due to the seasonal cycle of OH. The usage of this statistical method has a considerable effect on the results. This method efficiently identifies a series of outliers according to the standard deviation requirements. After removing the outliers, the fitting curves and trend estimates are more reliable.
Fuangrod, Todsaporn; Greer, Peter B; Simpson, John; Zwan, Benjamin J; Middleton, Richard H
2017-03-13
Purpose Due to increasing complexity, modern radiotherapy techniques require comprehensive quality assurance (QA) programmes, that to date generally focus on the pre-treatment stage. The purpose of this paper is to provide a method for an individual patient treatment QA evaluation and identification of a "quality gap" for continuous quality improvement. Design/methodology/approach A statistical process control (SPC) was applied to evaluate treatment delivery using in vivo electronic portal imaging device (EPID) dosimetry. A moving range control chart was constructed to monitor the individual patient treatment performance based on a control limit generated from initial data of 90 intensity-modulated radiotherapy (IMRT) and ten volumetric-modulated arc therapy (VMAT) patient deliveries. A process capability index was used to evaluate the continuing treatment quality based on three quality classes: treatment type-specific, treatment linac-specific, and body site-specific. Findings The determined control limits were 62.5 and 70.0 per cent of the χ pass-rate for IMRT and VMAT deliveries, respectively. In total, 14 patients were selected for a pilot study the results of which showed that about 1 per cent of all treatments contained errors relating to unexpected anatomical changes between treatment fractions. Both rectum and pelvis cancer treatments demonstrated process capability indices were less than 1, indicating the potential for quality improvement and hence may benefit from further assessment. Research limitations/implications The study relied on the application of in vivo EPID dosimetry for patients treated at the specific centre. Sampling patients for generating the control limits were limited to 100 patients. Whilst the quantitative results are specific to the clinical techniques and equipment used, the described method is generally applicable to IMRT and VMAT treatment QA. Whilst more work is required to determine the level of clinical significance, the authors have demonstrated the capability of the method for both treatment specific QA and continuing quality improvement. Practical implications The proposed method is a valuable tool for assessing the accuracy of treatment delivery whilst also improving treatment quality and patient safety. Originality/value Assessing in vivo EPID dosimetry with SPC can be used to improve the quality of radiation treatment for cancer patients.
An Exercise in Exploring Big Data for Producing Reliable Statistical Information.
Rey-Del-Castillo, Pilar; Cardeñosa, Jesús
2016-06-01
The availability of copious data about many human, social, and economic phenomena is considered an opportunity for the production of official statistics. National statistical organizations and other institutions are more and more involved in new projects for developing what is sometimes seen as a possible change of paradigm in the way statistical figures are produced. Nevertheless, there are hardly any systems in production using Big Data sources. Arguments of confidentiality, data ownership, representativeness, and others make it a difficult task to get results in the short term. Using Call Detail Records from Ivory Coast as an illustration, this article shows some of the issues that must be dealt with when producing statistical indicators from Big Data sources. A proposal of a graphical method to evaluate one specific aspect of the quality of the computed figures is also presented, demonstrating that the visual insight provided improves the results obtained using other traditional procedures.
Interactive semiautomatic contour delineation using statistical conditional random fields framework.
Hu, Yu-Chi; Grossberg, Michael D; Wu, Abraham; Riaz, Nadeem; Perez, Carmen; Mageras, Gig S
2012-07-01
Contouring a normal anatomical structure during radiation treatment planning requires significant time and effort. The authors present a fast and accurate semiautomatic contour delineation method to reduce the time and effort required of expert users. Following an initial segmentation on one CT slice, the user marks the target organ and nontarget pixels with a few simple brush strokes. The algorithm calculates statistics from this information that, in turn, determines the parameters of an energy function containing both boundary and regional components. The method uses a conditional random field graphical model to define the energy function to be minimized for obtaining an estimated optimal segmentation, and a graph partition algorithm to efficiently solve the energy function minimization. Organ boundary statistics are estimated from the segmentation and propagated to subsequent images; regional statistics are estimated from the simple brush strokes that are either propagated or redrawn as needed on subsequent images. This greatly reduces the user input needed and speeds up segmentations. The proposed method can be further accelerated with graph-based interpolation of alternating slices in place of user-guided segmentation. CT images from phantom and patients were used to evaluate this method. The authors determined the sensitivity and specificity of organ segmentations using physician-drawn contours as ground truth, as well as the predicted-to-ground truth surface distances. Finally, three physicians evaluated the contours for subjective acceptability. Interobserver and intraobserver analysis was also performed and Bland-Altman plots were used to evaluate agreement. Liver and kidney segmentations in patient volumetric CT images show that boundary samples provided on a single CT slice can be reused through the entire 3D stack of images to obtain accurate segmentation. In liver, our method has better sensitivity and specificity (0.925 and 0.995) than region growing (0.897 and 0.995) and level set methods (0.912 and 0.985) as well as shorter mean predicted-to-ground truth distance (2.13 mm) compared to regional growing (4.58 mm) and level set methods (8.55 mm and 4.74 mm). Similar results are observed in kidney segmentation. Physician evaluation of ten liver cases showed that 83% of contours did not need any modification, while 6% of contours needed modifications as assessed by two or more evaluators. In interobserver and intraobserver analysis, Bland-Altman plots showed our method to have better repeatability than the manual method while the delineation time was 15% faster on average. Our method achieves high accuracy in liver and kidney segmentation and considerably reduces the time and labor required for contour delineation. Since it extracts purely statistical information from the samples interactively specified by expert users, the method avoids heuristic assumptions commonly used by other methods. In addition, the method can be expanded to 3D directly without modification because the underlying graphical framework and graph partition optimization method fit naturally with the image grid structure.
Harrison, Thomas; Ruiz, Jaime; Sloan, Daniel B.; Ben-Hur, Asa; Boucher, Christina
2016-01-01
Pentatricopeptide repeat containing proteins (PPRs) bind to RNA transcripts originating from mitochondria and plastids. There are two classes of PPR proteins. The P class contains tandem P-type motif sequences, and the PLS class contains alternating P, L and S type sequences. In this paper, we describe a novel tool that predicts PPR-RNA interaction; specifically, our method, which we call aPPRove, determines where and how a PLS-class PPR protein will bind to RNA when given a PPR and one or more RNA transcripts by using a combinatorial binding code for site specificity proposed by Barkan et al. Our results demonstrate that aPPRove successfully locates how and where a PPR protein belonging to the PLS class can bind to RNA. For each binding event it outputs the binding site, the amino-acid-nucleotide interaction, and its statistical significance. Furthermore, we show that our method can be used to predict binding events for PLS-class proteins using a known edit site and the statistical significance of aligning the PPR protein to that site. In particular, we use our method to make a conjecture regarding an interaction between CLB19 and the second intronic region of ycf3. The aPPRove web server can be found at www.cs.colostate.edu/~approve. PMID:27560805
Ozaki, Vitor A.; Ghosh, Sujit K.; Goodwin, Barry K.; Shirota, Ricardo
2009-01-01
This article presents a statistical model of agricultural yield data based on a set of hierarchical Bayesian models that allows joint modeling of temporal and spatial autocorrelation. This method captures a comprehensive range of the various uncertainties involved in predicting crop insurance premium rates as opposed to the more traditional ad hoc, two-stage methods that are typically based on independent estimation and prediction. A panel data set of county-average yield data was analyzed for 290 counties in the State of Paraná (Brazil) for the period of 1990 through 2002. Posterior predictive criteria are used to evaluate different model specifications. This article provides substantial improvements in the statistical and actuarial methods often applied to the calculation of insurance premium rates. These improvements are especially relevant to situations where data are limited. PMID:19890450
Statistical Detection of Atypical Aircraft Flights
NASA Technical Reports Server (NTRS)
Statler, Irving; Chidester, Thomas; Shafto, Michael; Ferryman, Thomas; Amidan, Brett; Whitney, Paul; White, Amanda; Willse, Alan; Cooley, Scott; Jay, Joseph;
2006-01-01
A computational method and software to implement the method have been developed to sift through vast quantities of digital flight data to alert human analysts to aircraft flights that are statistically atypical in ways that signify that safety may be adversely affected. On a typical day, there are tens of thousands of flights in the United States and several times that number throughout the world. Depending on the specific aircraft design, the volume of data collected by sensors and flight recorders can range from a few dozen to several thousand parameters per second during a flight. Whereas these data have long been utilized in investigating crashes, the present method is oriented toward helping to prevent crashes by enabling routine monitoring of flight operations to identify portions of flights that may be of interest with respect to safety issues.
NASA Astrophysics Data System (ADS)
Kushnir, A. F.; Troitsky, E. V.; Haikin, L. M.; Dainty, A.
1999-06-01
A semi-automatic procedure has been developed to achieve statistically optimum discrimination between earthquakes and explosions at local or regional distances based on a learning set specific to a given region. The method is used for step-by-step testing of candidate discrimination features to find the optimum (combination) subset of features, with the decision taken on a rigorous statistical basis. Linear (LDF) and Quadratic (QDF) Discriminant Functions based on Gaussian distributions of the discrimination features are implemented and statistically grounded; the features may be transformed by the Box-Cox transformation z=(1/ α)( yα-1) to make them more Gaussian. Tests of the method were successfully conducted on seismograms from the Israel Seismic Network using features consisting of spectral ratios between and within phases. Results showed that the QDF was more effective than the LDF and required five features out of 18 candidates for the optimum set. It was found that discrimination improved with increasing distance within the local range, and that eliminating transformation of the features and failing to correct for noise led to degradation of discrimination.
A survey of design methods for failure detection in dynamic systems
NASA Technical Reports Server (NTRS)
Willsky, A. S.
1975-01-01
A number of methods for the detection of abrupt changes (such as failures) in stochastic dynamical systems were surveyed. The class of linear systems were emphasized, but the basic concepts, if not the detailed analyses, carry over to other classes of systems. The methods surveyed range from the design of specific failure-sensitive filters, to the use of statistical tests on filter innovations, to the development of jump process formulations. Tradeoffs in complexity versus performance are discussed.
[The research protocol III. Study population].
Arias-Gómez, Jesús; Villasís-Keever, Miguel Ángel; Miranda-Novales, María Guadalupe
2016-01-01
The study population is defined as a set of cases, determined, limited, and accessible, that will constitute the subjects for the selection of the sample, and must fulfill several characteristics and distinct criteria. The objectives of this manuscript are focused on specifying each one of the elements required to make the selection of the participants of a research project, during the elaboration of the protocol, including the concepts of study population, sample, selection criteria and sampling methods. After delineating the study population, the researcher must specify the criteria that each participant has to comply. The criteria that include the specific characteristics are denominated selection or eligibility criteria. These criteria are inclusion, exclusion and elimination, and will delineate the eligible population. The sampling methods are divided in two large groups: 1) probabilistic or random sampling and 2) non-probabilistic sampling. The difference lies in the employment of statistical methods to select the subjects. In every research, it is necessary to establish at the beginning the specific number of participants to be included to achieve the objectives of the study. This number is the sample size, and can be calculated or estimated with mathematical formulas and statistic software.
Statistical Coupling Analysis-Guided Library Design for the Discovery of Mutant Luciferases.
Liu, Mira D; Warner, Elliot A; Morrissey, Charlotte E; Fick, Caitlyn W; Wu, Taia S; Ornelas, Marya Y; Ochoa, Gabriela V; Zhang, Brendan S; Rathbun, Colin M; Porterfield, William B; Prescher, Jennifer A; Leconte, Aaron M
2018-02-06
Directed evolution has proven to be an invaluable tool for protein engineering; however, there is still a need for developing new approaches to continue to improve the efficiency and efficacy of these methods. Here, we demonstrate a new method for library design that applies a previously developed bioinformatic method, Statistical Coupling Analysis (SCA). SCA uses homologous enzymes to identify amino acid positions that are mutable and functionally important and engage in synergistic interactions between amino acids. We use SCA to guide a library of the protein luciferase and demonstrate that, in a single round of selection, we can identify luciferase mutants with several valuable properties. Specifically, we identify luciferase mutants that possess both red-shifted emission spectra and improved stability relative to those of the wild-type enzyme. We also identify luciferase mutants that possess a >50-fold change in specificity for modified luciferins. To understand the mutational origin of these improved mutants, we demonstrate the role of mutations at N229, S239, and G246 in altered function. These studies show that SCA can be used to guide library design and rapidly identify synergistic amino acid mutations from a small library.
Bellomo-Brandao, Maria Angela; Andrade, Paula D; Costa, Sandra CB; Escanhoela, Cecilia AF; Vassallo, Jose; Porta, Gilda; De Tommaso, Adriana MA; Hessel, Gabriel
2009-01-01
AIM: To determine cytomegalovirus (CMV) frequency in neonatal intrahepatic cholestasis by serology, histological revision (searching for cytomegalic cells), immunohistochemistry, and polymerase chain reaction (PCR), and to verify the relationships among these methods. METHODS: The study comprised 101 non-consecutive infants submitted for hepatic biopsy between March 1982 and December 2005. Serological results were obtained from the patient’s files and the other methods were performed on paraffin-embedded liver samples from hepatic biopsies. The following statistical measures were calculated: frequency, sensibility, specific positive predictive value, negative predictive value, and accuracy. RESULTS: The frequencies of positive results were as follows: serology, 7/64 (11%); histological revision, 0/84; immunohistochemistry, 1/44 (2%), and PCR, 6/77 (8%). Only one patient had positive immunohistochemical findings and a positive PCR. The following statistical measures were calculated between PCR and serology: sensitivity, 33.3%; specificity, 88.89%; positive predictive value, 28.57%; negative predictive value, 90.91%; and accuracy, 82.35%. CONCLUSION: The frequency of positive CMV varied among the tests. Serology presented the highest positive frequency. When compared to PCR, the sensitivity and positive predictive value of serology were low. PMID:19610143
Statistical Selection of Biological Models for Genome-Wide Association Analyses.
Bi, Wenjian; Kang, Guolian; Pounds, Stanley B
2018-05-24
Genome-wide association studies have discovered many biologically important associations of genes with phenotypes. Typically, genome-wide association analyses formally test the association of each genetic feature (SNP, CNV, etc) with the phenotype of interest and summarize the results with multiplicity-adjusted p-values. However, very small p-values only provide evidence against the null hypothesis of no association without indicating which biological model best explains the observed data. Correctly identifying a specific biological model may improve the scientific interpretation and can be used to more effectively select and design a follow-up validation study. Thus, statistical methodology to identify the correct biological model for a particular genotype-phenotype association can be very useful to investigators. Here, we propose a general statistical method to summarize how accurately each of five biological models (null, additive, dominant, recessive, co-dominant) represents the data observed for each variant in a GWAS study. We show that the new method stringently controls the false discovery rate and asymptotically selects the correct biological model. Simulations of two-stage discovery-validation studies show that the new method has these properties and that its validation power is similar to or exceeds that of simple methods that use the same statistical model for all SNPs. Example analyses of three data sets also highlight these advantages of the new method. An R package is freely available at www.stjuderesearch.org/site/depts/biostats/maew. Copyright © 2018. Published by Elsevier Inc.
Statistical testing and power analysis for brain-wide association study.
Gong, Weikang; Wan, Lin; Lu, Wenlian; Ma, Liang; Cheng, Fan; Cheng, Wei; Grünewald, Stefan; Feng, Jianfeng
2018-04-05
The identification of connexel-wise associations, which involves examining functional connectivities between pairwise voxels across the whole brain, is both statistically and computationally challenging. Although such a connexel-wise methodology has recently been adopted by brain-wide association studies (BWAS) to identify connectivity changes in several mental disorders, such as schizophrenia, autism and depression, the multiple correction and power analysis methods designed specifically for connexel-wise analysis are still lacking. Therefore, we herein report the development of a rigorous statistical framework for connexel-wise significance testing based on the Gaussian random field theory. It includes controlling the family-wise error rate (FWER) of multiple hypothesis testings using topological inference methods, and calculating power and sample size for a connexel-wise study. Our theoretical framework can control the false-positive rate accurately, as validated empirically using two resting-state fMRI datasets. Compared with Bonferroni correction and false discovery rate (FDR), it can reduce false-positive rate and increase statistical power by appropriately utilizing the spatial information of fMRI data. Importantly, our method bypasses the need of non-parametric permutation to correct for multiple comparison, thus, it can efficiently tackle large datasets with high resolution fMRI images. The utility of our method is shown in a case-control study. Our approach can identify altered functional connectivities in a major depression disorder dataset, whereas existing methods fail. A software package is available at https://github.com/weikanggong/BWAS. Copyright © 2018 Elsevier B.V. All rights reserved.
A novel data-driven learning method for radar target detection in nonstationary environments
Akcakaya, Murat; Nehorai, Arye; Sen, Satyabrata
2016-04-12
Most existing radar algorithms are developed under the assumption that the environment (clutter) is stationary. However, in practice, the characteristics of the clutter can vary enormously depending on the radar-operational scenarios. If unaccounted for, these nonstationary variabilities may drastically hinder the radar performance. Therefore, to overcome such shortcomings, we develop a data-driven method for target detection in nonstationary environments. In this method, the radar dynamically detects changes in the environment and adapts to these changes by learning the new statistical characteristics of the environment and by intelligibly updating its statistical detection algorithm. Specifically, we employ drift detection algorithms to detectmore » changes in the environment; incremental learning, particularly learning under concept drift algorithms, to learn the new statistical characteristics of the environment from the new radar data that become available in batches over a period of time. The newly learned environment characteristics are then integrated in the detection algorithm. Furthermore, we use Monte Carlo simulations to demonstrate that the developed method provides a significant improvement in the detection performance compared with detection techniques that are not aware of the environmental changes.« less
Networking—a statistical physics perspective
NASA Astrophysics Data System (ADS)
Yeung, Chi Ho; Saad, David
2013-03-01
Networking encompasses a variety of tasks related to the communication of information on networks; it has a substantial economic and societal impact on a broad range of areas including transportation systems, wired and wireless communications and a range of Internet applications. As transportation and communication networks become increasingly more complex, the ever increasing demand for congestion control, higher traffic capacity, quality of service, robustness and reduced energy consumption requires new tools and methods to meet these conflicting requirements. The new methodology should serve for gaining better understanding of the properties of networking systems at the macroscopic level, as well as for the development of new principled optimization and management algorithms at the microscopic level. Methods of statistical physics seem best placed to provide new approaches as they have been developed specifically to deal with nonlinear large-scale systems. This review aims at presenting an overview of tools and methods that have been developed within the statistical physics community and that can be readily applied to address the emerging problems in networking. These include diffusion processes, methods from disordered systems and polymer physics, probabilistic inference, which have direct relevance to network routing, file and frequency distribution, the exploration of network structures and vulnerability, and various other practical networking applications.
Water Quality Sensing and Spatio-Temporal Monitoring Structure with Autocorrelation Kernel Methods.
Vizcaíno, Iván P; Carrera, Enrique V; Muñoz-Romero, Sergio; Cumbal, Luis H; Rojo-Álvarez, José Luis
2017-10-16
Pollution on water resources is usually analyzed with monitoring campaigns, which consist of programmed sampling, measurement, and recording of the most representative water quality parameters. These campaign measurements yields a non-uniform spatio-temporal sampled data structure to characterize complex dynamics phenomena. In this work, we propose an enhanced statistical interpolation method to provide water quality managers with statistically interpolated representations of spatial-temporal dynamics. Specifically, our proposal makes efficient use of the a priori available information of the quality parameter measurements through Support Vector Regression (SVR) based on Mercer's kernels. The methods are benchmarked against previously proposed methods in three segments of the Machángara River and one segment of the San Pedro River in Ecuador, and their different dynamics are shown by statistically interpolated spatial-temporal maps. The best interpolation performance in terms of mean absolute error was the SVR with Mercer's kernel given by either the Mahalanobis spatial-temporal covariance matrix or by the bivariate estimated autocorrelation function. In particular, the autocorrelation kernel provides with significant improvement of the estimation quality, consistently for all the six water quality variables, which points out the relevance of including a priori knowledge of the problem.
Water Quality Sensing and Spatio-Temporal Monitoring Structure with Autocorrelation Kernel Methods
Vizcaíno, Iván P.; Muñoz-Romero, Sergio; Cumbal, Luis H.
2017-01-01
Pollution on water resources is usually analyzed with monitoring campaigns, which consist of programmed sampling, measurement, and recording of the most representative water quality parameters. These campaign measurements yields a non-uniform spatio-temporal sampled data structure to characterize complex dynamics phenomena. In this work, we propose an enhanced statistical interpolation method to provide water quality managers with statistically interpolated representations of spatial-temporal dynamics. Specifically, our proposal makes efficient use of the a priori available information of the quality parameter measurements through Support Vector Regression (SVR) based on Mercer’s kernels. The methods are benchmarked against previously proposed methods in three segments of the Machángara River and one segment of the San Pedro River in Ecuador, and their different dynamics are shown by statistically interpolated spatial-temporal maps. The best interpolation performance in terms of mean absolute error was the SVR with Mercer’s kernel given by either the Mahalanobis spatial-temporal covariance matrix or by the bivariate estimated autocorrelation function. In particular, the autocorrelation kernel provides with significant improvement of the estimation quality, consistently for all the six water quality variables, which points out the relevance of including a priori knowledge of the problem. PMID:29035333
"Adultspan" Publication Patterns: Author and Article Characteristics from 1999 to 2009
ERIC Educational Resources Information Center
Erford, Bradley T.; Clark, Kelly H.; Erford, Breann M.
2011-01-01
Publication patterns of articles in "Adultspan" from 1999 to 2009 were reviewed. Author characteristics and article content were analyzed to determine trends over time. Research articles were analyzed specifically for type of research design, classification, sampling method, types of participants, sample size, types of statistics used, and…
Federal Register 2010, 2011, 2012, 2013, 2014
2010-09-27
... Coordinating Committee (CCC), require that the Councils' science and statistical committee (SSC) members... Council's Internet site, with alternative methods of retrieval for specific documents. The words ``to the... restrictions on lobbying; the procedures for Council member nomination, including timing for submission of...
How to Engage Medical Students in Chronobiology: An Example on Autorhythmometry
ERIC Educational Resources Information Center
Rol de Lama, M. A.; Lozano, J. P.; Ortiz, V.; Sanchez-Vazquez, F. J.; Madrid, J. A.
2005-01-01
This contribution describes a new laboratory experience that improves medical students' learning of chronobiology by introducing them to basic chronobiology concepts as well as to methods and statistical analysis tools specific for circadian rhythms. We designed an autorhythmometry laboratory session where students simultaneously played the role…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Akcakaya, Murat; Nehorai, Arye; Sen, Satyabrata
Most existing radar algorithms are developed under the assumption that the environment (clutter) is stationary. However, in practice, the characteristics of the clutter can vary enormously depending on the radar-operational scenarios. If unaccounted for, these nonstationary variabilities may drastically hinder the radar performance. Therefore, to overcome such shortcomings, we develop a data-driven method for target detection in nonstationary environments. In this method, the radar dynamically detects changes in the environment and adapts to these changes by learning the new statistical characteristics of the environment and by intelligibly updating its statistical detection algorithm. Specifically, we employ drift detection algorithms to detectmore » changes in the environment; incremental learning, particularly learning under concept drift algorithms, to learn the new statistical characteristics of the environment from the new radar data that become available in batches over a period of time. The newly learned environment characteristics are then integrated in the detection algorithm. Furthermore, we use Monte Carlo simulations to demonstrate that the developed method provides a significant improvement in the detection performance compared with detection techniques that are not aware of the environmental changes.« less
Zhu, Wensheng; Yuan, Ying; Zhang, Jingwen; Zhou, Fan; Knickmeyer, Rebecca C; Zhu, Hongtu
2017-02-01
The aim of this paper is to systematically evaluate a biased sampling issue associated with genome-wide association analysis (GWAS) of imaging phenotypes for most imaging genetic studies, including the Alzheimer's Disease Neuroimaging Initiative (ADNI). Specifically, the original sampling scheme of these imaging genetic studies is primarily the retrospective case-control design, whereas most existing statistical analyses of these studies ignore such sampling scheme by directly correlating imaging phenotypes (called the secondary traits) with genotype. Although it has been well documented in genetic epidemiology that ignoring the case-control sampling scheme can produce highly biased estimates, and subsequently lead to misleading results and suspicious associations, such findings are not well documented in imaging genetics. We use extensive simulations and a large-scale imaging genetic data analysis of the Alzheimer's Disease Neuroimaging Initiative (ADNI) data to evaluate the effects of the case-control sampling scheme on GWAS results based on some standard statistical methods, such as linear regression methods, while comparing it with several advanced statistical methods that appropriately adjust for the case-control sampling scheme. Copyright © 2016 Elsevier Inc. All rights reserved.
Defining window-boundaries for genomic analyses using smoothing spline techniques
Beissinger, Timothy M.; Rosa, Guilherme J.M.; Kaeppler, Shawn M.; ...
2015-04-17
High-density genomic data is often analyzed by combining information over windows of adjacent markers. Interpretation of data grouped in windows versus at individual locations may increase statistical power, simplify computation, reduce sampling noise, and reduce the total number of tests performed. However, use of adjacent marker information can result in over- or under-smoothing, undesirable window boundary specifications, or highly correlated test statistics. We introduce a method for defining windows based on statistically guided breakpoints in the data, as a foundation for the analysis of multiple adjacent data points. This method involves first fitting a cubic smoothing spline to the datamore » and then identifying the inflection points of the fitted spline, which serve as the boundaries of adjacent windows. This technique does not require prior knowledge of linkage disequilibrium, and therefore can be applied to data collected from individual or pooled sequencing experiments. Moreover, in contrast to existing methods, an arbitrary choice of window size is not necessary, since these are determined empirically and allowed to vary along the genome.« less
Campbell, J Q; Petrella, A J
2016-09-06
Population-based modeling of the lumbar spine has the potential to be a powerful clinical tool. However, developing a fully parameterized model of the lumbar spine with accurate geometry has remained a challenge. The current study used automated methods for landmark identification to create a statistical shape model of the lumbar spine. The shape model was evaluated using compactness, generalization ability, and specificity. The primary shape modes were analyzed visually, quantitatively, and biomechanically. The biomechanical analysis was performed by using the statistical shape model with an automated method for finite element model generation to create a fully parameterized finite element model of the lumbar spine. Functional finite element models of the mean shape and the extreme shapes (±3 standard deviations) of all 17 shape modes were created demonstrating the robust nature of the methods. This study represents an advancement in finite element modeling of the lumbar spine and will allow population-based modeling in the future. Copyright © 2016 Elsevier Ltd. All rights reserved.
Bootstrapping under constraint for the assessment of group behavior in human contact networks
NASA Astrophysics Data System (ADS)
Tremblay, Nicolas; Barrat, Alain; Forest, Cary; Nornberg, Mark; Pinton, Jean-François; Borgnat, Pierre
2013-11-01
The increasing availability of time- and space-resolved data describing human activities and interactions gives insights into both static and dynamic properties of human behavior. In practice, nevertheless, real-world data sets can often be considered as only one realization of a particular event. This highlights a key issue in social network analysis: the statistical significance of estimated properties. In this context, we focus here on the assessment of quantitative features of specific subset of nodes in empirical networks. We present a method of statistical resampling based on bootstrapping groups of nodes under constraints within the empirical network. The method enables us to define acceptance intervals for various null hypotheses concerning relevant properties of the subset of nodes under consideration in order to characterize by a statistical test its behavior as “normal” or not. We apply this method to a high-resolution data set describing the face-to-face proximity of individuals during two colocated scientific conferences. As a case study, we show how to probe whether colocating the two conferences succeeded in bringing together the two corresponding groups of scientists.
Hu, Xiangdong; Liu, Yujiang; Qian, Linxue
2017-10-01
Real-time elastography (RTE) and shear wave elastography (SWE) are noninvasive and easily available imaging techniques that measure the tissue strain, and it has been reported that the sensitivity and the specificity of elastography were better in differentiating between benign and malignant thyroid nodules than conventional technologies. Relevant articles were searched in multiple databases; the comparison of elasticity index (EI) was conducted with the Review Manager 5.0. Forest plots of the sensitivity and specificity and SROC curve of RTE and SWE were performed with STATA 10.0 software. In addition, sensitivity analysis and bias analysis of the studies were conducted to examine the quality of articles; and to estimate possible publication bias, funnel plot was used and the Egger test was conducted. Finally 22 articles which eventually satisfied the inclusion criteria were included in this study. After eliminating the inefficient, benign and malignant nodules were 2106 and 613, respectively. The meta-analysis suggested that the difference of EI between benign and malignant nodules was statistically significant (SMD = 2.11, 95% CI [1.67, 2.55], P < .00001). The overall sensitivities of RTE and SWE were roughly comparable, whereas the difference of specificities between these 2 methods was statistically significant. In addition, statistically significant difference of AUC between RTE and SWE was observed between RTE and SWE (P < .01). The specificity of RTE was statistically higher than that of SWE; which suggests that compared with SWE, RTE may be more accurate on differentiating benign and malignant thyroid nodules.
A user-targeted synthesis of the VALUE perfect predictor experiment
NASA Astrophysics Data System (ADS)
Maraun, Douglas; Widmann, Martin; Gutierrez, Jose; Kotlarski, Sven; Hertig, Elke; Wibig, Joanna; Rössler, Ole; Huth, Radan
2016-04-01
VALUE is an open European network to validate and compare downscaling methods for climate change research. A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. VALUE's main approach to validation is user-focused: starting from a specific user problem, a validation tree guides the selection of relevant validation indices and performance measures. We consider different aspects: (1) marginal aspects such as mean, variance and extremes; (2) temporal aspects such as spell length characteristics; (3) spatial aspects such as the de-correlation length of precipitation extremes; and multi-variate aspects such as the interplay of temperature and precipitation or scale-interactions. Several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur. Experiment 1 (perfect predictors): what is the isolated downscaling skill? How do statistical and dynamical methods compare? How do methods perform at different spatial scales? Experiment 2 (Global climate model predictors): how is the overall representation of regional climate, including errors inherited from global climate models? Experiment 3 (pseudo reality): do methods fail in representing regional climate change? Here, we present a user-targeted synthesis of the results of the first VALUE experiment. In this experiment, downscaling methods are driven with ERA-Interim reanalysis data to eliminate global climate model errors, over the period 1979-2008. As reference data we use, depending on the question addressed, (1) observations from 86 meteorological stations distributed across Europe; (2) gridded observations at the corresponding 86 locations or (3) gridded spatially extended observations for selected European regions. With more than 40 contributing methods, this study is the most comprehensive downscaling inter-comparison project so far. The results clearly indicate that for several aspects, the downscaling skill varies considerably between different methods. For specific purposes, some methods can therefore clearly be excluded.
Statistical modeling of 4D respiratory lung motion using diffeomorphic image registration.
Ehrhardt, Jan; Werner, René; Schmidt-Richberg, Alexander; Handels, Heinz
2011-02-01
Modeling of respiratory motion has become increasingly important in various applications of medical imaging (e.g., radiation therapy of lung cancer). Current modeling approaches are usually confined to intra-patient registration of 3D image data representing the individual patient's anatomy at different breathing phases. We propose an approach to generate a mean motion model of the lung based on thoracic 4D computed tomography (CT) data of different patients to extend the motion modeling capabilities. Our modeling process consists of three steps: an intra-subject registration to generate subject-specific motion models, the generation of an average shape and intensity atlas of the lung as anatomical reference frame, and the registration of the subject-specific motion models to the atlas in order to build a statistical 4D mean motion model (4D-MMM). Furthermore, we present methods to adapt the 4D mean motion model to a patient-specific lung geometry. In all steps, a symmetric diffeomorphic nonlinear intensity-based registration method was employed. The Log-Euclidean framework was used to compute statistics on the diffeomorphic transformations. The presented methods are then used to build a mean motion model of respiratory lung motion using thoracic 4D CT data sets of 17 patients. We evaluate the model by applying it for estimating respiratory motion of ten lung cancer patients. The prediction is evaluated with respect to landmark and tumor motion, and the quantitative analysis results in a mean target registration error (TRE) of 3.3 ±1.6 mm if lung dynamics are not impaired by large lung tumors or other lung disorders (e.g., emphysema). With regard to lung tumor motion, we show that prediction accuracy is independent of tumor size and tumor motion amplitude in the considered data set. However, tumors adhering to non-lung structures degrade local lung dynamics significantly and the model-based prediction accuracy is lower in these cases. The statistical respiratory motion model is capable of providing valuable prior knowledge in many fields of applications. We present two examples of possible applications in radiation therapy and image guided diagnosis.
Shinzato, Takashi
2016-12-01
The portfolio optimization problem in which the variances of the return rates of assets are not identical is analyzed in this paper using the methodology of statistical mechanical informatics, specifically, replica analysis. We defined two characteristic quantities of an optimal portfolio, namely, minimal investment risk and investment concentration, in order to solve the portfolio optimization problem and analytically determined their asymptotical behaviors using replica analysis. Numerical experiments were also performed, and a comparison between the results of our simulation and those obtained via replica analysis validated our proposed method.
NASA Astrophysics Data System (ADS)
Shinzato, Takashi
2016-12-01
The portfolio optimization problem in which the variances of the return rates of assets are not identical is analyzed in this paper using the methodology of statistical mechanical informatics, specifically, replica analysis. We defined two characteristic quantities of an optimal portfolio, namely, minimal investment risk and investment concentration, in order to solve the portfolio optimization problem and analytically determined their asymptotical behaviors using replica analysis. Numerical experiments were also performed, and a comparison between the results of our simulation and those obtained via replica analysis validated our proposed method.
Assigning African elephant DNA to geographic region of origin: Applications to the ivory trade
Wasser, Samuel K.; Shedlock, Andrew M.; Comstock, Kenine; Ostrander, Elaine A.; Mutayoba, Benezeth; Stephens, Matthew
2004-01-01
Resurgence of illicit trade in African elephant ivory is placing the elephant at renewed risk. Regulation of this trade could be vastly improved by the ability to verify the geographic origin of tusks. We address this need by developing a combined genetic and statistical method to determine the origin of poached ivory. Our statistical approach exploits a smoothing method to estimate geographic-specific allele frequencies over the entire African elephants' range for 16 microsatellite loci, using 315 tissue and 84 scat samples from forest (Loxodonta africana cyclotis) and savannah (Loxodonta africana africana) elephants at 28 locations. These geographic-specific allele frequency estimates are used to infer the geographic origin of DNA samples, such as could be obtained from tusks of unknown origin. We demonstrate that our method alleviates several problems associated with standard assignment methods in this context, and the absolute accuracy of our method is high. Continent-wide, 50% of samples were located within 500 km, and 80% within 932 km of their actual place of origin. Accuracy varied by region (median accuracies: West Africa, 135 km; Central Savannah, 286 km; Central Forest, 411 km; South, 535 km; and East, 697 km). In some cases, allele frequencies vary considerably over small geographic regions, making much finer discriminations possible and suggesting that resolution could be further improved by collection of samples from locations not represented in our study. PMID:15459317
Statistical assessment of the learning curves of health technologies.
Ramsay, C R; Grant, A M; Wallace, S A; Garthwaite, P H; Monk, A F; Russell, I T
2001-01-01
(1) To describe systematically studies that directly assessed the learning curve effect of health technologies. (2) Systematically to identify 'novel' statistical techniques applied to learning curve data in other fields, such as psychology and manufacturing. (3) To test these statistical techniques in data sets from studies of varying designs to assess health technologies in which learning curve effects are known to exist. METHODS - STUDY SELECTION (HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW): For a study to be included, it had to include a formal analysis of the learning curve of a health technology using a graphical, tabular or statistical technique. METHODS - STUDY SELECTION (NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH): For a study to be included, it had to include a formal assessment of a learning curve using a statistical technique that had not been identified in the previous search. METHODS - DATA SOURCES: Six clinical and 16 non-clinical biomedical databases were searched. A limited amount of handsearching and scanning of reference lists was also undertaken. METHODS - DATA EXTRACTION (HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW): A number of study characteristics were abstracted from the papers such as study design, study size, number of operators and the statistical method used. METHODS - DATA EXTRACTION (NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH): The new statistical techniques identified were categorised into four subgroups of increasing complexity: exploratory data analysis; simple series data analysis; complex data structure analysis, generic techniques. METHODS - TESTING OF STATISTICAL METHODS: Some of the statistical methods identified in the systematic searches for single (simple) operator series data and for multiple (complex) operator series data were illustrated and explored using three data sets. The first was a case series of 190 consecutive laparoscopic fundoplication procedures performed by a single surgeon; the second was a case series of consecutive laparoscopic cholecystectomy procedures performed by ten surgeons; the third was randomised trial data derived from the laparoscopic procedure arm of a multicentre trial of groin hernia repair, supplemented by data from non-randomised operations performed during the trial. RESULTS - HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW: Of 4571 abstracts identified, 272 (6%) were later included in the study after review of the full paper. Some 51% of studies assessed a surgical minimal access technique and 95% were case series. The statistical method used most often (60%) was splitting the data into consecutive parts (such as halves or thirds), with only 14% attempting a more formal statistical analysis. The reporting of the studies was poor, with 31% giving no details of data collection methods. RESULTS - NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH: Of 9431 abstracts assessed, 115 (1%) were deemed appropriate for further investigation and, of these, 18 were included in the study. All of the methods for complex data sets were identified in the non-clinical literature. These were discriminant analysis, two-stage estimation of learning rates, generalised estimating equations, multilevel models, latent curve models, time series models and stochastic parameter models. In addition, eight new shapes of learning curves were identified. RESULTS - TESTING OF STATISTICAL METHODS: No one particular shape of learning curve performed significantly better than another. The performance of 'operation time' as a proxy for learning differed between the three procedures. Multilevel modelling using the laparoscopic cholecystectomy data demonstrated and measured surgeon-specific and confounding effects. The inclusion of non-randomised cases, despite the possible limitations of the method, enhanced the interpretation of learning effects. CONCLUSIONS - HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW: The statistical methods used for assessing learning effects in health technology assessment have been crude and the reporting of studies poor. CONCLUSIONS - NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH: A number of statistical methods for assessing learning effects were identified that had not hitherto been used in health technology assessment. There was a hierarchy of methods for the identification and measurement of learning, and the more sophisticated methods for both have had little if any use in health technology assessment. This demonstrated the value of considering fields outside clinical research when addressing methodological issues in health technology assessment. CONCLUSIONS - TESTING OF STATISTICAL METHODS: It has been demonstrated that the portfolio of techniques identified can enhance investigations of learning curve effects. (ABSTRACT TRUNCATED)
Simulation of financial market via nonlinear Ising model
NASA Astrophysics Data System (ADS)
Ko, Bonggyun; Song, Jae Wook; Chang, Woojin
2016-09-01
In this research, we propose a practical method for simulating the financial return series whose distribution has a specific heaviness. We employ the Ising model for generating financial return series to be analogous to those of the real series. The similarity between real financial return series and simulated one is statistically verified based on their stylized facts including the power law behavior of tail distribution. We also suggest the scheme for setting the parameters in order to simulate the financial return series with specific tail behavior. The simulation method introduced in this paper is expected to be applied to the other financial products whose price return distribution is fat-tailed.
Hickey, Graeme L; Dunning, Joel; Seifert, Burkhardt; Sodeck, Gottfried; Carr, Matthew J; Burger, Hans Ulrich; Beyersdorf, Friedhelm
2015-08-01
As part of the peer review process for the European Journal of Cardio-Thoracic Surgery (EJCTS) and the Interactive CardioVascular and Thoracic Surgery (ICVTS), a statistician reviews any manuscript that includes a statistical analysis. To facilitate authors considering submitting a manuscript and to make it clearer about the expectations of the statistical reviewers, we present up-to-date guidelines for authors on statistical and data reporting specifically in these journals. The number of statistical methods used in the cardiothoracic literature is vast, as are the ways in which data are presented. Therefore, we narrow the scope of these guidelines to cover the most common applications submitted to the EJCTS and ICVTS, focusing in particular on those that the statistical reviewers most frequently comment on. © The Author 2015. Published by Oxford University Press on behalf of the European Association for Cardio-Thoracic Surgery. All rights reserved.
Kennedy, Richard; Pankratz, V. Shane; Swanson, Eric; Watson, David; Golding, Hana; Poland, Gregory A.
2009-01-01
Because of the bioterrorism threat posed by agents such as variola virus, considerable time, resources, and effort have been devoted to biodefense preparation. One avenue of this research has been the development of rapid, sensitive, high-throughput assays to validate immune responses to poxviruses. Here we describe the adaptation of a β-galactosidase reporter-based vaccinia virus neutralization assay to large-scale use in a study that included over 1,000 subjects. We also describe the statistical methods involved in analyzing the large quantity of data generated. The assay and its associated methods should prove useful tools in monitoring immune responses to next-generation smallpox vaccines, studying poxvirus immunity, and evaluating therapeutic agents such as vaccinia virus immune globulin. PMID:19535540
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moses, Alan M.; Chiang, Derek Y.; Pollard, Daniel A.
2004-10-28
We introduce a method (MONKEY) to identify conserved transcription-factor binding sites in multispecies alignments. MONKEY employs probabilistic models of factor specificity and binding site evolution, on which basis we compute the likelihood that putative sites are conserved and assign statistical significance to each hit. Using genomes from the genus Saccharomyces, we illustrate how the significance of real sites increases with evolutionary distance and explore the relationship between conservation and function.
Genome-Wide Specific Selection in Three Domestic Sheep Breeds.
Wang, Huihua; Zhang, Li; Cao, Jiaxve; Wu, Mingming; Ma, Xiaomeng; Liu, Zhen; Liu, Ruizao; Zhao, Fuping; Wei, Caihong; Du, Lixin
2015-01-01
Commercial sheep raised for mutton grow faster than traditional Chinese sheep breeds. Here, we aimed to evaluate genetic selection among three different types of sheep breed: two well-known commercial mutton breeds and one indigenous Chinese breed. We first combined locus-specific branch lengths and di statistical methods to detect candidate regions targeted by selection in the three different populations. The results showed that the genetic distances reached at least medium divergence for each pairwise combination. We found these two methods were highly correlated, and identified many growth-related candidate genes undergoing artificial selection. For production traits, APOBR and FTO are associated with body mass index. For meat traits, ALDOA, STK32B and FAM190A are related to marbling. For reproduction traits, CCNB2 and SLC8A3 affect oocyte development. We also found two well-known genes, GHR (which affects meat production and quality) and EDAR (associated with hair thickness) were associated with German mutton merino sheep. Furthermore, four genes (POL, RPL7, MSL1 and SHISA9) were associated with pre-weaning gain in our previous genome-wide association study. Our results indicated that combine locus-specific branch lengths and di statistical approaches can reduce the searching ranges for specific selection. And we got many credible candidate genes which not only confirm the results of previous reports, but also provide a suite of novel candidate genes in defined breeds to guide hybridization breeding.
Liver segmentation from CT images using a sparse priori statistical shape model (SP-SSM).
Wang, Xuehu; Zheng, Yongchang; Gan, Lan; Wang, Xuan; Sang, Xinting; Kong, Xiangfeng; Zhao, Jie
2017-01-01
This study proposes a new liver segmentation method based on a sparse a priori statistical shape model (SP-SSM). First, mark points are selected in the liver a priori model and the original image. Then, the a priori shape and its mark points are used to obtain a dictionary for the liver boundary information. Second, the sparse coefficient is calculated based on the correspondence between mark points in the original image and those in the a priori model, and then the sparse statistical model is established by combining the sparse coefficients and the dictionary. Finally, the intensity energy and boundary energy models are built based on the intensity information and the specific boundary information of the original image. Then, the sparse matching constraint model is established based on the sparse coding theory. These models jointly drive the iterative deformation of the sparse statistical model to approximate and accurately extract the liver boundaries. This method can solve the problems of deformation model initialization and a priori method accuracy using the sparse dictionary. The SP-SSM can achieve a mean overlap error of 4.8% and a mean volume difference of 1.8%, whereas the average symmetric surface distance and the root mean square symmetric surface distance can reach 0.8 mm and 1.4 mm, respectively.
Developing points-based risk-scoring systems in the presence of competing risks.
Austin, Peter C; Lee, Douglas S; D'Agostino, Ralph B; Fine, Jason P
2016-09-30
Predicting the occurrence of an adverse event over time is an important issue in clinical medicine. Clinical prediction models and associated points-based risk-scoring systems are popular statistical methods for summarizing the relationship between a multivariable set of patient risk factors and the risk of the occurrence of an adverse event. Points-based risk-scoring systems are popular amongst physicians as they permit a rapid assessment of patient risk without the use of computers or other electronic devices. The use of such points-based risk-scoring systems facilitates evidence-based clinical decision making. There is a growing interest in cause-specific mortality and in non-fatal outcomes. However, when considering these types of outcomes, one must account for competing risks whose occurrence precludes the occurrence of the event of interest. We describe how points-based risk-scoring systems can be developed in the presence of competing events. We illustrate the application of these methods by developing risk-scoring systems for predicting cardiovascular mortality in patients hospitalized with acute myocardial infarction. Code in the R statistical programming language is provided for the implementation of the described methods. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Quantum theory of multiscale coarse-graining.
Han, Yining; Jin, Jaehyeok; Wagner, Jacob W; Voth, Gregory A
2018-03-14
Coarse-grained (CG) models serve as a powerful tool to simulate molecular systems at much longer temporal and spatial scales. Previously, CG models and methods have been built upon classical statistical mechanics. The present paper develops a theory and numerical methodology for coarse-graining in quantum statistical mechanics, by generalizing the multiscale coarse-graining (MS-CG) method to quantum Boltzmann statistics. A rigorous derivation of the sufficient thermodynamic consistency condition is first presented via imaginary time Feynman path integrals. It identifies the optimal choice of CG action functional and effective quantum CG (qCG) force field to generate a quantum MS-CG (qMS-CG) description of the equilibrium system that is consistent with the quantum fine-grained model projected onto the CG variables. A variational principle then provides a class of algorithms for optimally approximating the qMS-CG force fields. Specifically, a variational method based on force matching, which was also adopted in the classical MS-CG theory, is generalized to quantum Boltzmann statistics. The qMS-CG numerical algorithms and practical issues in implementing this variational minimization procedure are also discussed. Then, two numerical examples are presented to demonstrate the method. Finally, as an alternative strategy, a quasi-classical approximation for the thermal density matrix expressed in the CG variables is derived. This approach provides an interesting physical picture for coarse-graining in quantum Boltzmann statistical mechanics in which the consistency with the quantum particle delocalization is obviously manifest, and it opens up an avenue for using path integral centroid-based effective classical force fields in a coarse-graining methodology.
The score statistic of the LD-lod analysis: detecting linkage adaptive to linkage disequilibrium.
Huang, J; Jiang, Y
2001-01-01
We study the properties of a modified lod score method for testing linkage that incorporates linkage disequilibrium (LD-lod). By examination of its score statistic, we show that the LD-lod score method adaptively combines two sources of information: (a) the IBD sharing score which is informative for linkage regardless of the existence of LD and (b) the contrast between allele-specific IBD sharing scores which is informative for linkage only in the presence of LD. We also consider the connection between the LD-lod score method and the transmission-disequilibrium test (TDT) for triad data and the mean test for affected sib pair (ASP) data. We show that, for triad data, the recessive LD-lod test is asymptotically equivalent to the TDT; and for ASP data, it is an adaptive combination of the TDT and the ASP mean test. We demonstrate that the LD-lod score method has relatively good statistical efficiency in comparison with the ASP mean test and the TDT for a broad range of LD and the genetic models considered in this report. Therefore, the LD-lod score method is an interesting approach for detecting linkage when the extent of LD is unknown, such as in a genome-wide screen with a dense set of genetic markers. Copyright 2001 S. Karger AG, Basel
Pitfalls of national routine death statistics for maternal mortality study.
Saucedo, Monica; Bouvier-Colle, Marie-Hélène; Chantry, Anne A; Lamarche-Vadel, Agathe; Rey, Grégoire; Deneux-Tharaux, Catherine
2014-11-01
The lessons learned from the study of maternal deaths depend on the accuracy of data. Our objective was to assess time trends in the underestimation of maternal mortality (MM) in the national routine death statistics in France and to evaluate their current accuracy for the selection and causes of maternal deaths. National data obtained by enhanced methods in 1989, 1999, and 2007-09 were used as the gold standard to assess time trends in the underestimation of MM ratios (MMRs) in death statistics. Enhanced data and death statistics for 2007-09 were further compared by characterising false negatives (FNs) and false positives (FPs). The distribution of cause-specific MMRs, as assessed by each system, was described. Underestimation of MM in death statistics decreased from 55.6% in 1989 to 11.4% in 2007-09 (P < 0.001). In 2007-09, of 787 pregnancy-associated deaths, 254 were classified as maternal by the enhanced system and 211 by the death statistics; 34% of maternal deaths in the enhanced system were FNs in the death statistics, and 20% of maternal deaths in the death statistics were FPs. The hierarchy of causes of MM differed between the two systems. The discordances were mainly explained by the lack of precision in the drafting of death certificates by clinicians. Although the underestimation of MM in routine death statistics has decreased substantially over time, one third of maternal deaths remain unidentified, and the main causes of death are incorrectly identified in these data. Defining relevant priorities in maternal health requires the use of enhanced methods for MM study. © 2014 John Wiley & Sons Ltd.
Aerobic conditioning for team sport athletes.
Stone, Nicholas M; Kilding, Andrew E
2009-01-01
Team sport athletes require a high level of aerobic fitness in order to generate and maintain power output during repeated high-intensity efforts and to recover. Research to date suggests that these components can be increased by regularly performing aerobic conditioning. Traditional aerobic conditioning, with minimal changes of direction and no skill component, has been demonstrated to effectively increase aerobic function within a 4- to 10-week period in team sport players. More importantly, traditional aerobic conditioning methods have been shown to increase team sport performance substantially. Many team sports require the upkeep of both aerobic fitness and sport-specific skills during a lengthy competitive season. Classic team sport trainings have been shown to evoke marginal increases/decreases in aerobic fitness. In recent years, aerobic conditioning methods have been designed to allow adequate intensities to be achieved to induce improvements in aerobic fitness whilst incorporating movement-specific and skill-specific tasks, e.g. small-sided games and dribbling circuits. Such 'sport-specific' conditioning methods have been demonstrated to promote increases in aerobic fitness, though careful consideration of player skill levels, current fitness, player numbers, field dimensions, game rules and availability of player encouragement is required. Whilst different conditioning methods appear equivalent in their ability to improve fitness, whether sport-specific conditioning is superior to other methods at improving actual game performance statistics requires further research.
A decade of individual participant data meta-analyses: A review of current practice.
Simmonds, Mark; Stewart, Gavin; Stewart, Lesley
2015-11-01
Individual participant data (IPD) systematic reviews and meta-analyses are often considered to be the gold standard for meta-analysis. In the ten years since the first review into the methodology and reporting practice of IPD reviews was published much has changed in the field. This paper investigates current reporting and statistical practice in IPD systematic reviews. A systematic review was performed to identify systematic reviews that collected and analysed IPD. Data were extracted from each included publication on a variety of issues related to the reporting of IPD review process, and the statistical methods used. There has been considerable growth in the use of "one-stage" methods to perform IPD meta-analyses. The majority of reviews consider at least one covariate other than the primary intervention, either using subgroup analysis or including covariates in one-stage regression models. Random-effects analyses, however, are not often used. Reporting of review methods was often limited, with few reviews presenting a risk-of-bias assessment. Details on issues specific to the use of IPD were little reported, including how IPD were obtained; how data was managed and checked for consistency and errors; and for how many studies and participants IPD were sought and obtained. While the last ten years have seen substantial changes in how IPD meta-analyses are performed there remains considerable scope for improving the quality of reporting for both the process of IPD systematic reviews, and the statistical methods employed in them. It is to be hoped that the publication of the PRISMA-IPD guidelines specific to IPD reviews will improve reporting in this area. Copyright © 2015 Elsevier Inc. All rights reserved.
Jaiswara, Ranjana; Nandi, Diptarup; Balakrishnan, Rohini
2013-01-01
Traditional taxonomy based on morphology has often failed in accurate species identification owing to the occurrence of cryptic species, which are reproductively isolated but morphologically identical. Molecular data have thus been used to complement morphology in species identification. The sexual advertisement calls in several groups of acoustically communicating animals are species-specific and can thus complement molecular data as non-invasive tools for identification. Several statistical tools and automated identifier algorithms have been used to investigate the efficiency of acoustic signals in species identification. Despite a plethora of such methods, there is a general lack of knowledge regarding the appropriate usage of these methods in specific taxa. In this study, we investigated the performance of two commonly used statistical methods, discriminant function analysis (DFA) and cluster analysis, in identification and classification based on acoustic signals of field cricket species belonging to the subfamily Gryllinae. Using a comparative approach we evaluated the optimal number of species and calling song characteristics for both the methods that lead to most accurate classification and identification. The accuracy of classification using DFA was high and was not affected by the number of taxa used. However, a constraint in using discriminant function analysis is the need for a priori classification of songs. Accuracy of classification using cluster analysis, which does not require a priori knowledge, was maximum for 6-7 taxa and decreased significantly when more than ten taxa were analysed together. We also investigated the efficacy of two novel derived acoustic features in improving the accuracy of identification. Our results show that DFA is a reliable statistical tool for species identification using acoustic signals. Our results also show that cluster analysis of acoustic signals in crickets works effectively for species classification and identification.
MO-G-12A-01: Quantitative Imaging Metrology: What Should Be Assessed and How?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Giger, M; Petrick, N; Obuchowski, N
The first two symposia in the Quantitative Imaging Track focused on 1) the introduction of quantitative imaging (QI) challenges and opportunities, and QI efforts of agencies and organizations such as the RSNA, NCI, FDA, and NIST, and 2) the techniques, applications, and challenges of QI, with specific examples from CT, PET/CT, and MR. This third symposium in the QI Track will focus on metrology and its importance in successfully advancing the QI field. While the specific focus will be on QI, many of the concepts presented are more broadly applicable to many areas of medical physics research and applications. Asmore » such, the topics discussed should be of interest to medical physicists involved in imaging as well as therapy. The first talk of the session will focus on the introduction to metrology and why it is critically important in QI. The second talk will focus on appropriate methods for technical performance assessment. The third talk will address statistically valid methods for algorithm comparison, a common problem not only in QI but also in other areas of medical physics. The final talk in the session will address strategies for publication of results that will allow statistically valid meta-analyses, which is critical for combining results of individual studies with typically small sample sizes in a manner that can best inform decisions and advance the field. Learning Objectives: Understand the importance of metrology in the QI efforts. Understand appropriate methods for technical performance assessment. Understand methods for comparing algorithms with or without reference data (i.e., “ground truth”). Understand the challenges and importance of reporting results in a manner that allows for statistically valid meta-analyses.« less
Chatasingh, S; Tapaneya-Olarn, W
1989-01-01
The comparison of specific gravity values of 561 urine samples from TS meter and reagent strip was made. The data were divided into two groups: group 1-less than 2+ protein contained urine samples and group 2--equal or more than 2+ protein contained urine samples. The results revealed that the specific gravity values from both methods in both groups were statistically different (p less than 0.01) but they were correlated at r = 0.84 (p less than 0.001) and r = 0.73 (p less than 0.001) in group 1 and group 2, respectively. It was concluded that the reagent strip is suitable for use as a screening test but it should not be considered when precise measurement is necessary.
Effects of quantum coherence on work statistics
NASA Astrophysics Data System (ADS)
Xu, Bao-Ming; Zou, Jian; Guo, Li-Sha; Kong, Xiang-Mu
2018-05-01
In the conventional two-point measurement scheme of quantum thermodynamics, quantum coherence is destroyed by the first measurement. But as we know the coherence really plays an important role in the quantum thermodynamics process, and how to describe the work statistics for a quantum coherent process is still an open question. In this paper, we use the full counting statistics method to investigate the effects of quantum coherence on work statistics. First, we give a general discussion and show that for a quantum coherent process, work statistics is very different from that of the two-point measurement scheme, specifically the average work is increased or decreased and the work fluctuation can be decreased by quantum coherence, which strongly depends on the relative phase, the energy level structure, and the external protocol. Then, we concretely consider a quenched one-dimensional transverse Ising model and show that quantum coherence has a more significant influence on work statistics in the ferromagnetism regime compared with that in the paramagnetism regime, so that due to the presence of quantum coherence the work statistics can exhibit the critical phenomenon even at high temperature.
Violent crime in San Antonio, Texas: an application of spatial epidemiological methods.
Sparks, Corey S
2011-12-01
Violent crimes are rarely considered a public health problem or investigated using epidemiological methods. But patterns of violent crime and other health conditions are often affected by similar characteristics of the built environment. In this paper, methods and perspectives from spatial epidemiology are used in an analysis of violent crimes in San Antonio, TX. Bayesian statistical methods are used to examine the contextual influence of several aspects of the built environment. Additionally, spatial regression models using Bayesian model specifications are used to examine spatial patterns of violent crime risk. Results indicate that the determinants of violent crime depend on the model specification, but are primarily related to the built environment and neighborhood socioeconomic conditions. Results are discussed within the context of a rapidly growing urban area with a diverse population. Copyright © 2011 Elsevier Ltd. All rights reserved.
Inferring general relations between network characteristics from specific network ensembles.
Cardanobile, Stefano; Pernice, Volker; Deger, Moritz; Rotter, Stefan
2012-01-01
Different network models have been suggested for the topology underlying complex interactions in natural systems. These models are aimed at replicating specific statistical features encountered in real-world networks. However, it is rarely considered to which degree the results obtained for one particular network class can be extrapolated to real-world networks. We address this issue by comparing different classical and more recently developed network models with respect to their ability to generate networks with large structural variability. In particular, we consider the statistical constraints which the respective construction scheme imposes on the generated networks. After having identified the most variable networks, we address the issue of which constraints are common to all network classes and are thus suitable candidates for being generic statistical laws of complex networks. In fact, we find that generic, not model-related dependencies between different network characteristics do exist. This makes it possible to infer global features from local ones using regression models trained on networks with high generalization power. Our results confirm and extend previous findings regarding the synchronization properties of neural networks. Our method seems especially relevant for large networks, which are difficult to map completely, like the neural networks in the brain. The structure of such large networks cannot be fully sampled with the present technology. Our approach provides a method to estimate global properties of under-sampled networks in good approximation. Finally, we demonstrate on three different data sets (C. elegans neuronal network, R. prowazekii metabolic network, and a network of synonyms extracted from Roget's Thesaurus) that real-world networks have statistical relations compatible with those obtained using regression models.
An application of principal component analysis to the clavicle and clavicle fixation devices.
Daruwalla, Zubin J; Courtis, Patrick; Fitzpatrick, Clare; Fitzpatrick, David; Mullett, Hannan
2010-03-26
Principal component analysis (PCA) enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories. Twenty-one high-resolution computerized tomography scans of the clavicle were reconstructed and analyzed using a specifically developed statistical software package. After performing statistical shape analysis, PCA was applied to study the factors that account for anatomical variation. The first principal component representing size accounted for 70.5 percent of anatomical variation. The addition of a further three principal components accounted for almost 87 percent. Using statistical shape analysis, clavicles in males have a greater lateral depth and are longer, wider and thicker than in females. However, the sternal angle in females is larger than in males. PCA confirmed these differences between genders but also noted that men exhibit greater variance and classified clavicles into five morphological groups. This unique approach is the first that standardizes a clavicular orientation. It provides information that is useful to both, the biomedical engineer and clinician. Other applications include implant design with regard to modifying current or designing future clavicle fixation devices. Our findings support the need for further development of clavicle fixation devices and the questioning of whether gender-specific devices are necessary.
[The main directions of reforming the service of medical statistics in Ukraine].
Golubchykov, Mykhailo V; Orlova, Nataliia M; Bielikova, Inna V
2018-01-01
Introduction: Implementation of new methods of information support of managerial decision-making should ensure of the effective health system reform and create conditions for improving the quality of operational management, reasonable planning of medical care and increasing the efficiency of the use of system resources. Reforming of Medical Statistics Service of Ukraine should be considered only in the context of the reform of the entire health system. The aim: This work is an analysis of the current situation and justification of the main directions of reforming of Medical Statistics Service of Ukraine. Material and methods: In the work is used a range of methods: content analysis, bibliosemantic, systematic approach. The information base of the research became: WHO strategic and program documents, data of the Medical Statistics Center of the Ministry of Health of Ukraine. Review: The Medical Statistics Service of Ukraine has a completed and effective structure, headed by the State Institution "Medical Statistics Center of the Ministry of Health of Ukraine." This institution reports on behalf of the Ministry of Health of Ukraine to the State Statistical Service of Ukraine, the WHO European Office and other international organizations. An analysis of the current situation showed that to achieve this goal it is necessary: to improve the system of statistical indicators for an adequate assessment of the performance of health institutions, including in the economic aspect; creation of a developed medical and statistical base of administrative territories; change of existing technologies for the formation of information resources; strengthening the material-technical base of the structural units of Medical Statistics Service; improvement of the system of training and retraining of personnel for the service of medical statistics; development of international cooperation in the field of methodology and practice of medical statistics, implementation of internationally accepted methods for collecting, processing, analyzing and disseminating medical and statistical information; the creation of a medical and statistical service that adapted to the specifics of market relations in health care, flexible and sensitive to changes in international methodologies and standards. Conclusions: The data of medical statistics are the basis for taking managerial decisions by managers at all levels of health care. Reform of Medical Statistics Service of Ukraine should be considered only in the context of the reform of the entire health system. The main directions of the reform of the medical statistics service in Ukraine are: the introduction of information technologies, the improvement of the training of personnel for the service, the improvement of material and technical equipment, the maximum reuse of the data obtained, which provides for the unification of primary data and a system of indicators. The most difficult area is the formation of information funds and the introduction of modern information technologies.
[The main directions of reforming the service of medical statistics in Ukraine].
Golubchykov, Mykhailo V; Orlova, Nataliia M; Bielikova, Inna V
Introduction: Implementation of new methods of information support of managerial decision-making should ensure of the effective health system reform and create conditions for improving the quality of operational management, reasonable planning of medical care and increasing the efficiency of the use of system resources. Reforming of Medical Statistics Service of Ukraine should be considered only in the context of the reform of the entire health system. The aim: This work is an analysis of the current situation and justification of the main directions of reforming of Medical Statistics Service of Ukraine. Material and methods: In the work is used a range of methods: content analysis, bibliosemantic, systematic approach. The information base of the research became: WHO strategic and program documents, data of the Medical Statistics Center of the Ministry of Health of Ukraine. Review: The Medical Statistics Service of Ukraine has a completed and effective structure, headed by the State Institution "Medical Statistics Center of the Ministry of Health of Ukraine." This institution reports on behalf of the Ministry of Health of Ukraine to the State Statistical Service of Ukraine, the WHO European Office and other international organizations. An analysis of the current situation showed that to achieve this goal it is necessary: to improve the system of statistical indicators for an adequate assessment of the performance of health institutions, including in the economic aspect; creation of a developed medical and statistical base of administrative territories; change of existing technologies for the formation of information resources; strengthening the material-technical base of the structural units of Medical Statistics Service; improvement of the system of training and retraining of personnel for the service of medical statistics; development of international cooperation in the field of methodology and practice of medical statistics, implementation of internationally accepted methods for collecting, processing, analyzing and disseminating medical and statistical information; the creation of a medical and statistical service that adapted to the specifics of market relations in health care, flexible and sensitive to changes in international methodologies and standards. Conclusions: The data of medical statistics are the basis for taking managerial decisions by managers at all levels of health care. Reform of Medical Statistics Service of Ukraine should be considered only in the context of the reform of the entire health system. The main directions of the reform of the medical statistics service in Ukraine are: the introduction of information technologies, the improvement of the training of personnel for the service, the improvement of material and technical equipment, the maximum reuse of the data obtained, which provides for the unification of primary data and a system of indicators. The most difficult area is the formation of information funds and the introduction of modern information technologies.
Stucki, Sheldon Lee; Biss, David J.
2000-01-01
An analysis was performed using the National Automotive Sampling System Crashworthiness Data System (NASS-CDS) database to compare the injury/fatality rates of variously restrained driver occupants as compared to unrestrained driver occupants in the total database of drivers/frontals, and also by Delta-V. A structured search of the NASS-CDS was done using the SAS® statistical analysis software to extract the data for this analysis and the SUDAAN software package was used to arrive at statistical significance indicators. In addition, this paper goes on to investigate different methods for presenting results of accident database searches including significance results; a risk versus Delta-V format for specific exposures; and, a percent cumulative injury versus Delta-V format to characterize injury trends. These alternative analysis presentation methods are then discussed by example using the present study results. PMID:11558105
Near-equilibrium dumb-bell-shaped figures for cohesionless small bodies
NASA Astrophysics Data System (ADS)
Descamps, Pascal
2016-02-01
In a previous paper (Descamps, P. [2015]. Icarus 245, 64-79), we developed a specific method aimed to retrieve the main physical characteristics (shape, density, surface scattering properties) of highly elongated bodies from their rotational lightcurves through the use of dumb-bell-shaped equilibrium figures. The present work is a test of this method. For that purpose we introduce near-equilibrium dumb-bell-shaped figures which are base dumb-bell equilibrium shapes modulated by lognormal statistics. Such synthetic irregular models are used to generate lightcurves from which our method is successfully applied. Shape statistical parameters of such near-equilibrium dumb-bell-shaped objects are in good agreement with those calculated for example for the Asteroid (216) Kleopatra from its dog-bone radar model. It may suggest that such bilobed and elongated asteroids can be approached by equilibrium figures perturbed be the interplay with a substantial internal friction modeled by a Gaussian random sphere.
Multiscale Structure of UXO Site Characterization: Spatial Estimation and Uncertainty Quantification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ostrouchov, George; Doll, William E.; Beard, Les P.
2009-01-01
Unexploded ordnance (UXO) site characterization must consider both how the contamination is generated and how we observe that contamination. Within the generation and observation processes, dependence structures can be exploited at multiple scales. We describe a conceptual site characterization process, the dependence structures available at several scales, and consider their statistical estimation aspects. It is evident that most of the statistical methods that are needed to address the estimation problems are known but their application-specific implementation may not be available. We demonstrate estimation at one scale and propose a representation for site contamination intensity that takes full account of uncertainty,more » is flexible enough to answer regulatory requirements, and is a practical tool for managing detailed spatial site characterization and remediation. The representation is based on point process spatial estimation methods that require modern computational resources for practical application. These methods have provisions for including prior and covariate information.« less
The mean time-limited crash rate of stock price
NASA Astrophysics Data System (ADS)
Li, Yun-Xian; Li, Jiang-Cheng; Yang, Ai-Jun; Tang, Nian-Sheng
2017-05-01
In this article we investigate the occurrence of stock market crash in an economy cycle. Bayesian approach, Heston model and statistical-physical method are considered. Specifically, Heston model and an effective potential are employed to address the dynamic changes of stock price. Bayesian approach has been utilized to estimate the Heston model's unknown parameters. Statistical physical method is used to investigate the occurrence of stock market crash by calculating the mean time-limited crash rate. The real financial data from the Shanghai Composite Index is analyzed with the proposed methods. The mean time-limited crash rate of stock price is used to describe the occurrence of stock market crash in an economy cycle. The monotonous and nonmonotonous behaviors are observed in the behavior of the mean time-limited crash rate versus volatility of stock for various cross correlation coefficient between volatility and price. Also a minimum occurrence of stock market crash matching an optimal volatility is discovered.
Hezel, Marcus; von Usslar, Kathrin; Kurzweg, Thiemo; Lörincz, Balazs B; Knecht, Rainald
2016-04-01
This article reviews the methodical and statistical basics of designing a trial, with a special focus on the process of defining and choosing endpoints and cutpoints as the foundations of clinical research, and ultimately that of evidence-based medicine. There has been a significant progress in the treatment of head and neck cancer in the past few decades. Currently available treatment options can have a variety of different goals, depending e.g. on tumor stage, among other factors. The outcome of a specific treatment in clinical trials is measured using endpoints. Besides classical endpoints, such as overall survival or organ preservation, other endpoints like quality of life are becoming increasingly important in designing and conducting a trial. The present work is based on electronic research and focuses on the solid methodical and statistical basics of a clinical trial, on the structure of study designs and on the presentation of various endpoints.
Powerful Inference with the D-Statistic on Low-Coverage Whole-Genome Data
Soraggi, Samuele; Wiuf, Carsten; Albrechtsen, Anders
2017-01-01
The detection of ancient gene flow between human populations is an important issue in population genetics. A common tool for detecting ancient admixture events is the D-statistic. The D-statistic is based on the hypothesis of a genetic relationship that involves four populations, whose correctness is assessed by evaluating specific coincidences of alleles between the groups. When working with high-throughput sequencing data, calling genotypes accurately is not always possible; therefore, the D-statistic currently samples a single base from the reads of one individual per population. This implies ignoring much of the information in the data, an issue especially striking in the case of ancient genomes. We provide a significant improvement to overcome the problems of the D-statistic by considering all reads from multiple individuals in each population. We also apply type-specific error correction to combat the problems of sequencing errors, and show a way to correct for introgression from an external population that is not part of the supposed genetic relationship, and how this leads to an estimate of the admixture rate. We prove that the D-statistic is approximated by a standard normal distribution. Furthermore, we show that our method outperforms the traditional D-statistic in detecting admixtures. The power gain is most pronounced for low and medium sequencing depth (1–10×), and performances are as good as with perfectly called genotypes at a sequencing depth of 2×. We show the reliability of error correction in scenarios with simulated errors and ancient data, and correct for introgression in known scenarios to estimate the admixture rates. PMID:29196497
Voss, Andreas; Fischer, Claudia; Schroeder, Rico; Figulla, Hans R; Goernig, Matthias
2012-07-01
The objectives of this study were to introduce a new type of heart-rate variability analysis improving risk stratification in patients with idiopathic dilated cardiomyopathy (DCM) and to provide additional information about impaired heart beat generation in these patients. Beat-to-beat intervals (BBI) of 30-min ECGs recorded from 91 DCM patients and 21 healthy subjects were analyzed applying the lagged segmented Poincaré plot analysis (LSPPA) method. LSPPA includes the Poincaré plot reconstruction with lags of 1-100, rotating the cloud of points, its normalized segmentation adapted to their standard deviations, and finally, a frequency-dependent clustering. The lags were combined into eight different clusters representing specific frequency bands within 0.012-1.153 Hz. Statistical differences between low- and high-risk DCM could be found within the clusters II-VIII (e.g., cluster IV: 0.033-0.038 Hz; p = 0.0002; sensitivity = 85.7 %; specificity = 71.4 %). The multivariate statistics led to a sensitivity of 92.9 %, specificity of 85.7 % and an area under the curve of 92.1 % discriminating these patient groups. We introduced the LSPPA method to investigate time correlations in BBI time series. We found that LSPPA contributes considerably to risk stratification in DCM and yields the highest discriminant power in the low and very low-frequency bands.
Implementing Peer-Assisted Writing Support in German Secondary Schools
ERIC Educational Resources Information Center
Rensing, Julia; Vierbuchen, Marie-Christine; Hillenbrand, Clemens; Grünke, Matthias
2016-01-01
The alarming results of large studies such as the National Assessment of Educational Progress (NAEP; National Center for Education Statistics, 2012) point to an urgent need for writing support and call for specific and effective methods to foster writing competencies. The main purpose of this paper is to describe an innovative peer-assisted…
Demographic Accounting and Model-Building. Education and Development Technical Reports.
ERIC Educational Resources Information Center
Stone, Richard
This report describes and develops a model for coordinating a variety of demographic and social statistics within a single framework. The framework proposed, together with its associated methods of analysis, serves both general and specific functions. The general aim of these functions is to give numerical definition to the pattern of society and…
Use of the Analysis of the Volatile Faecal Metabolome in Screening for Colorectal Cancer
2015-01-01
Diagnosis of colorectal cancer is an invasive and expensive colonoscopy, which is usually carried out after a positive screening test. Unfortunately, existing screening tests lack specificity and sensitivity, hence many unnecessary colonoscopies are performed. Here we report on a potential new screening test for colorectal cancer based on the analysis of volatile organic compounds (VOCs) in the headspace of faecal samples. Faecal samples were obtained from subjects who had a positive faecal occult blood sample (FOBT). Subjects subsequently had colonoscopies performed to classify them into low risk (non-cancer) and high risk (colorectal cancer) groups. Volatile organic compounds were analysed by selected ion flow tube mass spectrometry (SIFT-MS) and then data were analysed using both univariate and multivariate statistical methods. Ions most likely from hydrogen sulphide, dimethyl sulphide and dimethyl disulphide are statistically significantly higher in samples from high risk rather than low risk subjects. Results using multivariate methods show that the test gives a correct classification of 75% with 78% specificity and 72% sensitivity on FOBT positive samples, offering a potentially effective alternative to FOBT. PMID:26086914
Ramilo, Andrea; Navas, J Ignacio; Villalba, Antonio; Abollo, Elvira
2013-05-27
Bonamia ostreae and B. exitiosa have caused mass mortalities of various oyster species around the world and co-occur in some European areas. The World Organisation for Animal Health (OIE) has included infections with both species in the list of notifiable diseases. However, official methods for species-specific diagnosis of either parasite have certain limitations. In this study, new species-specific conventional PCR (cPCR) and real-time PCR techniques were developed to diagnose each parasite species. Moreover, a multiplex PCR method was designed to detect both parasites in a single assay. The analytical sensitivity and specificity of each new method were evaluated. These new procedures were compared with 2 OIE-recommended methods, viz. standard histology and PCR-RFLP. The new procedures showed higher sensitivity than the OIE recommended ones for the diagnosis of both species. The sensitivity of tests with the new primers was higher using oyster gills and gonad tissue, rather than gills alone. The lack of a 'gold standard' prevented accurate estimation of sensitivity and specificity of the new methods. The implementation of statistical tools (maximum likelihood method) for the comparison of the diagnostic tests showed the possibility of false positives with the new procedures, although the absence of a gold standard precluded certainty. Nevertheless, all procedures showed negative results when used for the analysis of oysters from a Bonamia-free area.
Robust approaches to quantification of margin and uncertainty for sparse data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hund, Lauren; Schroeder, Benjamin B.; Rumsey, Kelin
Characterizing the tails of probability distributions plays a key role in quantification of margins and uncertainties (QMU), where the goal is characterization of low probability, high consequence events based on continuous measures of performance. When data are collected using physical experimentation, probability distributions are typically fit using statistical methods based on the collected data, and these parametric distributional assumptions are often used to extrapolate about the extreme tail behavior of the underlying probability distribution. In this project, we character- ize the risk associated with such tail extrapolation. Specifically, we conducted a scaling study to demonstrate the large magnitude of themore » risk; then, we developed new methods for communicat- ing risk associated with tail extrapolation from unvalidated statistical models; lastly, we proposed a Bayesian data-integration framework to mitigate tail extrapolation risk through integrating ad- ditional information. We conclude that decision-making using QMU is a complex process that cannot be achieved using statistical analyses alone.« less
Text grouping in patent analysis using adaptive K-means clustering algorithm
NASA Astrophysics Data System (ADS)
Shanie, Tiara; Suprijadi, Jadi; Zulhanif
2017-03-01
Patents are one of the Intellectual Property. Analyzing patent is one requirement in knowing well the development of technology in each country and in the world now. This study uses the patent document coming from the Espacenet server about Green Tea. Patent documents related to the technology in the field of tea is still widespread, so it will be difficult for users to information retrieval (IR). Therefore, it is necessary efforts to categorize documents in a specific group of related terms contained therein. This study uses titles patent text data with the proposed Green Tea in Statistical Text Mining methods consists of two phases: data preparation and data analysis stage. The data preparation phase uses Text Mining methods and data analysis stage is done by statistics. Statistical analysis in this study using a cluster analysis algorithm, the Adaptive K-Means Clustering Algorithm. Results from this study showed that based on the maximum value Silhouette, generate 87 clusters associated fifteen terms therein that can be utilized in the process of information retrieval needs.
Evaluation of airborne lidar data to predict vegetation Presence/Absence
Palaseanu-Lovejoy, M.; Nayegandhi, A.; Brock, J.; Woodman, R.; Wright, C.W.
2009-01-01
This study evaluates the capabilities of the Experimental Advanced Airborne Research Lidar (EAARL) in delineating vegetation assemblages in Jean Lafitte National Park, Louisiana. Five-meter-resolution grids of bare earth, canopy height, canopy-reflection ratio, and height of median energy were derived from EAARL data acquired in September 2006. Ground-truth data were collected along transects to assess species composition, canopy cover, and ground cover. To decide which model is more accurate, comparisons of general linear models and generalized additive models were conducted using conventional evaluation methods (i.e., sensitivity, specificity, Kappa statistics, and area under the curve) and two new indexes, net reclassification improvement and integrated discrimination improvement. Generalized additive models were superior to general linear models in modeling presence/absence in training vegetation categories, but no statistically significant differences between the two models were achieved in determining the classification accuracy at validation locations using conventional evaluation methods, although statistically significant improvements in net reclassifications were observed. ?? 2009 Coastal Education and Research Foundation.
Evaluating, Comparing, and Interpreting Protein Domain Hierarchies
2014-01-01
Abstract Arranging protein domain sequences hierarchically into evolutionarily divergent subgroups is important for investigating evolutionary history, for speeding up web-based similarity searches, for identifying sequence determinants of protein function, and for genome annotation. However, whether or not a particular hierarchy is optimal is often unclear, and independently constructed hierarchies for the same domain can often differ significantly. This article describes methods for statistically evaluating specific aspects of a hierarchy, for probing the criteria underlying its construction and for direct comparisons between hierarchies. Information theoretical notions are used to quantify the contributions of specific hierarchical features to the underlying statistical model. Such features include subhierarchies, sequence subgroups, individual sequences, and subgroup-associated signature patterns. Underlying properties are graphically displayed in plots of each specific feature's contributions, in heat maps of pattern residue conservation, in “contrast alignments,” and through cross-mapping of subgroups between hierarchies. Together, these approaches provide a deeper understanding of protein domain functional divergence, reveal uncertainties caused by inconsistent patterns of sequence conservation, and help resolve conflicts between competing hierarchies. PMID:24559108
Technology Development Risk Assessment for Space Transportation Systems
NASA Technical Reports Server (NTRS)
Mathias, Donovan L.; Godsell, Aga M.; Go, Susie
2006-01-01
A new approach for assessing development risk associated with technology development projects is presented. The method represents technology evolution in terms of sector-specific discrete development stages. A Monte Carlo simulation is used to generate development probability distributions based on statistical models of the discrete transitions. Development risk is derived from the resulting probability distributions and specific program requirements. Two sample cases are discussed to illustrate the approach, a single rocket engine development and a three-technology space transportation portfolio.
Statistical Characterization and Classification of Edge-Localized Plasma Instabilities
NASA Astrophysics Data System (ADS)
Webster, A. J.; Dendy, R. O.
2013-04-01
The statistics of edge-localized plasma instabilities (ELMs) in toroidal magnetically confined fusion plasmas are considered. From first principles, standard experimentally motivated assumptions are shown to determine a specific probability distribution for the waiting times between ELMs: the Weibull distribution. This is confirmed empirically by a statistically rigorous comparison with a large data set from the Joint European Torus. The successful characterization of ELM waiting times enables future work to progress in various ways. Here we present a quantitative classification of ELM types, complementary to phenomenological approaches. It also informs us about the nature of ELM processes, such as whether they are random or deterministic. The methods are extremely general and can be applied to numerous other quasiperiodic intermittent phenomena.
Holmes, Susan; Alekseyenko, Alexander; Timme, Alden; Nelson, Tyrrell; Pasricha, Pankaj Jay; Spormann, Alfred
2011-01-01
This article explains the statistical and computational methodology used to analyze species abundances collected using the LNBL Phylochip in a study of Irritable Bowel Syndrome (IBS) in rats. Some tools already available for the analysis of ordinary microarray data are useful in this type of statistical analysis. For instance in correcting for multiple testing we use Family Wise Error rate control and step-down tests (available in the multtest package). Once the most significant species are chosen we use the hypergeometric tests familiar for testing GO categories to test specific phyla and families. We provide examples of normalization, multivariate projections, batch effect detection and integration of phylogenetic covariation, as well as tree equalization and robustification methods.
NASA Astrophysics Data System (ADS)
Ghanate, A. D.; Kothiwale, S.; Singh, S. P.; Bertrand, Dominique; Krishna, C. Murali
2011-02-01
Cancer is now recognized as one of the major causes of morbidity and mortality. Histopathological diagnosis, the gold standard, is shown to be subjective, time consuming, prone to interobserver disagreement, and often fails to predict prognosis. Optical spectroscopic methods are being contemplated as adjuncts or alternatives to conventional cancer diagnostics. The most important aspect of these approaches is their objectivity, and multivariate statistical tools play a major role in realizing it. However, rigorous evaluation of the robustness of spectral models is a prerequisite. The utility of Raman spectroscopy in the diagnosis of cancers has been well established. Until now, the specificity and applicability of spectral models have been evaluated for specific cancer types. In this study, we have evaluated the utility of spectroscopic models representing normal and malignant tissues of the breast, cervix, colon, larynx, and oral cavity in a broader perspective, using different multivariate tests. The limit test, which was used in our earlier study, gave high sensitivity but suffered from poor specificity. The performance of other methods such as factorial discriminant analysis and partial least square discriminant analysis are at par with more complex nonlinear methods such as decision trees, but they provide very little information about the classification model. This comparative study thus demonstrates not just the efficacy of Raman spectroscopic models but also the applicability and limitations of different multivariate tools for discrimination under complex conditions such as the multicancer scenario.
Mechanism-based Pharmacovigilance over the Life Sciences Linked Open Data Cloud.
Kamdar, Maulik R; Musen, Mark A
2017-01-01
Adverse drug reactions (ADR) result in significant morbidity and mortality in patients, and a substantial proportion of these ADRs are caused by drug-drug interactions (DDIs). Pharmacovigilance methods are used to detect unanticipated DDIs and ADRs by mining Spontaneous Reporting Systems, such as the US FDA Adverse Event Reporting System (FAERS). However, these methods do not provide mechanistic explanations for the discovered drug-ADR associations in a systematic manner. In this paper, we present a systems pharmacology-based approach to perform mechanism-based pharmacovigilance. We integrate data and knowledge from four different sources using Semantic Web Technologies and Linked Data principles to generate a systems network. We present a network-based Apriori algorithm for association mining in FAERS reports. We evaluate our method against existing pharmacovigilance methods for three different validation sets. Our method has AUROC statistics of 0.7-0.8, similar to current methods, and event-specific thresholds generate AUROC statistics greater than 0.75 for certain ADRs. Finally, we discuss the benefits of using Semantic Web technologies to attain the objectives for mechanism-based pharmacovigilance.
Fox, A S; Bonacci, J; McLean, S G; Saunders, N
2017-05-01
Screening methods sensitive to movement strategies that increase anterior cruciate ligament (ACL) loads are likely to be effective in identifying athletes at-risk of ACL injury. Current ACL injury risk screening methods are yet to be evaluated for their ability to identify athletes' who exhibit high-risk lower limb mechanics during sport-specific maneuvers associated with ACL injury occurrences. The purpose of this study was to examine the efficacy of two ACL injury risk screening methods in identifying high-risk lower limb mechanics during a sport-specific landing task. Thirty-two female athletes were screened using the Landing Error Scoring System (LESS) and Tuck Jump Assessment. Participants' also completed a sport-specific landing task, during which three-dimensional kinematic and kinetic data were collected. One-dimensional statistical parametric mapping was used to examine the relationships between screening method scores, and the three-dimensional hip and knee joint rotation and moment data from the sport-specific landing. Higher LESS scores were associated with reduced knee flexion from 30 to 57 ms after initial contact (P = 0.003) during the sport-specific landing; however, no additional relationships were found. These findings suggest the LESS and Tuck Jump Assessment may have minimal applicability in identifying athletes' who exhibit high-risk landing postures in the sport-specific task examined. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
MIDAS: Regionally linear multivariate discriminative statistical mapping.
Varol, Erdem; Sotiras, Aristeidis; Davatzikos, Christos
2018-07-01
Statistical parametric maps formed via voxel-wise mass-univariate tests, such as the general linear model, are commonly used to test hypotheses about regionally specific effects in neuroimaging cross-sectional studies where each subject is represented by a single image. Despite being informative, these techniques remain limited as they ignore multivariate relationships in the data. Most importantly, the commonly employed local Gaussian smoothing, which is important for accounting for registration errors and making the data follow Gaussian distributions, is usually chosen in an ad hoc fashion. Thus, it is often suboptimal for the task of detecting group differences and correlations with non-imaging variables. Information mapping techniques, such as searchlight, which use pattern classifiers to exploit multivariate information and obtain more powerful statistical maps, have become increasingly popular in recent years. However, existing methods may lead to important interpretation errors in practice (i.e., misidentifying a cluster as informative, or failing to detect truly informative voxels), while often being computationally expensive. To address these issues, we introduce a novel efficient multivariate statistical framework for cross-sectional studies, termed MIDAS, seeking highly sensitive and specific voxel-wise brain maps, while leveraging the power of regional discriminant analysis. In MIDAS, locally linear discriminative learning is applied to estimate the pattern that best discriminates between two groups, or predicts a variable of interest. This pattern is equivalent to local filtering by an optimal kernel whose coefficients are the weights of the linear discriminant. By composing information from all neighborhoods that contain a given voxel, MIDAS produces a statistic that collectively reflects the contribution of the voxel to the regional classifiers as well as the discriminative power of the classifiers. Critically, MIDAS efficiently assesses the statistical significance of the derived statistic by analytically approximating its null distribution without the need for computationally expensive permutation tests. The proposed framework was extensively validated using simulated atrophy in structural magnetic resonance imaging (MRI) and further tested using data from a task-based functional MRI study as well as a structural MRI study of cognitive performance. The performance of the proposed framework was evaluated against standard voxel-wise general linear models and other information mapping methods. The experimental results showed that MIDAS achieves relatively higher sensitivity and specificity in detecting group differences. Together, our results demonstrate the potential of the proposed approach to efficiently map effects of interest in both structural and functional data. Copyright © 2018. Published by Elsevier Inc.
Recchia, Gabriel L; Louwerse, Max M
2016-11-01
Computational techniques comparing co-occurrences of city names in texts allow the relative longitudes and latitudes of cities to be estimated algorithmically. However, these techniques have not been applied to estimate the provenance of artifacts with unknown origins. Here, we estimate the geographic origin of artifacts from the Indus Valley Civilization, applying methods commonly used in cognitive science to the Indus script. We show that these methods can accurately predict the relative locations of archeological sites on the basis of artifacts of known provenance, and we further apply these techniques to determine the most probable excavation sites of four sealings of unknown provenance. These findings suggest that inscription statistics reflect historical interactions among locations in the Indus Valley region, and they illustrate how computational methods can help localize inscribed archeological artifacts of unknown origin. The success of this method offers opportunities for the cognitive sciences in general and for computational anthropology specifically. Copyright © 2015 Cognitive Science Society, Inc.
The expectancy-value muddle in the theory of planned behaviour - and some proposed solutions.
French, David P; Hankins, Matthew
2003-02-01
The authors of the Theories of Reasoned Action and Planned Behaviour recommended a method for statistically analysing the relationships between beliefs and the Attitude, Subjective Norm, and Perceived Behavioural Control constructs. This method has been used in the overwhelming majority of studies using these theories. However, there is a growing awareness that this method yields statistically uninterpretable results (Evans, 1991). Despite this, the use of this method is continuing, as is uninformed interpretation of this problematic research literature. This is probably due to the lack of a simple account of where the problem lies, and the large number of alternatives available. This paper therefore summarizes the problem as simply as possible, gives consideration to the conclusions that can be validly drawn from studies that contain this problem, and critically reviews the many alternatives that have been proposed to address this problem. Different techniques are identified as being suitable, according to the purpose of the specific research project.
Multi-site precipitation downscaling using a stochastic weather generator
NASA Astrophysics Data System (ADS)
Chen, Jie; Chen, Hua; Guo, Shenglian
2018-03-01
Statistical downscaling is an efficient way to solve the spatiotemporal mismatch between climate model outputs and the data requirements of hydrological models. However, the most commonly-used downscaling method only produces climate change scenarios for a specific site or watershed average, which is unable to drive distributed hydrological models to study the spatial variability of climate change impacts. By coupling a single-site downscaling method and a multi-site weather generator, this study proposes a multi-site downscaling approach for hydrological climate change impact studies. Multi-site downscaling is done in two stages. The first stage involves spatially downscaling climate model-simulated monthly precipitation from grid scale to a specific site using a quantile mapping method, and the second stage involves the temporal disaggregating of monthly precipitation to daily values by adjusting the parameters of a multi-site weather generator. The inter-station correlation is specifically considered using a distribution-free approach along with an iterative algorithm. The performance of the downscaling approach is illustrated using a 10-station watershed as an example. The precipitation time series derived from the National Centers for Environment Prediction (NCEP) reanalysis dataset is used as the climate model simulation. The precipitation time series of each station is divided into 30 odd years for calibration and 29 even years for validation. Several metrics, including the frequencies of wet and dry spells and statistics of the daily, monthly and annual precipitation are used as criteria to evaluate the multi-site downscaling approach. The results show that the frequencies of wet and dry spells are well reproduced for all stations. In addition, the multi-site downscaling approach performs well with respect to reproducing precipitation statistics, especially at monthly and annual timescales. The remaining biases mainly result from the non-stationarity of NCEP precipitation. Overall, the proposed approach is efficient for generating multi-site climate change scenarios that can be used to investigate the spatial variability of climate change impacts on hydrology.
Fael, Hanan; Sakur, Amir Al-Haj
2015-11-01
A novel, simple and specific spectrofluorimetric method was developed and validated for the determination of perindopril erbumine (PDE). The method is based on the fluorescence quenching of Rhodamine B upon adding perindopril erbumine. The quenched fluorescence was monitored at 578 nm after excitation at 500 nm. The optimization of the reaction conditions such as the solvent, reagent concentration, and reaction time were investigated. Under the optimum conditions, the fluorescence quenching was linear over a concentration range of 1.0-6.0 μg/mL. The proposed method was fully validated and successfully applied to the analysis of perindopril erbumine in pure form and tablets. Statistical comparison of the results obtained by the developed and reference methods revealed no significant differences between the methods compared in terms of accuracy and precision. The method was shown to be highly specific in the presence of indapamide, a diuretic that is commonly combined with perindopril erbumine. The mechanism of rhodamine B quenching was also discussed.
Efficient Blockwise Permutation Tests Preserving Exchangeability
Zhou, Chunxiao; Zwilling, Chris E.; Calhoun, Vince D.; Wang, Michelle Y.
2014-01-01
In this paper, we present a new blockwise permutation test approach based on the moments of the test statistic. The method is of importance to neuroimaging studies. In order to preserve the exchangeability condition required in permutation tests, we divide the entire set of data into certain exchangeability blocks. In addition, computationally efficient moments-based permutation tests are performed by approximating the permutation distribution of the test statistic with the Pearson distribution series. This involves the calculation of the first four moments of the permutation distribution within each block and then over the entire set of data. The accuracy and efficiency of the proposed method are demonstrated through simulated experiment on the magnetic resonance imaging (MRI) brain data, specifically the multi-site voxel-based morphometry analysis from structural MRI (sMRI). PMID:25289113
The application of the statistical theory of extreme values to gust-load problems
NASA Technical Reports Server (NTRS)
Press, Harry
1950-01-01
An analysis is presented which indicates that the statistical theory of extreme values is applicable to the problems of predicting the frequency of encountering the larger gust loads and gust velocities for both specific test conditions as well as commercial transport operations. The extreme-value theory provides an analytic form for the distributions of maximum values of gust load and velocity. Methods of fitting the distribution are given along with a method of estimating the reliability of the predictions. The theory of extreme values is applied to available load data from commercial transport operations. The results indicate that the estimates of the frequency of encountering the larger loads are more consistent with the data and more reliable than those obtained in previous analyses. (author)
Gandhi, Neha; Jain, Sandeep; Kumar, Manish; Rupakar, Pratik; Choyal, Kanaram; Prajapati, Seema
2015-01-01
Background: Age assessment may be a crucial step in postmortem profiling leading to confirmative identification. In children, Demirjian's method based on eight developmental stages was developed to determine maturity scores as a function of age and polynomial functions to determine age as a function of score. Aim: Of this study was to evaluate the reliability of age estimation using Demirjian's eight teeth method following the French maturity scores and Indian-specific formula from developmental stages of third molar with the help of orthopantomograms using the Demirjian method. Materials and Methods: Dental panoramic tomograms from 30 subjects each of known chronological age and sex were collected and were evaluated according to Demirjian's criteria. Age calculations were performed using Demirjian's formula and Indian formula. Statistical analysis used was Chi-square test and ANOVA test and the P values obtained were statistically significant. Results: There was an average underestimation of age with both Indian and Demirjian's formulas. The mean absolute error was lower using Indian formula hence it can be applied for age estimation in present Gujarati population. Also, females were ahead of achieving dental maturity than males thus completion of dental development is attained earlier in females. Conclusion: Greater accuracy can be obtained if population-specific formulas considering the ethnic and environmental variation are derived performing the regression analysis. PMID:26005298
NASA Astrophysics Data System (ADS)
Karpov, A. V.; Yumagulov, E. Z.
2003-05-01
We have restored and ordered the archive of meteor observations carried out with a meteor radar complex ``KGU-M5'' since 1986. A relational database has been formed under the control of the Database Management System (DBMS) Oracle 8. We also improved and tested a statistical method for studying the fine spatial structure of meteor streams with allowance for the specific features of application of the DBMS. Statistical analysis of the results of observations made it possible to obtain information about the substance distribution in the Quadrantid, Geminid, and Perseid meteor streams.
Approximate Model Checking of PCTL Involving Unbounded Path Properties
NASA Astrophysics Data System (ADS)
Basu, Samik; Ghosh, Arka P.; He, Ru
We study the problem of applying statistical methods for approximate model checking of probabilistic systems against properties encoded as
Lin, Tin-Chi; Marucci-Wellman, Helen R; Willetts, Joanna L; Brennan, Melanye J; Verma, Santosh K
2016-12-01
A common issue in descriptive injury epidemiology is that in order to calculate injury rates that account for the time spent in an activity, both injury cases and exposure time of specific activities need to be collected. In reality, few national surveys have this capacity. To address this issue, we combined statistics from two different national complex surveys as inputs for the numerator and denominator to estimate injury rate, accounting for the time spent in specific activities and included a procedure to estimate variance using the combined surveys. The 2010 National Health Interview Survey (NHIS) was used to quantify injuries, and the 2010 American Time Use Survey (ATUS) was used to quantify time of exposure to specific activities. The injury rate was estimated by dividing the average number of injuries (from NHIS) by average exposure hours (from ATUS), both measured for specific activities. The variance was calculated using the 'delta method', a general method for variance estimation with complex surveys. Among the five types of injuries examined, 'sport and exercise' had the highest rate (12.64 injuries per 100 000 h), followed by 'working around house/yard' (6.14), driving/riding a motor vehicle (2.98), working (1.45) and sleeping/resting/eating/drinking (0.23). The results show a ranking of injury rate by activity quite different from estimates using population as the denominator. Our approach produces an estimate of injury risk which includes activity exposure time and may more reliably reflect the underlying injury risks, offering an alternative method for injury surveillance and research. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Lee, Mei-Ling Ting; Bulyk, Martha L; Whitmore, G A; Church, George M
2002-12-01
There is considerable scientific interest in knowing the probability that a site-specific transcription factor will bind to a given DNA sequence. Microarray methods provide an effective means for assessing the binding affinities of a large number of DNA sequences as demonstrated by Bulyk et al. (2001, Proceedings of the National Academy of Sciences, USA 98, 7158-7163) in their study of the DNA-binding specificities of Zif268 zinc fingers using microarray technology. In a follow-up investigation, Bulyk, Johnson, and Church (2002, Nucleic Acid Research 30, 1255-1261) studied the interdependence of nucleotides on the binding affinities of transcription proteins. Our article is motivated by this pair of studies. We present a general statistical methodology for analyzing microarray intensity measurements reflecting DNA-protein interactions. The log probability of a protein binding to a DNA sequence on an array is modeled using a linear ANOVA model. This model is convenient because it employs familiar statistical concepts and procedures and also because it is effective for investigating the probability structure of the binding mechanism.
Exact test-based approach for equivalence test with parameter margin.
Cassie Dong, Xiaoyu; Bian, Yuanyuan; Tsong, Yi; Wang, Tianhua
2017-01-01
The equivalence test has a wide range of applications in pharmaceutical statistics which we need to test for the similarity between two groups. In recent years, the equivalence test has been used in assessing the analytical similarity between a proposed biosimilar product and a reference product. More specifically, the mean values of the two products for a given quality attribute are compared against an equivalence margin in the form of ±f × σ R , where ± f × σ R is a function of the reference variability. In practice, this margin is unknown and is estimated from the sample as ±f × S R . If we use this estimated margin with the classic t-test statistic on the equivalence test for the means, both Type I and Type II error rates may inflate. To resolve this issue, we develop an exact-based test method and compare this method with other proposed methods, such as the Wald test, the constrained Wald test, and the Generalized Pivotal Quantity (GPQ) in terms of Type I error rate and power. Application of those methods on data analysis is also provided in this paper. This work focuses on the development and discussion of the general statistical methodology and is not limited to the application of analytical similarity.
Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes
Xia, Xiao-Lei; Xing, Huanlai; Liu, Xueqin
2013-01-01
One of the most important applications of microarray data is the class prediction of biological samples. For this purpose, statistical tests have often been applied to identify the differentially expressed genes (DEGs), followed by the employment of the state-of-the-art learning machines including the Support Vector Machines (SVM) in particular. The SVM is a typical sample-based classifier whose performance comes down to how discriminant samples are. However, DEGs identified by statistical tests are not guaranteed to result in a training dataset composed of discriminant samples. To tackle this problem, a novel gene ranking method namely the Kernel Matrix Gene Selection (KMGS) is proposed. The rationale of the method, which roots in the fundamental ideas of the SVM algorithm, is described. The notion of ''the separability of a sample'' which is estimated by performing -like statistics on each column of the kernel matrix, is first introduced. The separability of a classification problem is then measured, from which the significance of a specific gene is deduced. Also described is a method of Kernel Matrix Sequential Forward Selection (KMSFS) which shares the KMGS method's essential ideas but proceeds in a greedy manner. On three public microarray datasets, our proposed algorithms achieved noticeably competitive performance in terms of the B.632+ error rate. PMID:24349110
Interactive Web Graphs with Fewer Restrictions
NASA Technical Reports Server (NTRS)
Fiedler, James
2012-01-01
There is growing popularity for interactive, statistical web graphs and programs to generate them. However, it seems that these programs tend to be somewhat restricted in which web browsers and statistical software are supported. For example, the software might use SVG (e.g., Protovis, gridSVG) or HTML canvas, both of which exclude most versions of Internet Explorer, or the software might be made specifically for R (gridSVG, CRanvas), thus excluding users of other stats software. There are more general tools (d3, Rapha lJS) which are compatible with most browsers, but using one of these to make statistical graphs requires more coding than is probably desired, and requires learning a new tool. This talk will present a method for making interactive web graphs, which, by design, attempts to support as many browsers and as many statistical programs as possible, while also aiming to be relatively easy to use and relatively easy to extend.
NIRS-SPM: statistical parametric mapping for near infrared spectroscopy
NASA Astrophysics Data System (ADS)
Tak, Sungho; Jang, Kwang Eun; Jung, Jinwook; Jang, Jaeduck; Jeong, Yong; Ye, Jong Chul
2008-02-01
Even though there exists a powerful statistical parametric mapping (SPM) tool for fMRI, similar public domain tools are not available for near infrared spectroscopy (NIRS). In this paper, we describe a new public domain statistical toolbox called NIRS-SPM for quantitative analysis of NIRS signals. Specifically, NIRS-SPM statistically analyzes the NIRS data using GLM and makes inference as the excursion probability which comes from the random field that are interpolated from the sparse measurement. In order to obtain correct inference, NIRS-SPM offers the pre-coloring and pre-whitening method for temporal correlation estimation. For simultaneous recording NIRS signal with fMRI, the spatial mapping between fMRI image and real coordinate in 3-D digitizer is estimated using Horn's algorithm. These powerful tools allows us the super-resolution localization of the brain activation which is not possible using the conventional NIRS analysis tools.
Decadal power in land air temperatures: Is it statistically significant?
NASA Astrophysics Data System (ADS)
Thejll, Peter A.
2001-12-01
The geographical distribution and properties of the well-known 10-11 year signal in terrestrial temperature records is investigated. By analyzing the Global Historical Climate Network data for surface air temperatures we verify that the signal is strongest in North America and is similar in nature to that reported earlier by R. G. Currie. The decadal signal is statistically significant for individual stations, but it is not possible to show that the signal is statistically significant globally, using strict tests. In North America, during the twentieth century, the decadal variability in the solar activity cycle is associated with the decadal part of the North Atlantic Oscillation index series in such a way that both of these signals correspond to the same spatial pattern of cooling and warming. A method for testing statistical results with Monte Carlo trials on data fields with specified temporal structure and specific spatial correlation retained is presented.
Quantitative assessment model for gastric cancer screening
Chen, Kun; Yu, Wei-Ping; Song, Liang; Zhu, Yi-Min
2005-01-01
AIM: To set up a mathematic model for gastric cancer screening and to evaluate its function in mass screening for gastric cancer. METHODS: A case control study was carried on in 66 patients and 198 normal people, then the risk and protective factors of gastric cancer were determined, including heavy manual work, foods such as small yellow-fin tuna, dried small shrimps, squills, crabs, mothers suffering from gastric diseases, spouse alive, use of refrigerators and hot food, etc. According to some principles and methods of probability and fuzzy mathematics, a quantitative assessment model was established as follows: first, we selected some factors significant in statistics, and calculated weight coefficient for each one by two different methods; second, population space was divided into gastric cancer fuzzy subset and non gastric cancer fuzzy subset, then a mathematic model for each subset was established, we got a mathematic expression of attribute degree (AD). RESULTS: Based on the data of 63 patients and 693 normal people, AD of each subject was calculated. Considering the sensitivity and specificity, the thresholds of AD values calculated were configured with 0.20 and 0.17, respectively. According to these thresholds, the sensitivity and specificity of the quantitative model were about 69% and 63%. Moreover, statistical test showed that the identification outcomes of these two different calculation methods were identical (P>0.05). CONCLUSION: The validity of this method is satisfactory. It is convenient, feasible, economic and can be used to determine individual and population risks of gastric cancer. PMID:15655813
NASA Astrophysics Data System (ADS)
Lotfy, Hayam M.; Saleh, Sarah S.; Hassan, Nagiba Y.; Salem, Hesham
This work represents the application of the isosbestic points present in different absorption spectra. Three novel spectrophotometric methods were developed, the first method is the absorption subtraction method (AS) utilizing the isosbestic point in zero-order absorption spectra; the second method is the amplitude modulation method (AM) utilizing the isosbestic point in ratio spectra; and third method is the amplitude summation method (A-Sum) utilizing the isosbestic point in derivative spectra. The three methods were applied for the analysis of the ternary mixture of chloramphenicol (CHL), dexamethasone sodium phosphate (DXM) and tetryzoline hydrochloride (TZH) in eye drops in the presence of benzalkonium chloride as a preservative. The components at the isosbestic point were determined using the corresponding unified regression equation at this point with no need for a complementary method. The obtained results were statistically compared to each other and to that of the developed PLS model. The specificity of the developed methods was investigated by analyzing laboratory prepared mixtures and the combined dosage form. The methods were validated as per ICH guidelines where accuracy, repeatability, inter-day precision and robustness were found to be within the acceptable limits. The results obtained from the proposed methods were statistically compared with official ones where no significant difference was observed.
Gandhi, Neha; Jain, Sandeep; Kumar, Manish; Rupakar, Pratik; Choyal, Kanaram; Prajapati, Seema
2015-01-01
Age assessment may be a crucial step in postmortem profiling leading to confirmative identification. In children, Demirjian's method based on eight developmental stages was developed to determine maturity scores as a function of age and polynomial functions to determine age as a function of score. Of this study was to evaluate the reliability of age estimation using Demirjian's eight teeth method following the French maturity scores and Indian-specific formula from developmental stages of third molar with the help of orthopantomograms using the Demirjian method. Dental panoramic tomograms from 30 subjects each of known chronological age and sex were collected and were evaluated according to Demirjian's criteria. Age calculations were performed using Demirjian's formula and Indian formula. Statistical analysis used was Chi-square test and ANOVA test and the P values obtained were statistically significant. There was an average underestimation of age with both Indian and Demirjian's formulas. The mean absolute error was lower using Indian formula hence it can be applied for age estimation in present Gujarati population. Also, females were ahead of achieving dental maturity than males thus completion of dental development is attained earlier in females. Greater accuracy can be obtained if population-specific formulas considering the ethnic and environmental variation are derived performing the regression analysis.
Statistical Methods for Identifying Sequence Motifs Affecting Point Mutations
Zhu, Yicheng; Neeman, Teresa; Yap, Von Bing; Huttley, Gavin A.
2017-01-01
Mutation processes differ between types of point mutation, genomic locations, cells, and biological species. For some point mutations, specific neighboring bases are known to be mechanistically influential. Beyond these cases, numerous questions remain unresolved, including: what are the sequence motifs that affect point mutations? How large are the motifs? Are they strand symmetric? And, do they vary between samples? We present new log-linear models that allow explicit examination of these questions, along with sequence logo style visualization to enable identifying specific motifs. We demonstrate the performance of these methods by analyzing mutation processes in human germline and malignant melanoma. We recapitulate the known CpG effect, and identify novel motifs, including a highly significant motif associated with A→G mutations. We show that major effects of neighbors on germline mutation lie within ±2 of the mutating base. Models are also presented for contrasting the entire mutation spectra (the distribution of the different point mutations). We show the spectra vary significantly between autosomes and X-chromosome, with a difference in T→C transition dominating. Analyses of malignant melanoma confirmed reported characteristic features of this cancer, including statistically significant strand asymmetry, and markedly different neighboring influences. The methods we present are made freely available as a Python library https://bitbucket.org/pycogent3/mutationmotif. PMID:27974498
NASA Astrophysics Data System (ADS)
Lotfy, Hayam Mahmoud; Fayez, Yasmin Mohammed; Tawakkol, Shereen Mostafa; Fahmy, Nesma Mahmoud; Shehata, Mostafa Abd El-Atty
2017-09-01
Simultaneous determination of miconazole (MIC), mometasone furaoate (MF), and gentamicin (GEN) in their pharmaceutical combination. Gentamicin determination is based on derivatization with of o-phthalaldehyde reagent (OPA) without any interference of other cited drugs, while the spectra of MIC and MF are resolved using both successive and progressive resolution techniques. The first derivative spectrum of MF is measured using constant multiplication or spectrum subtraction, while its recovered zero order spectrum is obtained using derivative transformation. Beside the application of constant value method. Zero order spectrum of MIC is obtained by derivative transformation after getting its first derivative spectrum by derivative subtraction method. The novel method namely, differential amplitude modulation is used to get the concentration of MF and MIC, while the novel graphical method namely, concentration value is used to get the concentration of MIC, MF, and GEN. Accuracy and precision testing of the developed methods show good results. Specificity of the methods is ensured and is successfully applied for the analysis of pharmaceutical formulation of the three drugs in combination. ICH guidelines are used for validation of the proposed methods. Statistical data are calculated, and the results are satisfactory revealing no significant difference regarding accuracy and precision.
Gerber, Madelyn M.; Hampel, Heather; Schulz, Nathan P.; Fernandez, Soledad; Wei, Lai; Zhou, Xiao-Ping; de la Chapelle, Albert; Toland, Amanda Ewart
2012-01-01
Background Tumors frequently exhibit loss of tumor suppressor genes or allelic gains of activated oncogenes. A significant proportion of cancer susceptibility loci in the mouse show somatic losses or gains consistent with the presence of a tumor susceptibility or resistance allele. Thus, allele-specific somatic gains or losses at loci may demarcate the presence of resistance or susceptibility alleles. The goal of this study was to determine if previously mapped susceptibility loci for colorectal cancer show evidence of allele-specific somatic events in colon tumors. Methods We performed quantitative genotyping of 16 single nucleotide polymorphisms (SNPs) showing statistically significant association with colorectal cancer in published genome-wide association studies (GWAS). We genotyped 194 paired normal and colorectal tumor DNA samples and 296 paired validation samples to investigate these SNPs for allele-specific somatic gains and losses. We combined analysis of our data with published data for seven of these SNPs. Results No statistically significant evidence for allele-specific somatic selection was observed for the tested polymorphisms in the discovery set. The rs6983267 variant, which has shown preferential loss of the non-risk T allele and relative gain of the risk G allele in previous studies, favored relative gain of the G allele in the combined discovery and validation samples (corrected p-value = 0.03). When we combined our data with published allele-specific imbalance data for this SNP, the G allele of rs6983267 showed statistically significant evidence of relative retention (p-value = 2.06×10−4). Conclusions Our results suggest that the majority of variants identified as colon cancer susceptibility alleles through GWAS do not exhibit somatic allele-specific imbalance in colon tumors. Our data confirm previously published results showing allele-specific imbalance for rs6983267. These results indicate that allele-specific imbalance of cancer susceptibility alleles may not be a common phenomenon in colon cancer. PMID:22629442
Statewide analysis of the drainage-area ratio method for 34 streamflow percentile ranges in Texas
Asquith, William H.; Roussel, Meghan C.; Vrabel, Joseph
2006-01-01
The drainage-area ratio method commonly is used to estimate streamflow for sites where no streamflow data are available using data from one or more nearby streamflow-gaging stations. The method is intuitive and straightforward to implement and is in widespread use by analysts and managers of surface-water resources. The method equates the ratio of streamflow at two stream locations to the ratio of the respective drainage areas. In practice, unity often is assumed as the exponent on the drainage-area ratio, and unity also is assumed as a multiplicative bias correction. These two assumptions are evaluated in this investigation through statewide analysis of daily mean streamflow in Texas. The investigation was made by the U.S. Geological Survey in cooperation with the Texas Commission on Environmental Quality. More than 7.8 million values of daily mean streamflow for 712 U.S. Geological Survey streamflow-gaging stations in Texas were analyzed. To account for the influence of streamflow probability on the drainage-area ratio method, 34 percentile ranges were considered. The 34 ranges are the 4 quartiles (0-25, 25-50, 50-75, and 75-100 percent), the 5 intervals of the lower tail of the streamflow distribution (0-1, 1-2, 2-3, 3-4, and 4-5 percent), the 20 quintiles of the 4 quartiles (0-5, 5-10, 10-15, 15-20, 20-25, 25-30, 30-35, 35-40, 40-45, 45-50, 50-55, 55-60, 60-65, 65-70, 70-75, 75-80, 80-85, 85-90, 90-95, and 95-100 percent), and the 5 intervals of the upper tail of the streamflow distribution (95-96, 96-97, 97-98, 98-99 and 99-100 percent). For each of the 253,116 (712X711/2) unique pairings of stations and for each of the 34 percentile ranges, the concurrent daily mean streamflow values available for the two stations provided for station-pair application of the drainage-area ratio method. For each station pair, specific statistical summarization (median, mean, and standard deviation) of both the exponent and bias-correction components of the drainage-area ratio method were computed. Statewide statistics (median, mean, and standard deviation) of the station-pair specific statistics subsequently were computed and are tabulated herein. A separate analysis considered conditioning station pairs to those stations within 100 miles of each other and with the absolute value of the logarithm (base-10) of the ratio of the drainage areas greater than or equal to 0.25. Statewide statistics of the conditional station-pair specific statistics were computed and are tabulated. The conditional analysis is preferable because of the anticipation that small separation distances reflect similar hydrologic conditions and the observation of large variation in exponent estimates for similar-sized drainage areas. The conditional analysis determined that the exponent is about 0.89 for streamflow percentiles from 0 to about 50 percent, is about 0.92 for percentiles from about 50 to about 65 percent, and is about 0.93 for percentiles from about 65 to about 85 percent. The exponent decreases rapidly to about 0.70 for percentiles nearing 100 percent. The computation of the bias-correction factor is sensitive to the range analysis interval (range of streamflow percentile); however, evidence suggests that in practice the drainage-area method can be considered unbiased. Finally, for general application, suggested values of the exponent are tabulated for 54 percentiles of daily mean streamflow in Texas; when these values are used, the bias correction is unity.
A multi-analyte serum test for the detection of non-small cell lung cancer
Farlow, E C; Vercillo, M S; Coon, J S; Basu, S; Kim, A W; Faber, L P; Warren, W H; Bonomi, P; Liptay, M J; Borgia, J A
2010-01-01
Background: In this study, we appraised a wide assortment of biomarkers previously shown to have diagnostic or prognostic value for non-small cell lung cancer (NSCLC) with the intent of establishing a multi-analyte serum test capable of identifying patients with lung cancer. Methods: Circulating levels of 47 biomarkers were evaluated against patient cohorts consisting of 90 NSCLC and 43 non-cancer controls using commercial immunoassays. Multivariate statistical methods were used on all biomarkers achieving statistical relevance to define an optimised panel of diagnostic biomarkers for NSCLC. The resulting biomarkers were fashioned into a classification algorithm and validated against serum from a second patient cohort. Results: A total of 14 analytes achieved statistical relevance upon evaluation. Multivariate statistical methods then identified a panel of six biomarkers (tumour necrosis factor-α, CYFRA 21-1, interleukin-1ra, matrix metalloproteinase-2, monocyte chemotactic protein-1 and sE-selectin) as being the most efficacious for diagnosing early stage NSCLC. When tested against a second patient cohort, the panel successfully classified 75 of 88 patients. Conclusions: Here, we report the development of a serum algorithm with high specificity for classifying patients with NSCLC against cohorts of various ‘high-risk' individuals. A high rate of false positives was observed within the cohort in which patients had non-neoplastic lung nodules, possibly as a consequence of the inflammatory nature of these conditions. PMID:20859284
Karim, Mohammad Ehsanul; Platt, Robert W
2017-06-15
Correct specification of the inverse probability weighting (IPW) model is necessary for consistent inference from a marginal structural Cox model (MSCM). In practical applications, researchers are typically unaware of the true specification of the weight model. Nonetheless, IPWs are commonly estimated using parametric models, such as the main-effects logistic regression model. In practice, assumptions underlying such models may not hold and data-adaptive statistical learning methods may provide an alternative. Many candidate statistical learning approaches are available in the literature. However, the optimal approach for a given dataset is impossible to predict. Super learner (SL) has been proposed as a tool for selecting an optimal learner from a set of candidates using cross-validation. In this study, we evaluate the usefulness of a SL in estimating IPW in four different MSCM simulation scenarios, in which we varied the specification of the true weight model specification (linear and/or additive). Our simulations show that, in the presence of weight model misspecification, with a rich and diverse set of candidate algorithms, SL can generally offer a better alternative to the commonly used statistical learning approaches in terms of MSE as well as the coverage probabilities of the estimated effect in an MSCM. The findings from the simulation studies guided the application of the MSCM in a multiple sclerosis cohort from British Columbia, Canada (1995-2008), to estimate the impact of beta-interferon treatment in delaying disability progression. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Highly selective colorimetric bacteria sensing based on protein-capped nanoparticles.
Qiu, Suyan; Lin, Zhenyu; Zhou, Yaomin; Wang, Donggen; Yuan, Lijuan; Wei, Yihua; Dai, Tingcan; Luo, Linguang; Chen, Guonan
2015-02-21
A rapid and cost-effective colorimetric sensor has been developed for the detection of bacteria (Bacillus subtilis was selected as an example). The sensor was designed to rely on lysozyme-capped AuNPs with the advantages of effective amplification and high specificity. In the sensing system, lysozyme was able to bind strongly to Bacillus subtilis, which effectively induced a color change of the solution from light purple to purplish red. The lowest concentration of Bacillus subtilis detectable by the naked eye was 4.5 × 10(3) colony-forming units (CFU) mL(-1). Similar results were discernable from UV-Vis absorption measurements. A good specificity was observed through a statistical analysis method using the SPSS software (version 17.0). This simple colorimetric sensor may therefore be a rapid and specific method for a bacterial detection assay in complex samples.
Aryan, Arvin; Azizi, Zahra; Teimouri, Azam; Ebrahimi Daryani, Nasser; Aletaha, Najme; Jahanbakhsh, Ali; Nouritaromlou, Mohammad Kazem; Alborzi, Forough; Mami, Masoud; Basirat, Vahid; Javid Anbardan, Sanam
2016-04-01
BACKGROUND According to recent studies comparing magnetic resonance enterography (MRE) with ileocolonoscopy for assessing inflammation of small bowel and colonic segments in adults with active Crohn's disease (CD), we aimed to compare the accuracy of these two diagnostic methods in Iranian population. METHODS During 2013-2014 a follow-up study was done on 30 patients with active CD in a gastroenterology clinic affiliated to Tehran University of Medical Sciences. MRE and ileocolonoscopy were performed for all the patients. All statistical analyses were performed using SPSS (version 18) and p-value<0.05 was considered as statistically significant. RESULTS Of the 30 patients with active CD, 11(36.7%) were men and 19 (63.3%) were women with mean age of 37.30±13.66 years (range: 19-67 years). MRE had sensitivity and specificity of 50% and 90% with positive predictive value (PPV) and negative predictive value (NPV) of 71.43 and 78.26, respectively for localizing sigmoid lesions and ileum had sensitivity and specificity of 84.21 and 45.45 with PPV and NPV of 72.73 and 62.50, respectively. CONCLUSION While moderate sensitivity and high specificity of MRE in localizing colonic lesions makes it an appropriate confirmatory test after colonoscopy, the reported high sensitivity and moderate specificity of MRE versus colonoscopy in detecting ileal lesions makes it a suitable screening test for ileal lesions. Finally we can conclude that MRE can be an important complementary test to colonoscopy in detecting active disease.
Vlachopoulos, Lazaros; Lüthi, Marcel; Carrillo, Fabio; Gerber, Christian; Székely, Gábor; Fürnstahl, Philipp
2018-04-18
In computer-assisted reconstructive surgeries, the contralateral anatomy is established as the best available reconstruction template. However, existing intra-individual bilateral differences or a pathological, contralateral humerus may limit the applicability of the method. The aim of the study was to evaluate whether a statistical shape model (SSM) has the potential to predict accurately the pretraumatic anatomy of the humerus from the posttraumatic condition. Three-dimensional (3D) triangular surface models were extracted from the computed tomographic data of 100 paired cadaveric humeri without a pathological condition. An SSM was constructed, encoding the characteristic shape variations among the individuals. To predict the patient-specific anatomy of the proximal (or distal) part of the humerus with the SSM, we generated segments of the humerus of predefined length excluding the part to predict. The proximal and distal humeral prediction (p-HP and d-HP) errors, defined as the deviation of the predicted (bone) model from the original (bone) model, were evaluated. For comparison with the state-of-the-art technique, i.e., the contralateral registration method, we used the same segments of the humerus to evaluate whether the SSM or the contralateral anatomy yields a more accurate reconstruction template. The p-HP error (mean and standard deviation, 3.8° ± 1.9°) using 85% of the distal end of the humerus to predict the proximal humeral anatomy was significantly smaller (p = 0.001) compared with the contralateral registration method. The difference between the d-HP error (mean, 5.5° ± 2.9°), using 85% of the proximal part of the humerus to predict the distal humeral anatomy, and the contralateral registration method was not significant (p = 0.61). The restoration of the humeral length was not significantly different between the SSM and the contralateral registration method. SSMs accurately predict the patient-specific anatomy of the proximal and distal aspects of the humerus. The prediction errors of the SSM depend on the size of the healthy part of the humerus. The prediction of the patient-specific anatomy of the humerus is of fundamental importance for computer-assisted reconstructive surgeries.
Álvarez-Díaz, N; Amador-García, I; Fuentes-Hernández, M; Dorta-Guerra, R
2015-01-01
To compare the ability of lung ultrasound and a clinical method in the confirmation of a selective bronchial intubation by left double-lumen tube in elective thoracic surgery. A prospective and blind, observational study was conducted in the setting of a university hospital operating room assigned for thoracic surgery. A single group of 105 consecutive patients from a total of 130, were included. After blind intubation, the position of the tube was confirmed by clinical and ultrasound assessment. Finally, the fiberoptic bronchoscopy confirmation as a reference standard was used to confirm the position of the tube. Under manual ventilation, by sequentially clamping the tracheal and bronchial limbs of the tube, clinical confirmation was made by auscultation, capnography, visualizing the chest wall expansion, and perceiving the lung compliance in the reservoir bag. Ultrasound confirmation was obtained by visualizing lung sliding, diaphragmatic movements, and the appearance of lung pulse sign. The sensitivity of the clinical method was 84.5%, with a specificity of 41.1%. The positive and negative likelihood ratio was 1.44 and 0.38, respectively. The sensitivity of the ultrasound method was 98.6%, specificity was 52.9%, with a positive likelihood ratio of 2.10 and a negative likelihood ratio of 0.03. Comparisons between the diagnostic performance of the 2 methods were calculated with McNemar's test. There was a significant difference in sensitivity between the ultrasound method and the clinical method (P=.002). Nevertheless, there was no statistically significant difference in specificity between both methods (P=.34). A p value<.01 was considered statistically significant. Lung ultrasound was superior to the clinical method in confirming the adequate position of the left double-lumen tube. On the other hand, in confirming the misplacement of the tube, differences between both methods could not be ensured. Copyright © 2014 Sociedad Española de Anestesiología, Reanimación y Terapéutica del Dolor. Publicado por Elsevier España, S.L.U. All rights reserved.
Use of dichotomous choice nonmarket methods to value the whooping crane resource
J. Michael Bowker; John R. Stoll
1985-01-01
A dichotomous choice form of contingent valuation is applied to quantify individuals' economic surplus associated with preservation of the whooping crane resource. Specific issues and limitations of the empirical approach are discussed. The results of this case study reveal that models with similar statistical fits can lead to very disparate measures of economic...
USDA-ARS?s Scientific Manuscript database
This paper provides a summary of results presented in a much more comprehensive article (Sampson et al. 2014). Specifics regarding methods and statistical procedures can be found in Sampson et al. 2014. Here, we summarize these results for popular cultivars of rabbiteye blueberry (V. virgatum syn. a...
ERIC Educational Resources Information Center
Burstein, Marcy; Georgiades, Katholiki; Lamers, Femke; Swanson, Sonja A.; Cui, Lihong; He, Jian-Ping; Avenevoli, Shelli; Merikangas, Kathleen R.
2012-01-01
Objective: The current study examined the sex- and age-specific structure and comorbidity of lifetime anxiety disorders among U.S. adolescents. Method: The sample consisted of 2,539 adolescents (1,505 females and 1,034 males) from the National Comorbidity Survey-Adolescent Supplement who met criteria for "Diagnostic and Statistical Manual of…
NASA Astrophysics Data System (ADS)
Gomo, M.; Vermeulen, D.
2015-03-01
An investigation was conducted to statistically compare the influence of non-purging and purging groundwater sampling methods on analysed inorganic chemistry parameters and calculated saturation indices. Groundwater samples were collected from 15 monitoring wells drilled in Karoo aquifers before and after purging for the comparative study. For the non-purging method, samples were collected from groundwater flow zones located in the wells using electrical conductivity (EC) profiling. The two data sets of non-purged and purged groundwater samples were analysed for inorganic chemistry parameters at the Institute of Groundwater Studies (IGS) laboratory of the Free University in South Africa. Saturation indices for mineral phases that were found in the data base of PHREEQC hydrogeochemical model were calculated for each data set. Four one-way ANOVA tests were conducted using Microsoft excel 2007 to investigate if there is any statistically significant difference between: (1) all inorganic chemistry parameters measured in the non-purged and purged groundwater samples per each specific well, (2) all mineral saturation indices calculated for the non-purged and purged groundwater samples per each specific well, (3) individual inorganic chemistry parameters measured in the non-purged and purged groundwater samples across all wells and (4) Individual mineral saturation indices calculated for non-purged and purged groundwater samples across all wells. For all the ANOVA tests conducted, the calculated alpha values (p) are greater than 0.05 (significance level) and test statistic (F) is less than the critical value (Fcrit) (F < Fcrit). The results imply that there was no statistically significant difference between the two data sets. With a 95% confidence, it was therefore concluded that the variance between groups was rather due to random chance and not to the influence of the sampling methods (tested factor). It is therefore be possible that in some hydrogeologic conditions, non-purged groundwater samples might be just as representative as the purged ones. The findings of this study can provide an important platform for future evidence oriented research investigations to establish the necessity of purging prior to groundwater sampling in different aquifer systems.
Cannon, Edward O; Amini, Ata; Bender, Andreas; Sternberg, Michael J E; Muggleton, Stephen H; Glen, Robert C; Mitchell, John B O
2007-05-01
We investigate the classification performance of circular fingerprints in combination with the Naive Bayes Classifier (MP2D), Inductive Logic Programming (ILP) and Support Vector Inductive Logic Programming (SVILP) on a standard molecular benchmark dataset comprising 11 activity classes and about 102,000 structures. The Naive Bayes Classifier treats features independently while ILP combines structural fragments, and then creates new features with higher predictive power. SVILP is a very recently presented method which adds a support vector machine after common ILP procedures. The performance of the methods is evaluated via a number of statistical measures, namely recall, specificity, precision, F-measure, Matthews Correlation Coefficient, area under the Receiver Operating Characteristic (ROC) curve and enrichment factor (EF). According to the F-measure, which takes both recall and precision into account, SVILP is for seven out of the 11 classes the superior method. The results show that the Bayes Classifier gives the best recall performance for eight of the 11 targets, but has a much lower precision, specificity and F-measure. The SVILP model on the other hand has the highest recall for only three of the 11 classes, but generally far superior specificity and precision. To evaluate the statistical significance of the SVILP superiority, we employ McNemar's test which shows that SVILP performs significantly (p < 5%) better than both other methods for six out of 11 activity classes, while being superior with less significance for three of the remaining classes. While previously the Bayes Classifier was shown to perform very well in molecular classification studies, these results suggest that SVILP is able to extract additional knowledge from the data, thus improving classification results further.
Ruiz-España, Silvia; Arana, Estanislao; Moratal, David
2015-07-01
Computer-aided diagnosis (CAD) methods for detecting and classifying lumbar spine disease in Magnetic Resonance imaging (MRI) can assist radiologists to perform their decision-making tasks. In this paper, a CAD software has been developed able to classify and quantify spine disease (disc degeneration, herniation and spinal stenosis) in two-dimensional MRI. A set of 52 lumbar discs from 14 patients was used for training and 243 lumbar discs from 53 patients for testing in conventional two-dimensional MRI of the lumbar spine. To classify disc degeneration according to the gold standard, Pfirrmann classification, a method based on the measurement of disc signal intensity and structure was developed. A gradient Vector Flow algorithm was used to extract disc shape features and for detecting contour abnormalities. Also, a signal intensity method was used for segmenting and detecting spinal stenosis. Novel algorithms have also been developed to quantify the severity of these pathologies. Variability was evaluated by kappa (k) and intra-class correlation (ICC) statistics. Segmentation inaccuracy was below 1%. Almost perfect agreement, as measured by the k and ICC statistics, was obtained for all the analyzed pathologies: disc degeneration (k=0.81 with 95% CI=[0.75..0.88]) with a sensitivity of 95.8% and a specificity of 92.6%, disc herniation (k=0.94 with 95% CI=[0.87..1]) with a sensitivity of 60% and a specificity of 87.1%, categorical stenosis (k=0.94 with 95% CI=[0.90..0.98]) and quantitative stenosis (ICC=0.98 with 95% CI=[0.97..0.98]) with a sensitivity of 70% and a specificity of 81.7%. The proposed methods are reproducible and should be considered as a possible alternative when compared to reference standards. Copyright © 2015 Elsevier Ltd. All rights reserved.
A Bifactor Approach to Model Multifaceted Constructs in Statistical Mediation Analysis.
Gonzalez, Oscar; MacKinnon, David P
Statistical mediation analysis allows researchers to identify the most important mediating constructs in the causal process studied. Identifying specific mediators is especially relevant when the hypothesized mediating construct consists of multiple related facets. The general definition of the construct and its facets might relate differently to an outcome. However, current methods do not allow researchers to study the relationships between general and specific aspects of a construct to an outcome simultaneously. This study proposes a bifactor measurement model for the mediating construct as a way to parse variance and represent the general aspect and specific facets of a construct simultaneously. Monte Carlo simulation results are presented to help determine the properties of mediated effect estimation when the mediator has a bifactor structure and a specific facet of a construct is the true mediator. This study also investigates the conditions when researchers can detect the mediated effect when the multidimensionality of the mediator is ignored and treated as unidimensional. Simulation results indicated that the mediation model with a bifactor mediator measurement model had unbiased and adequate power to detect the mediated effect with a sample size greater than 500 and medium a - and b -paths. Also, results indicate that parameter bias and detection of the mediated effect in both the data-generating model and the misspecified model varies as a function of the amount of facet variance represented in the mediation model. This study contributes to the largely unexplored area of measurement issues in statistical mediation analysis.
Subjective global assessment of nutritional status in children.
Mahdavi, Aida Malek; Ostadrahimi, Alireza; Safaiyan, Abdolrasool
2010-10-01
This study was aimed to compare the subjective and objective nutritional assessments and to analyse the performance of subjective global assessment (SGA) of nutritional status in diagnosing undernutrition in paediatric patients. One hundred and forty children (aged 2-12 years) hospitalized consecutively in Tabriz Paediatric Hospital from June 2008 to August 2008 underwent subjective assessment using the SGA questionnaire and objective assessment, including anthropometric and biochemical measurements. Agreement between two assessment methods was analysed by the kappa (κ) statistic. Statistical indicators including (sensitivity, specificity, predictive values, error rates, accuracy, powers, likelihood ratios and odds ratio) between SGA and objective assessment method were determined. The overall prevalence of undernutrition according to the SGA (70.7%) was higher than that by objective assessment of nutritional status (48.5%). Agreement between the two evaluation methods was only fair to moderate (κ = 0.336, P < 0.001). The sensitivity, specificity, positive and negative predictive value of the SGA method for screening undernutrition in this population were 88.235%, 45.833%, 60.606% and 80.487%, respectively. Accuracy, positive and negative power of the SGA method were 66.428%, 56.074% and 41.25%, respectively. Likelihood ratio positive, likelihood ratio negative and odds ratio of the SGA method were 1.628, 0.256 and 6.359, respectively. Our findings indicated that in assessing nutritional status of children, there is not a good level of agreement between SGA and objective nutritional assessment. In addition, SGA is a highly sensitive tool for assessing nutritional status and could identify children at risk of developing undernutrition. © 2009 Blackwell Publishing Ltd.
First arrival time picking for microseismic data based on DWSW algorithm
NASA Astrophysics Data System (ADS)
Li, Yue; Wang, Yue; Lin, Hongbo; Zhong, Tie
2018-03-01
The first arrival time picking is a crucial step in microseismic data processing. When the signal-to-noise ratio (SNR) is low, however, it is difficult to get the first arrival time accurately with traditional methods. In this paper, we propose the double-sliding-window SW (DWSW) method based on the Shapiro-Wilk (SW) test. The DWSW method is used to detect the first arrival time by making full use of the differences between background noise and effective signals in the statistical properties. Specifically speaking, we obtain the moment corresponding to the maximum as the first arrival time of microseismic data when the statistic of our method reaches its maximum. Hence, in our method, there is no need to select the threshold, which makes the algorithm more facile when the SNR of microseismic data is low. To verify the reliability of the proposed method, a series of experiments is performed on both synthetic and field microseismic data. Our method is compared with the traditional short-time and long-time average (STA/LTA) method, the Akaike information criterion, and the kurtosis method. Analysis results indicate that the accuracy rate of the proposed method is superior to that of the other three methods when the SNR is as low as - 10 dB.
Powerful Inference with the D-Statistic on Low-Coverage Whole-Genome Data.
Soraggi, Samuele; Wiuf, Carsten; Albrechtsen, Anders
2018-02-02
The detection of ancient gene flow between human populations is an important issue in population genetics. A common tool for detecting ancient admixture events is the D-statistic. The D-statistic is based on the hypothesis of a genetic relationship that involves four populations, whose correctness is assessed by evaluating specific coincidences of alleles between the groups. When working with high-throughput sequencing data, calling genotypes accurately is not always possible; therefore, the D-statistic currently samples a single base from the reads of one individual per population. This implies ignoring much of the information in the data, an issue especially striking in the case of ancient genomes. We provide a significant improvement to overcome the problems of the D-statistic by considering all reads from multiple individuals in each population. We also apply type-specific error correction to combat the problems of sequencing errors, and show a way to correct for introgression from an external population that is not part of the supposed genetic relationship, and how this leads to an estimate of the admixture rate. We prove that the D-statistic is approximated by a standard normal distribution. Furthermore, we show that our method outperforms the traditional D-statistic in detecting admixtures. The power gain is most pronounced for low and medium sequencing depth (1-10×), and performances are as good as with perfectly called genotypes at a sequencing depth of 2×. We show the reliability of error correction in scenarios with simulated errors and ancient data, and correct for introgression in known scenarios to estimate the admixture rates. Copyright © 2018 Soraggi et al.
Application of a data-mining method based on Bayesian networks to lesion-deficit analysis
NASA Technical Reports Server (NTRS)
Herskovits, Edward H.; Gerring, Joan P.
2003-01-01
Although lesion-deficit analysis (LDA) has provided extensive information about structure-function associations in the human brain, LDA has suffered from the difficulties inherent to the analysis of spatial data, i.e., there are many more variables than subjects, and data may be difficult to model using standard distributions, such as the normal distribution. We herein describe a Bayesian method for LDA; this method is based on data-mining techniques that employ Bayesian networks to represent structure-function associations. These methods are computationally tractable, and can represent complex, nonlinear structure-function associations. When applied to the evaluation of data obtained from a study of the psychiatric sequelae of traumatic brain injury in children, this method generates a Bayesian network that demonstrates complex, nonlinear associations among lesions in the left caudate, right globus pallidus, right side of the corpus callosum, right caudate, and left thalamus, and subsequent development of attention-deficit hyperactivity disorder, confirming and extending our previous statistical analysis of these data. Furthermore, analysis of simulated data indicates that methods based on Bayesian networks may be more sensitive and specific for detecting associations among categorical variables than methods based on chi-square and Fisher exact statistics.
NASA Astrophysics Data System (ADS)
Lomakina, N. Ya.
2017-11-01
The work presents the results of the applied climatic division of the Siberian region into districts based on the methodology of objective classification of the atmospheric boundary layer climates by the "temperature-moisture-wind" complex realized with using the method of principal components and the special similarity criteria of average profiles and the eigen values of correlation matrices. On the territory of Siberia, it was identified 14 homogeneous regions for winter season and 10 regions were revealed for summer. The local statistical models were constructed for each region. These include vertical profiles of mean values, mean square deviations, and matrices of interlevel correlation of temperature, specific humidity, zonal and meridional wind velocity. The advantage of the obtained local statistical models over the regional models is shown.
Model Error Estimation for the CPTEC Eta Model
NASA Technical Reports Server (NTRS)
Tippett, Michael K.; daSilva, Arlindo
1999-01-01
Statistical data assimilation systems require the specification of forecast and observation error statistics. Forecast error is due to model imperfections and differences between the initial condition and the actual state of the atmosphere. Practical four-dimensional variational (4D-Var) methods try to fit the forecast state to the observations and assume that the model error is negligible. Here with a number of simplifying assumption, a framework is developed for isolating the model error given the forecast error at two lead-times. Two definitions are proposed for the Talagrand ratio tau, the fraction of the forecast error due to model error rather than initial condition error. Data from the CPTEC Eta Model running operationally over South America are used to calculate forecast error statistics and lower bounds for tau.
Kuhn-Tucker optimization based reliability analysis for probabilistic finite elements
NASA Technical Reports Server (NTRS)
Liu, W. K.; Besterfield, G.; Lawrence, M.; Belytschko, T.
1988-01-01
The fusion of probability finite element method (PFEM) and reliability analysis for fracture mechanics is considered. Reliability analysis with specific application to fracture mechanics is presented, and computational procedures are discussed. Explicit expressions for the optimization procedure with regard to fracture mechanics are given. The results show the PFEM is a very powerful tool in determining the second-moment statistics. The method can determine the probability of failure or fracture subject to randomness in load, material properties and crack length, orientation, and location.
Assessing criticality in seismicity by entropy
NASA Astrophysics Data System (ADS)
Goltz, C.
2003-04-01
There is an ongoing discussion whether the Earth's crust is in a critical state and whether this state is permanent or intermittent. Intermittent criticality would allow specification of time-dependent hazard in principle. Analysis of a spatio-temporally evolving synthetic critical point phenomenon and of real seismicity using configurational entropy shows that the method is a suitable approach for the characterisation of critical point dynamics. Results obtained rather support the notion of intermittent criticality in earthquakes. Statistical significance of the findings is assessed by the method of surrogate data.
[Studies on localized low-risk prostate cancer : Do we know enough?
Weißbach, L; Roloff, C
2018-06-05
Treatment of localized low-risk prostate cancer (PCa) is undergoing a paradigm shift: Invasive treatments such as surgery and radiation therapy are being replaced by defensive strategies such as active surveillance (AS) and watchful waiting (WW). The aim of this work is to evaluate the significance of current studies regarding defensive strategies (AS and WW). The best-known AS studies are critically evaluated for their significance in terms of input criteria, follow-up criteria, and statistical significance. The difficulties faced by randomized studies in answering the question of the best treatment for low-risk cancer in two or even more study groups with known low tumor-specific mortality are clearly shown. Some studies fail because of the objective, others-like PIVOT-are underpowered. ProtecT, a renowned randomized, controlled trial (RCT), lists systematic and statistical shortcomings in detail. The time and effort required for RCTs to answer the question of which therapy is best for locally limited low-risk cancer is very large because the low specific mortality rate requires a large number of participants and a long study duration. In any case, RCTs create hand-picked cohorts for statistical evaluation that have little to do with care in daily clinical practice. The necessary randomization is also offset by the decision-making of the informed patient. If further studies of low-risk PCa are needed, they will need real-world conditions that an RCT can not provide. To obtain clinically relevant results, we need to rethink things: When planning the study, biometricians and clinicians must understand that the statistical methods used in RCTs are of limited use and they must select a method (e.g. propensity scores) appropriate for health care research.
O'Leary, Neil; Chauhan, Balwantray C; Artes, Paul H
2012-10-01
To establish a method for estimating the overall statistical significance of visual field deterioration from an individual patient's data, and to compare its performance to pointwise linear regression. The Truncated Product Method was used to calculate a statistic S that combines evidence of deterioration from individual test locations in the visual field. The overall statistical significance (P value) of visual field deterioration was inferred by comparing S with its permutation distribution, derived from repeated reordering of the visual field series. Permutation of pointwise linear regression (PoPLR) and pointwise linear regression were evaluated in data from patients with glaucoma (944 eyes, median mean deviation -2.9 dB, interquartile range: -6.3, -1.2 dB) followed for more than 4 years (median 10 examinations over 8 years). False-positive rates were estimated from randomly reordered series of this dataset, and hit rates (proportion of eyes with significant deterioration) were estimated from the original series. The false-positive rates of PoPLR were indistinguishable from the corresponding nominal significance levels and were independent of baseline visual field damage and length of follow-up. At P < 0.05, the hit rates of PoPLR were 12, 29, and 42%, at the fifth, eighth, and final examinations, respectively, and at matching specificities they were consistently higher than those of pointwise linear regression. In contrast to population-based progression analyses, PoPLR provides a continuous estimate of statistical significance for visual field deterioration individualized to a particular patient's data. This allows close control over specificity, essential for monitoring patients in clinical practice and in clinical trials.
Kwon, Deukwoo; Reis, Isildinha M
2015-08-12
When conducting a meta-analysis of a continuous outcome, estimated means and standard deviations from the selected studies are required in order to obtain an overall estimate of the mean effect and its confidence interval. If these quantities are not directly reported in the publications, they must be estimated from other reported summary statistics, such as the median, the minimum, the maximum, and quartiles. We propose a simulation-based estimation approach using the Approximate Bayesian Computation (ABC) technique for estimating mean and standard deviation based on various sets of summary statistics found in published studies. We conduct a simulation study to compare the proposed ABC method with the existing methods of Hozo et al. (2005), Bland (2015), and Wan et al. (2014). In the estimation of the standard deviation, our ABC method performs better than the other methods when data are generated from skewed or heavy-tailed distributions. The corresponding average relative error (ARE) approaches zero as sample size increases. In data generated from the normal distribution, our ABC performs well. However, the Wan et al. method is best for estimating standard deviation under normal distribution. In the estimation of the mean, our ABC method is best regardless of assumed distribution. ABC is a flexible method for estimating the study-specific mean and standard deviation for meta-analysis, especially with underlying skewed or heavy-tailed distributions. The ABC method can be applied using other reported summary statistics such as the posterior mean and 95 % credible interval when Bayesian analysis has been employed.
Han, Kyunghwa; Jung, Inkyung
2018-05-01
This review article presents an assessment of trends in statistical methods and an evaluation of their appropriateness in articles published in the Archives of Plastic Surgery (APS) from 2012 to 2017. We reviewed 388 original articles published in APS between 2012 and 2017. We categorized the articles that used statistical methods according to the type of statistical method, the number of statistical methods, and the type of statistical software used. We checked whether there were errors in the description of statistical methods and results. A total of 230 articles (59.3%) published in APS between 2012 and 2017 used one or more statistical method. Within these articles, there were 261 applications of statistical methods with continuous or ordinal outcomes, and 139 applications of statistical methods with categorical outcome. The Pearson chi-square test (17.4%) and the Mann-Whitney U test (14.4%) were the most frequently used methods. Errors in describing statistical methods and results were found in 133 of the 230 articles (57.8%). Inadequate description of P-values was the most common error (39.1%). Among the 230 articles that used statistical methods, 71.7% provided details about the statistical software programs used for the analyses. SPSS was predominantly used in the articles that presented statistical analyses. We found that the use of statistical methods in APS has increased over the last 6 years. It seems that researchers have been paying more attention to the proper use of statistics in recent years. It is expected that these positive trends will continue in APS.
How to Perform a Systematic Review and Meta-analysis of Diagnostic Imaging Studies.
Cronin, Paul; Kelly, Aine Marie; Altaee, Duaa; Foerster, Bradley; Petrou, Myria; Dwamena, Ben A
2018-05-01
A systematic review is a comprehensive search, critical evaluation, and synthesis of all the relevant studies on a specific (clinical) topic that can be applied to the evaluation of diagnostic and screening imaging studies. It can be a qualitative or a quantitative (meta-analysis) review of available literature. A meta-analysis uses statistical methods to combine and summarize the results of several studies. In this review, a 12-step approach to performing a systematic review (and meta-analysis) is outlined under the four domains: (1) Problem Formulation and Data Acquisition, (2) Quality Appraisal of Eligible Studies, (3) Statistical Analysis of Quantitative Data, and (4) Clinical Interpretation of the Evidence. This review is specifically geared toward the performance of a systematic review and meta-analysis of diagnostic test accuracy (imaging) studies. Copyright © 2018 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
Advani, Aneel; Jones, Neil; Shahar, Yuval; Goldstein, Mary K; Musen, Mark A
2004-01-01
We develop a method and algorithm for deciding the optimal approach to creating quality-auditing protocols for guideline-based clinical performance measures. An important element of the audit protocol design problem is deciding which guide-line elements to audit. Specifically, the problem is how and when to aggregate individual patient case-specific guideline elements into population-based quality measures. The key statistical issue involved is the trade-off between increased reliability with more general population-based quality measures versus increased validity from individually case-adjusted but more restricted measures done at a greater audit cost. Our intelligent algorithm for auditing protocol design is based on hierarchically modeling incrementally case-adjusted quality constraints. We select quality constraints to measure using an optimization criterion based on statistical generalizability coefficients. We present results of the approach from a deployed decision support system for a hypertension guideline.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Young, M; Craft, D
Purpose: To develop an efficient, pathway-based classification system using network biology statistics to assist in patient-specific response predictions to radiation and drug therapies across multiple cancer types. Methods: We developed PICS (Pathway Informed Classification System), a novel two-step cancer classification algorithm. In PICS, a matrix m of mRNA expression values for a patient cohort is collapsed into a matrix p of biological pathways. The entries of p, which we term pathway scores, are obtained from either principal component analysis (PCA), normal tissue centroid (NTC), or gene expression deviation (GED). The pathway score matrix is clustered using both k-means and hierarchicalmore » clustering, and a clustering is judged by how well it groups patients into distinct survival classes. The most effective pathway scoring/clustering combination, per clustering p-value, thus generates various ‘signatures’ for conventional and functional cancer classification. Results: PICS successfully regularized large dimension gene data, separated normal and cancerous tissues, and clustered a large patient cohort spanning six cancer types. Furthermore, PICS clustered patient cohorts into distinct, statistically-significant survival groups. For a suboptimally-debulked ovarian cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00127) showed significant improvement over that of a prior gene expression-classified study (p = .0179). For a pancreatic cancer set, the pathway-classified Kaplan-Meier survival curve (p = .00141) showed significant improvement over that of a prior gene expression-classified study (p = .04). Pathway-based classification confirmed biomarkers for the pyrimidine, WNT-signaling, glycerophosphoglycerol, beta-alanine, and panthothenic acid pathways for ovarian cancer. Despite its robust nature, PICS requires significantly less run time than current pathway scoring methods. Conclusion: This work validates the PICS method to improve cancer classification using biological pathways. Patients are classified with greater specificity and physiological relevance as compared to current gene-specific approaches. Focus now moves to utilizing PICS for pan-cancer patient-specific treatment response prediction.« less
Takeyoshi, Masahiro; Sawaki, Masakuni; Yamasaki, Kanji; Kimber, Ian
2003-09-30
The murine local lymph node assay (LLNA) is used for the identification of chemicals that have the potential to cause skin sensitization. However, it requires specific facility and handling procedures to accommodate a radioisotopic (RI) endpoint. We have developed non-radioisotopic (non-RI) endpoint of LLNA based on BrdU incorporation to avoid a use of RI. Although this alternative method appears viable in principle, it is somewhat less sensitive than the standard assay. In this study, we report investigations to determine the use of statistical analysis to improve the sensitivity of a non-RI LLNA procedure with alpha-hexylcinnamic aldehyde (HCA) in two separate experiments. Consequently, the alternative non-RI method required HCA concentrations of greater than 25% to elicit a positive response based on the criterion for classification as a skin sensitizer in the standard LLNA. Nevertheless, dose responses to HCA in the alternative method were consistent in both experiments and we examined whether the use of an endpoint based upon the statistical significance of induced changes in LNC turnover, rather than an SI of 3 or greater, might provide for additional sensitivity. The results reported here demonstrate that with HCA at least significant responses were, in each of two experiments, recorded following exposure of mice to 25% of HCA. These data suggest that this approach may be more satisfactory-at least when BrdU incorporation is measured. However, this modification of the LLNA is rather less sensitive than the standard method if employing statistical endpoint. Taken together the data reported here suggest that a modified LLNA in which BrdU is used in place of radioisotope incorporation shows some promise, but that in its present form, even with the use of a statistical endpoint, lacks some of the sensitivity of the standard method. The challenge is to develop strategies for further refinement of this approach.
Statistical methods used in articles published by the Journal of Periodontal and Implant Science.
Choi, Eunsil; Lyu, Jiyoung; Park, Jinyoung; Kim, Hae-Young
2014-12-01
The purposes of this study were to assess the trend of use of statistical methods including parametric and nonparametric methods and to evaluate the use of complex statistical methodology in recent periodontal studies. This study analyzed 123 articles published in the Journal of Periodontal & Implant Science (JPIS) between 2010 and 2014. Frequencies and percentages were calculated according to the number of statistical methods used, the type of statistical method applied, and the type of statistical software used. Most of the published articles considered (64.4%) used statistical methods. Since 2011, the percentage of JPIS articles using statistics has increased. On the basis of multiple counting, we found that the percentage of studies in JPIS using parametric methods was 61.1%. Further, complex statistical methods were applied in only 6 of the published studies (5.0%), and nonparametric statistical methods were applied in 77 of the published studies (38.9% of a total of 198 studies considered). We found an increasing trend towards the application of statistical methods and nonparametric methods in recent periodontal studies and thus, concluded that increased use of complex statistical methodology might be preferred by the researchers in the fields of study covered by JPIS.
Allele-specific copy-number discovery from whole-genome and whole-exome sequencing
Wang, WeiBo; Wang, Wei; Sun, Wei; Crowley, James J.; Szatkiewicz, Jin P.
2015-01-01
Copy-number variants (CNVs) are a major form of genetic variation and a risk factor for various human diseases, so it is crucial to accurately detect and characterize them. It is conceivable that allele-specific reads from high-throughput sequencing data could be leveraged to both enhance CNV detection and produce allele-specific copy number (ASCN) calls. Although statistical methods have been developed to detect CNVs using whole-genome sequence (WGS) and/or whole-exome sequence (WES) data, information from allele-specific read counts has not yet been adequately exploited. In this paper, we develop an integrated method, called AS-GENSENG, which incorporates allele-specific read counts in CNV detection and estimates ASCN using either WGS or WES data. To evaluate the performance of AS-GENSENG, we conducted extensive simulations, generated empirical data using existing WGS and WES data sets and validated predicted CNVs using an independent methodology. We conclude that AS-GENSENG not only predicts accurate ASCN calls but also improves the accuracy of total copy number calls, owing to its unique ability to exploit information from both total and allele-specific read counts while accounting for various experimental biases in sequence data. Our novel, user-friendly and computationally efficient method and a complete analytic protocol is freely available at https://sourceforge.net/projects/asgenseng/. PMID:25883151
A feature refinement approach for statistical interior CT reconstruction
NASA Astrophysics Data System (ADS)
Hu, Zhanli; Zhang, Yunwan; Liu, Jianbo; Ma, Jianhua; Zheng, Hairong; Liang, Dong
2016-07-01
Interior tomography is clinically desired to reduce the radiation dose rendered to patients. In this work, a new statistical interior tomography approach for computed tomography is proposed. The developed design focuses on taking into account the statistical nature of local projection data and recovering fine structures which are lost in the conventional total-variation (TV)—minimization reconstruction. The proposed method falls within the compressed sensing framework of TV minimization, which only assumes that the interior ROI is piecewise constant or polynomial and does not need any additional prior knowledge. To integrate the statistical distribution property of projection data, the objective function is built under the criteria of penalized weighed least-square (PWLS-TV). In the implementation of the proposed method, the interior projection extrapolation based FBP reconstruction is first used as the initial guess to mitigate truncation artifacts and also provide an extended field-of-view. Moreover, an interior feature refinement step, as an important processing operation is performed after each iteration of PWLS-TV to recover the desired structure information which is lost during the TV minimization. Here, a feature descriptor is specifically designed and employed to distinguish structure from noise and noise-like artifacts. A modified steepest descent algorithm is adopted to minimize the associated objective function. The proposed method is applied to both digital phantom and in vivo Micro-CT datasets, and compared to FBP, ART-TV and PWLS-TV. The reconstruction results demonstrate that the proposed method performs better than other conventional methods in suppressing noise, reducing truncated and streak artifacts, and preserving features. The proposed approach demonstrates its potential usefulness for feature preservation of interior tomography under truncated projection measurements.
A feature refinement approach for statistical interior CT reconstruction.
Hu, Zhanli; Zhang, Yunwan; Liu, Jianbo; Ma, Jianhua; Zheng, Hairong; Liang, Dong
2016-07-21
Interior tomography is clinically desired to reduce the radiation dose rendered to patients. In this work, a new statistical interior tomography approach for computed tomography is proposed. The developed design focuses on taking into account the statistical nature of local projection data and recovering fine structures which are lost in the conventional total-variation (TV)-minimization reconstruction. The proposed method falls within the compressed sensing framework of TV minimization, which only assumes that the interior ROI is piecewise constant or polynomial and does not need any additional prior knowledge. To integrate the statistical distribution property of projection data, the objective function is built under the criteria of penalized weighed least-square (PWLS-TV). In the implementation of the proposed method, the interior projection extrapolation based FBP reconstruction is first used as the initial guess to mitigate truncation artifacts and also provide an extended field-of-view. Moreover, an interior feature refinement step, as an important processing operation is performed after each iteration of PWLS-TV to recover the desired structure information which is lost during the TV minimization. Here, a feature descriptor is specifically designed and employed to distinguish structure from noise and noise-like artifacts. A modified steepest descent algorithm is adopted to minimize the associated objective function. The proposed method is applied to both digital phantom and in vivo Micro-CT datasets, and compared to FBP, ART-TV and PWLS-TV. The reconstruction results demonstrate that the proposed method performs better than other conventional methods in suppressing noise, reducing truncated and streak artifacts, and preserving features. The proposed approach demonstrates its potential usefulness for feature preservation of interior tomography under truncated projection measurements.
Huttary, Rudolf; Goubergrits, Leonid; Schütte, Christof; Bernhard, Stefan
2017-08-01
It has not yet been possible to obtain modeling approaches suitable for covering a wide range of real world scenarios in cardiovascular physiology because many of the system parameters are uncertain or even unknown. Natural variability and statistical variation of cardiovascular system parameters in healthy and diseased conditions are characteristic features for understanding cardiovascular diseases in more detail. This paper presents SISCA, a novel software framework for cardiovascular system modeling and its MATLAB implementation. The framework defines a multi-model statistical ensemble approach for dimension reduced, multi-compartment models and focuses on statistical variation, system identification and patient-specific simulation based on clinical data. We also discuss a data-driven modeling scenario as a use case example. The regarded dataset originated from routine clinical examinations and comprised typical pre and post surgery clinical data from a patient diagnosed with coarctation of aorta. We conducted patient and disease specific pre/post surgery modeling by adapting a validated nominal multi-compartment model with respect to structure and parametrization using metadata and MRI geometry. In both models, the simulation reproduced measured pressures and flows fairly well with respect to stenosis and stent treatment and by pre-treatment cross stenosis phase shift of the pulse wave. However, with post-treatment data showing unrealistic phase shifts and other more obvious inconsistencies within the dataset, the methods and results we present suggest that conditioning and uncertainty management of routine clinical data sets needs significantly more attention to obtain reasonable results in patient-specific cardiovascular modeling. Copyright © 2017 Elsevier Ltd. All rights reserved.
Spatial and spatiotemporal pattern analysis of coconut lethal yellowing in Mozambique.
Bonnot, F; de Franqueville, H; Lourenço, E
2010-04-01
Coconut lethal yellowing (LY) is caused by a phytoplasma and is a major threat for coconut production throughout its growing area. Incidence of LY was monitored visually on every coconut tree in six fields in Mozambique for 34 months. Disease progress curves were plotted and average monthly disease incidence was estimated. Spatial patterns of disease incidence were analyzed at six assessment times. Aggregation was tested by the coefficient of spatial autocorrelation of the beta-binomial distribution of diseased trees in quadrats. The binary power law was used as an assessment of overdispersion across the six fields. Spatial autocorrelation between symptomatic trees was measured by the BB join count statistic based on the number of pairs of diseased trees separated by a specific distance and orientation, and tested using permutation methods. Aggregation of symptomatic trees was detected in every field in both cumulative and new cases. Spatiotemporal patterns were analyzed with two methods. The proximity of symptomatic trees at two assessment times was investigated using the spatiotemporal BB join count statistic based on the number of pairs of trees separated by a specific distance and orientation and exhibiting the first symptoms of LY at the two times. The semivariogram of times of appearance of LY was calculated to characterize how the lag between times of appearance of LY was related to the distance between symptomatic trees. Both statistics were tested using permutation methods. A tendency for new cases to appear in the proximity of previously diseased trees and a spatially structured pattern of times of appearance of LY within clusters of diseased trees were detected, suggesting secondary spread of the disease.
2011-01-01
Background The advent of ChIP-seq technology has made the investigation of epigenetic regulatory networks a computationally tractable problem. Several groups have applied statistical computing methods to ChIP-seq datasets to gain insight into the epigenetic regulation of transcription. However, methods for estimating enrichment levels in ChIP-seq data for these computational studies are understudied and variable. Since the conclusions drawn from these data mining and machine learning applications strongly depend on the enrichment level inputs, a comparison of estimation methods with respect to the performance of statistical models should be made. Results Various methods were used to estimate the gene-wise ChIP-seq enrichment levels for 20 histone methylations and the histone variant H2A.Z. The Multivariate Adaptive Regression Splines (MARS) algorithm was applied for each estimation method using the estimation of enrichment levels as predictors and gene expression levels as responses. The methods used to estimate enrichment levels included tag counting and model-based methods that were applied to whole genes and specific gene regions. These methods were also applied to various sizes of estimation windows. The MARS model performance was assessed with the Generalized Cross-Validation Score (GCV). We determined that model-based methods of enrichment estimation that spatially weight enrichment based on average patterns provided an improvement over tag counting methods. Also, methods that included information across the entire gene body provided improvement over methods that focus on a specific sub-region of the gene (e.g., the 5' or 3' region). Conclusion The performance of data mining and machine learning methods when applied to histone modification ChIP-seq data can be improved by using data across the entire gene body, and incorporating the spatial distribution of enrichment. Refinement of enrichment estimation ultimately improved accuracy of model predictions. PMID:21834981
Li, M H; Liu, Y; Liu, L S; Li, P X; Chen, Q
2016-05-24
To investigate the real-time tissue elastography and 3D contrast-enhanced ultrasonography(CEUS) in breast lumps differential diagnostic value. A total of 126 patients (180 lumps) with breast mass were retrospectively analyzed from December 2012 to December 2014 in Tumor Hospital Affiliated To Xinjiang Medical University.All patients were divided into three groups by using stratified random method.Each group was detected by real-time tissue elastography, 3D CEUS and two joint inspection.Each group of 42 cases (60 lumps) was confirmed by the pathological results as gold standard.Diagnostic sensitivity, specificity and coincidence rate of different methods were compared. The benign masses of ultrasound contrast showed the punctate, linear and nodular enhancement, and the border of enhancement was smooth.The malignant tumors were mainly dominated by uneven and high enhancement. There was no statistical difference in sensitivity, specificity and coincidence rate between elastography group and 3D CEUS group (64.7% vs 73.5%, 69.2% vs 76.9%, 66.7% vs 75.0%, all P>0.05). The sensitivity, specificity and coincidence rate of two joint inspection group were higher than those of elastography group and 3D CEUS group, the differences were statistically significant (97.1%, 92.3% and 98.3% , all P<0.05). 3D CEUS combined with real-time tissue elastography is of high value in the diagnosis of breast masses.
Bieber, Frederick R; Buckleton, John S; Budowle, Bruce; Butler, John M; Coble, Michael D
2016-08-31
The evaluation and interpretation of forensic DNA mixture evidence faces greater interpretational challenges due to increasingly complex mixture evidence. Such challenges include: casework involving low quantity or degraded evidence leading to allele and locus dropout; allele sharing of contributors leading to allele stacking; and differentiation of PCR stutter artifacts from true alleles. There is variation in statistical approaches used to evaluate the strength of the evidence when inclusion of a specific known individual(s) is determined, and the approaches used must be supportable. There are concerns that methods utilized for interpretation of complex forensic DNA mixtures may not be implemented properly in some casework. Similar questions are being raised in a number of U.S. jurisdictions, leading to some confusion about mixture interpretation for current and previous casework. Key elements necessary for the interpretation and statistical evaluation of forensic DNA mixtures are described. Given the most common method for statistical evaluation of DNA mixtures in many parts of the world, including the USA, is the Combined Probability of Inclusion/Exclusion (CPI/CPE). Exposition and elucidation of this method and a protocol for use is the focus of this article. Formulae and other supporting materials are provided. Guidance and details of a DNA mixture interpretation protocol is provided for application of the CPI/CPE method in the analysis of more complex forensic DNA mixtures. This description, in turn, should help reduce the variability of interpretation with application of this methodology and thereby improve the quality of DNA mixture interpretation throughout the forensic community.
NASA Astrophysics Data System (ADS)
Clerc, F.; Njiki-Menga, G.-H.; Witschger, O.
2013-04-01
Most of the measurement strategies that are suggested at the international level to assess workplace exposure to nanomaterials rely on devices measuring, in real time, airborne particles concentrations (according different metrics). Since none of the instruments to measure aerosols can distinguish a particle of interest to the background aerosol, the statistical analysis of time resolved data requires special attention. So far, very few approaches have been used for statistical analysis in the literature. This ranges from simple qualitative analysis of graphs to the implementation of more complex statistical models. To date, there is still no consensus on a particular approach and the current period is always looking for an appropriate and robust method. In this context, this exploratory study investigates a statistical method to analyse time resolved data based on a Bayesian probabilistic approach. To investigate and illustrate the use of the this statistical method, particle number concentration data from a workplace study that investigated the potential for exposure via inhalation from cleanout operations by sandpapering of a reactor producing nanocomposite thin films have been used. In this workplace study, the background issue has been addressed through the near-field and far-field approaches and several size integrated and time resolved devices have been used. The analysis of the results presented here focuses only on data obtained with two handheld condensation particle counters. While one was measuring at the source of the released particles, the other one was measuring in parallel far-field. The Bayesian probabilistic approach allows a probabilistic modelling of data series, and the observed task is modelled in the form of probability distributions. The probability distributions issuing from time resolved data obtained at the source can be compared with the probability distributions issuing from the time resolved data obtained far-field, leading in a quantitative estimation of the airborne particles released at the source when the task is performed. Beyond obtained results, this exploratory study indicates that the analysis of the results requires specific experience in statistics.
Wildfire cluster detection using space-time scan statistics
NASA Astrophysics Data System (ADS)
Tonini, M.; Tuia, D.; Ratle, F.; Kanevski, M.
2009-04-01
The aim of the present study is to identify spatio-temporal clusters of fires sequences using space-time scan statistics. These statistical methods are specifically designed to detect clusters and assess their significance. Basically, scan statistics work by comparing a set of events occurring inside a scanning window (or a space-time cylinder for spatio-temporal data) with those that lie outside. Windows of increasing size scan the zone across space and time: the likelihood ratio is calculated for each window (comparing the ratio "observed cases over expected" inside and outside): the window with the maximum value is assumed to be the most probable cluster, and so on. Under the null hypothesis of spatial and temporal randomness, these events are distributed according to a known discrete-state random process (Poisson or Bernoulli), which parameters can be estimated. Given this assumption, it is possible to test whether or not the null hypothesis holds in a specific area. In order to deal with fires data, the space-time permutation scan statistic has been applied since it does not require the explicit specification of the population-at risk in each cylinder. The case study is represented by Florida daily fire detection using the Moderate Resolution Imaging Spectroradiometer (MODIS) active fire product during the period 2003-2006. As result, statistically significant clusters have been identified. Performing the analyses over the entire frame period, three out of the five most likely clusters have been identified in the forest areas, on the North of the country; the other two clusters cover a large zone in the South, corresponding to agricultural land and the prairies in the Everglades. Furthermore, the analyses have been performed separately for the four years to analyze if the wildfires recur each year during the same period. It emerges that clusters of forest fires are more frequent in hot seasons (spring and summer), while in the South areas they are widely present along the whole year. The analysis of fires distribution to evaluate if they are statistically more frequent in some area or/and in some period of the year, can be useful to support fire management and to focus on prevention measures.
NASA Astrophysics Data System (ADS)
Newman, Brent D.; Havenor, Kay C.; Longmire, Patrick
2016-06-01
Analysis of groundwater chemistry can yield important insights about subsurface conditions, and provide an alternative and complementary method for characterizing basin hydrogeology, especially in areas where hydraulic data are limited. More specifically, hydrochemical facies have been used for decades to help understand basin flow and transport, and a set of facies were developed for the Roswell Artesian Basin (RAB) in a semi-arid part of New Mexico, USA. The RAB is an important agricultural water source, and is an excellent example of a rechargeable artesian system. However, substantial uncertainties about the RAB hydrogeology and groundwater chemistry exist. The RAB was a great opportunity to explore hydrochemcial facies definition. A set of facies, derived from fingerprint diagrams (graphical approach), existed as a basis for testing and for comparison to principal components, factor analysis, and cluster analyses (statistical approaches). Geochemical data from over 300 RAB wells in the central basin were examined. The statistical testing of fingerprint-diagram-based facies was useful in terms of quantitatively evaluating differences between facies, and for understanding potential controls on basin groundwater chemistry. This study suggests the presence of three hydrochemical facies in the shallower part of the RAB (mostly unconfined conditions) and three in the deeper artesian system of the RAB. These facies reflect significant spatial differences in chemistry in the basin that are associated with specific stratigraphic intervals as well as structural features. Substantial chemical variability across faults and within fault blocks was also observed.
Correcting evaluation bias of relational classifiers with network cross validation
Neville, Jennifer; Gallagher, Brian; Eliassi-Rad, Tina; ...
2011-01-04
Recently, a number of modeling techniques have been developed for data mining and machine learning in relational and network domains where the instances are not independent and identically distributed (i.i.d.). These methods specifically exploit the statistical dependencies among instances in order to improve classification accuracy. However, there has been little focus on how these same dependencies affect our ability to draw accurate conclusions about the performance of the models. More specifically, the complex link structure and attribute dependencies in relational data violate the assumptions of many conventional statistical tests and make it difficult to use these tests to assess themore » models in an unbiased manner. In this work, we examine the task of within-network classification and the question of whether two algorithms will learn models that will result in significantly different levels of performance. We show that the commonly used form of evaluation (paired t-test on overlapping network samples) can result in an unacceptable level of Type I error. Furthermore, we show that Type I error increases as (1) the correlation among instances increases and (2) the size of the evaluation set increases (i.e., the proportion of labeled nodes in the network decreases). Lastly, we propose a method for network cross-validation that combined with paired t-tests produces more acceptable levels of Type I error while still providing reasonable levels of statistical power (i.e., 1–Type II error).« less
Shariati, Laleh; Validi, Majid; Tabatabaiefar, Mohammad Amin; Karimi, Ali; Nafisi, Mohammad Reza
2010-12-01
Methicillin-resistant Staphylococcus aureus (MRSA) is a nosocomial pathogen. Our main objective was to compare oxacillin disk test, oxacillin E-test, and oxacillin agar screen for detection of methicillin resistance in S. aureus, using real-time PCR for mecA as the "gold standard" comparison assay. 196 S. aureus isolates were identified out of 284 Staphylococcus isolates. These isolates were screened for MRSA with several methods: disk diffusion, agar screen (6.0 μg/ml), oxacillin E-test, and real-time PCR for detection of mecA gene. Of the 196 S. aureus isolates tested, 96 isolates (49%) were mecA-positive and 100 isolates (51%) mecA-negative. All methods tested had a statistically significant agreement with real-time PCR. E-test was 100% sensitive and specific for mecA presence. The sensitivity and specificity of oxacillin agar screen method were 98 and 99%, respectively and sensitivity and specificity of oxacillin disk diffusion method were 95 and 93%, respectively. In the present study, oxacillin E-test is proposed as the best phenotypic method. For economic reasons, the oxacillin agar screen method (6.0 μg/ml), which is suitable for the detection of MRSA, is recommended due to its accuracy and low cost.
Cardone, A.; Bornstein, A.; Pant, H. C.; Brady, M.; Sriram, R.; Hassan, S. A.
2015-01-01
A method is proposed to study protein-ligand binding in a system governed by specific and non-specific interactions. Strong associations lead to narrow distributions in the proteins configuration space; weak and ultra-weak associations lead instead to broader distributions, a manifestation of non-specific, sparsely-populated binding modes with multiple interfaces. The method is based on the notion that a discrete set of preferential first-encounter modes are metastable states from which stable (pre-relaxation) complexes at equilibrium evolve. The method can be used to explore alternative pathways of complexation with statistical significance and can be integrated into a general algorithm to study protein interaction networks. The method is applied to a peptide-protein complex. The peptide adopts several low-population conformers and binds in a variety of modes with a broad range of affinities. The system is thus well suited to analyze general features of binding, including conformational selection, multiplicity of binding modes, and nonspecific interactions, and to illustrate how the method can be applied to study these problems systematically. The equilibrium distributions can be used to generate biasing functions for simulations of multiprotein systems from which bulk thermodynamic quantities can be calculated. PMID:25782918
Messai, Habib; Farman, Muhammad; Sarraj-Laabidi, Abir; Hammami-Semmar, Asma; Semmar, Nabil
2016-11-17
Olive oils (OOs) show high chemical variability due to several factors of genetic, environmental and anthropic types. Genetic and environmental factors are responsible for natural compositions and polymorphic diversification resulting in different varietal patterns and phenotypes. Anthropic factors, however, are at the origin of different blends' preparation leading to normative, labelled or adulterated commercial products. Control of complex OO samples requires their (i) characterization by specific markers; (ii) authentication by fingerprint patterns; and (iii) monitoring by traceability analysis. These quality control and management aims require the use of several multivariate statistical tools: specificity highlighting requires ordination methods; authentication checking calls for classification and pattern recognition methods; traceability analysis implies the use of network-based approaches able to separate or extract mixed information and memorized signals from complex matrices. This chapter presents a review of different chemometrics methods applied for the control of OO variability from metabolic and physical-chemical measured characteristics. The different chemometrics methods are illustrated by different study cases on monovarietal and blended OO originated from different countries. Chemometrics tools offer multiple ways for quantitative evaluations and qualitative control of complex chemical variability of OO in relation to several intrinsic and extrinsic factors.
Spector, Paul E.
2016-01-01
Background Safety climate, violence prevention climate, and civility climate were independently developed and linked to domain-specific workplace hazards, although all three were designed to promote the physical and psychological safety of workers. Purpose To test domain specificity between conceptually related workplace climates and relevant workplace hazards. Methods Data were collected from 368 persons employed in various industries and descriptive statistics were calculated for all study variables. Correlational and relative weights analyses were used to test for domain specificity. Results The three climate domains were similarly predictive of most workplace hazards, regardless of domain specificity. Discussion This study suggests that the three climate domains share a common higher order construct that may predict relevant workplace hazards better than any of the scales alone. PMID:27110930
USDA-ARS?s Scientific Manuscript database
The role that BMI plays in the association between dietary quality and CVD risk is not known. We aimed to better understand this relationship using statistical methods which correct for sex-specific underreporting of dietary intake. Overall, dietary quality was assessed using the Healthy Eating Inde...
ERIC Educational Resources Information Center
Grimm, Kevin; Marcoulides, Katerina
2016-01-01
Researchers are often interested in studying how the timing of a specific event affects concurrent and future development. When faced with such research questions there are multiple statistical models to consider and those models are the focus of this paper as well as their theoretical underpinnings and assumptions regarding the nature of the…
Estimation of critical behavior from the density of states in classical statistical models
NASA Astrophysics Data System (ADS)
Malakis, A.; Peratzakis, A.; Fytas, N. G.
2004-12-01
We present a simple and efficient approximation scheme which greatly facilitates the extension of Wang-Landau sampling (or similar techniques) in large systems for the estimation of critical behavior. The method, presented in an algorithmic approach, is based on a very simple idea, familiar in statistical mechanics from the notion of thermodynamic equivalence of ensembles and the central limit theorem. It is illustrated that we can predict with high accuracy the critical part of the energy space and by using this restricted part we can extend our simulations to larger systems and improve the accuracy of critical parameters. It is proposed that the extensions of the finite-size critical part of the energy space, determining the specific heat, satisfy a scaling law involving the thermal critical exponent. The method is applied successfully for the estimation of the scaling behavior of specific heat of both square and simple cubic Ising lattices. The proposed scaling law is verified by estimating the thermal critical exponent from the finite-size behavior of the critical part of the energy space. The density of states of the zero-field Ising model on these lattices is obtained via a multirange Wang-Landau sampling.
Gao, Bin; Li, Xiaoqing; Woo, Wai Lok; Tian, Gui Yun
2018-05-01
Thermographic inspection has been widely applied to non-destructive testing and evaluation with the capabilities of rapid, contactless, and large surface area detection. Image segmentation is considered essential for identifying and sizing defects. To attain a high-level performance, specific physics-based models that describe defects generation and enable the precise extraction of target region are of crucial importance. In this paper, an effective genetic first-order statistical image segmentation algorithm is proposed for quantitative crack detection. The proposed method automatically extracts valuable spatial-temporal patterns from unsupervised feature extraction algorithm and avoids a range of issues associated with human intervention in laborious manual selection of specific thermal video frames for processing. An internal genetic functionality is built into the proposed algorithm to automatically control the segmentation threshold to render enhanced accuracy in sizing the cracks. Eddy current pulsed thermography will be implemented as a platform to demonstrate surface crack detection. Experimental tests and comparisons have been conducted to verify the efficacy of the proposed method. In addition, a global quantitative assessment index F-score has been adopted to objectively evaluate the performance of different segmentation algorithms.
Huang, Shuguang; Yeo, Adeline A; Li, Shuyu Dan
2007-10-01
The Kolmogorov-Smirnov (K-S) test is a statistical method often used for comparing two distributions. In high-throughput screening (HTS) studies, such distributions usually arise from the phenotype of independent cell populations. However, the K-S test has been criticized for being overly sensitive in applications, and it often detects a statistically significant difference that is not biologically meaningful. One major reason is that there is a common phenomenon in HTS studies that systematic drifting exists among the distributions due to reasons such as instrument variation, plate edge effect, accidental difference in sample handling, etc. In particular, in high-content cellular imaging experiments, the location shift could be dramatic since some compounds themselves are fluorescent. This oversensitivity of the K-S test is particularly overpowered in cellular assays where the sample sizes are very big (usually several thousands). In this paper, a modified K-S test is proposed to deal with the nonspecific location-shift problem in HTS studies. Specifically, we propose that the distributions are "normalized" by density curve alignment before the K-S test is conducted. In applications to simulation data and real experimental data, the results show that the proposed method has improved specificity.
Aerodynamic method for obtaining the soil water retention curve
NASA Astrophysics Data System (ADS)
Alekseev, V. V.; Maksimov, I. I.
2013-07-01
A new method for the rapid plotting of the soil water retention curve (SWRC) has been proposed that considers the soil water as an environment limited by the soil solid phase on one side and by the soil air on the other side. Both contact surfaces have surface energies, which play the main role in water retention. The use of an idealized soil model with consideration for the nonequilibrium thermodynamic laws and the aerodynamic similarity principles allows us to estimate the volumetric specific surface areas of soils and, using the proposed pedotransfer function (PTF), to plot the SWRC. The volumetric specific surface area of the solid phase, the porosity, and the specific free surface energy at the water-air interface are used as the SWRC parameters. Devices for measuring the parameters are briefly described. The differences between the proposed PTF and the experimental data have been analyzed using the statistical processing of the data.
The Statistical Consulting Center for Astronomy (SCCA)
NASA Technical Reports Server (NTRS)
Akritas, Michael
2001-01-01
The process by which raw astronomical data acquisition is transformed into scientifically meaningful results and interpretation typically involves many statistical steps. Traditional astronomy limits itself to a narrow range of old and familiar statistical methods: means and standard deviations; least-squares methods like chi(sup 2) minimization; and simple nonparametric procedures such as the Kolmogorov-Smirnov tests. These tools are often inadequate for the complex problems and datasets under investigations, and recent years have witnessed an increased usage of maximum-likelihood, survival analysis, multivariate analysis, wavelet and advanced time-series methods. The Statistical Consulting Center for Astronomy (SCCA) assisted astronomers with the use of sophisticated tools, and to match these tools with specific problems. The SCCA operated with two professors of statistics and a professor of astronomy working together. Questions were received by e-mail, and were discussed in detail with the questioner. Summaries of those questions and answers leading to new approaches were posted on the Web (www.state.psu.edu/ mga/SCCA). In addition to serving individual astronomers, the SCCA established a Web site for general use that provides hypertext links to selected on-line public-domain statistical software and services. The StatCodes site (www.astro.psu.edu/statcodes) provides over 200 links in the areas of: Bayesian statistics; censored and truncated data; correlation and regression, density estimation and smoothing, general statistics packages and information; image analysis; interactive Web tools; multivariate analysis; multivariate clustering and classification; nonparametric analysis; software written by astronomers; spatial statistics; statistical distributions; time series analysis; and visualization tools. StatCodes has received a remarkable high and constant hit rate of 250 hits/week (over 10,000/year) since its inception in mid-1997. It is of interest to scientists both within and outside of astronomy. The most popular sections are multivariate techniques, image analysis, and time series analysis. Hundreds of copies of the ASURV, SLOPES and CENS-TAU codes developed by SCCA scientists were also downloaded from the StatCodes site. In addition to formal SCCA duties, SCCA scientists continued a variety of related activities in astrostatistics, including refereeing of statistically oriented papers submitted to the Astrophysical Journal, talks in meetings including Feigelson's talk to science journalists entitled "The reemergence of astrostatistics" at the American Association for the Advancement of Science meeting, and published papers of astrostatistical content.
Quantitative knowledge acquisition for expert systems
NASA Technical Reports Server (NTRS)
Belkin, Brenda L.; Stengel, Robert F.
1991-01-01
A common problem in the design of expert systems is the definition of rules from data obtained in system operation or simulation. While it is relatively easy to collect data and to log the comments of human operators engaged in experiments, generalizing such information to a set of rules has not previously been a direct task. A statistical method is presented for generating rule bases from numerical data, motivated by an example based on aircraft navigation with multiple sensors. The specific objective is to design an expert system that selects a satisfactory suite of measurements from a dissimilar, redundant set, given an arbitrary navigation geometry and possible sensor failures. The systematic development is described of a Navigation Sensor Management (NSM) Expert System from Kalman Filter convariance data. The method invokes two statistical techniques: Analysis of Variance (ANOVA) and the ID3 Algorithm. The ANOVA technique indicates whether variations of problem parameters give statistically different covariance results, and the ID3 algorithms identifies the relationships between the problem parameters using probabilistic knowledge extracted from a simulation example set. Both are detailed.
Kantardjiev, Alexander A
2015-04-05
A cluster of strongly interacting ionization groups in protein molecules with irregular ionization behavior is suggestive for specific structure-function relationship. However, their computational treatment is unconventional (e.g., lack of convergence in naive self-consistent iterative algorithm). The stringent evaluation requires evaluation of Boltzmann averaged statistical mechanics sums and electrostatic energy estimation for each microstate. irGPU: Irregular strong interactions in proteins--a GPU solver is novel solution to a versatile problem in protein biophysics--atypical protonation behavior of coupled groups. The computational severity of the problem is alleviated by parallelization (via GPU kernels) which is applied for the electrostatic interaction evaluation (including explicit electrostatics via the fast multipole method) as well as statistical mechanics sums (partition function) estimation. Special attention is given to the ease of the service and encapsulation of theoretical details without sacrificing rigor of computational procedures. irGPU is not just a solution-in-principle but a promising practical application with potential to entice community into deeper understanding of principles governing biomolecule mechanisms. © 2015 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Havens, Timothy C.; Cummings, Ian; Botts, Jonathan; Summers, Jason E.
2017-05-01
The linear ordered statistic (LOS) is a parameterized ordered statistic (OS) that is a weighted average of a rank-ordered sample. LOS operators are useful generalizations of aggregation as they can represent any linear aggregation, from minimum to maximum, including conventional aggregations, such as mean and median. In the fuzzy logic field, these aggregations are called ordered weighted averages (OWAs). Here, we present a method for learning LOS operators from training data, viz., data for which you know the output of the desired LOS. We then extend the learning process with regularization, such that a lower complexity or sparse LOS can be learned. Hence, we discuss what 'lower complexity' means in this context and how to represent that in the optimization procedure. Finally, we apply our learning methods to the well-known constant-false-alarm-rate (CFAR) detection problem, specifically for the case of background levels modeled by long-tailed distributions, such as the K-distribution. These backgrounds arise in several pertinent imaging problems, including the modeling of clutter in synthetic aperture radar and sonar (SAR and SAS) and in wireless communications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rah, Jeong-Eun; Oh, Do Hoon; Shin, Dongho
Purpose: To evaluate and improve the reliability of proton quality assurance (QA) processes and, to provide an optimal customized tolerance level using the statistical process control (SPC) methodology. Methods: The authors investigated the consistency check of dose per monitor unit (D/MU) and range in proton beams to see whether it was within the tolerance level of the daily QA process. This study analyzed the difference between the measured and calculated ranges along the central axis to improve the patient-specific QA process in proton beams by using process capability indices. Results: The authors established a customized tolerance level of ±2% formore » D/MU and ±0.5 mm for beam range in the daily proton QA process. In the authors’ analysis of the process capability indices, the patient-specific range measurements were capable of a specification limit of ±2% in clinical plans. Conclusions: SPC methodology is a useful tool for customizing the optimal QA tolerance levels and improving the quality of proton machine maintenance, treatment delivery, and ultimately patient safety.« less
Comparative effectiveness research methodology using secondary data: A starting user's guide.
Sun, Maxine; Lipsitz, Stuart R
2018-04-01
The use of secondary data, such as claims or administrative data, in comparative effectiveness research has grown tremendously in recent years. We believe that the current review can help investigators relying on secondary data to (1) gain insight into both the methodologies and statistical methods, (2) better understand the necessity of a rigorous planning before initiating a comparative effectiveness investigation, and (3) optimize the quality of their investigations. Specifically, we review concepts of adjusted analyses and confounders, methods of propensity score analyses, and instrumental variable analyses, risk prediction models (logistic and time-to-event), decision-curve analysis, as well as the interpretation of the P value and hypothesis testing. Overall, we hope that the current review article can help research investigators relying on secondary data to perform comparative effectiveness research better understand the necessity of a rigorous planning before study start, and gain better insight in the choice of statistical methods so as to optimize the quality of the research study. Copyright © 2017 Elsevier Inc. All rights reserved.
Sikirzhytskaya, Aliaksandra; Sikirzhytski, Vitali; Lednev, Igor K
2014-01-01
Body fluids are a common and important type of forensic evidence. In particular, the identification of menstrual blood stains is often a key step during the investigation of rape cases. Here, we report on the application of near-infrared Raman microspectroscopy for differentiating menstrual blood from peripheral blood. We observed that the menstrual and peripheral blood samples have similar but distinct Raman spectra. Advanced statistical analysis of the multiple Raman spectra that were automatically (Raman mapping) acquired from the 40 dried blood stains (20 donors for each group) allowed us to build classification model with maximum (100%) sensitivity and specificity. We also demonstrated that despite certain common constituents, menstrual blood can be readily distinguished from vaginal fluid. All of the classification models were verified using cross-validation methods. The proposed method overcomes the problems associated with currently used biochemical methods, which are destructive, time consuming and expensive. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Miri, Raz; Graf, Iulia M; Dössel, Olaf
2009-11-01
Electrode positions and timing delays influence the efficacy of biventricular pacing (BVP). Accordingly, this study focuses on BVP optimization, using a detailed 3-D electrophysiological model of the human heart, which is adapted to patient-specific anatomy and pathophysiology. The research is effectuated on ten heart models with left bundle branch block and myocardial infarction derived from magnetic resonance and computed tomography data. Cardiac electrical activity is simulated with the ten Tusscher cell model and adaptive cellular automaton at physiological and pathological conduction levels. The optimization methods are based on a comparison between the electrical response of the healthy and diseased heart models, measured in terms of root mean square error (E(RMS)) of the excitation front and the QRS duration error (E(QRS)). Intra- and intermethod associations of the pacing electrodes and timing delays variables were analyzed with statistical methods, i.e., t -test for dependent data, one-way analysis of variance for electrode pairs, and Pearson model for equivalent parameters from the two optimization methods. The results indicate that lateral the left ventricle and the upper or middle septal area are frequently (60% of cases) the optimal positions of the left and right electrodes, respectively. Statistical analysis proves that the two optimization methods are in good agreement. In conclusion, a noninvasive preoperative BVP optimization strategy based on computer simulations can be used to identify the most beneficial patient-specific electrode configuration and timing delays.
Popescu, M D; Draghici, L; Secheli, I; Secheli, M; Codrescu, M; Draghici, I
2015-01-01
Infantile Hemangiomas (IH) are the most frequent tumors of vascular origin, and the differential diagnosis from vascular malformations is difficult to establish. Specific types of IH due to the location, dimensions and fast evolution, can determine important functional and esthetic sequels. To avoid these unfortunate consequences it is necessary to establish the exact appropriate moment to begin the treatment and decide which the most adequate therapeutic procedure is. Based on clinical data collected by a serial clinical observations correlated with imaging data, and processed by a computer-aided diagnosis system (CAD), the study intended to develop a treatment algorithm to accurately predict the best final results, from the esthetical and functional point of view, for a certain type of lesion. The preliminary database was composed of 75 patients divided into 4 groups according to the treatment management they received: medical therapy, sclerotherapy, surgical excision and no treatment. The serial clinical observation was performed each month and all the data was processed by using CAD. The project goal was to create a software that incorporated advanced methods to accurately measure the specific IH lesions, integrated medical information, statistical methods and computational methods to correlate this information with that obtained from the processing of images. Based on these correlations, a prediction mechanism of the evolution of hemangioma, which helped determine the best method of therapeutic intervention to minimize further complications, was established.
Statistical method to compare massive parallel sequencing pipelines.
Elsensohn, M H; Leblay, N; Dimassi, S; Campan-Fournier, A; Labalme, A; Roucher-Boulez, F; Sanlaville, D; Lesca, G; Bardel, C; Roy, P
2017-03-01
Today, sequencing is frequently carried out by Massive Parallel Sequencing (MPS) that cuts drastically sequencing time and expenses. Nevertheless, Sanger sequencing remains the main validation method to confirm the presence of variants. The analysis of MPS data involves the development of several bioinformatic tools, academic or commercial. We present here a statistical method to compare MPS pipelines and test it in a comparison between an academic (BWA-GATK) and a commercial pipeline (TMAP-NextGENe®), with and without reference to a gold standard (here, Sanger sequencing), on a panel of 41 genes in 43 epileptic patients. This method used the number of variants to fit log-linear models for pairwise agreements between pipelines. To assess the heterogeneity of the margins and the odds ratios of agreement, four log-linear models were used: a full model, a homogeneous-margin model, a model with single odds ratio for all patients, and a model with single intercept. Then a log-linear mixed model was fitted considering the biological variability as a random effect. Among the 390,339 base-pairs sequenced, TMAP-NextGENe® and BWA-GATK found, on average, 2253.49 and 1857.14 variants (single nucleotide variants and indels), respectively. Against the gold standard, the pipelines had similar sensitivities (63.47% vs. 63.42%) and close but significantly different specificities (99.57% vs. 99.65%; p < 0.001). Same-trend results were obtained when only single nucleotide variants were considered (99.98% specificity and 76.81% sensitivity for both pipelines). The method allows thus pipeline comparison and selection. It is generalizable to all types of MPS data and all pipelines.
Hefron, Ryan; Borghetti, Brett; Schubert Kabban, Christine; Christensen, James; Estepp, Justin
2018-04-26
Applying deep learning methods to electroencephalograph (EEG) data for cognitive state assessment has yielded improvements over previous modeling methods. However, research focused on cross-participant cognitive workload modeling using these techniques is underrepresented. We study the problem of cross-participant state estimation in a non-stimulus-locked task environment, where a trained model is used to make workload estimates on a new participant who is not represented in the training set. Using experimental data from the Multi-Attribute Task Battery (MATB) environment, a variety of deep neural network models are evaluated in the trade-space of computational efficiency, model accuracy, variance and temporal specificity yielding three important contributions: (1) The performance of ensembles of individually-trained models is statistically indistinguishable from group-trained methods at most sequence lengths. These ensembles can be trained for a fraction of the computational cost compared to group-trained methods and enable simpler model updates. (2) While increasing temporal sequence length improves mean accuracy, it is not sufficient to overcome distributional dissimilarities between individuals’ EEG data, as it results in statistically significant increases in cross-participant variance. (3) Compared to all other networks evaluated, a novel convolutional-recurrent model using multi-path subnetworks and bi-directional, residual recurrent layers resulted in statistically significant increases in predictive accuracy and decreases in cross-participant variance.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chu, Tsong-Lun; Varuttamaseni, Athi; Baek, Joo-Seok
The U.S. Nuclear Regulatory Commission (NRC) encourages the use of probabilistic risk assessment (PRA) technology in all regulatory matters, to the extent supported by the state-of-the-art in PRA methods and data. Although much has been accomplished in the area of risk-informed regulation, risk assessment for digital systems has not been fully developed. The NRC established a plan for research on digital systems to identify and develop methods, analytical tools, and regulatory guidance for (1) including models of digital systems in the PRAs of nuclear power plants (NPPs), and (2) incorporating digital systems in the NRC's risk-informed licensing and oversight activities.more » Under NRC's sponsorship, Brookhaven National Laboratory (BNL) explored approaches for addressing the failures of digital instrumentation and control (I and C) systems in the current NPP PRA framework. Specific areas investigated included PRA modeling digital hardware, development of a philosophical basis for defining software failure, and identification of desirable attributes of quantitative software reliability methods. Based on the earlier research, statistical testing is considered a promising method for quantifying software reliability. This paper describes a statistical software testing approach for quantifying software reliability and applies it to the loop-operating control system (LOCS) of an experimental loop of the Advanced Test Reactor (ATR) at Idaho National Laboratory (INL).« less
Hefron, Ryan; Borghetti, Brett; Schubert Kabban, Christine; Christensen, James; Estepp, Justin
2018-01-01
Applying deep learning methods to electroencephalograph (EEG) data for cognitive state assessment has yielded improvements over previous modeling methods. However, research focused on cross-participant cognitive workload modeling using these techniques is underrepresented. We study the problem of cross-participant state estimation in a non-stimulus-locked task environment, where a trained model is used to make workload estimates on a new participant who is not represented in the training set. Using experimental data from the Multi-Attribute Task Battery (MATB) environment, a variety of deep neural network models are evaluated in the trade-space of computational efficiency, model accuracy, variance and temporal specificity yielding three important contributions: (1) The performance of ensembles of individually-trained models is statistically indistinguishable from group-trained methods at most sequence lengths. These ensembles can be trained for a fraction of the computational cost compared to group-trained methods and enable simpler model updates. (2) While increasing temporal sequence length improves mean accuracy, it is not sufficient to overcome distributional dissimilarities between individuals’ EEG data, as it results in statistically significant increases in cross-participant variance. (3) Compared to all other networks evaluated, a novel convolutional-recurrent model using multi-path subnetworks and bi-directional, residual recurrent layers resulted in statistically significant increases in predictive accuracy and decreases in cross-participant variance. PMID:29701668
Jaiswara, Ranjana; Nandi, Diptarup; Balakrishnan, Rohini
2013-01-01
Traditional taxonomy based on morphology has often failed in accurate species identification owing to the occurrence of cryptic species, which are reproductively isolated but morphologically identical. Molecular data have thus been used to complement morphology in species identification. The sexual advertisement calls in several groups of acoustically communicating animals are species-specific and can thus complement molecular data as non-invasive tools for identification. Several statistical tools and automated identifier algorithms have been used to investigate the efficiency of acoustic signals in species identification. Despite a plethora of such methods, there is a general lack of knowledge regarding the appropriate usage of these methods in specific taxa. In this study, we investigated the performance of two commonly used statistical methods, discriminant function analysis (DFA) and cluster analysis, in identification and classification based on acoustic signals of field cricket species belonging to the subfamily Gryllinae. Using a comparative approach we evaluated the optimal number of species and calling song characteristics for both the methods that lead to most accurate classification and identification. The accuracy of classification using DFA was high and was not affected by the number of taxa used. However, a constraint in using discriminant function analysis is the need for a priori classification of songs. Accuracy of classification using cluster analysis, which does not require a priori knowledge, was maximum for 6–7 taxa and decreased significantly when more than ten taxa were analysed together. We also investigated the efficacy of two novel derived acoustic features in improving the accuracy of identification. Our results show that DFA is a reliable statistical tool for species identification using acoustic signals. Our results also show that cluster analysis of acoustic signals in crickets works effectively for species classification and identification. PMID:24086666
Petukh, Marharyta; Li, Minghui; Alexov, Emil
2015-07-01
A new methodology termed Single Amino Acid Mutation based change in Binding free Energy (SAAMBE) was developed to predict the changes of the binding free energy caused by mutations. The method utilizes 3D structures of the corresponding protein-protein complexes and takes advantage of both approaches: sequence- and structure-based methods. The method has two components: a MM/PBSA-based component, and an additional set of statistical terms delivered from statistical investigation of physico-chemical properties of protein complexes. While the approach is rigid body approach and does not explicitly consider plausible conformational changes caused by the binding, the effect of conformational changes, including changes away from binding interface, on electrostatics are mimicked with amino acid specific dielectric constants. This provides significant improvement of SAAMBE predictions as indicated by better match against experimentally determined binding free energy changes over 1300 mutations in 43 proteins. The final benchmarking resulted in a very good agreement with experimental data (correlation coefficient 0.624) while the algorithm being fast enough to allow for large-scale calculations (the average time is less than a minute per mutation).
Coloc-stats: a unified web interface to perform colocalization analysis of genomic features.
Simovski, Boris; Kanduri, Chakravarthi; Gundersen, Sveinung; Titov, Dmytro; Domanska, Diana; Bock, Christoph; Bossini-Castillo, Lara; Chikina, Maria; Favorov, Alexander; Layer, Ryan M; Mironov, Andrey A; Quinlan, Aaron R; Sheffield, Nathan C; Trynka, Gosia; Sandve, Geir K
2018-06-05
Functional genomics assays produce sets of genomic regions as one of their main outputs. To biologically interpret such region-sets, researchers often use colocalization analysis, where the statistical significance of colocalization (overlap, spatial proximity) between two or more region-sets is tested. Existing colocalization analysis tools vary in the statistical methodology and analysis approaches, thus potentially providing different conclusions for the same research question. As the findings of colocalization analysis are often the basis for follow-up experiments, it is helpful to use several tools in parallel and to compare the results. We developed the Coloc-stats web service to facilitate such analyses. Coloc-stats provides a unified interface to perform colocalization analysis across various analytical methods and method-specific options (e.g. colocalization measures, resolution, null models). Coloc-stats helps the user to find a method that supports their experimental requirements and allows for a straightforward comparison across methods. Coloc-stats is implemented as a web server with a graphical user interface that assists users with configuring their colocalization analyses. Coloc-stats is freely available at https://hyperbrowser.uio.no/coloc-stats/.
Online Denoising Based on the Second-Order Adaptive Statistics Model.
Yi, Sheng-Lun; Jin, Xue-Bo; Su, Ting-Li; Tang, Zhen-Yun; Wang, Fa-Fa; Xiang, Na; Kong, Jian-Lei
2017-07-20
Online denoising is motivated by real-time applications in the industrial process, where the data must be utilizable soon after it is collected. Since the noise in practical process is usually colored, it is quite a challenge for denoising techniques. In this paper, a novel online denoising method was proposed to achieve the processing of the practical measurement data with colored noise, and the characteristics of the colored noise were considered in the dynamic model via an adaptive parameter. The proposed method consists of two parts within a closed loop: the first one is to estimate the system state based on the second-order adaptive statistics model and the other is to update the adaptive parameter in the model using the Yule-Walker algorithm. Specifically, the state estimation process was implemented via the Kalman filter in a recursive way, and the online purpose was therefore attained. Experimental data in a reinforced concrete structure test was used to verify the effectiveness of the proposed method. Results show the proposed method not only dealt with the signals with colored noise, but also achieved a tradeoff between efficiency and accuracy.
Statistical Evaluation of Biometric Evidence in Forensic Automatic Speaker Recognition
NASA Astrophysics Data System (ADS)
Drygajlo, Andrzej
Forensic speaker recognition is the process of determining if a specific individual (suspected speaker) is the source of a questioned voice recording (trace). This paper aims at presenting forensic automatic speaker recognition (FASR) methods that provide a coherent way of quantifying and presenting recorded voice as biometric evidence. In such methods, the biometric evidence consists of the quantified degree of similarity between speaker-dependent features extracted from the trace and speaker-dependent features extracted from recorded speech of a suspect. The interpretation of recorded voice as evidence in the forensic context presents particular challenges, including within-speaker (within-source) variability and between-speakers (between-sources) variability. Consequently, FASR methods must provide a statistical evaluation which gives the court an indication of the strength of the evidence given the estimated within-source and between-sources variabilities. This paper reports on the first ENFSI evaluation campaign through a fake case, organized by the Netherlands Forensic Institute (NFI), as an example, where an automatic method using the Gaussian mixture models (GMMs) and the Bayesian interpretation (BI) framework were implemented for the forensic speaker recognition task.
Meta-analysis and The Cochrane Collaboration: 20 years of the Cochrane Statistical Methods Group
2013-01-01
The Statistical Methods Group has played a pivotal role in The Cochrane Collaboration over the past 20 years. The Statistical Methods Group has determined the direction of statistical methods used within Cochrane reviews, developed guidance for these methods, provided training, and continued to discuss and consider new and controversial issues in meta-analysis. The contribution of Statistical Methods Group members to the meta-analysis literature has been extensive and has helped to shape the wider meta-analysis landscape. In this paper, marking the 20th anniversary of The Cochrane Collaboration, we reflect on the history of the Statistical Methods Group, beginning in 1993 with the identification of aspects of statistical synthesis for which consensus was lacking about the best approach. We highlight some landmark methodological developments that Statistical Methods Group members have contributed to in the field of meta-analysis. We discuss how the Group implements and disseminates statistical methods within The Cochrane Collaboration. Finally, we consider the importance of robust statistical methodology for Cochrane systematic reviews, note research gaps, and reflect on the challenges that the Statistical Methods Group faces in its future direction. PMID:24280020
DNA viewed as an out-of-equilibrium structure
NASA Astrophysics Data System (ADS)
Provata, A.; Nicolis, C.; Nicolis, G.
2014-05-01
The complexity of the primary structure of human DNA is explored using methods from nonequilibrium statistical mechanics, dynamical systems theory, and information theory. A collection of statistical analyses is performed on the DNA data and the results are compared with sequences derived from different stochastic processes. The use of χ2 tests shows that DNA can not be described as a low order Markov chain of order up to r =6. Although detailed balance seems to hold at the level of a binary alphabet, it fails when all four base pairs are considered, suggesting spatial asymmetry and irreversibility. Furthermore, the block entropy does not increase linearly with the block size, reflecting the long-range nature of the correlations in the human genomic sequences. To probe locally the spatial structure of the chain, we study the exit distances from a specific symbol, the distribution of recurrence distances, and the Hurst exponent, all of which show power law tails and long-range characteristics. These results suggest that human DNA can be viewed as a nonequilibrium structure maintained in its state through interactions with a constantly changing environment. Based solely on the exit distance distribution accounting for the nonequilibrium statistics and using the Monte Carlo rejection sampling method, we construct a model DNA sequence. This method allows us to keep both long- and short-range statistical characteristics of the native DNA data. The model sequence presents the same characteristic exponents as the natural DNA but fails to capture spatial correlations and point-to-point details.
Inference with viral quasispecies diversity indices: clonal and NGS approaches.
Gregori, Josep; Salicrú, Miquel; Domingo, Esteban; Sanchez, Alex; Esteban, Juan I; Rodríguez-Frías, Francisco; Quer, Josep
2014-04-15
Given the inherent dynamics of a viral quasispecies, we are often interested in the comparison of diversity indices of sequential samples of a patient, or in the comparison of diversity indices of virus in groups of patients in a treated versus control design. It is then important to make sure that the diversity measures from each sample may be compared with no bias and within a consistent statistical framework. In the present report, we review some indices often used as measures for viral quasispecies complexity and provide means for statistical inference, applying procedures taken from the ecology field. In particular, we examine the Shannon entropy and the mutation frequency, and we discuss the appropriateness of different normalization methods of the Shannon entropy found in the literature. By taking amplicons ultra-deep pyrosequencing (UDPS) raw data as a surrogate of a real hepatitis C virus viral population, we study through in-silico sampling the statistical properties of these indices under two methods of viral quasispecies sampling, classical cloning followed by Sanger sequencing (CCSS) and next-generation sequencing (NGS) such as UDPS. We propose solutions specific to each of the two sampling methods-CCSS and NGS-to guarantee statistically conforming conclusions as free of bias as possible. josep.gregori@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Abid, Najmul; Mirkhalaf, Mohammad; Barthelat, Francois
2018-03-01
Natural materials such as nacre, collagen, and spider silk are composed of staggered stiff and strong inclusions in a softer matrix. This type of hybrid microstructure results in remarkable combinations of stiffness, strength, and toughness and it now inspires novel classes of high-performance composites. However, the analytical and numerical approaches used to predict and optimize the mechanics of staggered composites often neglect statistical variations and inhomogeneities, which may have significant impacts on modulus, strength, and toughness. Here we present an analysis of localization using small representative volume elements (RVEs) and large scale statistical volume elements (SVEs) based on the discrete element method (DEM). DEM is an efficient numerical method which enabled the evaluation of more than 10,000 microstructures in this study, each including about 5,000 inclusions. The models explore the combined effects of statistics, inclusion arrangement, and interface properties. We find that statistical variations have a negative effect on all properties, in particular on the ductility and energy absorption because randomness precipitates the localization of deformations. However, the results also show that the negative effects of random microstructures can be offset by interfaces with large strain at failure accompanied by strain hardening. More specifically, this quantitative study reveals an optimal range of interface properties where the interfaces are the most effective at delaying localization. These findings show how carefully designed interfaces in bioinspired staggered composites can offset the negative effects of microstructural randomness, which is inherent to most current fabrication methods.
DNA viewed as an out-of-equilibrium structure.
Provata, A; Nicolis, C; Nicolis, G
2014-05-01
The complexity of the primary structure of human DNA is explored using methods from nonequilibrium statistical mechanics, dynamical systems theory, and information theory. A collection of statistical analyses is performed on the DNA data and the results are compared with sequences derived from different stochastic processes. The use of χ^{2} tests shows that DNA can not be described as a low order Markov chain of order up to r=6. Although detailed balance seems to hold at the level of a binary alphabet, it fails when all four base pairs are considered, suggesting spatial asymmetry and irreversibility. Furthermore, the block entropy does not increase linearly with the block size, reflecting the long-range nature of the correlations in the human genomic sequences. To probe locally the spatial structure of the chain, we study the exit distances from a specific symbol, the distribution of recurrence distances, and the Hurst exponent, all of which show power law tails and long-range characteristics. These results suggest that human DNA can be viewed as a nonequilibrium structure maintained in its state through interactions with a constantly changing environment. Based solely on the exit distance distribution accounting for the nonequilibrium statistics and using the Monte Carlo rejection sampling method, we construct a model DNA sequence. This method allows us to keep both long- and short-range statistical characteristics of the native DNA data. The model sequence presents the same characteristic exponents as the natural DNA but fails to capture spatial correlations and point-to-point details.
NASA Astrophysics Data System (ADS)
Ryazanova, A. A.; Okladnikov, I. G.; Gordov, E. P.
2017-11-01
The frequency of occurrence and magnitude of precipitation and temperature extreme events show positive trends in several geographical regions. These events must be analyzed and studied in order to better understand their impact on the environment, predict their occurrences, and mitigate their effects. For this purpose, we augmented web-GIS called “CLIMATE” to include a dedicated statistical package developed in the R language. The web-GIS “CLIMATE” is a software platform for cloud storage processing and visualization of distributed archives of spatial datasets. It is based on a combined use of web and GIS technologies with reliable procedures for searching, extracting, processing, and visualizing the spatial data archives. The system provides a set of thematic online tools for the complex analysis of current and future climate changes and their effects on the environment. The package includes new powerful methods of time-dependent statistics of extremes, quantile regression and copula approach for the detailed analysis of various climate extreme events. Specifically, the very promising copula approach allows obtaining the structural connections between the extremes and the various environmental characteristics. The new statistical methods integrated into the web-GIS “CLIMATE” can significantly facilitate and accelerate the complex analysis of climate extremes using only a desktop PC connected to the Internet.
NASA Astrophysics Data System (ADS)
Xu, Liangfei; Reimer, Uwe; Li, Jianqiu; Huang, Haiyan; Hu, Zunyan; Jiang, Hongliang; Janßen, Holger; Ouyang, Minggao; Lehnert, Werner
2018-02-01
City buses using polymer electrolyte membrane (PEM) fuel cells are considered to be the most likely fuel cell vehicles to be commercialized in China. The technical specifications of the fuel cell systems (FCSs) these buses are equipped with will differ based on the powertrain configurations and vehicle control strategies, but can generally be classified into the power-follow and soft-run modes. Each mode imposes different levels of electrochemical stress on the fuel cells. Evaluating the aging behavior of fuel cell stacks under the conditions encountered in fuel cell buses requires new durability test protocols based on statistical results obtained during actual driving tests. In this study, we propose a systematic design method for fuel cell durability test protocols that correspond to the power-follow mode based on three parameters for different fuel cell load ranges. The powertrain configurations and control strategy are described herein, followed by a presentation of the statistical data for the duty cycles of FCSs in one city bus in the demonstration project. Assessment protocols are presented based on the statistical results using mathematical optimization methods, and are compared to existing protocols with respect to common factors, such as time at open circuit voltage and root-mean-square power.
Earth-Space Link Attenuation Estimation via Ground Radar Kdp
NASA Technical Reports Server (NTRS)
Bolen, Steven M.; Benjamin, Andrew L.; Chandrasekar, V.
2003-01-01
A method of predicting attenuation on microwave Earth/spacecraft communication links, over wide areas and under various atmospheric conditions, has been developed. In the area around the ground station locations, a nearly horizontally aimed polarimetric S-band ground radar measures the specific differential phase (Kdp) along the Earth-space path. The specific attenuation along a path of interest is then computed by use of a theoretical model of the relationship between the measured S-band specific differential phase and the specific attenuation at the frequency to be used on the communication link. The model includes effects of rain, wet ice, and other forms of precipitation. The attenuation on the path of interest is then computed by integrating the specific attenuation over the length of the path. This method can be used to determine statistics of signal degradation on Earth/spacecraft communication links. It can also be used to obtain real-time estimates of attenuation along multiple Earth/spacecraft links that are parts of a communication network operating within the radar coverage area, thereby enabling better management of the network through appropriate dynamic routing along the best combination of links.
Review of a statistical specification for pugmill mixed material.
DOT National Transportation Integrated Search
1974-01-01
Since the spring of 1964, the Virginia Highway Research Council has been developing and implementing statistical specifications for highway operations. One of these specifications is used for the acceptance of pugmill mixed materials. The purpose of ...
Gassner, Christoph; Rainer, Esther; Pircher, Elfriede; Markut, Lydia; Körmöczi, Günther F.; Jungbauer, Christof; Wessin, Dietmar; Klinghofer, Roswitha; Schennach, Harald; Schwind, Peter; Schönitzer, Diether
2009-01-01
Summary Background Validations of routinely used serological typing methods require intense performance evaluations typically including large numbers of samples before routine application. However, such evaluations could be improved considering information about the frequency of standard blood groups and their variants. Methods Using RHD and ABO population genetic data, a Caucasian-specific donor panel was compiled for a performance comparison of the three RhD and ABO serological typing methods MDmulticard (Medion Diagnostics), ID-System (DiaMed) and ScanGel (Bio-Rad). The final test panel included standard and variant RHD and ABO genotypes, e.g. RhD categories, partial and weak RhDs, RhD DELs, and ABO samples, mainly to interpret weak serological reactivity for blood group A specificity. All samples were from individuals recorded in our local DNA blood group typing database. Results For ‘standard’ blood groups, results of performance were clearly interpretable for all three serological methods compared. However, when focusing on specific variant phenotypes, pronounced differences in reaction strengths and specificities were observed between them. Conclusions A genetically and ethnically predefined donor test panel consisting of 93 individual samples only, delivered highly significant results for serological performance comparisons. Such small panels offer impressive representative powers, higher as such based on statistical chances and large numbers only. PMID:21113264
Wang, S H; Zheng, D W; Zhu, Y K; Ma, X G; Shi, J; Ou, X C; Li, H; Xing, J; Zhao, Y L
2018-02-12
Objective: To compare the efficacies of cross priming amplification (CPA) and RealAmp with XpertMTB/RIF for the diagnosis of pulmonary tuberculosis(TB) at peripheral microscopic centers. Methods: From December of 2014 to December of 2015, 3 193 patients suspected with TB were enrolled consecutively at 3 county level TB clinical clinics in Zhongmu, Xinmi and Dengzhou of Henan province. Totally 3 193 collected sputum samples were detected by smear microscopy, L-J media culture, CPA, RealAmp and Xpert MTB/RIF. The culture positive samples were tested by MPB64 for strain identification. The sensitivity and specificity of CPA, RealAmp and Xpert MTB/RIF were calculated according to L-J solid culture results and clinical diagnosis results. Results: The sensitivity of CPA, RealAmp and Xpert MTB/RIF were 85.5%(413/483), 85.5%(413/483) and 87.9%(422/480), respectively, compared with L-J solid culture, the difference among the 3 methods being not significant(χ(2)=1.6, P >0.05). The specificity of CPA, RealAmp and Xpert MTB/RIF were 96.8%(2 624/2 170), 93.2%(2 527/2 170) and 95.3%(2 567/2 170) compared with culture; and there was a significantly statistic difference among the 3 methods(χ(2)=37.8, P <0.001). The sensitivity of smear microscopy, culture, CPA, RealAmp and Xpert MTB/RIF was 21.7%(300/1 383), 34.9%(483/1 383), 34.6%(478/1 383), 39.2%(542/1 383) and 38.1%(526/1 381) compared with clinical diagnosis. The sensitivity of CPA, RealAmp and Xpert MTB/RIF was higher than that of smear (χ(2) =31.9, P <0.01), but there was no significantly statistic difference between the 3 molecular methods(χ(2)=2.9, P >0.05). The specificity of smear microscopy, L-J solid culture, CPA, RealAmp and Xpert MTB/RIF was 100%(1 810/1 810), 100%(1 810/1 810), 98.8%(1 789/1 810), 98.8%(1 756/1 810) and 97.0%(1 788/1 810), and there was no significantly statistic difference among the 3 molecular methods(χ(2)=0.16, P >0.05). Conclusion: The capability of CPA and RealAmp for diagnosing pulmonary TB was similar to Xpert MTB/RIF.The former 2 methods were more suitable to apply to the diagnoses of pulmonary TB in peripheral laboratories.
Proposal for a recovery prediction method for patients affected by acute mediastinitis
2012-01-01
Background An attempt to find a prediction method of death risk in patients affected by acute mediastinitis. There is not such a tool described in available literature for that serious disease. Methods The study comprised 44 consecutive cases of acute mediastinitis. General anamnesis and biochemical data were included. Factor analysis was used to extract the risk characteristic for the patients. The most valuable results were obtained for 8 parameters which were selected for further statistical analysis (all collected during few hours after admission). Three factors reached Eigenvalue >1. Clinical explanations of these combined statistical factors are: Factor1 - proteinic status (serum total protein, albumin, and hemoglobin level), Factor2 - inflammatory status (white blood cells, CRP, procalcitonin), and Factor3 - general risk (age, number of coexisting diseases). Threshold values of prediction factors were estimated by means of statistical analysis (factor analysis, Statgraphics Centurion XVI). Results The final prediction result for the patients is constructed as simultaneous evaluation of all factor scores. High probability of death should be predicted if factor 1 value decreases with simultaneous increase of factors 2 and 3. The diagnostic power of the proposed method was revealed to be high [sensitivity =90%, specificity =64%], for Factor1 [SNC = 87%, SPC = 79%]; for Factor2 [SNC = 87%, SPC = 50%] and for Factor3 [SNC = 73%, SPC = 71%]. Conclusion The proposed prediction method seems a useful emergency signal during acute mediastinitis control in affected patients. PMID:22574625
Dental enamel defect diagnosis through different technology-based devices.
Kobayashi, Tatiana Yuriko; Vitor, Luciana Lourenço Ribeiro; Carrara, Cleide Felício Carvalho; Silva, Thiago Cruvinel; Rios, Daniela; Machado, Maria Aparecida Andrade Moreira; Oliveira, Thais Marchini
2018-06-01
Dental enamel defects (DEDs) are faulty or deficient enamel formations of primary and permanent teeth. Changes during tooth development result in hypoplasia (a quantitative defect) and/or hypomineralisation (a qualitative defect). To compare technology-based diagnostic methods for detecting DEDs. Two-hundred and nine dental surfaces of anterior permanent teeth were selected in patients, 6-11 years of age, with cleft lip with/without cleft palate. First, a conventional clinical examination was conducted according to the modified Developmental Defects of Enamel Index (DDE Index). Dental surfaces were evaluated using an operating microscope and a fluorescence-based device. Interexaminer reproducibility was determined using the kappa test. To compare groups, McNemar's test was used. Cramer's V test was used for comparing the distribution of index codes obtained after classification of all dental surfaces. Cramer's V test revealed statistically significant differences (P < .0001) in the distribution of index codes obtained using the different methods; the coefficients were 0.365 for conventional clinical examination versus fluorescence, 0.961 for conventional clinical examination versus operating microscope and 0.358 for operating microscope versus fluorescence. The sensitivity of the operating microscope and fluorescence method was statistically significant (P = .008 and P < .0001, respectively). Otherwise, the results did not show statistically significant differences in accuracy and specificity for either the operating microscope or the fluorescence methods. This study suggests that the operating microscope performed better than the fluorescence-based device and could be an auxiliary method for the detection of DEDs. © 2017 FDI World Dental Federation.
A novel statistical method for quantitative comparison of multiple ChIP-seq datasets.
Chen, Li; Wang, Chi; Qin, Zhaohui S; Wu, Hao
2015-06-15
ChIP-seq is a powerful technology to measure the protein binding or histone modification strength in the whole genome scale. Although there are a number of methods available for single ChIP-seq data analysis (e.g. 'peak detection'), rigorous statistical method for quantitative comparison of multiple ChIP-seq datasets with the considerations of data from control experiment, signal to noise ratios, biological variations and multiple-factor experimental designs is under-developed. In this work, we develop a statistical method to perform quantitative comparison of multiple ChIP-seq datasets and detect genomic regions showing differential protein binding or histone modification. We first detect peaks from all datasets and then union them to form a single set of candidate regions. The read counts from IP experiment at the candidate regions are assumed to follow Poisson distribution. The underlying Poisson rates are modeled as an experiment-specific function of artifacts and biological signals. We then obtain the estimated biological signals and compare them through the hypothesis testing procedure in a linear model framework. Simulations and real data analyses demonstrate that the proposed method provides more accurate and robust results compared with existing ones. An R software package ChIPComp is freely available at http://web1.sph.emory.edu/users/hwu30/software/ChIPComp.html. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
MyPMFs: a simple tool for creating statistical potentials to assess protein structural models.
Postic, Guillaume; Hamelryck, Thomas; Chomilier, Jacques; Stratmann, Dirk
2018-05-29
Evaluating the model quality of protein structures that evolve in environments with particular physicochemical properties requires scoring functions that are adapted to their specific residue compositions and/or structural characteristics. Thus, computational methods developed for structures from the cytosol cannot work properly on membrane or secreted proteins. Here, we present MyPMFs, an easy-to-use tool that allows users to train statistical potentials of mean force (PMFs) on the protein structures of their choice, with all parameters being adjustable. We demonstrate its use by creating an accurate statistical potential for transmembrane protein domains. We also show its usefulness to study the influence of the physical environment on residue interactions within protein structures. Our open-source software is freely available for download at https://github.com/bibip-impmc/mypmfs. Copyright © 2018. Published by Elsevier B.V.
Kong, Shibo; Tan, Xiaodong; Deng, Zhiqing; Xie, Yaofei; Yang, Fen; Zheng, Zengwang
2017-08-01
Snail control is a key link in schistosomiasis control, but no unified methods for eliminating snails have been produced to date. This study was conducted to explore an engineering method for eliminating Oncomelania hupensis applicable to urban areas. The engineering specifications were established using the Delphi method. An engineering project based on these specifications was conducted in Hankou marshland to eliminate snails, including the transformation of the beach surface and ditches. Molluscicide was used as a supplement. The snail control effect was evaluated by field investigation. The engineering results fulfilled the requirements of the design. The snail density decreased to 0/0.11m 2 , and the snail area dropped to 0m 2 after the project. There was a statistically significant difference in the number of frames with snails before and after the project (P<0.05). Snails were completely eliminated through one year of continuous monitoring, and no new snails were found after a flood disaster. This study demonstrates that engineering specifications for environmental modification were successfully established. Environmental modification, mainly through beach and ditch remediation, can completely change the environment of Oncomelania breeding. This method of environmental modification combined with mollusciciding was highly effective at eliminating snails. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Chao, Zenas C.; Bakkum, Douglas J.; Potter, Steve M.
2007-09-01
Electrically interfaced cortical networks cultured in vitro can be used as a model for studying the network mechanisms of learning and memory. Lasting changes in functional connectivity have been difficult to detect with extracellular multi-electrode arrays using standard firing rate statistics. We used both simulated and living networks to compare the ability of various statistics to quantify functional plasticity at the network level. Using a simulated integrate-and-fire neural network, we compared five established statistical methods to one of our own design, called center of activity trajectory (CAT). CAT, which depicts dynamics of the location-weighted average of spatiotemporal patterns of action potentials across the physical space of the neuronal circuitry, was the most sensitive statistic for detecting tetanus-induced plasticity in both simulated and living networks. By reducing the dimensionality of multi-unit data while still including spatial information, CAT allows efficient real-time computation of spatiotemporal activity patterns. Thus, CAT will be useful for studies in vivo or in vitro in which the locations of recording sites on multi-electrode probes are important.
The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison
Sioson, Allan A; Mane, Shrinivasrao P; Li, Pinghua; Sha, Wei; Heath, Lenwood S; Bohnert, Hans J; Grene, Ruth
2006-01-01
Background Analysis of DNA microarray data takes as input spot intensity measurements from scanner software and returns differential expression of genes between two conditions, together with a statistical significance assessment. This process typically consists of two steps: data normalization and identification of differentially expressed genes through statistical analysis. The Expresso microarray experiment management system implements these steps with a two-stage, log-linear ANOVA mixed model technique, tailored to individual experimental designs. The complement of tools in TM4, on the other hand, is based on a number of preset design choices that limit its flexibility. In the TM4 microarray analysis suite, normalization, filter, and analysis methods form an analysis pipeline. TM4 computes integrated intensity values (IIV) from the average intensities and spot pixel counts returned by the scanner software as input to its normalization steps. By contrast, Expresso can use either IIV data or median intensity values (MIV). Here, we compare Expresso and TM4 analysis of two experiments and assess the results against qRT-PCR data. Results The Expresso analysis using MIV data consistently identifies more genes as differentially expressed, when compared to Expresso analysis with IIV data. The typical TM4 normalization and filtering pipeline corrects systematic intensity-specific bias on a per microarray basis. Subsequent statistical analysis with Expresso or a TM4 t-test can effectively identify differentially expressed genes. The best agreement with qRT-PCR data is obtained through the use of Expresso analysis and MIV data. Conclusion The results of this research are of practical value to biologists who analyze microarray data sets. The TM4 normalization and filtering pipeline corrects microarray-specific systematic bias and complements the normalization stage in Expresso analysis. The results of Expresso using MIV data have the best agreement with qRT-PCR results. In one experiment, MIV is a better choice than IIV as input to data normalization and statistical analysis methods, as it yields as greater number of statistically significant differentially expressed genes; TM4 does not support the choice of MIV input data. Overall, the more flexible and extensive statistical models of Expresso achieve more accurate analytical results, when judged by the yardstick of qRT-PCR data, in the context of an experimental design of modest complexity. PMID:16626497
Limited-information goodness-of-fit testing of diagnostic classification item response models.
Hansen, Mark; Cai, Li; Monroe, Scott; Li, Zhen
2016-11-01
Despite the growing popularity of diagnostic classification models (e.g., Rupp et al., 2010, Diagnostic measurement: theory, methods, and applications, Guilford Press, New York, NY) in educational and psychological measurement, methods for testing their absolute goodness of fit to real data remain relatively underdeveloped. For tests of reasonable length and for realistic sample size, full-information test statistics such as Pearson's X 2 and the likelihood ratio statistic G 2 suffer from sparseness in the underlying contingency table from which they are computed. Recently, limited-information fit statistics such as Maydeu-Olivares and Joe's (2006, Psychometrika, 71, 713) M 2 have been found to be quite useful in testing the overall goodness of fit of item response theory models. In this study, we applied Maydeu-Olivares and Joe's (2006, Psychometrika, 71, 713) M 2 statistic to diagnostic classification models. Through a series of simulation studies, we found that M 2 is well calibrated across a wide range of diagnostic model structures and was sensitive to certain misspecifications of the item model (e.g., fitting disjunctive models to data generated according to a conjunctive model), errors in the Q-matrix (adding or omitting paths, omitting a latent variable), and violations of local item independence due to unmodelled testlet effects. On the other hand, M 2 was largely insensitive to misspecifications in the distribution of higher-order latent dimensions and to the specification of an extraneous attribute. To complement the analyses of the overall model goodness of fit using M 2 , we investigated the utility of the Chen and Thissen (1997, J. Educ. Behav. Stat., 22, 265) local dependence statistic XLD2 for characterizing sources of misfit, an important aspect of model appraisal often overlooked in favour of overall statements. The XLD2 statistic was found to be slightly conservative (with Type I error rates consistently below the nominal level) but still useful in pinpointing the sources of misfit. Patterns of local dependence arising due to specific model misspecifications are illustrated. Finally, we used the M 2 and XLD2 statistics to evaluate a diagnostic model fit to data from the Trends in Mathematics and Science Study, drawing upon analyses previously conducted by Lee et al., (2011, IJT, 11, 144). © 2016 The British Psychological Society.
R package MVR for Joint Adaptive Mean-Variance Regularization and Variance Stabilization
Dazard, Jean-Eudes; Xu, Hua; Rao, J. Sunil
2015-01-01
We present an implementation in the R language for statistical computing of our recent non-parametric joint adaptive mean-variance regularization and variance stabilization procedure. The method is specifically suited for handling difficult problems posed by high-dimensional multivariate datasets (p ≫ n paradigm), such as in ‘omics’-type data, among which are that the variance is often a function of the mean, variable-specific estimators of variances are not reliable, and tests statistics have low powers due to a lack of degrees of freedom. The implementation offers a complete set of features including: (i) normalization and/or variance stabilization function, (ii) computation of mean-variance-regularized t and F statistics, (iii) generation of diverse diagnostic plots, (iv) synthetic and real ‘omics’ test datasets, (v) computationally efficient implementation, using C interfacing, and an option for parallel computing, (vi) manual and documentation on how to setup a cluster. To make each feature as user-friendly as possible, only one subroutine per functionality is to be handled by the end-user. It is available as an R package, called MVR (‘Mean-Variance Regularization’), downloadable from the CRAN. PMID:26819572
León, Larry F; Cai, Tianxi
2012-04-01
In this paper we develop model checking techniques for assessing functional form specifications of covariates in censored linear regression models. These procedures are based on a censored data analog to taking cumulative sums of "robust" residuals over the space of the covariate under investigation. These cumulative sums are formed by integrating certain Kaplan-Meier estimators and may be viewed as "robust" censored data analogs to the processes considered by Lin, Wei & Ying (2002). The null distributions of these stochastic processes can be approximated by the distributions of certain zero-mean Gaussian processes whose realizations can be generated by computer simulation. Each observed process can then be graphically compared with a few realizations from the Gaussian process. We also develop formal test statistics for numerical comparison. Such comparisons enable one to assess objectively whether an apparent trend seen in a residual plot reects model misspecification or natural variation. We illustrate the methods with a well known dataset. In addition, we examine the finite sample performance of the proposed test statistics in simulation experiments. In our simulation experiments, the proposed test statistics have good power of detecting misspecification while at the same time controlling the size of the test.
Developing Topic-Specific Search Filters for PubMed with Click-Through Data
Li, Jiao; Lu, Zhiyong
2013-01-01
Summary Objectives Search filters have been developed and demonstrated for better information access to the immense and ever-growing body of publications in the biomedical domain. However, to date the number of filters remains quite limited because the current filter development methods require significant human efforts in manual document review and filter term selection. In this regard, we aim to investigate automatic methods for generating search filters. Methods We present an automated method to develop topic-specific filters on the basis of users’ search logs in PubMed. Specifically, for a given topic, we first detect its relevant user queries and then include their corresponding clicked articles to serve as the topic-relevant document set accordingly. Next, we statistically identify informative terms that best represent the topic-relevant document set using a background set composed of topic irrelevant articles. Lastly, the selected representative terms are combined with Boolean operators and evaluated on benchmark datasets to derive the final filter with the best performance. Results We applied our method to develop filters for four clinical topics: nephrology, diabetes, pregnancy, and depression. For the nephrology filter, our method obtained performance comparable to the state of the art (sensitivity of 91.3%, specificity of 98.7%, precision of 94.6%, and accuracy of 97.2%). Similarly, high-performing results (over 90% in all measures) were obtained for the other three search filters. Conclusion Based on PubMed click-through data, we successfully developed a high-performance method for generating topic-specific search filters that is significantly more efficient than existing manual methods. All data sets (topic-relevant and irrelevant document sets) used in this study and a demonstration system are publicly available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/downloads/CQ_filter/ PMID:23666447
Mallett, Susan; Halligan, Steve; Collins, Gary S.; Altman, Doug G.
2014-01-01
Background Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. Methods In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. Results Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. Conclusions The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests. PMID:25353643
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tratnyek, Paul G.; Bylaska, Eric J.; Weber, Eric J.
2017-01-01
Quantitative structure–activity relationships (QSARs) have long been used in the environmental sciences. More recently, molecular modeling and chemoinformatic methods have become widespread. These methods have the potential to expand and accelerate advances in environmental chemistry because they complement observational and experimental data with “in silico” results and analysis. The opportunities and challenges that arise at the intersection between statistical and theoretical in silico methods are most apparent in the context of properties that determine the environmental fate and effects of chemical contaminants (degradation rate constants, partition coefficients, toxicities, etc.). The main example of this is the calibration of QSARs usingmore » descriptor variable data calculated from molecular modeling, which can make QSARs more useful for predicting property data that are unavailable, but also can make them more powerful tools for diagnosis of fate determining pathways and mechanisms. Emerging opportunities for “in silico environmental chemical science” are to move beyond the calculation of specific chemical properties using statistical models and toward more fully in silico models, prediction of transformation pathways and products, incorporation of environmental factors into model predictions, integration of databases and predictive models into more comprehensive and efficient tools for exposure assessment, and extending the applicability of all the above from chemicals to biologicals and materials.« less
Silver, Matt; Montana, Giovanni
2012-01-01
Where causal SNPs (single nucleotide polymorphisms) tend to accumulate within biological pathways, the incorporation of prior pathways information into a statistical model is expected to increase the power to detect true associations in a genetic association study. Most existing pathways-based methods rely on marginal SNP statistics and do not fully exploit the dependence patterns among SNPs within pathways. We use a sparse regression model, with SNPs grouped into pathways, to identify causal pathways associated with a quantitative trait. Notable features of our “pathways group lasso with adaptive weights” (P-GLAW) algorithm include the incorporation of all pathways in a single regression model, an adaptive pathway weighting procedure that accounts for factors biasing pathway selection, and the use of a bootstrap sampling procedure for the ranking of important pathways. P-GLAW takes account of the presence of overlapping pathways and uses a novel combination of techniques to optimise model estimation, making it fast to run, even on whole genome datasets. In a comparison study with an alternative pathways method based on univariate SNP statistics, our method demonstrates high sensitivity and specificity for the detection of important pathways, showing the greatest relative gains in performance where marginal SNP effect sizes are small. PMID:22499682
Reentry survivability modeling
NASA Astrophysics Data System (ADS)
Fudge, Michael L.; Maher, Robert L.
1997-10-01
Statistical methods for expressing the impact risk posed to space systems in general [and the International Space Station (ISS) in particular] by other resident space objects have been examined. One of the findings of this investigation is that there are legitimate physical modeling reasons for the common statistical expression of the collision risk. A combination of statistical methods and physical modeling is also used to express the impact risk posed by re-entering space systems to objects of interest (e.g., people and property) on Earth. One of the largest uncertainties in the expressing of this risk is the estimation of survivable material which survives reentry to impact Earth's surface. This point was recently demonstrated in dramatic fashion by the impact of an intact expendable launch vehicle (ELV) upper stage near a private residence in the continental United States. Since approximately half of the missions supporting ISS will utilize ELVs, it is appropriate to examine the methods used to estimate the amount and physical characteristics of ELV debris surviving reentry to impact Earth's surface. This paper examines reentry survivability estimation methodology, including the specific methodology used by Caiman Sciences' 'Survive' model. Comparison between empirical results (observations of objects which have been recovered on Earth after surviving reentry) and Survive estimates are presented for selected upper stage or spacecraft components and a Delta launch vehicle second stage.
AAS and spectrophotometric methods for the determination metoprolol tartrate in tablets
NASA Astrophysics Data System (ADS)
Alpdoğan, Güzin; Sungur, Sidika
1999-11-01
Sensitive and specific atomic adsorption spectroscopy (AAS) and spectrophotometric methods have been developed for the determination of beta adrenergic blocking drug, metoprolol tartrate.The method is based on the formation of Cu(II) dithiocarbamate complex by derivatization of the secondary amino group of metoprolol with CS 2 and CuCl 2 in the presence of ammonia.The copper-bis(dithiocarbamate) complex was extracted into chloroform and the concentration of metoprolol tartrate was determined directly by spectrophotometric and indirectly by AAS measurement of copper.The two methods developed were applied to the assay of metoprolol tartrate in commercial tablet formulations.The methods were compared statistically with each other and with the high performance liquid chromatography (HPLC) method of USPXXII using t- and F-tests.
Robust Strategy for Rocket Engine Health Monitoring
NASA Technical Reports Server (NTRS)
Santi, L. Michael
2001-01-01
Monitoring the health of rocket engine systems is essentially a two-phase process. The acquisition phase involves sensing physical conditions at selected locations, converting physical inputs to electrical signals, conditioning the signals as appropriate to establish scale or filter interference, and recording results in a form that is easy to interpret. The inference phase involves analysis of results from the acquisition phase, comparison of analysis results to established health measures, and assessment of health indications. A variety of analytical tools may be employed in the inference phase of health monitoring. These tools can be separated into three broad categories: statistical, rule based, and model based. Statistical methods can provide excellent comparative measures of engine operating health. They require well-characterized data from an ensemble of "typical" engines, or "golden" data from a specific test assumed to define the operating norm in order to establish reliable comparative measures. Statistical methods are generally suitable for real-time health monitoring because they do not deal with the physical complexities of engine operation. The utility of statistical methods in rocket engine health monitoring is hindered by practical limits on the quantity and quality of available data. This is due to the difficulty and high cost of data acquisition, the limited number of available test engines, and the problem of simulating flight conditions in ground test facilities. In addition, statistical methods incur a penalty for disregarding flow complexity and are therefore limited in their ability to define performance shift causality. Rule based methods infer the health state of the engine system based on comparison of individual measurements or combinations of measurements with defined health norms or rules. This does not mean that rule based methods are necessarily simple. Although binary yes-no health assessment can sometimes be established by relatively simple rules, the causality assignment needed for refined health monitoring often requires an exceptionally complex rule base involving complicated logical maps. Structuring the rule system to be clear and unambiguous can be difficult, and the expert input required to maintain a large logic network and associated rule base can be prohibitive.
Messai, Habib; Farman, Muhammad; Sarraj-Laabidi, Abir; Hammami-Semmar, Asma; Semmar, Nabil
2016-01-01
Background. Olive oils (OOs) show high chemical variability due to several factors of genetic, environmental and anthropic types. Genetic and environmental factors are responsible for natural compositions and polymorphic diversification resulting in different varietal patterns and phenotypes. Anthropic factors, however, are at the origin of different blends’ preparation leading to normative, labelled or adulterated commercial products. Control of complex OO samples requires their (i) characterization by specific markers; (ii) authentication by fingerprint patterns; and (iii) monitoring by traceability analysis. Methods. These quality control and management aims require the use of several multivariate statistical tools: specificity highlighting requires ordination methods; authentication checking calls for classification and pattern recognition methods; traceability analysis implies the use of network-based approaches able to separate or extract mixed information and memorized signals from complex matrices. Results. This chapter presents a review of different chemometrics methods applied for the control of OO variability from metabolic and physical-chemical measured characteristics. The different chemometrics methods are illustrated by different study cases on monovarietal and blended OO originated from different countries. Conclusion. Chemometrics tools offer multiple ways for quantitative evaluations and qualitative control of complex chemical variability of OO in relation to several intrinsic and extrinsic factors. PMID:28231172
Statistical context shapes stimulus-specific adaptation in human auditory cortex
Henry, Molly J.; Fromboluti, Elisa Kim; McAuley, J. Devin
2015-01-01
Stimulus-specific adaptation is the phenomenon whereby neural response magnitude decreases with repeated stimulation. Inconsistencies between recent nonhuman animal recordings and computational modeling suggest dynamic influences on stimulus-specific adaptation. The present human electroencephalography (EEG) study investigates the potential role of statistical context in dynamically modulating stimulus-specific adaptation by examining the auditory cortex-generated N1 and P2 components. As in previous studies of stimulus-specific adaptation, listeners were presented with oddball sequences in which the presentation of a repeated tone was infrequently interrupted by rare spectral changes taking on three different magnitudes. Critically, the statistical context varied with respect to the probability of small versus large spectral changes within oddball sequences (half of the time a small change was most probable; in the other half a large change was most probable). We observed larger N1 and P2 amplitudes (i.e., release from adaptation) for all spectral changes in the small-change compared with the large-change statistical context. The increase in response magnitude also held for responses to tones presented with high probability, indicating that statistical adaptation can overrule stimulus probability per se in its influence on neural responses. Computational modeling showed that the degree of coadaptation in auditory cortex changed depending on the statistical context, which in turn affected stimulus-specific adaptation. Thus the present data demonstrate that stimulus-specific adaptation in human auditory cortex critically depends on statistical context. Finally, the present results challenge the implicit assumption of stationarity of neural response magnitudes that governs the practice of isolating established deviant-detection responses such as the mismatch negativity. PMID:25652920
Variable system: An alternative approach for the analysis of mediated moderation.
Kwan, Joyce Lok Yin; Chan, Wai
2018-06-01
Mediated moderation (meMO) occurs when the moderation effect of the moderator (W) on the relationship between the independent variable (X) and the dependent variable (Y) is transmitted through a mediator (M). To examine this process empirically, 2 different model specifications (Type I meMO and Type II meMO) have been proposed in the literature. However, both specifications are found to be problematic, either conceptually or statistically. For example, it can be shown that each type of meMO model is statistically equivalent to a particular form of moderated mediation (moME), another process that examines the condition when the indirect effect from X to Y through M varies as a function of W. Consequently, it is difficult for one to differentiate these 2 processes mathematically. This study therefore has 2 objectives. First, we attempt to differentiate moME and meMO by proposing an alternative specification for meMO. Conceptually, this alternative specification is intuitively meaningful and interpretable, and, statistically, it offers meMO a unique representation that is no longer identical to its moME counterpart. Second, using structural equation modeling, we propose an integrated approach for the analysis of meMO as well as for other general types of conditional path models. VS, a computer software program that implements the proposed approach, has been developed to facilitate the analysis of conditional path models for applied researchers. Real examples are considered to illustrate how the proposed approach works in practice and to compare its performance against the traditional methods. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Heilskov-Hansen, Thomas; Wulff Svendsen, Susanne; Frølund Thomsen, Jane; Mikkelsen, Sigurd; Hansson, Gert-Åke
2014-01-01
Objectives Sex differences in occupational biomechanical exposures may be part of the explanation why musculoskeletal complaints and disorders tend to be more common among women than among men. We aimed to determine possible sex differences in task distribution and task-specific postures and movements of the upper extremities among Danish house painters, and to establish sex-specific task exposure matrices. Methods To obtain task distributions, we sent out a questionnaire to all members of the Painters' Union in Denmark (N = 9364), of whom 53% responded. Respondents reported their task distributions in a typical week. To obtain task exposures, postures and movements were measured in 25 male and 25 female house painters for one whole working day per person. We used goniometers on the wrists, and inclinometers on the forehead and the upper arms. Participants filled in a logbook allowing task-specific exposures to be identified. Percentiles and % time with non-neutral postures were used to characterise postures. Velocity, range of motion, repetitiveness, and variation were used as measures of movement. Cochran-Mantel-Haenszel statistics and unpaired double-sided t-tests with post-hoc Bonferroni correction were used to evaluate sex differences. Results Statistically significant (p<0.05) sex differences were revealed in task proportions, but the proportions differed by less than 4%. For task exposures, no statistically significant sex differences were found. Conclusions Only minor sex differences were found in task distribution and task exposures regarding postures and movements among Danish house painters. Sex-specific task exposure matrices were established. PMID:25365301
Methodological choices affect cancer incidence rates: a cohort study.
Brooke, Hannah L; Talbäck, Mats; Feychting, Maria; Ljung, Rickard
2017-01-19
Incidence rates are fundamental to epidemiology, but their magnitude and interpretation depend on methodological choices. We aimed to examine the extent to which the definition of the study population affects cancer incidence rates. All primary cancer diagnoses in Sweden between 1958 and 2010 were identified from the national Cancer Register. Age-standardized and age-specific incidence rates of 29 cancer subtypes between 2000 and 2010 were calculated using four definitions of the study population: persons resident in Sweden 1) based on general population statistics; 2) with no previous subtype-specific cancer diagnosis; 3) with no previous cancer diagnosis except non-melanoma skin cancer; and 4) with no previous cancer diagnosis of any type. We calculated absolute and relative differences between methods. Age-standardized incidence rates calculated using general population statistics ranged from 6% lower (prostate cancer, incidence rate difference: -13.5/100,000 person-years) to 8% higher (breast cancer in women, incidence rate difference: 10.5/100,000 person-years) than incidence rates based on individuals with no previous subtype-specific cancer diagnosis. Age-standardized incidence rates in persons with no previous cancer of any type were up to 10% lower (bladder cancer in women) than rates in those with no previous subtype-specific cancer diagnosis; however, absolute differences were <5/100,000 person-years for all cancer subtypes. For some cancer subtypes incidence rates vary depending on the definition of the study population. For these subtypes, standardized incidence ratios calculated using general population statistics could be misleading. Moreover, etiological arguments should be used to inform methodological choices during study design.
Kaye, T.N.; Pyke, David A.
2003-01-01
Population viability analysis is an important tool for conservation biologists, and matrix models that incorporate stochasticity are commonly used for this purpose. However, stochastic simulations may require assumptions about the distribution of matrix parameters, and modelers often select a statistical distribution that seems reasonable without sufficient data to test its fit. We used data from long-term (5a??10 year) studies with 27 populations of five perennial plant species to compare seven methods of incorporating environmental stochasticity. We estimated stochastic population growth rate (a measure of viability) using a matrix-selection method, in which whole observed matrices were selected at random at each time step of the model. In addition, we drew matrix elements (transition probabilities) at random using various statistical distributions: beta, truncated-gamma, truncated-normal, triangular, uniform, or discontinuous/observed. Recruitment rates were held constant at their observed mean values. Two methods of constraining stage-specific survival to a??100% were also compared. Different methods of incorporating stochasticity and constraining matrix column sums interacted in their effects and resulted in different estimates of stochastic growth rate (differing by up to 16%). Modelers should be aware that when constraining stage-specific survival to 100%, different methods may introduce different levels of bias in transition element means, and when this happens, different distributions for generating random transition elements may result in different viability estimates. There was no species effect on the results and the growth rates derived from all methods were highly correlated with one another. We conclude that the absolute value of population viability estimates is sensitive to model assumptions, but the relative ranking of populations (and management treatments) is robust. Furthermore, these results are applicable to a range of perennial plants and possibly other life histories.
Weighted regularized statistical shape space projection for breast 3D model reconstruction.
Ruiz, Guillermo; Ramon, Eduard; García, Jaime; Sukno, Federico M; Ballester, Miguel A González
2018-07-01
The use of 3D imaging has increased as a practical and useful tool for plastic and aesthetic surgery planning. Specifically, the possibility of representing the patient breast anatomy in a 3D shape and simulate aesthetic or plastic procedures is a great tool for communication between surgeon and patient during surgery planning. For the purpose of obtaining the specific 3D model of the breast of a patient, model-based reconstruction methods can be used. In particular, 3D morphable models (3DMM) are a robust and widely used method to perform 3D reconstruction. However, if additional prior information (i.e., known landmarks) is combined with the 3DMM statistical model, shape constraints can be imposed to improve the 3DMM fitting accuracy. In this paper, we present a framework to fit a 3DMM of the breast to two possible inputs: 2D photos and 3D point clouds (scans). Our method consists in a Weighted Regularized (WR) projection into the shape space. The contribution of each point in the 3DMM shape is weighted allowing to assign more relevance to those points that we want to impose as constraints. Our method is applied at multiple stages of the 3D reconstruction process. Firstly, it can be used to obtain a 3DMM initialization from a sparse set of 3D points. Additionally, we embed our method in the 3DMM fitting process in which more reliable or already known 3D points or regions of points, can be weighted in order to preserve their shape information. The proposed method has been tested in two different input settings: scans and 2D pictures assessing both reconstruction frameworks with very positive results. Copyright © 2018 Elsevier B.V. All rights reserved.
A new method to search for high-redshift clusters using photometric redshifts
DOE Office of Scientific and Technical Information (OSTI.GOV)
Castignani, G.; Celotti, A.; Chiaberge, M.
2014-09-10
We describe a new method (Poisson probability method, PPM) to search for high-redshift galaxy clusters and groups by using photometric redshift information and galaxy number counts. The method relies on Poisson statistics and is primarily introduced to search for megaparsec-scale environments around a specific beacon. The PPM is tailored to both the properties of the FR I radio galaxies in the Chiaberge et al. sample, which are selected within the COSMOS survey, and to the specific data set used. We test the efficiency of our method of searching for cluster candidates against simulations. Two different approaches are adopted. (1) Wemore » use two z ∼ 1 X-ray detected cluster candidates found in the COSMOS survey and we shift them to higher redshift up to z = 2. We find that the PPM detects the cluster candidates up to z = 1.5, and it correctly estimates both the redshift and size of the two clusters. (2) We simulate spherically symmetric clusters of different size and richness, and we locate them at different redshifts (i.e., z = 1.0, 1.5, and 2.0) in the COSMOS field. We find that the PPM detects the simulated clusters within the considered redshift range with a statistical 1σ redshift accuracy of ∼0.05. The PPM is an efficient alternative method for high-redshift cluster searches that may also be applied to both present and future wide field surveys such as SDSS Stripe 82, LSST, and Euclid. Accurate photometric redshifts and a survey depth similar or better than that of COSMOS (e.g., I < 25) are required.« less
NASA Technical Reports Server (NTRS)
Bull, William B. (Compiler); Pinoli, Pat C. (Compiler); Upton, Cindy G. (Compiler); Day, Tony (Compiler); Hill, Keith (Compiler); Stone, Frank (Compiler); Hall, William B.
1994-01-01
This report is a compendium of the presentations of the 12th biannual meeting of the Industry Advisory Committee under the Solid Propulsion Integrity Program. A complete transcript of the welcoming talks is provided. Presentation outlines and overheads are included for the other sessions: SPIP Overview, Past, Current and Future Activity; Test Methods Manual and Video Tape Library; Air Force Developed Computer Aided Cure Program and SPC/TQM Experience; Magneto-Optical mapper (MOM), Joint Army/NASA program to assess composite integrity; Permeability Testing; Moisture Effusion Testing by Karl Fischer Analysis; Statistical Analysis of Acceptance Test Data; NMR Phenolic Resin Advancement; Constituent Testing Highlights on the LDC Optimization Program; Carbon Sulfur Study, Performance Related Testing; Current Rayon Specifications and Future Availability; RSRM/SPC Implementation; SRM Test Methods, Delta/Titan/FBM/RSRM; and Open Forum on Performance Based Acceptance Testing -- Industry Experience.
Gu, Jinghua; Xuan, Jianhua; Riggins, Rebecca B; Chen, Li; Wang, Yue; Clarke, Robert
2012-08-01
Identification of transcriptional regulatory networks (TRNs) is of significant importance in computational biology for cancer research, providing a critical building block to unravel disease pathways. However, existing methods for TRN identification suffer from the inclusion of excessive 'noise' in microarray data and false-positives in binding data, especially when applied to human tumor-derived cell line studies. More robust methods that can counteract the imperfection of data sources are therefore needed for reliable identification of TRNs in this context. In this article, we propose to establish a link between the quality of one target gene to represent its regulator and the uncertainty of its expression to represent other target genes. Specifically, an outlier sum statistic was used to measure the aggregated evidence for regulation events between target genes and their corresponding transcription factors. A Gibbs sampling method was then developed to estimate the marginal distribution of the outlier sum statistic, hence, to uncover underlying regulatory relationships. To evaluate the effectiveness of our proposed method, we compared its performance with that of an existing sampling-based method using both simulation data and yeast cell cycle data. The experimental results show that our method consistently outperforms the competing method in different settings of signal-to-noise ratio and network topology, indicating its robustness for biological applications. Finally, we applied our method to breast cancer cell line data and demonstrated its ability to extract biologically meaningful regulatory modules related to estrogen signaling and action in breast cancer. The Gibbs sampler MATLAB package is freely available at http://www.cbil.ece.vt.edu/software.htm. xuan@vt.edu Supplementary data are available at Bioinformatics online.
Gu, Jinghua; Xuan, Jianhua; Riggins, Rebecca B.; Chen, Li; Wang, Yue; Clarke, Robert
2012-01-01
Motivation: Identification of transcriptional regulatory networks (TRNs) is of significant importance in computational biology for cancer research, providing a critical building block to unravel disease pathways. However, existing methods for TRN identification suffer from the inclusion of excessive ‘noise’ in microarray data and false-positives in binding data, especially when applied to human tumor-derived cell line studies. More robust methods that can counteract the imperfection of data sources are therefore needed for reliable identification of TRNs in this context. Results: In this article, we propose to establish a link between the quality of one target gene to represent its regulator and the uncertainty of its expression to represent other target genes. Specifically, an outlier sum statistic was used to measure the aggregated evidence for regulation events between target genes and their corresponding transcription factors. A Gibbs sampling method was then developed to estimate the marginal distribution of the outlier sum statistic, hence, to uncover underlying regulatory relationships. To evaluate the effectiveness of our proposed method, we compared its performance with that of an existing sampling-based method using both simulation data and yeast cell cycle data. The experimental results show that our method consistently outperforms the competing method in different settings of signal-to-noise ratio and network topology, indicating its robustness for biological applications. Finally, we applied our method to breast cancer cell line data and demonstrated its ability to extract biologically meaningful regulatory modules related to estrogen signaling and action in breast cancer. Availability and implementation: The Gibbs sampler MATLAB package is freely available at http://www.cbil.ece.vt.edu/software.htm. Contact: xuan@vt.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22595208
NASA Astrophysics Data System (ADS)
Abdel-Ghany, Maha F.; Abdel-Aziz, Omar; Ayad, Miriam F.; Tadros, Mariam M.
New, simple, specific, accurate, precise and reproducible spectrophotometric methods have been developed and subsequently validated for determination of vildagliptin (VLG) and metformin (MET) in binary mixture. Zero order spectrophotometric method was the first method used for determination of MET in the range of 2-12 μg mL-1 by measuring the absorbance at 237.6 nm. The second method was derivative spectrophotometric technique; utilized for determination of MET at 247.4 nm, in the range of 1-12 μg mL-1. Derivative ratio spectrophotometric method was the third technique; used for determination of VLG in the range of 4-24 μg mL-1 at 265.8 nm. Fourth and fifth methods adopted for determination of VLG in the range of 4-24 μg mL-1; were ratio subtraction and mean centering spectrophotometric methods, respectively. All the results were statistically compared with the reported methods, using one-way analysis of variance (ANOVA). The developed methods were satisfactorily applied to analysis of the investigated drugs and proved to be specific and accurate for quality control of them in pharmaceutical dosage forms.
NASA Technical Reports Server (NTRS)
Grotjahn, Richard; Black, Robert; Leung, Ruby; Wehner, Michael F.; Barlow, Mathew; Bosilovich, Michael G.; Gershunov, Alexander; Gutowski, William J., Jr.; Gyakum, John R.; Katz, Richard W.;
2015-01-01
The objective of this paper is to review statistical methods, dynamics, modeling efforts, and trends related to temperature extremes, with a focus upon extreme events of short duration that affect parts of North America. These events are associated with large scale meteorological patterns (LSMPs). The statistics, dynamics, and modeling sections of this paper are written to be autonomous and so can be read separately. Methods to define extreme events statistics and to identify and connect LSMPs to extreme temperature events are presented. Recent advances in statistical techniques connect LSMPs to extreme temperatures through appropriately defined covariates that supplement more straightforward analyses. Various LSMPs, ranging from synoptic to planetary scale structures, are associated with extreme temperature events. Current knowledge about the synoptics and the dynamical mechanisms leading to the associated LSMPs is incomplete. Systematic studies of: the physics of LSMP life cycles, comprehensive model assessment of LSMP-extreme temperature event linkages, and LSMP properties are needed. Generally, climate models capture observed properties of heat waves and cold air outbreaks with some fidelity. However they overestimate warm wave frequency and underestimate cold air outbreak frequency, and underestimate the collective influence of low-frequency modes on temperature extremes. Modeling studies have identified the impact of large-scale circulation anomalies and landatmosphere interactions on changes in extreme temperatures. However, few studies have examined changes in LSMPs to more specifically understand the role of LSMPs on past and future extreme temperature changes. Even though LSMPs are resolvable by global and regional climate models, they are not necessarily well simulated. The paper concludes with unresolved issues and research questions.
Direct Statistical Simulation of Astrophysical and Geophysical Flows
NASA Astrophysics Data System (ADS)
Marston, B.; Tobias, S.
2011-12-01
Astrophysical and geophysical flows are amenable to direct statistical simulation (DSS), the calculation of statistical properties that does not rely upon accumulation by direct numerical simulation (DNS) (Tobias and Marston, 2011). Anisotropic and inhomogeneous flows, such as those found in the atmospheres of planets, in rotating stars, and in disks, provide the starting point for an expansion in fluctuations about the mean flow, leading to a hierarchy of equations of motion for the equal-time cumulants. The method is described for a general set of evolution equations, and then illustrated for two specific cases: (i) A barotropic jet on a rotating sphere (Marston, Conover, and Schneider, 2008); and (ii) A model of a stellar tachocline driven by relaxation to an underlying flow with shear (Cally 2001) for which a joint instability arises from the combination of shearing forces and magnetic stress. The reliability of DSS is assessed by comparing statistics so obtained against those accumulated from DNS, the traditional approach. The simplest non-trivial closure, CE2, sets the third and higher cumulants to zero yet yields qualitatively accurate low-order statistics for both systems. Physically CE2 retains only the eddy-mean flow interaction, and drops the eddy-eddy interaction. Quantitatively accurate zonal means are found for barotropic jet for long and short (but not intermediate) relaxation times, and for Cally problem in the case of strong shearing and large magnetic fields. Deficiencies in CE2 can be repaired at the CE3 level, that is by retaining the third cumulant (Marston 2011). We conclude by discussing possible extensions of the method both in terms of computational methods and the range of astrophysical and geophysical problems that are of interest.
Computer-aided auditing of prescription drug claims.
Iyengar, Vijay S; Hermiz, Keith B; Natarajan, Ramesh
2014-09-01
We describe a methodology for identifying and ranking candidate audit targets from a database of prescription drug claims. The relevant audit targets may include various entities such as prescribers, patients and pharmacies, who exhibit certain statistical behavior indicative of potential fraud and abuse over the prescription claims during a specified period of interest. Our overall approach is consistent with related work in statistical methods for detection of fraud and abuse, but has a relative emphasis on three specific aspects: first, based on the assessment of domain experts, certain focus areas are selected and data elements pertinent to the audit analysis in each focus area are identified; second, specialized statistical models are developed to characterize the normalized baseline behavior in each focus area; and third, statistical hypothesis testing is used to identify entities that diverge significantly from their expected behavior according to the relevant baseline model. The application of this overall methodology to a prescription claims database from a large health plan is considered in detail.
An ANOVA approach for statistical comparisons of brain networks.
Fraiman, Daniel; Fraiman, Ricardo
2018-03-16
The study of brain networks has developed extensively over the last couple of decades. By contrast, techniques for the statistical analysis of these networks are less developed. In this paper, we focus on the statistical comparison of brain networks in a nonparametric framework and discuss the associated detection and identification problems. We tested network differences between groups with an analysis of variance (ANOVA) test we developed specifically for networks. We also propose and analyse the behaviour of a new statistical procedure designed to identify different subnetworks. As an example, we show the application of this tool in resting-state fMRI data obtained from the Human Connectome Project. We identify, among other variables, that the amount of sleep the days before the scan is a relevant variable that must be controlled. Finally, we discuss the potential bias in neuroimaging findings that is generated by some behavioural and brain structure variables. Our method can also be applied to other kind of networks such as protein interaction networks, gene networks or social networks.
Statistical Analyses of Raw Material Data for MTM45-1/CF7442A-36% RW: CMH Cure Cycle
NASA Technical Reports Server (NTRS)
Coroneos, Rula; Pai, Shantaram, S.; Murthy, Pappu
2013-01-01
This report describes statistical characterization of physical properties of the composite material system MTM45-1/CF7442A, which has been tested and is currently being considered for use on spacecraft structures. This composite system is made of 6K plain weave graphite fibers in a highly toughened resin system. This report summarizes the distribution types and statistical details of the tests and the conditions for the experimental data generated. These distributions will be used in multivariate regression analyses to help determine material and design allowables for similar material systems and to establish a procedure for other material systems. Additionally, these distributions will be used in future probabilistic analyses of spacecraft structures. The specific properties that are characterized are the ultimate strength, modulus, and Poisson??s ratio by using a commercially available statistical package. Results are displayed using graphical and semigraphical methods and are included in the accompanying appendixes.
Pincus, Steven M; Schmidt, Peter J; Palladino-Negro, Paula; Rubinow, David R
2008-04-01
Enhanced statistical characterization of mood-rating data holds the potential to more precisely classify and sub-classify recurrent mood disorders like premenstrual dysphoric disorder (PMDD) and recurrent brief depressive disorder (RBD). We applied several complementary statistical methods to differentiate mood rating dynamics among women with PMDD, RBD, and normal controls (NC). We compared three subgroups of women: NC (n=8); PMDD (n=15); and RBD (n=9) on the basis of daily self-ratings of sadness, study lengths between 50 and 120 days. We analyzed mean levels; overall variability, SD; sequential irregularity, approximate entropy (ApEn); and a quantification of the extent of brief and staccato dynamics, denoted 'Spikiness'. For each of SD, irregularity (ApEn), and Spikiness, we showed highly significant subgroup differences, ANOVA0.001 for each statistic; additionally, many paired subgroup comparisons showed highly significant differences. In contrast, mean levels were indistinct among the subgroups. For SD, normal controls had much smaller levels than the other subgroups, with RBD intermediate. ApEn showed PMDD to be significantly more regular than the other subgroups. Spikiness showed NC and RBD data sets to be much more staccato than their PMDD counterparts, and appears to suitably characterize the defining feature of RBD dynamics. Compound criteria based on these statistical measures discriminated diagnostic subgroups with high sensitivity and specificity. Taken together, the statistical suite provides well-defined specifications of each subgroup. This can facilitate accurate diagnosis, and augment the prediction and evaluation of response to treatment. The statistical methodologies have broad and direct applicability to behavioral studies for many psychiatric disorders, and indeed to similar analyses of associated biological signals across multiple axes.
Python package for model STructure ANalysis (pySTAN)
NASA Astrophysics Data System (ADS)
Van Hoey, Stijn; van der Kwast, Johannes; Nopens, Ingmar; Seuntjens, Piet
2013-04-01
The selection and identification of a suitable hydrological model structure is more than fitting parameters of a model structure to reproduce a measured hydrograph. The procedure is highly dependent on various criteria, i.e. the modelling objective, the characteristics and the scale of the system under investigation as well as the available data. Rigorous analysis of the candidate model structures is needed to support and objectify the selection of the most appropriate structure for a specific case (or eventually justify the use of a proposed ensemble of structures). This holds both in the situation of choosing between a limited set of different structures as well as in the framework of flexible model structures with interchangeable components. Many different methods to evaluate and analyse model structures exist. This leads to a sprawl of available methods, all characterized by different assumptions, changing conditions of application and various code implementations. Methods typically focus on optimization, sensitivity analysis or uncertainty analysis, with backgrounds from optimization, machine-learning or statistics amongst others. These methods also need an evaluation metric (objective function) to compare the model outcome with some observed data. However, for current methods described in literature, implementations are not always transparent and reproducible (if available at all). No standard procedures exist to share code and the popularity (and amount of applications) of the methods is sometimes more dependent on the availability than the merits of the method. Moreover, new implementations of existing methods are difficult to verify and the different theoretical backgrounds make it difficult for environmental scientists to decide about the usefulness of a specific method. A common and open framework with a large set of methods can support users in deciding about the most appropriate method. Hence, it enables to simultaneously apply and compare different methods on a fair basis. We developed and present pySTAN (python framework for STructure Analysis), a python package containing a set of functions for model structure evaluation to provide the analysis of (hydrological) model structures. A selected set of algorithms for optimization, uncertainty and sensitivity analysis is currently available, together with a set of evaluation (objective) functions and input distributions to sample from. The methods are implemented model-independent and the python language provides the wrapper functions to apply administer external model codes. Different objective functions can be considered simultaneously with both statistical metrics and more hydrology specific metrics. By using so-called reStructuredText (sphinx documentation generator) and Python documentation strings (docstrings), the generation of manual pages is semi-automated and a specific environment is available to enhance both the readability and transparency of the code. It thereby enables a larger group of users to apply and compare these methods and to extend the functionalities.
Laserson, K F; Petralanda, I; Hamlin, D M; Almera, R; Fuentes, M; Carrasquel, A; Barker, R H
1994-02-01
We have examined the reproducibility, sensitivity, and specificity of detecting Plasmodium falciparum using the polymerase chain reaction (PCR) and the species-specific probe pPF14 under field conditions in the Venezuelan Amazon. Up to eight samples were field collected from each of 48 consenting Amerindians presenting with symptoms of malaria. Sample processing and analysis was performed at the Centro Amazonico para la Investigacion y Control de Enfermedades Tropicales Simon Bolivar. A total of 229 samples from 48 patients were analyzed by PCR methods using four different P. falciparum-specific probes. One P. vivax-specific probe and by conventional microscopy. Samples in which results from PCR and microscopy differed were reanalyzed at a higher sensitivity by microscopy. Results suggest that microscopy-negative, PCR-positive samples are true positives, and that microscopy-positive and PCR-negative samples are true negatives. The sensitivity of the DNA probe/PCR method was 78% and its specificity was 97%. The positive predictive value of the PCR method was 88%, and the negative predictive value was 95%. Through the analysis of multiple blood samples from each individual, the DNA probe/PCR methodology was found to have an inherent reproducibility that was highly statistically significant.
Analysis of the sleep quality of elderly people using biomedical signals.
Moreno-Alsasua, L; Garcia-Zapirain, B; Mendez-Zorrilla, A
2015-01-01
This paper presents a technical solution that analyses sleep signals captured by biomedical sensors to find possible disorders during rest. Specifically, the method evaluates electrooculogram (EOG) signals, skin conductance (GSR), air flow (AS), and body temperature. Next, a quantitative sleep quality analysis determines significant changes in the biological signals, and any similarities between them in a given time period. Filtering techniques such as the Fourier transform method and IIR filters process the signal and identify significant variations. Once these changes have been identified, all significant data is compared and a quantitative and statistical analysis is carried out to determine the level of a person's rest. To evaluate the correlation and significant differences, a statistical analysis has been calculated showing correlation between EOG and AS signals (p=0,005), EOG, and GSR signals (p=0,037) and, finally, the EOG and Body temperature (p=0,04). Doctors could use this information to monitor changes within a patient.
Levine, Judah
2016-01-01
A method is presented for synchronizing the time of a clock to a remote time standard when the channel connecting the two has significant delay variation that can be described only statistically. The method compares the Allan deviation of the channel fluctuations to the free-running stability of the local clock, and computes the optimum interval between requests based on one of three selectable requirements: (1) choosing the highest possible accuracy, (2) choosing the best tradeoff of cost vs. accuracy, or (3) minimizing the number of requests to realize a specific accuracy. Once the interval between requests is chosen, the final step is to steer the local clock based on the received data. A typical adjustment algorithm, which supports both the statistical considerations based on the Allan deviation comparison and the timely detection of errors is included as an example. PMID:26529759
Vargas-Rodriguez, Everardo; Guzman-Chavez, Ana Dinora; Baeza-Serrato, Roberto
2018-06-04
In this work, a novel tailored algorithm to enhance the overall sensitivity of gas concentration sensors based on the Direct Absorption Tunable Laser Absorption Spectroscopy (DA-ATLAS) method is presented. By using this algorithm, the sensor sensitivity can be custom-designed to be quasi constant over a much larger dynamic range compared with that obtained by typical methods based on a single statistics feature of the sensor signal output (peak amplitude, area under the curve, mean or RMS). Additionally, it is shown that with our algorithm, an optimal function can be tailored to get a quasi linear relationship between the concentration and some specific statistics features over a wider dynamic range. In order to test the viability of our algorithm, a basic C 2 H 2 sensor based on DA-ATLAS was implemented, and its experimental measurements support the simulated results provided by our algorithm.
Chen, Chih-Hao; Hsu, Chueh-Lin; Huang, Shih-Hao; Chen, Shih-Yuan; Hung, Yi-Lin; Chen, Hsiao-Rong; Wu, Yu-Chung
2015-01-01
Although genome-wide expression analysis has become a routine tool for gaining insight into molecular mechanisms, extraction of information remains a major challenge. It has been unclear why standard statistical methods, such as the t-test and ANOVA, often lead to low levels of reproducibility, how likely applying fold-change cutoffs to enhance reproducibility is to miss key signals, and how adversely using such methods has affected data interpretations. We broadly examined expression data to investigate the reproducibility problem and discovered that molecular heterogeneity, a biological property of genetically different samples, has been improperly handled by the statistical methods. Here we give a mathematical description of the discovery and report the development of a statistical method, named HTA, for better handling molecular heterogeneity. We broadly demonstrate the improved sensitivity and specificity of HTA over the conventional methods and show that using fold-change cutoffs has lost much information. We illustrate the especial usefulness of HTA for heterogeneous diseases, by applying it to existing data sets of schizophrenia, bipolar disorder and Parkinson’s disease, and show it can abundantly and reproducibly uncover disease signatures not previously detectable. Based on 156 biological data sets, we estimate that the methodological issue has affected over 96% of expression studies and that HTA can profoundly correct 86% of the affected data interpretations. The methodological advancement can better facilitate systems understandings of biological processes, render biological inferences that are more reliable than they have hitherto been and engender translational medical applications, such as identifying diagnostic biomarkers and drug prediction, which are more robust. PMID:25793610
Alles, Susan; Peng, Linda X; Mozola, Mark A
2009-01-01
A modification to Performance-Tested Method 010403, GeneQuence Listeria Test (DNAH method), is described. The modified method uses a new media formulation, LESS enrichment broth, in single-step enrichment protocols for both foods and environmental sponge and swab samples. Food samples are enriched for 27-30 h at 30 degrees C, and environmental samples for 24-48 h at 30 degrees C. Implementation of these abbreviated enrichment procedures allows test results to be obtained on a next-day basis. In testing of 14 food types in internal comparative studies with inoculated samples, there were statistically significant differences in method performance between the DNAH method and reference culture procedures for only 2 foods (pasteurized crab meat and lettuce) at the 27 h enrichment time point and for only a single food (pasteurized crab meat) in one trial at the 30 h enrichment time point. Independent laboratory testing with 3 foods showed statistical equivalence between the methods for all foods, and results support the findings of the internal trials. Overall, considering both internal and independent laboratory trials, sensitivity of the DNAH method relative to the reference culture procedures was 90.5%. Results of testing 5 environmental surfaces inoculated with various strains of Listeria spp. showed that the DNAH method was more productive than the reference U.S. Department of Agriculture-Food Safety and Inspection Service (USDA-FSIS) culture procedure for 3 surfaces (stainless steel, plastic, and cast iron), whereas results were statistically equivalent to the reference method for the other 2 surfaces (ceramic tile and sealed concrete). An independent laboratory trial with ceramic tile inoculated with L. monocytogenes confirmed the effectiveness of the DNAH method at the 24 h time point. Overall, sensitivity of the DNAH method at 24 h relative to that of the USDA-FSIS method was 152%. The DNAH method exhibited extremely high specificity, with only 1% false-positive reactions overall.
ERIC Educational Resources Information Center
Zhong, Hua; Schwartz, Jennifer
2010-01-01
Underage drinking is among the most serious of public health problems facing adolescents in the United States. Recent concerns have centered on young women, reflected in media reports and arrest statistics on their increasing problematic alcohol use. This study rigorously examined whether girls' alcohol use rose by applying time series methods to…
Operational Planning of Channel Airlift Missions Using Forecasted Demand
2013-03-01
tailored to the specific problem ( Metaheuristics , 2005). As seen in the section Cargo Loading Algorithm , heuristic methods are often iterative...that are equivalent to the forecasted cargo amount. The simulated pallets are then used in a heuristic cargo loading algorithm . The loading... algorithm places cargo onto available aircraft (based on real schedules) given the date and the destination and outputs statistics based on the aircraft ton
The application of automatic recognition techniques in the Apollo 9 SO-65 experiment
NASA Technical Reports Server (NTRS)
Macdonald, R. B.
1970-01-01
A synoptic feature analysis is reported on Apollo 9 remote earth surface photographs that uses the methods of statistical pattern recognition to classify density points and clusterings in digital conversion of optical data. A computer derived geological map of a geological test site indicates that geological features of the range are separable, but that specific rock types are not identifiable.
ERIC Educational Resources Information Center
Van Houten, Ron; Malenfant, J. E. Louis; Zhao, Nan; Ko, Byungkon; Van Houten, Jonathan
2005-01-01
The Florida Department of Transportation used a series of changeable-message signs that functioned as freeway guide signs to divert traffic to Universal Theme Park via one of two eastbound exits based on traffic congestion at the first of the two exits. An examination of crashes along the entire route indicated a statistically significant increase…
Method for Establishing Direction of Arrival by Use of Signals of Opportunity
2017-08-29
March 2018 The below identified patent application is available for licensing. Requests for information should be addressed to: TECHNOLOGY...without the payment of any royalties thereon or therefor. CROSS REFERENCE TO OTHER PATENT APPLICATIONS [0002] None. BACKGROUND OF THE INVENTION (1...based on a statistical model of a partitioned aperture communications receiving system and specifically a receiving system to converge on a best
A Survey of ChalleNGe Program Teachers: Their Characteristics and Pedagogical Approaches
2015-08-01
aloud, and reading books of their choice during class. As we have done with other pedagogical methods, we estimated the relationship between the...significant relationships between pedagogical practices and average cadet outcomes. When considering the impact of specific math subjects and the extent...ChalleNGe). Although we did find some statistically significant relationships between pedagogical approaches and cadets’ average outcomes, we
ERIC Educational Resources Information Center
Bradford, Jennifer; Mowder, Denise; Bohte, Joy
2016-01-01
The current project conducted an assessment of specific, directed use of student-centered teaching techniques in a criminal justice and criminology research methods and statistics class. The project sought to ascertain to what extent these techniques improved or impacted student learning and engagement in this traditionally difficult course.…
Comparison analysis for classification algorithm in data mining and the study of model use
NASA Astrophysics Data System (ADS)
Chen, Junde; Zhang, Defu
2018-04-01
As a key technique in data mining, classification algorithm was received extensive attention. Through an experiment of classification algorithm in UCI data set, we gave a comparison analysis method for the different algorithms and the statistical test was used here. Than that, an adaptive diagnosis model for preventive electricity stealing and leakage was given as a specific case in the paper.
1986-02-01
espacially trte for the topics of sampling and analytical methods, statistical considerations, and the design of general water quality monitoring networks. For...and to the establishment and habitat differentiation of biological populations within reservoirs. Reservoir operatirn, esp- cially the timing...8217 % - - % properties of bottom sediments, as well as specific habitat associations of biological populations of reservoirs. Thus, such heterogeneities
A data recipient centered de-identification method to retain statistical attributes.
Gal, Tamas S; Tucker, Thomas C; Gangopadhyay, Aryya; Chen, Zhiyuan
2014-08-01
Privacy has always been a great concern of patients and medical service providers. As a result of the recent advances in information technology and the government's push for the use of Electronic Health Record (EHR) systems, a large amount of medical data is collected and stored electronically. This data needs to be made available for analysis but at the same time patient privacy has to be protected through de-identification. Although biomedical researchers often describe their research plans when they request anonymized data, most existing anonymization methods do not use this information when de-identifying the data. As a result, the anonymized data may not be useful for the planned research project. This paper proposes a data recipient centered approach to tailor the de-identification method based on input from the recipient of the data. We demonstrate our approach through an anonymization project for biomedical researchers with specific goals to improve the utility of the anonymized data for statistical models used for their research project. The selected algorithm improves a privacy protection method called Condensation by Aggarwal et al. Our methods were tested and validated on real cancer surveillance data provided by the Kentucky Cancer Registry. Copyright © 2014 Elsevier Inc. All rights reserved.
Görgen, Kai; Hebart, Martin N; Allefeld, Carsten; Haynes, John-Dylan
2017-12-27
Standard neuroimaging data analysis based on traditional principles of experimental design, modelling, and statistical inference is increasingly complemented by novel analysis methods, driven e.g. by machine learning methods. While these novel approaches provide new insights into neuroimaging data, they often have unexpected properties, generating a growing literature on possible pitfalls. We propose to meet this challenge by adopting a habit of systematic testing of experimental design, analysis procedures, and statistical inference. Specifically, we suggest to apply the analysis method used for experimental data also to aspects of the experimental design, simulated confounds, simulated null data, and control data. We stress the importance of keeping the analysis method the same in main and test analyses, because only this way possible confounds and unexpected properties can be reliably detected and avoided. We describe and discuss this Same Analysis Approach in detail, and demonstrate it in two worked examples using multivariate decoding. With these examples, we reveal two sources of error: A mismatch between counterbalancing (crossover designs) and cross-validation which leads to systematic below-chance accuracies, and linear decoding of a nonlinear effect, a difference in variance. Copyright © 2017 Elsevier Inc. All rights reserved.
Gao, Yang; Bian, Zhaoying; Huang, Jing; Zhang, Yunwan; Niu, Shanzhou; Feng, Qianjin; Chen, Wufan; Liang, Zhengrong; Ma, Jianhua
2014-06-16
To realize low-dose imaging in X-ray computed tomography (CT) examination, lowering milliampere-seconds (low-mAs) or reducing the required number of projection views (sparse-view) per rotation around the body has been widely studied as an easy and effective approach. In this study, we are focusing on low-dose CT image reconstruction from the sinograms acquired with a combined low-mAs and sparse-view protocol and propose a two-step image reconstruction strategy. Specifically, to suppress significant statistical noise in the noisy and insufficient sinograms, an adaptive sinogram restoration (ASR) method is first proposed with consideration of the statistical property of sinogram data, and then to further acquire a high-quality image, a total variation based projection onto convex sets (TV-POCS) method is adopted with a slight modification. For simplicity, the present reconstruction strategy was termed as "ASR-TV-POCS." To evaluate the present ASR-TV-POCS method, both qualitative and quantitative studies were performed on a physical phantom. Experimental results have demonstrated that the present ASR-TV-POCS method can achieve promising gains over other existing methods in terms of the noise reduction, contrast-to-noise ratio, and edge detail preservation.
NASA Astrophysics Data System (ADS)
Pedretti, Daniele; Beckie, Roger Daniel
2014-05-01
Missing data in hydrological time-series databases are ubiquitous in practical applications, yet it is of fundamental importance to make educated decisions in problems involving exhaustive time-series knowledge. This includes precipitation datasets, since recording or human failures can produce gaps in these time series. For some applications, directly involving the ratio between precipitation and some other quantity, lack of complete information can result in poor understanding of basic physical and chemical dynamics involving precipitated water. For instance, the ratio between precipitation (recharge) and outflow rates at a discharge point of an aquifer (e.g. rivers, pumping wells, lysimeters) can be used to obtain aquifer parameters and thus to constrain model-based predictions. We tested a suite of methodologies to reconstruct missing information in rainfall datasets. The goal was to obtain a suitable and versatile method to reduce the errors given by the lack of data in specific time windows. Our analyses included both a classical chronologically-pairing approach between rainfall stations and a probability-based approached, which accounted for the probability of exceedence of rain depths measured at two or multiple stations. Our analyses proved that it is not clear a priori which method delivers the best methodology. Rather, this selection should be based considering the specific statistical properties of the rainfall dataset. In this presentation, our emphasis is to discuss the effects of a few typical parametric distributions used to model the behavior of rainfall. Specifically, we analyzed the role of distributional "tails", which have an important control on the occurrence of extreme rainfall events. The latter strongly affect several hydrological applications, including recharge-discharge relationships. The heavy-tailed distributions we considered were parametric Log-Normal, Generalized Pareto, Generalized Extreme and Gamma distributions. The methods were first tested on synthetic examples, to have a complete control of the impact of several variables such as minimum amount of data required to obtain reliable statistical distributions from the selected parametric functions. Then, we applied the methodology to precipitation datasets collected in the Vancouver area and on a mining site in Peru.
NASA Astrophysics Data System (ADS)
Mohamed, Heba M.
2015-02-01
Itopride hydrochloride (IT) and Rabeprazole sodium (RB) are co-formulated together for the treatment of gastro-esophageal reflux disease. Three simple, specific and accurate spectrophotometric methods were applied and validated for simultaneous determination of Itopride hydrochloride (IT) and Rabeprazole sodium (RB) namely; constant center (CC), ratio difference (RD) and mean centering of ratio spectra (MCR) spectrophotometric methods. Linear correlations were obtained in range of 10-110 μg/μL for Itopride hydrochloride and 4-44 μg/mL for Rabeprazole sodium. No preliminary separation steps were required prior the analysis of the two drugs using the proposed methods. Specificity was investigated by analyzing the synthetic mixtures containing the two cited drugs and their capsules dosage form. The obtained results were statistically compared with those obtained by the reported method, no significant difference was obtained with respect to accuracy and precision. The three methods were validated in accordance with ICH guidelines and can be used for quality control laboratories for IT and RB.
Mohamed, Heba M
2015-02-05
Itopride hydrochloride (IT) and Rabeprazole sodium (RB) are co-formulated together for the treatment of gastro-esophageal reflux disease. Three simple, specific and accurate spectrophotometric methods were applied and validated for simultaneous determination of Itopride hydrochloride (IT) and Rabeprazole sodium (RB) namely; constant center (CC), ratio difference (RD) and mean centering of ratio spectra (MCR) spectrophotometric methods. Linear correlations were obtained in range of 10-110μg/μL for Itopride hydrochloride and 4-44μg/mL for Rabeprazole sodium. No preliminary separation steps were required prior the analysis of the two drugs using the proposed methods. Specificity was investigated by analyzing the synthetic mixtures containing the two cited drugs and their capsules dosage form. The obtained results were statistically compared with those obtained by the reported method, no significant difference was obtained with respect to accuracy and precision. The three methods were validated in accordance with ICH guidelines and can be used for quality control laboratories for IT and RB. Copyright © 2014 Elsevier B.V. All rights reserved.
2013-01-01
Background Intraoperative detection of 18F-FDG-avid tissue sites during 18F-FDG-directed surgery can be very challenging when utilizing gamma detection probes that rely on a fixed target-to-background (T/B) ratio (ratiometric threshold) for determination of probe positivity. The purpose of our study was to evaluate the counting efficiency and the success rate of in situ intraoperative detection of 18F-FDG-avid tissue sites (using the three-sigma statistical threshold criteria method and the ratiometric threshold criteria method) for three different gamma detection probe systems. Methods Of 58 patients undergoing 18F-FDG-directed surgery for known or suspected malignancy using gamma detection probes, we identified nine 18F-FDG-avid tissue sites (from amongst seven patients) that were seen on same-day preoperative diagnostic PET/CT imaging, and for which each 18F-FDG-avid tissue site underwent attempted in situ intraoperative detection concurrently using three gamma detection probe systems (K-alpha probe, and two commercially-available PET-probe systems), and then were subsequently surgical excised. Results The mean relative probe counting efficiency ratio was 6.9 (± 4.4, range 2.2–15.4) for the K-alpha probe, as compared to 1.5 (± 0.3, range 1.0–2.1) and 1.0 (± 0, range 1.0–1.0), respectively, for two commercially-available PET-probe systems (P < 0.001). Successful in situ intraoperative detection of 18F-FDG-avid tissue sites was more frequently accomplished with each of the three gamma detection probes tested by using the three-sigma statistical threshold criteria method than by using the ratiometric threshold criteria method, specifically with the three-sigma statistical threshold criteria method being significantly better than the ratiometric threshold criteria method for determining probe positivity for the K-alpha probe (P = 0.05). Conclusions Our results suggest that the improved probe counting efficiency of the K-alpha probe design used in conjunction with the three-sigma statistical threshold criteria method can allow for improved detection of 18F-FDG-avid tissue sites when a low in situ T/B ratio is encountered. PMID:23496877
A method for identifying color vision deficiency malingering.
Pouw, Andrew; Karanjia, Rustum; Sadun, Alfredo
2017-03-01
To propose a new test to identify color vision deficiency malingering. An online survey was distributed to 130 truly color vision deficient participants and 160 participants willing to simulate color vision deficiency. The survey contained three sets of six color-adjusted versions of the standard Ishihara color plates each, as well as one set of six control plates. The plates that best discriminated both participant groups were selected for a "balanced" test emphasizing both sensitivity and specificity. A "specific" test that prioritized high specificity was also created by selecting from these plates. Statistical measures of the test (sensitivity, specificity, and Youden index) were assessed at each possible cut-off threshold, and a receiver operating characteristic (ROC) function with its area under the curve (AUC) charted. The redshift plate set was identified as having the highest difference of means between groups (-58%, CI: -64 to -52%), as well as the widest gap between group modes. Statistical measures of the "balanced" test show an optimal cut-off of at least two incorrectly identified plates to suggest malingering (Youden index: 0.773, sensitivity: 83.3%, specificity: 94.0%, AUC of ROC 0.918). The "specific" test was able to identify color vision deficiency simulators with a specificity of 100% when using a cut-off of at least two incorrectly identified plates (Youden index 0.599, sensitivity 59.9%, specificity 100%, AUC of ROC 0.881). Our proposed test for identifying color vision deficiency malingering demonstrates a high degree of reliability with AUCs of 0.918 and 0.881 for the "balanced" and "specific" tests, respectively. A cut-off threshold of at least two missed plates on the "specific" test was able to identify color vision deficiency simulators with 100% specificity.
Voss, Frank D.
2003-01-01
In a joint effort by the Washington State Department of Agriculture, the Washington Department of Ecology, and the U.S. Geological Survey, the Environmental Protection Agency's Pesticide Root Zone Model and a Geographic Information System were used to develop and test a method for screening and mapping the susceptibility of ground water in agricultural areas to pesticide contamination. The objective was to produce a map that would be used by the Washington State Department of Agriculture to allocate resources for monitoring pesticide levels in ground water. The method was tested by producing a map showing susceptibility to leaching of the pesticide atrazine for the Columbia Basin Irrigation Project, which encompasses an area of intensive agriculture in eastern Washington. The reliability of the atrazine map was assessed by using statistical procedures to determine whether the median of the percentage of atrazine simulated to leach below the root zone in wells where atrazine was detected was statistically greater than the median percentage at wells where atrazine was not detected (at or above 0.001 microgram per liter) in 134 wells sampled by the U.S. Geological Survey. A statistical difference in medians was not found when all 134 wells were compared. However, a statistical difference was found in medians for two subsets of the 134 wells that were used in land-use studies (studies examining the quality of ground water beneath specific crops). The statistical results from wells from the land-use studies indicate that the model potentially can be used to map the relative susceptibility of agricultural areas to atrazine leaching. However, the distinction between areas of high and low susceptibility may not yet be sufficient to use the method for allocating resources to monitor water quality. Several options are offered for improving the reliability of future simulations.
Cross-Validation of Survival Bump Hunting by Recursive Peeling Methods.
Dazard, Jean-Eudes; Choe, Michael; LeBlanc, Michael; Rao, J Sunil
2014-08-01
We introduce a survival/risk bump hunting framework to build a bump hunting model with a possibly censored time-to-event type of response and to validate model estimates. First, we describe the use of adequate survival peeling criteria to build a survival/risk bump hunting model based on recursive peeling methods. Our method called "Patient Recursive Survival Peeling" is a rule-induction method that makes use of specific peeling criteria such as hazard ratio or log-rank statistics. Second, to validate our model estimates and improve survival prediction accuracy, we describe a resampling-based validation technique specifically designed for the joint task of decision rule making by recursive peeling (i.e. decision-box) and survival estimation. This alternative technique, called "combined" cross-validation is done by combining test samples over the cross-validation loops, a design allowing for bump hunting by recursive peeling in a survival setting. We provide empirical results showing the importance of cross-validation and replication.
Cross-Validation of Survival Bump Hunting by Recursive Peeling Methods
Dazard, Jean-Eudes; Choe, Michael; LeBlanc, Michael; Rao, J. Sunil
2015-01-01
We introduce a survival/risk bump hunting framework to build a bump hunting model with a possibly censored time-to-event type of response and to validate model estimates. First, we describe the use of adequate survival peeling criteria to build a survival/risk bump hunting model based on recursive peeling methods. Our method called “Patient Recursive Survival Peeling” is a rule-induction method that makes use of specific peeling criteria such as hazard ratio or log-rank statistics. Second, to validate our model estimates and improve survival prediction accuracy, we describe a resampling-based validation technique specifically designed for the joint task of decision rule making by recursive peeling (i.e. decision-box) and survival estimation. This alternative technique, called “combined” cross-validation is done by combining test samples over the cross-validation loops, a design allowing for bump hunting by recursive peeling in a survival setting. We provide empirical results showing the importance of cross-validation and replication. PMID:26997922
A note on evaluating VAN earthquake predictions
NASA Astrophysics Data System (ADS)
Tselentis, G.-Akis; Melis, Nicos S.
The evaluation of the success level of an earthquake prediction method should not be based on approaches that apply generalized strict statistical laws and avoid the specific nature of the earthquake phenomenon. Fault rupture processes cannot be compared to gambling processes. The outcome of the present note is that even an ideal earthquake prediction method is still shown to be a matter of a “chancy” association between precursors and earthquakes if we apply the same procedure proposed by Mulargia and Gasperini [1992] in evaluating VAN earthquake predictions. Each individual VAN prediction has to be evaluated separately, taking always into account the specific circumstances and information available. The success level of epicenter prediction should depend on the earthquake magnitude, and magnitude and time predictions may depend on earthquake clustering and the tectonic regime respectively.
Mallett, Susan; Halligan, Steve; Collins, Gary S; Altman, Doug G
2014-01-01
Different methods of evaluating diagnostic performance when comparing diagnostic tests may lead to different results. We compared two such approaches, sensitivity and specificity with area under the Receiver Operating Characteristic Curve (ROC AUC) for the evaluation of CT colonography for the detection of polyps, either with or without computer assisted detection. In a multireader multicase study of 10 readers and 107 cases we compared sensitivity and specificity, using radiological reporting of the presence or absence of polyps, to ROC AUC calculated from confidence scores concerning the presence of polyps. Both methods were assessed against a reference standard. Here we focus on five readers, selected to illustrate issues in design and analysis. We compared diagnostic measures within readers, showing that differences in results are due to statistical methods. Reader performance varied widely depending on whether sensitivity and specificity or ROC AUC was used. There were problems using confidence scores; in assigning scores to all cases; in use of zero scores when no polyps were identified; the bimodal non-normal distribution of scores; fitting ROC curves due to extrapolation beyond the study data; and the undue influence of a few false positive results. Variation due to use of different ROC methods exceeded differences between test results for ROC AUC. The confidence scores recorded in our study violated many assumptions of ROC AUC methods, rendering these methods inappropriate. The problems we identified will apply to other detection studies using confidence scores. We found sensitivity and specificity were a more reliable and clinically appropriate method to compare diagnostic tests.
Dahabreh, Issa J; Trikalinos, Thomas A; Lau, Joseph; Schmid, Christopher H
2017-03-01
To compare statistical methods for meta-analysis of sensitivity and specificity of medical tests (e.g., diagnostic or screening tests). We constructed a database of PubMed-indexed meta-analyses of test performance from which 2 × 2 tables for each included study could be extracted. We reanalyzed the data using univariate and bivariate random effects models fit with inverse variance and maximum likelihood methods. Analyses were performed using both normal and binomial likelihoods to describe within-study variability. The bivariate model using the binomial likelihood was also fit using a fully Bayesian approach. We use two worked examples-thoracic computerized tomography to detect aortic injury and rapid prescreening of Papanicolaou smears to detect cytological abnormalities-to highlight that different meta-analysis approaches can produce different results. We also present results from reanalysis of 308 meta-analyses of sensitivity and specificity. Models using the normal approximation produced sensitivity and specificity estimates closer to 50% and smaller standard errors compared to models using the binomial likelihood; absolute differences of 5% or greater were observed in 12% and 5% of meta-analyses for sensitivity and specificity, respectively. Results from univariate and bivariate random effects models were similar, regardless of estimation method. Maximum likelihood and Bayesian methods produced almost identical summary estimates under the bivariate model; however, Bayesian analyses indicated greater uncertainty around those estimates. Bivariate models produced imprecise estimates of the between-study correlation of sensitivity and specificity. Differences between methods were larger with increasing proportion of studies that were small or required a continuity correction. The binomial likelihood should be used to model within-study variability. Univariate and bivariate models give similar estimates of the marginal distributions for sensitivity and specificity. Bayesian methods fully quantify uncertainty and their ability to incorporate external evidence may be useful for imprecisely estimated parameters. Copyright © 2017 Elsevier Inc. All rights reserved.
Kim, Yun Hak; Jeong, Dae Cheon; Pak, Kyoungjune; Goh, Tae Sik; Lee, Chi-Seung; Han, Myoung-Eun; Kim, Ji-Young; Liangwen, Liu; Kim, Chi Dae; Jang, Jeon Yeob; Cha, Wonjae; Oh, Sae-Ock
2017-09-29
Accurate prediction of prognosis is critical for therapeutic decisions regarding cancer patients. Many previously developed prognostic scoring systems have limitations in reflecting recent progress in the field of cancer biology such as microarray, next-generation sequencing, and signaling pathways. To develop a new prognostic scoring system for cancer patients, we used mRNA expression and clinical data in various independent breast cancer cohorts (n=1214) from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) and Gene Expression Omnibus (GEO). A new prognostic score that reflects gene network inherent in genomic big data was calculated using Network-Regularized high-dimensional Cox-regression (Net-score). We compared its discriminatory power with those of two previously used statistical methods: stepwise variable selection via univariate Cox regression (Uni-score) and Cox regression via Elastic net (Enet-score). The Net scoring system showed better discriminatory power in prediction of disease-specific survival (DSS) than other statistical methods (p=0 in METABRIC training cohort, p=0.000331, 4.58e-06 in two METABRIC validation cohorts) when accuracy was examined by log-rank test. Notably, comparison of C-index and AUC values in receiver operating characteristic analysis at 5 years showed fewer differences between training and validation cohorts with the Net scoring system than other statistical methods, suggesting minimal overfitting. The Net-based scoring system also successfully predicted prognosis in various independent GEO cohorts with high discriminatory power. In conclusion, the Net-based scoring system showed better discriminative power than previous statistical methods in prognostic prediction for breast cancer patients. This new system will mark a new era in prognosis prediction for cancer patients.
Theoretical approaches to the steady-state statistical physics of interacting dissipative units
NASA Astrophysics Data System (ADS)
Bertin, Eric
2017-02-01
The aim of this review is to provide a concise overview of some of the generic approaches that have been developed to deal with the statistical description of large systems of interacting dissipative ‘units’. The latter notion includes, e.g. inelastic grains, active or self-propelled particles, bubbles in a foam, low-dimensional dynamical systems like driven oscillators, or even spatially extended modes like Fourier modes of the velocity field in a fluid. We first review methods based on the statistical properties of a single unit, starting with elementary mean-field approximations, either static or dynamic, that describe a unit embedded in a ‘self-consistent’ environment. We then discuss how this basic mean-field approach can be extended to account for spatial dependences, in the form of space-dependent mean-field Fokker-Planck equations, for example. We also briefly review the use of kinetic theory in the framework of the Boltzmann equation, which is an appropriate description for dilute systems. We then turn to descriptions in terms of the full N-body distribution, starting from exact solutions of one-dimensional models, using a matrix-product ansatz method when correlations are present. Since exactly solvable models are scarce, we also present some approximation methods which can be used to determine the N-body distribution in a large system of dissipative units. These methods include the Edwards approach for dense granular matter and the approximate treatment of multiparticle Langevin equations with colored noise, which models systems of self-propelled particles. Throughout this review, emphasis is put on methodological aspects of the statistical modeling and on formal similarities between different physical problems, rather than on the specific behavior of a given system.
Jackson, Dan; Bowden, Jack
2016-09-07
Confidence intervals for the between study variance are useful in random-effects meta-analyses because they quantify the uncertainty in the corresponding point estimates. Methods for calculating these confidence intervals have been developed that are based on inverting hypothesis tests using generalised heterogeneity statistics. Whilst, under the random effects model, these new methods furnish confidence intervals with the correct coverage, the resulting intervals are usually very wide, making them uninformative. We discuss a simple strategy for obtaining 95 % confidence intervals for the between-study variance with a markedly reduced width, whilst retaining the nominal coverage probability. Specifically, we consider the possibility of using methods based on generalised heterogeneity statistics with unequal tail probabilities, where the tail probability used to compute the upper bound is greater than 2.5 %. This idea is assessed using four real examples and a variety of simulation studies. Supporting analytical results are also obtained. Our results provide evidence that using unequal tail probabilities can result in shorter 95 % confidence intervals for the between-study variance. We also show some further results for a real example that illustrates how shorter confidence intervals for the between-study variance can be useful when performing sensitivity analyses for the average effect, which is usually the parameter of primary interest. We conclude that using unequal tail probabilities when computing 95 % confidence intervals for the between-study variance, when using methods based on generalised heterogeneity statistics, can result in shorter confidence intervals. We suggest that those who find the case for using unequal tail probabilities convincing should use the '1-4 % split', where greater tail probability is allocated to the upper confidence bound. The 'width-optimal' interval that we present deserves further investigation.
Kim, Yun Hak; Jeong, Dae Cheon; Pak, Kyoungjune; Goh, Tae Sik; Lee, Chi-Seung; Han, Myoung-Eun; Kim, Ji-Young; Liangwen, Liu; Kim, Chi Dae; Jang, Jeon Yeob; Cha, Wonjae; Oh, Sae-Ock
2017-01-01
Accurate prediction of prognosis is critical for therapeutic decisions regarding cancer patients. Many previously developed prognostic scoring systems have limitations in reflecting recent progress in the field of cancer biology such as microarray, next-generation sequencing, and signaling pathways. To develop a new prognostic scoring system for cancer patients, we used mRNA expression and clinical data in various independent breast cancer cohorts (n=1214) from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) and Gene Expression Omnibus (GEO). A new prognostic score that reflects gene network inherent in genomic big data was calculated using Network-Regularized high-dimensional Cox-regression (Net-score). We compared its discriminatory power with those of two previously used statistical methods: stepwise variable selection via univariate Cox regression (Uni-score) and Cox regression via Elastic net (Enet-score). The Net scoring system showed better discriminatory power in prediction of disease-specific survival (DSS) than other statistical methods (p=0 in METABRIC training cohort, p=0.000331, 4.58e-06 in two METABRIC validation cohorts) when accuracy was examined by log-rank test. Notably, comparison of C-index and AUC values in receiver operating characteristic analysis at 5 years showed fewer differences between training and validation cohorts with the Net scoring system than other statistical methods, suggesting minimal overfitting. The Net-based scoring system also successfully predicted prognosis in various independent GEO cohorts with high discriminatory power. In conclusion, the Net-based scoring system showed better discriminative power than previous statistical methods in prognostic prediction for breast cancer patients. This new system will mark a new era in prognosis prediction for cancer patients. PMID:29100405
Bej, A K; McCarty, S C; Atlas, R M
1991-01-01
Multiplex polymerase chain reaction (PCR) and gene probe detection of target lacZ and uidA genes were used to detect total coliform bacteria and Escherichia coli, respectively, for determining water quality. In tests of environmental water samples, the lacZ PCR method gave results statistically equivalent to those of the plate count and defined substrate methods accepted by the U.S. Environmental Protection Agency for water quality monitoring and the uidA PCR method was more sensitive than 4-methylumbelliferyl-beta-D-glucuronide-based defined substrate tests for specific detection of E. coli. Images PMID:1768116
Allele-specific copy-number discovery from whole-genome and whole-exome sequencing.
Wang, WeiBo; Wang, Wei; Sun, Wei; Crowley, James J; Szatkiewicz, Jin P
2015-08-18
Copy-number variants (CNVs) are a major form of genetic variation and a risk factor for various human diseases, so it is crucial to accurately detect and characterize them. It is conceivable that allele-specific reads from high-throughput sequencing data could be leveraged to both enhance CNV detection and produce allele-specific copy number (ASCN) calls. Although statistical methods have been developed to detect CNVs using whole-genome sequence (WGS) and/or whole-exome sequence (WES) data, information from allele-specific read counts has not yet been adequately exploited. In this paper, we develop an integrated method, called AS-GENSENG, which incorporates allele-specific read counts in CNV detection and estimates ASCN using either WGS or WES data. To evaluate the performance of AS-GENSENG, we conducted extensive simulations, generated empirical data using existing WGS and WES data sets and validated predicted CNVs using an independent methodology. We conclude that AS-GENSENG not only predicts accurate ASCN calls but also improves the accuracy of total copy number calls, owing to its unique ability to exploit information from both total and allele-specific read counts while accounting for various experimental biases in sequence data. Our novel, user-friendly and computationally efficient method and a complete analytic protocol is freely available at https://sourceforge.net/projects/asgenseng/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Jansma, J Martijn; de Zwart, Jacco A; van Gelderen, Peter; Duyn, Jeff H; Drevets, Wayne C; Furey, Maura L
2013-01-01
Technical developments in MRI have improved signal to noise, allowing use of analysis methods such as Finite impulse response (FIR) of rapid event related functional MRI (er-fMRI). FIR is one of the most informative analysis methods as it determines onset and full shape of the hemodynamic response function (HRF) without any a-priori assumptions. FIR is however vulnerable to multicollinearity, which is directly related to the distribution of stimuli over time. Efficiency can be optimized by simplifying a design, and restricting stimuli distribution to specific sequences, while more design flexibility necessarily reduces efficiency. However, the actual effect of efficiency on fMRI results has never been tested in vivo. Thus, it is currently difficult to make an informed choice between protocol flexibility and statistical efficiency. The main goal of this study was to assign concrete fMRI signal to noise values to the abstract scale of FIR statistical efficiency. Ten subjects repeated a perception task with five random and m-sequence based protocol, with varying but, according to literature, acceptable levels of multicollinearity. Results indicated substantial differences in signal standard deviation, while the level was a function of multicollinearity. Experiment protocols varied up to 55.4% in standard deviation. Results confirm that quality of fMRI in an FIR analysis can significantly and substantially vary with statistical efficiency. Our in vivo measurements can be used to aid in making an informed decision between freedom in protocol design and statistical efficiency. PMID:23473798
Backenroth, Daniel; He, Zihuai; Kiryluk, Krzysztof; Boeva, Valentina; Pethukova, Lynn; Khurana, Ekta; Christiano, Angela; Buxbaum, Joseph D; Ionita-Laza, Iuliana
2018-05-03
We describe a method based on a latent Dirichlet allocation model for predicting functional effects of noncoding genetic variants in a cell-type- and/or tissue-specific way (FUN-LDA). Using this unsupervised approach, we predict tissue-specific functional effects for every position in the human genome in 127 different tissues and cell types. We demonstrate the usefulness of our predictions by using several validation experiments. Using eQTL data from several sources, including the GTEx project, Geuvadis project, and TwinsUK cohort, we show that eQTLs in specific tissues tend to be most enriched among the predicted functional variants in relevant tissues in Roadmap. We further show how these integrated functional scores can be used for (1) deriving the most likely cell or tissue type causally implicated for a complex trait by using summary statistics from genome-wide association studies and (2) estimating a tissue-based correlation matrix of various complex traits. We found large enrichment of heritability in functional components of relevant tissues for various complex traits, and FUN-LDA yielded higher enrichment estimates than existing methods. Finally, using experimentally validated functional variants from the literature and variants possibly implicated in disease by previous studies, we rigorously compare FUN-LDA with state-of-the-art functional annotation methods and show that FUN-LDA has better prediction accuracy and higher resolution than these methods. In particular, our results suggest that tissue- and cell-type-specific functional prediction methods tend to have substantially better prediction accuracy than organism-level prediction methods. Scores for each position in the human genome and for each ENCODE and Roadmap tissue are available online (see Web Resources). Copyright © 2018 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Developing topic-specific search filters for PubMed with click-through data.
Li, J; Lu, Z
2013-01-01
Search filters have been developed and demonstrated for better information access to the immense and ever-growing body of publications in the biomedical domain. However, to date the number of filters remains quite limited because the current filter development methods require significant human efforts in manual document review and filter term selection. In this regard, we aim to investigate automatic methods for generating search filters. We present an automated method to develop topic-specific filters on the basis of users' search logs in PubMed. Specifically, for a given topic, we first detect its relevant user queries and then include their corresponding clicked articles to serve as the topic-relevant document set accordingly. Next, we statistically identify informative terms that best represent the topic-relevant document set using a background set composed of topic irrelevant articles. Lastly, the selected representative terms are combined with Boolean operators and evaluated on benchmark datasets to derive the final filter with the best performance. We applied our method to develop filters for four clinical topics: nephrology, diabetes, pregnancy, and depression. For the nephrology filter, our method obtained performance comparable to the state of the art (sensitivity of 91.3%, specificity of 98.7%, precision of 94.6%, and accuracy of 97.2%). Similarly, high-performing results (over 90% in all measures) were obtained for the other three search filters. Based on PubMed click-through data, we successfully developed a high-performance method for generating topic-specific search filters that is significantly more efficient than existing manual methods. All data sets (topic-relevant and irrelevant document sets) used in this study and a demonstration system are publicly available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/downloads/CQ_filter/
Evaluating attention in delirium: A comparison of bedside tests of attention.
Adamis, Dimitrios; Meagher, David; Murray, Orla; O'Neill, Donagh; O'Mahony, Edmond; Mulligan, Owen; McCarthy, Geraldine
2016-09-01
Impaired attention is a core diagnostic feature for delirium. The present study examined the discriminating properties for patients with delirium versus those with dementia and/or no neurocognitive disorder of four objective tests of attention: digit span, vigilance "A" test, serial 7s subtraction and months of the year backwards together with global clinical subjective rating of attention. This as a prospective study of older patients admitted consecutively in a general hospital. Participants were assessed using the Confusion Assessment Method, Delirium Rating Scale-98 Revised and Montreal Cognitive Assessment scales, and months of the year backwards. Pre-existing dementia was diagnosed according to the Diagnostic and Statistical Manual of Mental Disorders fourth edition criteria. The sample consisted of 200 participants (mean age 81.1 ± 6.5 years; 50% women; pre-existing cognitive impairment in 126 [63%]). A total of 34 (17%) were identified with delirium (Confusion Assessment Method +). The five approaches to assessing attention had statistically significant correlations (P < 0.05). Discriminant analysis showed that clinical subjective rating of attention in conjunction with the months of the year backwards had the best discriminatory ability to identify Confusion Assessment Method-defined delirium, and to discriminate patients with delirium from those with dementia and/or normal cognition. Both of these approaches had high sensitivity, but modest specificity. Objective tests are useful for prediction of non-delirium, but lack specificity for a delirium diagnosis. Global attentional deficits were more indicative of delirium than deficits of specific domains of attention. Geriatr Gerontol Int 2016; 16: 1028-1035. © 2015 The Authors. Geriatrics & Gerontology International published by. Wiley Publishing Asia Pty Ltd on behalf of Japanese Geriatrics Society.
Research design and statistical methods in Pakistan Journal of Medical Sciences (PJMS).
Akhtar, Sohail; Shah, Syed Wadood Ali; Rafiq, M; Khan, Ajmal
2016-01-01
This article compares the study design and statistical methods used in 2005, 2010 and 2015 of Pakistan Journal of Medical Sciences (PJMS). Only original articles of PJMS were considered for the analysis. The articles were carefully reviewed for statistical methods and designs, and then recorded accordingly. The frequency of each statistical method and research design was estimated and compared with previous years. A total of 429 articles were evaluated (n=74 in 2005, n=179 in 2010, n=176 in 2015) in which 171 (40%) were cross-sectional and 116 (27%) were prospective study designs. A verity of statistical methods were found in the analysis. The most frequent methods include: descriptive statistics (n=315, 73.4%), chi-square/Fisher's exact tests (n=205, 47.8%) and student t-test (n=186, 43.4%). There was a significant increase in the use of statistical methods over time period: t-test, chi-square/Fisher's exact test, logistic regression, epidemiological statistics, and non-parametric tests. This study shows that a diverse variety of statistical methods have been used in the research articles of PJMS and frequency improved from 2005 to 2015. However, descriptive statistics was the most frequent method of statistical analysis in the published articles while cross-sectional study design was common study design.
Impact of Requirements Quality on Project Success or Failure
NASA Astrophysics Data System (ADS)
Tamai, Tetsuo; Kamata, Mayumi Itakura
We are interested in the relationship between the quality of the requirements specifications for software projects and the subsequent outcome of the projects. To examine this relationship, we investigated 32 projects started and completed between 2003 and 2005 by the software development division of a large company in Tokyo. The company has collected reliable data on requirements specification quality, as evaluated by software quality assurance teams, and overall project performance data relating to cost and time overruns. The data for requirements specification quality were first converted into a multiple-dimensional space, with each dimension corresponding to an item of the recommended structure for software requirements specifications (SRS) defined in IEEE Std. 830-1998. We applied various statistical analysis methods to the SRS quality data and project outcomes.
Evaluation of peak-picking algorithms for protein mass spectrometry.
Bauer, Chris; Cramer, Rainer; Schuchhardt, Johannes
2011-01-01
Peak picking is an early key step in MS data analysis. We compare three commonly used approaches to peak picking and discuss their merits by means of statistical analysis. Methods investigated encompass signal-to-noise ratio, continuous wavelet transform, and a correlation-based approach using a Gaussian template. Functionality of the three methods is illustrated and discussed in a practical context using a mass spectral data set created with MALDI-TOF technology. Sensitivity and specificity are investigated using a manually defined reference set of peaks. As an additional criterion, the robustness of the three methods is assessed by a perturbation analysis and illustrated using ROC curves.
Mean-Reverting Portfolio With Budget Constraint
NASA Astrophysics Data System (ADS)
Zhao, Ziping; Palomar, Daniel P.
2018-05-01
This paper considers the mean-reverting portfolio design problem arising from statistical arbitrage in the financial markets. We first propose a general problem formulation aimed at finding a portfolio of underlying component assets by optimizing a mean-reversion criterion characterizing the mean-reversion strength, taking into consideration the variance of the portfolio and an investment budget constraint. Then several specific problems are considered based on the general formulation, and efficient algorithms are proposed. Numerical results on both synthetic and market data show that our proposed mean-reverting portfolio design methods can generate consistent profits and outperform the traditional design methods and the benchmark methods in the literature.
Statistical context shapes stimulus-specific adaptation in human auditory cortex.
Herrmann, Björn; Henry, Molly J; Fromboluti, Elisa Kim; McAuley, J Devin; Obleser, Jonas
2015-04-01
Stimulus-specific adaptation is the phenomenon whereby neural response magnitude decreases with repeated stimulation. Inconsistencies between recent nonhuman animal recordings and computational modeling suggest dynamic influences on stimulus-specific adaptation. The present human electroencephalography (EEG) study investigates the potential role of statistical context in dynamically modulating stimulus-specific adaptation by examining the auditory cortex-generated N1 and P2 components. As in previous studies of stimulus-specific adaptation, listeners were presented with oddball sequences in which the presentation of a repeated tone was infrequently interrupted by rare spectral changes taking on three different magnitudes. Critically, the statistical context varied with respect to the probability of small versus large spectral changes within oddball sequences (half of the time a small change was most probable; in the other half a large change was most probable). We observed larger N1 and P2 amplitudes (i.e., release from adaptation) for all spectral changes in the small-change compared with the large-change statistical context. The increase in response magnitude also held for responses to tones presented with high probability, indicating that statistical adaptation can overrule stimulus probability per se in its influence on neural responses. Computational modeling showed that the degree of coadaptation in auditory cortex changed depending on the statistical context, which in turn affected stimulus-specific adaptation. Thus the present data demonstrate that stimulus-specific adaptation in human auditory cortex critically depends on statistical context. Finally, the present results challenge the implicit assumption of stationarity of neural response magnitudes that governs the practice of isolating established deviant-detection responses such as the mismatch negativity. Copyright © 2015 the American Physiological Society.
Kazemian, Majid; Zhu, Qiyun; Halfon, Marc S.; Sinha, Saurabh
2011-01-01
Despite recent advances in experimental approaches for identifying transcriptional cis-regulatory modules (CRMs, ‘enhancers’), direct empirical discovery of CRMs for all genes in all cell types and environmental conditions is likely to remain an elusive goal. Effective methods for computational CRM discovery are thus a critically needed complement to empirical approaches. However, existing computational methods that search for clusters of putative binding sites are ineffective if the relevant TFs and/or their binding specificities are unknown. Here, we provide a significantly improved method for ‘motif-blind’ CRM discovery that does not depend on knowledge or accurate prediction of TF-binding motifs and is effective when limited knowledge of functional CRMs is available to ‘supervise’ the search. We propose a new statistical method, based on ‘Interpolated Markov Models’, for motif-blind, genome-wide CRM discovery. It captures the statistical profile of variable length words in known CRMs of a regulatory network and finds candidate CRMs that match this profile. The method also uses orthologs of the known CRMs from closely related genomes. We perform in silico evaluation of predicted CRMs by assessing whether their neighboring genes are enriched for the expected expression patterns. This assessment uses a novel statistical test that extends the widely used Hypergeometric test of gene set enrichment to account for variability in intergenic lengths. We find that the new CRM prediction method is superior to existing methods. Finally, we experimentally validate 12 new CRM predictions by examining their regulatory activity in vivo in Drosophila; 10 of the tested CRMs were found to be functional, while 6 of the top 7 predictions showed the expected activity patterns. We make our program available as downloadable source code, and as a plugin for a genome browser installed on our servers. PMID:21821659
Research design and statistical methods in Pakistan Journal of Medical Sciences (PJMS)
Akhtar, Sohail; Shah, Syed Wadood Ali; Rafiq, M.; Khan, Ajmal
2016-01-01
Objective: This article compares the study design and statistical methods used in 2005, 2010 and 2015 of Pakistan Journal of Medical Sciences (PJMS). Methods: Only original articles of PJMS were considered for the analysis. The articles were carefully reviewed for statistical methods and designs, and then recorded accordingly. The frequency of each statistical method and research design was estimated and compared with previous years. Results: A total of 429 articles were evaluated (n=74 in 2005, n=179 in 2010, n=176 in 2015) in which 171 (40%) were cross-sectional and 116 (27%) were prospective study designs. A verity of statistical methods were found in the analysis. The most frequent methods include: descriptive statistics (n=315, 73.4%), chi-square/Fisher’s exact tests (n=205, 47.8%) and student t-test (n=186, 43.4%). There was a significant increase in the use of statistical methods over time period: t-test, chi-square/Fisher’s exact test, logistic regression, epidemiological statistics, and non-parametric tests. Conclusion: This study shows that a diverse variety of statistical methods have been used in the research articles of PJMS and frequency improved from 2005 to 2015. However, descriptive statistics was the most frequent method of statistical analysis in the published articles while cross-sectional study design was common study design. PMID:27022365