On detection and assessment of statistical significance of Genomic Islands
Chatterjee, Raghunath; Chaudhuri, Keya; Chaudhuri, Probal
2008-01-01
Background Many of the available methods for detecting Genomic Islands (GIs) in prokaryotic genomes use markers such as transposons, proximal tRNAs, flanking repeats etc., or they use other supervised techniques requiring training datasets. Most of these methods are primarily based on the biases in GC content or codon and amino acid usage of the islands. However, these methods either do not use any formal statistical test of significance or use statistical tests for which the critical values and the P-values are not adequately justified. We propose a method, which is unsupervised in nature and uses Monte-Carlo statistical tests based on randomly selected segments of a chromosome. Such tests are supported by precise statistical distribution theory, and consequently, the resulting P-values are quite reliable for making the decision. Results Our algorithm (named Design-Island, an acronym for Detection of Statistically Significant Genomic Island) runs in two phases. Some 'putative GIs' are identified in the first phase, and those are refined into smaller segments containing horizontally acquired genes in the refinement phase. This method is applied to Salmonella typhi CT18 genome leading to the discovery of several new pathogenicity, antibiotic resistance and metabolic islands that were missed by earlier methods. Many of these islands contain mobile genetic elements like phage-mediated genes, transposons, integrase and IS elements confirming their horizontal acquirement. Conclusion The proposed method is based on statistical tests supported by precise distribution theory and reliable P-values along with a technique for visualizing statistically significant islands. The performance of our method is better than many other well known methods in terms of their sensitivity and accuracy, and in terms of specificity, it is comparable to other methods. PMID:18380895
ERIC Educational Resources Information Center
Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza
2014-01-01
This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…
Statistical Significance Testing.
ERIC Educational Resources Information Center
McLean, James E., Ed.; Kaufman, Alan S., Ed.
1998-01-01
The controversy about the use or misuse of statistical significance testing has become the major methodological issue in educational research. This special issue contains three articles that explore the controversy, three commentaries on these articles, an overall response, and three rejoinders by the first three authors. They are: (1)…
Statistical or biological significance?
Saxon, Emma
2015-01-01
Oat plants grown at an agricultural research facility produce higher yields in Field 1 than in Field 2, under well fertilised conditions and with similar weather exposure; all oat plants in both fields are healthy and show no sign of disease. In this study, the authors hypothesised that the soil microbial community might be different in each field, and these differences might explain the difference in oat plant growth. They carried out a metagenomic analysis of the 16 s ribosomal 'signature' sequences from bacteria in 50 randomly located soil samples in each field to determine the composition of the bacterial community. The study identified >1000 species, most of which were present in both fields. The authors identified two plant growth-promoting species that were significantly reduced in soil from Field 2 (Student's t-test P < 0.05), and concluded that these species might have contributed to reduced yield. PMID:26541972
Statistically significant relational data mining :
Berry, Jonathan W.; Leung, Vitus Joseph; Phillips, Cynthia Ann; Pinar, Ali; Robinson, David Gerald; Berger-Wolf, Tanya; Bhowmick, Sanjukta; Casleton, Emily; Kaiser, Mark; Nordman, Daniel J.; Wilson, Alyson G.
2014-02-01
This report summarizes the work performed under the project (3z(BStatitically significant relational data mining.(3y (BThe goal of the project was to add more statistical rigor to the fairly ad hoc area of data mining on graphs. Our goal was to develop better algorithms and better ways to evaluate algorithm quality. We concetrated on algorithms for community detection, approximate pattern matching, and graph similarity measures. Approximate pattern matching involves finding an instance of a relatively small pattern, expressed with tolerance, in a large graph of data observed with uncertainty. This report gathers the abstracts and references for the eight refereed publications that have appeared as part of this work. We then archive three pieces of research that have not yet been published. The first is theoretical and experimental evidence that a popular statistical measure for comparison of community assignments favors over-resolved communities over approximations to a ground truth. The second are statistically motivated methods for measuring the quality of an approximate match of a small pattern in a large graph. The third is a new probabilistic random graph model. Statisticians favor these models for graph analysis. The new local structure graph model overcomes some of the issues with popular models such as exponential random graph models and latent variable models.
NASA Astrophysics Data System (ADS)
Baluev, Roman V.
2013-11-01
We consider the `multifrequency' periodogram, in which the putative signal is modelled as a sum of two or more sinusoidal harmonics with independent frequencies. It is useful in cases when the data may contain several periodic components, especially when their interaction with each other and with the data sampling patterns might produce misleading results. Although the multifrequency statistic itself was constructed earlier, for example by G. Foster in his CLEANest algorithm, its probabilistic properties (the detection significance levels) are still poorly known and much of what is deemed known is not rigorous. These detection levels are nonetheless important for data analysis. We argue that to prove the simultaneous existence of all n components revealed in a multiperiodic variation, it is mandatory to apply at least 2n - 1 significance tests, among which most involve various multifrequency statistics, and only n tests are single-frequency ones. The main result of this paper is an analytic estimation of the statistical significance of the frequency tuples that the multifrequency periodogram can reveal. Using the theory of extreme values of random fields (the generalized Rice method), we find a useful approximation to the relevant false alarm probability. For the double-frequency periodogram, this approximation is given by the elementary formula (π/16)W2e- zz2, where W denotes the normalized width of the settled frequency range, and z is the observed periodogram maximum. We carried out intensive Monte Carlo simulations to show that the practical quality of this approximation is satisfactory. A similar analytic expression for the general multifrequency periodogram is also given, although with less numerical verification.
Statistical significance of the gallium anomaly
Giunti, Carlo; Laveder, Marco
2011-06-15
We calculate the statistical significance of the anomalous deficit of electron neutrinos measured in the radioactive source experiments of the GALLEX and SAGE solar neutrino detectors, taking into account the uncertainty of the detection cross section. We found that the statistical significance of the anomaly is {approx}3.0{sigma}. A fit of the data in terms of neutrino oscillations favors at {approx}2.7{sigma} short-baseline electron neutrino disappearance with respect to the null hypothesis of no oscillations.
Social significance of community structure: Statistical view
NASA Astrophysics Data System (ADS)
Li, Hui-Jia; Daniels, Jasmine J.
2015-01-01
Community structure analysis is a powerful tool for social networks that can simplify their topological and functional analysis considerably. However, since community detection methods have random factors and real social networks obtained from complex systems always contain error edges, evaluating the significance of a partitioned community structure is an urgent and important question. In this paper, integrating the specific characteristics of real society, we present a framework to analyze the significance of a social community. The dynamics of social interactions are modeled by identifying social leaders and corresponding hierarchical structures. Instead of a direct comparison with the average outcome of a random model, we compute the similarity of a given node with the leader by the number of common neighbors. To determine the membership vector, an efficient community detection algorithm is proposed based on the position of the nodes and their corresponding leaders. Then, using a log-likelihood score, the tightness of the community can be derived. Based on the distribution of community tightness, we establish a connection between p -value theory and network analysis, and then we obtain a significance measure of statistical form . Finally, the framework is applied to both benchmark networks and real social networks. Experimental results show that our work can be used in many fields, such as determining the optimal number of communities, analyzing the social significance of a given community, comparing the performance among various algorithms, etc.
The insignificance of statistical significance testing
Johnson, Douglas H.
1999-01-01
Despite their use in scientific joumals such asThe journal of Wildlife Management, statistical hypothesis tests add very little value to the products of research. Indeed, they frequently confuse the interpretation of data. This paper describes how statistical hypothesis tests are often viewed, and then contrasts that interpretation with the correct one. I discuss the arbitrariness of P-values, conclusions that the null hypothesis is true, power analysis, and distinctions between statistical and biological significance. Statistical hypothesis testing, in which the null hypothesis about the properties of a population is almost always known a priori to be false, is contrasted with scientific hypothesis testing, which examines a credible null hypothesis about phenomena in nature. More meaningful alternatives are briefly outlined, including estimation and confidence intervals for determining the importance of factors, decision theory for guiding actions in the face of uncertainty, and Bayesian approaches to hypothesis testing and other statistical practices.
Statistical Significance vs. Practical Significance: An Exploration through Health Education
ERIC Educational Resources Information Center
Rosen, Brittany L.; DeMaria, Andrea L.
2012-01-01
The purpose of this paper is to examine the differences between statistical and practical significance, including strengths and criticisms of both methods, as well as provide information surrounding the application of various effect sizes and confidence intervals within health education research. Provided are recommendations, explanations and…
Understanding Statistical Significance: A Conceptual History.
ERIC Educational Resources Information Center
Little, Joseph
2001-01-01
Considers how if literacy is envisioned as a sort of competence in a set of social and intellectual practices, then scientific literacy must encompass the realization that "statistical significance," the cardinal arbiter of social scientific knowledge, was not born out of an immanent logic of mathematics but socially constructed and reconstructed…
Determining the Statistical Significance of Relative Weights
ERIC Educational Resources Information Center
Tonidandel, Scott; LeBreton, James M.; Johnson, Jeff W.
2009-01-01
Relative weight analysis is a procedure for estimating the relative importance of correlated predictors in a regression equation. Because the sampling distribution of relative weights is unknown, researchers using relative weight analysis are unable to make judgments regarding the statistical significance of the relative weights. J. W. Johnson…
Comments on the Statistical Significance Testing Articles.
ERIC Educational Resources Information Center
Knapp, Thomas R.
1998-01-01
Expresses a "middle-of-the-road" position on statistical significance testing, suggesting that it has its place but that confidence intervals are generally more useful. Identifies 10 errors of omission or commission in the papers reviewed that weaken the positions taken in their discussions. (SLD)
Statistical Significance of Clustering using Soft Thresholding
Huang, Hanwen; Liu, Yufeng; Yuan, Ming; Marron, J. S.
2015-01-01
Clustering methods have led to a number of important discoveries in bioinformatics and beyond. A major challenge in their use is determining which clusters represent important underlying structure, as opposed to spurious sampling artifacts. This challenge is especially serious, and very few methods are available, when the data are very high in dimension. Statistical Significance of Clustering (SigClust) is a recently developed cluster evaluation tool for high dimensional low sample size data. An important component of the SigClust approach is the very definition of a single cluster as a subset of data sampled from a multivariate Gaussian distribution. The implementation of SigClust requires the estimation of the eigenvalues of the covariance matrix for the null multivariate Gaussian distribution. We show that the original eigenvalue estimation can lead to a test that suffers from severe inflation of type-I error, in the important case where there are a few very large eigenvalues. This paper addresses this critical challenge using a novel likelihood based soft thresholding approach to estimate these eigenvalues, which leads to a much improved SigClust. Major improvements in SigClust performance are shown by both mathematical analysis, based on the new notion of Theoretical Cluster Index, and extensive simulation studies. Applications to some cancer genomic data further demonstrate the usefulness of these improvements. PMID:26755893
Statistical methodology for pathogen detection.
Ogliari, Paulo José; de Andrade, Dalton Francisco; Pacheco, Juliano Anderson; Franchin, Paulo Rogério; Batista, Cleide Rosana Vieira
2007-08-01
The main goal of the present study was to discuss the application of the McNemar test to the comparison of proportions in dependent samples. Data were analyzed from studies conducted to verify the suitability of replacing a conventional method with a new one for identifying the presence of Salmonella. It is shown that, in most situations, the McNemar test does not provide all the elements required by the microbiologist to make a final decision and that appropriate functions of the proportions need to be considered. Sample sizes suitable to guarantee a test with a high power in the detection of significant differences regarding the problem studied are obtained by simulation. Examples of functions that are of great value to the microbiologist are presented. PMID:17803152
The Use of Meta-Analytic Statistical Significance Testing
ERIC Educational Resources Information Center
Polanin, Joshua R.; Pigott, Terri D.
2015-01-01
Meta-analysis multiplicity, the concept of conducting multiple tests of statistical significance within one review, is an underdeveloped literature. We address this issue by considering how Type I errors can impact meta-analytic results, suggest how statistical power may be affected through the use of multiplicity corrections, and propose how…
Testing the Difference of Correlated Agreement Coefficients for Statistical Significance
ERIC Educational Resources Information Center
Gwet, Kilem L.
2016-01-01
This article addresses the problem of testing the difference between two correlated agreement coefficients for statistical significance. A number of authors have proposed methods for testing the difference between two correlated kappa coefficients, which require either the use of resampling methods or the use of advanced statistical modeling…
Advances in Testing the Statistical Significance of Mediation Effects
ERIC Educational Resources Information Center
Mallinckrodt, Brent; Abraham, W. Todd; Wei, Meifen; Russell, Daniel W.
2006-01-01
P. A. Frazier, A. P. Tix, and K. E. Barron (2004) highlighted a normal theory method popularized by R. M. Baron and D. A. Kenny (1986) for testing the statistical significance of indirect effects (i.e., mediator variables) in multiple regression contexts. However, simulation studies suggest that this method lacks statistical power relative to some…
A Tutorial on Hunting Statistical Significance by Chasing N
Szucs, Denes
2016-01-01
There is increasing concern about the replicability of studies in psychology and cognitive neuroscience. Hidden data dredging (also called p-hacking) is a major contributor to this crisis because it substantially increases Type I error resulting in a much larger proportion of false positive findings than the usually expected 5%. In order to build better intuition to avoid, detect and criticize some typical problems, here I systematically illustrate the large impact of some easy to implement and so, perhaps frequent data dredging techniques on boosting false positive findings. I illustrate several forms of two special cases of data dredging. First, researchers may violate the data collection stopping rules of null hypothesis significance testing by repeatedly checking for statistical significance with various numbers of participants. Second, researchers may group participants post hoc along potential but unplanned independent grouping variables. The first approach ‘hacks’ the number of participants in studies, the second approach ‘hacks’ the number of variables in the analysis. I demonstrate the high amount of false positive findings generated by these techniques with data from true null distributions. I also illustrate that it is extremely easy to introduce strong bias into data by very mild selection and re-testing. Similar, usually undocumented data dredging steps can easily lead to having 20–50%, or more false positives. PMID:27713723
Shukla, R.; Yu Daohai; Fulk, F.
1995-12-31
Short-term toxicity tests with aquatic organisms are a valuable measurement tool in the assessment of the toxicity of effluents, environmental samples and single chemicals. Currently toxicity tests are utilized in a wide range of US EPA regulatory activities including effluent discharge compliance. In the current approach for determining the No Observed Effect Concentration, an effluent concentration is presumed safe if there is no statistically significant difference in toxicant response versus control response. The conclusion of a safe concentration may be due to the fact that it truly is safe, or alternatively, that the ability of the statistical test to detect an effect, given its existence, is inadequate. Results of research of a new statistical approach, the basis of which is to move away from a demonstration of no difference to a demonstration of equivalence, will be discussed. The concept of observed confidence distributions, first suggested by Cox, is proposed as a measure of the strength of evidence for practically equivalent responses between a given effluent concentration and the control. The research included determination of intervals of practically equivalent responses as a function of the variability of control response. The approach is illustrated using reproductive data from tests with Ceriodaphnia dubia and survival and growth data from tests with fathead minnow. The data are from the US EPA`s National Reference Toxicant Database.
The questioned p value: clinical, practical and statistical significance.
Jiménez-Paneque, Rosa
2016-01-01
The use of p-value and statistical significance have been questioned since the early 80s in the last century until today. Much has been discussed about it in the field of statistics and its applications, especially in Epidemiology and Public Health. As a matter of fact, the p-value and its equivalent, statistical significance, are difficult concepts to grasp for the many health professionals some way involved in research applied to their work areas. However, its meaning should be clear in intuitive terms although it is based on theoretical concepts of the field of Statistics. This paper attempts to present the p-value as a concept that applies to everyday life and therefore intuitively simple but whose proper use cannot be separated from theoretical and methodological elements of inherent complexity. The reasons behind the criticism received by the p-value and its isolated use are intuitively explained, mainly the need to demarcate statistical significance from clinical significance and some of the recommended remedies for these problems are approached as well. It finally refers to the current trend to vindicate the p-value appealing to the convenience of its use in certain situations and the recent statement of the American Statistical Association in this regard. PMID:27636600
The questioned p value: clinical, practical and statistical significance.
Jiménez-Paneque, Rosa
2016-09-09
The use of p-value and statistical significance have been questioned since the early 80s in the last century until today. Much has been discussed about it in the field of statistics and its applications, especially in Epidemiology and Public Health. As a matter of fact, the p-value and its equivalent, statistical significance, are difficult concepts to grasp for the many health professionals some way involved in research applied to their work areas. However, its meaning should be clear in intuitive terms although it is based on theoretical concepts of the field of Statistics. This paper attempts to present the p-value as a concept that applies to everyday life and therefore intuitively simple but whose proper use cannot be separated from theoretical and methodological elements of inherent complexity. The reasons behind the criticism received by the p-value and its isolated use are intuitively explained, mainly the need to demarcate statistical significance from clinical significance and some of the recommended remedies for these problems are approached as well. It finally refers to the current trend to vindicate the p-value appealing to the convenience of its use in certain situations and the recent statement of the American Statistical Association in this regard.
Has Testing for Statistical Significance Outlived Its Usefulness?
ERIC Educational Resources Information Center
McLean, James E.; Ernest, James M.
The research methodology literature in recent years has included a full frontal assault on statistical significance testing. An entire edition of "Experimental Education" explored this controversy. The purpose of this paper is to promote the position that while significance testing by itself may be flawed, it has not outlived its usefulness.…
A Comparison of Statistical Significance Tests for Selecting Equating Functions
ERIC Educational Resources Information Center
Moses, Tim
2009-01-01
This study compared the accuracies of nine previously proposed statistical significance tests for selecting identity, linear, and equipercentile equating functions in an equivalent groups equating design. The strategies included likelihood ratio tests for the loglinear models of tests' frequency distributions, regression tests, Kolmogorov-Smirnov…
Statistical interpretation of “femtomolar” detection
Go, Jonghyun; Alam, Muhammad A.
2009-01-01
We calculate the statistics of diffusion-limited arrival-time distribution by a Monte Carlo method to suggest a simple statistical resolution of the enduring puzzle of nanobiosensors: a persistent gap between reports of analyte detection at approximately femtomolar concentration and theory suggesting the impossibility of approximately subpicomolar detection at the corresponding incubation time. The incubation time used in the theory is actually the mean incubation time, while experimental conditions suggest that device stability limited the minimum incubation time. The difference in incubation times—both described by characteristic power laws—provides an intuitive explanation of different detection limits anticipated by theory and experiments. PMID:19690630
Assigning statistical significance to proteotypic peptides via database searches
Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo
2011-01-01
Querying MS/MS spectra against a database containing only proteotypic peptides reduces data analysis time due to reduction of database size. Despite the speed advantage, this search strategy is challenged by issues of statistical significance and coverage. The former requires separating systematically significant identifications from less confident identifications, while the latter arises when the underlying peptide is not present, due to single amino acid polymorphisms (SAPs) or post-translational modifications (PTMs), in the proteotypic peptide libraries searched. To address both issues simultaneously, we have extended RAId’s knowledge database to include proteotypic information, utilized RAId’s statistical strategy to assign statistical significance to proteotypic peptides, and modified RAId’s programs to allow for consideration of proteotypic information during database searches. The extended database alleviates the coverage problem since all annotated modifications, even those occurred within proteotypic peptides, may be considered. Taking into account the likelihoods of observation, the statistical strategy of RAId provides accurate E-value assignments regardless whether a candidate peptide is proteotypic or not. The advantage of including proteotypic information is evidenced by its superior retrieval performance when compared to regular database searches. PMID:21055489
Estimation of the geochemical threshold and its statistical significance
Miesch, A.T.
1981-01-01
A statistic is proposed for estimating the geochemical threshold and its statistical significance, or it may be used to identify a group of extreme values that can be tested for significance by other means. The statistic is the maximum gap between adjacent values in an ordered array after each gap has been adjusted for the expected frequency. The values in the ordered array are geochemical values transformed by either ln(?? - ??) or ln(?? - ??) and then standardized so that the mean is zero and the variance is unity. The expected frequency is taken from a fitted normal curve with unit area. The midpoint of an adjusted gap that exceeds the corresponding critical value may be taken as an estimate of the geochemical threshold, and the associated probability indicates the likelihood that the threshold separates two geochemical populations. The adjusted gap test may fail to identify threshold values if the variation tends to be continuous from background values to the higher values that reflect mineralized ground. However, the test will serve to identify other anomalies that may be too subtle to have been noted by other means. ?? 1981.
Advances in Significance Testing for Cluster Detection
NASA Astrophysics Data System (ADS)
Coleman, Deidra Andrea
Over the past two decades, much attention has been given to data driven project goals such as the Human Genome Project and the development of syndromic surveillance systems. A major component of these types of projects is analyzing the abundance of data. Detecting clusters within the data can be beneficial as it can lead to the identification of specified sequences of DNA nucleotides that are related to important biological functions or the locations of epidemics such as disease outbreaks or bioterrorism attacks. Cluster detection techniques require efficient and accurate hypothesis testing procedures. In this dissertation, we improve upon the hypothesis testing procedures for cluster detection by enhancing distributional theory and providing an alternative method for spatial cluster detection using syndromic surveillance data. In Chapter 2, we provide an efficient method to compute the exact distribution of the number and coverage of h-clumps of a collection of words. This method involves defining a Markov chain using a minimal deterministic automaton to reduce the number of states needed for computation. We allow words of the collection to contain other words of the collection making the method more general. We use our method to compute the distributions of the number and coverage of h-clumps in the Chi motif of H. influenza.. In Chapter 3, we provide an efficient algorithm to compute the exact distribution of multiple window discrete scan statistics for higher-order, multi-state Markovian sequences. This algorithm involves defining a Markov chain to efficiently keep track of probabilities needed to compute p-values of the statistic. We use our algorithm to identify cases where the available approximation does not perform well. We also use our algorithm to detect unusual clusters of made free throw shots by National Basketball Association players during the 2009-2010 regular season. In Chapter 4, we give a procedure to detect outbreaks using syndromic
Statistical Fault Detection & Diagnosis Expert System
1996-12-18
STATMON is an expert system that performs real-time fault detection and diagnosis of redundant sensors in any industrial process requiring high reliability. After a training period performed during normal operation, the expert system monitors the statistical properties of the incoming signals using a pattern recognition test. If the test determines that statistical properties of the signals have changed, the expert system performs a sequence of logical steps to determine which sensor or machine component hasmore » degraded.« less
Statistical keyword detection in literary corpora
NASA Astrophysics Data System (ADS)
Herrera, J. P.; Pury, P. A.
2008-05-01
Understanding the complexity of human language requires an appropriate analysis of the statistical distribution of words in texts. We consider the information retrieval problem of detecting and ranking the relevant words of a text by means of statistical information referring to the spatial use of the words. Shannon's entropy of information is used as a tool for automatic keyword extraction. By using The Origin of Species by Charles Darwin as a representative text sample, we show the performance of our detector and compare it with another proposals in the literature. The random shuffled text receives special attention as a tool for calibrating the ranking indices.
Sibling Competition & Growth Tradeoffs. Biological vs. Statistical Significance
Kramer, Karen L.; Veile, Amanda; Otárola-Castillo, Erik
2016-01-01
Early childhood growth has many downstream effects on future health and reproduction and is an important measure of offspring quality. While a tradeoff between family size and child growth outcomes is theoretically predicted in high-fertility societies, empirical evidence is mixed. This is often attributed to phenotypic variation in parental condition. However, inconsistent study results may also arise because family size confounds the potentially differential effects that older and younger siblings can have on young children’s growth. Additionally, inconsistent results might reflect that the biological significance associated with different growth trajectories is poorly understood. This paper addresses these concerns by tracking children’s monthly gains in height and weight from weaning to age five in a high fertility Maya community. We predict that: 1) as an aggregate measure family size will not have a major impact on child growth during the post weaning period; 2) competition from young siblings will negatively impact child growth during the post weaning period; 3) however because of their economic value, older siblings will have a negligible effect on young children’s growth. Accounting for parental condition, we use linear mixed models to evaluate the effects that family size, younger and older siblings have on children’s growth. Congruent with our expectations, it is younger siblings who have the most detrimental effect on children’s growth. While we find statistical evidence of a quantity/quality tradeoff effect, the biological significance of these results is negligible in early childhood. Our findings help to resolve why quantity/quality studies have had inconsistent results by showing that sibling competition varies with sibling age composition, not just family size, and that biological significance is distinct from statistical significance. PMID:26938742
Statistical modeling approach for detecting generalized synchronization
NASA Astrophysics Data System (ADS)
Schumacher, Johannes; Haslinger, Robert; Pipa, Gordon
2012-05-01
Detecting nonlinear correlations between time series presents a hard problem for data analysis. We present a generative statistical modeling method for detecting nonlinear generalized synchronization. Truncated Volterra series are used to approximate functional interactions. The Volterra kernels are modeled as linear combinations of basis splines, whose coefficients are estimated via l1 and l2 regularized maximum likelihood regression. The regularization manages the high number of kernel coefficients and allows feature selection strategies yielding sparse models. The method's performance is evaluated on different coupled chaotic systems in various synchronization regimes and analytical results for detecting m:n phase synchrony are presented. Experimental applicability is demonstrated by detecting nonlinear interactions between neuronal local field potentials recorded in different parts of macaque visual cortex.
Fostering Students' Statistical Literacy through Significant Learning Experience
ERIC Educational Resources Information Center
Krishnan, Saras
2015-01-01
A major objective of statistics education is to develop students' statistical literacy that enables them to be educated users of data in context. Teaching statistics in today's educational settings is not an easy feat because teachers have a huge task in keeping up with the demands of the new generation of learners. The present day students have…
Chládek, J; Brázdil, M; Halámek, J; Plešinger, F; Jurák, P
2013-01-01
We present an off-line analysis procedure for exploring brain activity recorded from intra-cerebral electroencephalographic data (SEEG). The objective is to determine the statistical differences between different types of stimulations in the time-frequency domain. The procedure is based on computing relative signal power change and subsequent statistical analysis. An example of characteristic statistically significant event-related de/synchronization (ERD/ERS) detected across different frequency bands following different oddball stimuli is presented. The method is used for off-line functional classification of different brain areas. PMID:24109865
Statistical controversies in clinical research: statistical significance-too much of a good thing ….
Buyse, M; Hurvitz, S A; Andre, F; Jiang, Z; Burris, H A; Toi, M; Eiermann, W; Lindsay, M-A; Slamon, D
2016-05-01
The use and interpretation of P values is a matter of debate in applied research. We argue that P values are useful as a pragmatic guide to interpret the results of a clinical trial, not as a strict binary boundary that separates real treatment effects from lack thereof. We illustrate our point using the result of BOLERO-1, a randomized, double-blind trial evaluating the efficacy and safety of adding everolimus to trastuzumab and paclitaxel as first-line therapy for HER2+ advanced breast cancer. In this trial, the benefit of everolimus was seen only in the predefined subset of patients with hormone receptor-negative breast cancer at baseline (progression-free survival hazard ratio = 0.66, P = 0.0049). A strict interpretation of this finding, based on complex 'alpha splitting' rules to assess statistical significance, led to the conclusion that the benefit of everolimus was not statistically significant either overall or in the subset. We contend that this interpretation does not do justice to the data, and we argue that the benefit of everolimus in hormone receptor-negative breast cancer is both statistically compelling and clinically relevant. PMID:26861602
Statistical fingerprinting for malware detection and classification
Prowell, Stacy J.; Rathgeb, Christopher T.
2015-09-15
A system detects malware in a computing architecture with an unknown pedigree. The system includes a first computing device having a known pedigree and operating free of malware. The first computing device executes a series of instrumented functions that, when executed, provide a statistical baseline that is representative of the time it takes the software application to run on a computing device having a known pedigree. A second computing device executes a second series of instrumented functions that, when executed, provides an actual time that is representative of the time the known software application runs on the second computing device. The system detects malware when there is a difference in execution times between the first and the second computing devices.
Infants with Williams syndrome detect statistical regularities in continuous speech.
Cashon, Cara H; Ha, Oh-Ryeong; Graf Estes, Katharine; Saffran, Jenny R; Mervis, Carolyn B
2016-09-01
Williams syndrome (WS) is a rare genetic disorder associated with delays in language and cognitive development. The reasons for the language delay are unknown. Statistical learning is a domain-general mechanism recruited for early language acquisition. In the present study, we investigated whether infants with WS were able to detect the statistical structure in continuous speech. Eighteen 8- to 20-month-olds with WS were familiarized with 2min of a continuous stream of synthesized nonsense words; the statistical structure of the speech was the only cue to word boundaries. They were tested on their ability to discriminate statistically-defined "words" and "part-words" (which crossed word boundaries) in the artificial language. Despite significant cognitive and language delays, infants with WS were able to detect the statistical regularities in the speech stream. These findings suggest that an inability to track the statistical properties of speech is unlikely to be the primary basis for the delays in the onset of language observed in infants with WS. These results provide the first evidence of statistical learning by infants with developmental delays. PMID:27299804
Statistical detection of systematic election irregularities
Klimek, Peter; Yegorov, Yuri; Hanel, Rudolf; Thurner, Stefan
2012-01-01
Democratic societies are built around the principle of free and fair elections, and that each citizen’s vote should count equally. National elections can be regarded as large-scale social experiments, where people are grouped into usually large numbers of electoral districts and vote according to their preferences. The large number of samples implies statistical consequences for the polling results, which can be used to identify election irregularities. Using a suitable data representation, we find that vote distributions of elections with alleged fraud show a kurtosis substantially exceeding the kurtosis of normal elections, depending on the level of data aggregation. As an example, we show that reported irregularities in recent Russian elections are, indeed, well-explained by systematic ballot stuffing. We develop a parametric model quantifying the extent to which fraudulent mechanisms are present. We formulate a parametric test detecting these statistical properties in election results. Remarkably, this technique produces robust outcomes with respect to the resolution of the data and therefore, allows for cross-country comparisons. PMID:23010929
Statistical downscaling rainfall using artificial neural network: significantly wetter Bangkok?
NASA Astrophysics Data System (ADS)
Vu, Minh Tue; Aribarg, Thannob; Supratid, Siriporn; Raghavan, Srivatsan V.; Liong, Shie-Yui
2015-08-01
Artificial neural network (ANN) is an established technique with a flexible mathematical structure that is capable of identifying complex nonlinear relationships between input and output data. The present study utilizes ANN as a method of statistically downscaling global climate models (GCMs) during the rainy season at meteorological site locations in Bangkok, Thailand. The study illustrates the applications of the feed forward back propagation using large-scale predictor variables derived from both the ERA-Interim reanalyses data and present day/future GCM data. The predictors are first selected over different grid boxes surrounding Bangkok region and then screened by using principal component analysis (PCA) to filter the best correlated predictors for ANN training. The reanalyses downscaled results of the present day climate show good agreement against station precipitation with a correlation coefficient of 0.8 and a Nash-Sutcliffe efficiency of 0.65. The final downscaled results for four GCMs show an increasing trend of precipitation for rainy season over Bangkok by the end of the twenty-first century. The extreme values of precipitation determined using statistical indices show strong increases of wetness. These findings will be useful for policy makers in pondering adaptation measures due to flooding such as whether the current drainage network system is sufficient to meet the changing climate and to plan for a range of related adaptation/mitigation measures.
Statistically significant data base of rock properties for geothermal use
NASA Astrophysics Data System (ADS)
Koch, A.; Jorand, R.; Clauser, C.
2009-04-01
The high risk of failure due to the unknown properties of the target rocks at depth is a major obstacle for the exploration of geothermal energy. In general, the ranges of thermal and hydraulic properties given in compilations of rock properties are too large to be useful to constrain properties at a specific site. To overcome this problem, we study the thermal and hydraulic rock properties of the main rock types in Germany in a statistical approach. An important aspect is the use of data from exploration wells that are largely untapped for the purpose of geothermal exploration. In the current project stage, we have been analyzing mostly Devonian and Carboniferous drill cores from 20 deep boreholes in the region of the Lower Rhine Embayment and the Ruhr area (western North Rhine Westphalia). In total, we selected 230 core samples with a length of up to 30 cm from the core archive of the State Geological Survey. The use of core scanning technology allowed the rapid measurement of thermal conductivity, sonic velocity, and gamma density under dry and water saturated conditions with high resolution for a large number of samples. In addition, we measured porosity, bulk density, and matrix density based on Archimedes' principle and pycnometer analysis. As first results we present arithmetic means, medians and standard deviations characterizing the petrophysical properties and their variability for specific lithostratigraphic units. Bi- and multimodal frequency distributions correspond to the occurrence of different lithologies such as shale, limestone, dolomite, sandstone, siltstone, marlstone, and quartz-schist. In a next step, the data set will be combined with logging data and complementary mineralogical analyses to derive the variation of thermal conductivity with depth. As a final result, this may be used to infer thermal conductivity for boreholes without appropriate core data which were drilled in similar geological settings.
Assessing statistical significance in multivariable genome wide association analysis
Buzdugan, Laura; Kalisch, Markus; Navarro, Arcadi; Schunk, Daniel; Fehr, Ernst; Bühlmann, Peter
2016-01-01
Motivation: Although Genome Wide Association Studies (GWAS) genotype a very large number of single nucleotide polymorphisms (SNPs), the data are often analyzed one SNP at a time. The low predictive power of single SNPs, coupled with the high significance threshold needed to correct for multiple testing, greatly decreases the power of GWAS. Results: We propose a procedure in which all the SNPs are analyzed in a multiple generalized linear model, and we show its use for extremely high-dimensional datasets. Our method yields P-values for assessing significance of single SNPs or groups of SNPs while controlling for all other SNPs and the family wise error rate (FWER). Thus, our method tests whether or not a SNP carries any additional information about the phenotype beyond that available by all the other SNPs. This rules out spurious correlations between phenotypes and SNPs that can arise from marginal methods because the ‘spuriously correlated’ SNP merely happens to be correlated with the ‘truly causal’ SNP. In addition, the method offers a data driven approach to identifying and refining groups of SNPs that jointly contain informative signals about the phenotype. We demonstrate the value of our method by applying it to the seven diseases analyzed by the Wellcome Trust Case Control Consortium (WTCCC). We show, in particular, that our method is also capable of finding significant SNPs that were not identified in the original WTCCC study, but were replicated in other independent studies. Availability and implementation: Reproducibility of our research is supported by the open-source Bioconductor package hierGWAS. Contact: peter.buehlmann@stat.math.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153677
Statistical limitations in functional neuroimaging. II. Signal detection and statistical inference.
Petersson, K M; Nichols, T E; Poline, J B; Holmes, A P
1999-01-01
The field of functional neuroimaging (FNI) methodology has developed into a mature but evolving area of knowledge and its applications have been extensive. A general problem in the analysis of FNI data is finding a signal embedded in noise. This is sometimes called signal detection. Signal detection theory focuses in general on issues relating to the optimization of conditions for separating the signal from noise. When methods from probability theory and mathematical statistics are directly applied in this procedure it is also called statistical inference. In this paper we briefly discuss some aspects of signal detection theory relevant to FNI and, in addition, some common approaches to statistical inference used in FNI. Low-pass filtering in relation to functional-anatomical variability and some effects of filtering on signal detection of interest to FNI are discussed. Also, some general aspects of hypothesis testing and statistical inference are discussed. This includes the need for characterizing the signal in data when the null hypothesis is rejected, the problem of multiple comparisons that is central to FNI data analysis, omnibus tests and some issues related to statistical power in the context of FNI. In turn, random field, scale space, non-parametric and Monte Carlo approaches are reviewed, representing the most common approaches to statistical inference used in FNI. Complementary to these issues an overview and discussion of non-inferential descriptive methods, common statistical models and the problem of model selection is given in a companion paper. In general, model selection is an important prelude to subsequent statistical inference. The emphasis in both papers is on the assumptions and inherent limitations of the methods presented. Most of the methods described here generally serve their purposes well when the inherent assumptions and limitations are taken into account. Significant differences in results between different methods are most apparent in
Timescales for detecting a significant acceleration in sea level rise.
Haigh, Ivan D; Wahl, Thomas; Rohling, Eelco J; Price, René M; Pattiaratchi, Charitha B; Calafat, Francisco M; Dangendorf, Sönke
2014-04-14
There is observational evidence that global sea level is rising and there is concern that the rate of rise will increase, significantly threatening coastal communities. However, considerable debate remains as to whether the rate of sea level rise is currently increasing and, if so, by how much. Here we provide new insights into sea level accelerations by applying the main methods that have been used previously to search for accelerations in historical data, to identify the timings (with uncertainties) at which accelerations might first be recognized in a statistically significant manner (if not apparent already) in sea level records that we have artificially extended to 2100. We find that the most important approach to earliest possible detection of a significant sea level acceleration lies in improved understanding (and subsequent removal) of interannual to multidecadal variability in sea level records.
Timescales for detecting a significant acceleration in sea level rise.
Haigh, Ivan D; Wahl, Thomas; Rohling, Eelco J; Price, René M; Pattiaratchi, Charitha B; Calafat, Francisco M; Dangendorf, Sönke
2014-01-01
There is observational evidence that global sea level is rising and there is concern that the rate of rise will increase, significantly threatening coastal communities. However, considerable debate remains as to whether the rate of sea level rise is currently increasing and, if so, by how much. Here we provide new insights into sea level accelerations by applying the main methods that have been used previously to search for accelerations in historical data, to identify the timings (with uncertainties) at which accelerations might first be recognized in a statistically significant manner (if not apparent already) in sea level records that we have artificially extended to 2100. We find that the most important approach to earliest possible detection of a significant sea level acceleration lies in improved understanding (and subsequent removal) of interannual to multidecadal variability in sea level records. PMID:24728012
Timescales for detecting a significant acceleration in sea level rise
Haigh, Ivan D.; Wahl, Thomas; Rohling, Eelco J.; Price, René M.; Pattiaratchi, Charitha B.; Calafat, Francisco M.; Dangendorf, Sönke
2014-01-01
There is observational evidence that global sea level is rising and there is concern that the rate of rise will increase, significantly threatening coastal communities. However, considerable debate remains as to whether the rate of sea level rise is currently increasing and, if so, by how much. Here we provide new insights into sea level accelerations by applying the main methods that have been used previously to search for accelerations in historical data, to identify the timings (with uncertainties) at which accelerations might first be recognized in a statistically significant manner (if not apparent already) in sea level records that we have artificially extended to 2100. We find that the most important approach to earliest possible detection of a significant sea level acceleration lies in improved understanding (and subsequent removal) of interannual to multidecadal variability in sea level records. PMID:24728012
Damage detection in mechanical structures using extreme value statistic.
Worden, K.; Allen, D. W.; Sohn, H.; Farrar, C. R.
2002-01-01
The first and most important objective of any damage identification algorithms is to ascertain with confidence if damage is present or not. Many methods have been proposed for damage detection based on ideas of novelty detection founded in pattern recognition and multivariate statistics. The philosophy of novelty detection is simple. Features are first extracted from a baseline system to be monitored, and subsequent data are then compared to see if the new features are outliers, which significantly depart from the rest of population. In damage diagnosis problems, the assumption is that outliers are generated from a damaged condition of the monitored system. This damage classification necessitates the establishment of a decision boundary. Choosing this threshold value is often based on the assumption that the parent distribution of data is Gaussian in nature. While the problem of novelty detection focuses attention on the outlier or extreme values of the data i.e. those points in the tails of the distribution, the threshold selection using the normality assumption weighs the central population of data. Therefore, this normality assumption might impose potentially misleading behavior on damage classification, and is likely to lead the damage diagnosis astray. In this paper, extreme value statistics is integrated with the novelty detection to specifically model the tails of the distribution of interest. Finally, the proposed technique is demonstrated on simulated numerical data and time series data measured from an eight degree-of-freedom spring-mass system.
NASA Astrophysics Data System (ADS)
Williams, Arnold C.; Pachowicz, Peter W.
2004-09-01
Current mine detection research indicates that no single sensor or single look from a sensor will detect mines/minefields in a real-time manner at a performance level suitable for a forward maneuver unit. Hence, the integrated development of detectors and fusion algorithms are of primary importance. A problem in this development process has been the evaluation of these algorithms with relatively small data sets, leading to anecdotal and frequently over trained results. These anecdotal results are often unreliable and conflicting among various sensors and algorithms. Consequently, the physical phenomena that ought to be exploited and the performance benefits of this exploitation are often ambiguous. The Army RDECOM CERDEC Night Vision Laboratory and Electron Sensors Directorate has collected large amounts of multisensor data such that statistically significant evaluations of detection and fusion algorithms can be obtained. Even with these large data sets care must be taken in algorithm design and data processing to achieve statistically significant performance results for combined detectors and fusion algorithms. This paper discusses statistically significant detection and combined multilook fusion results for the Ellipse Detector (ED) and the Piecewise Level Fusion Algorithm (PLFA). These statistically significant performance results are characterized by ROC curves that have been obtained through processing this multilook data for the high resolution SAR data of the Veridian X-Band radar. We discuss the implications of these results on mine detection and the importance of statistical significance, sample size, ground truth, and algorithm design in performance evaluation.
Understanding the Sampling Distribution and Its Use in Testing Statistical Significance.
ERIC Educational Resources Information Center
Breunig, Nancy A.
Despite the increasing criticism of statistical significance testing by researchers, particularly in the publication of the 1994 American Psychological Association's style manual, statistical significance test results are still popular in journal articles. For this reason, it remains important to understand the logic of inferential statistics. A…
Dark census: Statistically detecting the satellite populations of distant galaxies
NASA Astrophysics Data System (ADS)
Cyr-Racine, Francis-Yan; Moustakas, Leonidas A.; Keeton, Charles R.; Sigurdson, Kris; Gilman, Daniel A.
2016-08-01
In the standard structure formation scenario based on the cold dark matter paradigm, galactic halos are predicted to contain a large population of dark matter subhalos. While the most massive members of the subhalo population can appear as luminous satellites and be detected in optical surveys, establishing the existence of the low mass and mostly dark subhalos has proven to be a daunting task. Galaxy-scale strong gravitational lenses have been successfully used to study mass substructures lying close to lensed images of bright background sources. However, in typical galaxy-scale lenses, the strong lensing region only covers a small projected area of the lens's dark matter halo, implying that the vast majority of subhalos cannot be directly detected in lensing observations. In this paper, we point out that this large population of dark satellites can collectively affect gravitational lensing observables, hence possibly allowing their statistical detection. Focusing on the region of the galactic halo outside the strong lensing area, we compute from first principles the statistical properties of perturbations to the gravitational time delay and position of lensed images in the presence of a mass substructure population. We find that in the standard cosmological scenario, the statistics of these lensing observables are well approximated by Gaussian distributions. The formalism developed as part of this calculation is very general and can be applied to any halo geometry and choice of subhalo mass function. Our results significantly reduce the computational cost of including a large substructure population in lens models and enable the use of Bayesian inference techniques to detect and characterize the distributed satellite population of distant lens galaxies.
Statistics and Machine Learning based Outlier Detection Techniques for Exoplanets
NASA Astrophysics Data System (ADS)
Goel, Amit; Montgomery, Michele
2015-08-01
Architectures of planetary systems are observable snapshots in time that can indicate formation and dynamic evolution of planets. The observable key parameters that we consider are planetary mass and orbital period. If planet masses are significantly less than their host star masses, then Keplerian Motion is defined as P^2 = a^3 where P is the orbital period in units of years and a is the orbital period in units of Astronomical Units (AU). Keplerian motion works on small scales such as the size of the Solar System but not on large scales such as the size of the Milky Way Galaxy. In this work, for confirmed exoplanets of known stellar mass, planetary mass, orbital period, and stellar age, we analyze Keplerian motion of systems based on stellar age to seek if Keplerian motion has an age dependency and to identify outliers. For detecting outliers, we apply several techniques based on statistical and machine learning methods such as probabilistic, linear, and proximity based models. In probabilistic and statistical models of outliers, the parameters of a closed form probability distributions are learned in order to detect the outliers. Linear models use regression analysis based techniques for detecting outliers. Proximity based models use distance based algorithms such as k-nearest neighbour, clustering algorithms such as k-means, or density based algorithms such as kernel density estimation. In this work, we will use unsupervised learning algorithms with only the proximity based models. In addition, we explore the relative strengths and weaknesses of the various techniques by validating the outliers. The validation criteria for the outliers is if the ratio of planetary mass to stellar mass is less than 0.001. In this work, we present our statistical analysis of the outliers thus detected.
A decision surface-based taxonomy of detection statistics
NASA Astrophysics Data System (ADS)
Bouffard, François
2012-09-01
Current and past literature on the topic of detection statistics - in particular those used in hyperspectral target detection - can be intimidating for newcomers, especially given the huge number of detection tests described in the literature. Detection tests for hyperspectral measurements, such as those generated by dispersive or Fourier transform spectrometers used in remote sensing of atmospheric contaminants, are of paramount importance if any level of analysis automation is to be achieved. The detection statistics used in hyperspectral target detection are generally borrowed and adapted from other fields such as radar signal processing or acoustics. Consequently, although remarkable efforts have been made to clarify and categorize the vast number of available detection tests, understanding their differences, similarities, limits and other intricacies is still an exacting journey. Reasons for this state of affairs include heterogeneous nomenclature and mathematical notation, probably due to the multiple origins of hyperspectral target detection formalisms. Attempts at sorting out detection statistics using ambiguously defined properties may also cause more harm than good. Ultimately, a detection statistic is entirely characterized by its decision boundary. Thus, we propose to catalogue detection statistics according to the shape of their decision surfaces, which greatly simplifies this taxonomy exercise. We make a distinction between the topology resulting from the mathematical formulation of the statistic and mere parameters that adjust the boundary's precise shape, position and orientation. Using this simple approach, similarities between various common detection statistics are found, limit cases are reduced to simpler statistics, and a general understanding of the available detection tests and their properties becomes much easier to achieve.
Statistical Detection of Atypical Aircraft Flights
NASA Technical Reports Server (NTRS)
Statler, Irving; Chidester, Thomas; Shafto, Michael; Ferryman, Thomas; Amidan, Brett; Whitney, Paul; White, Amanda; Willse, Alan; Cooley, Scott; Jay, Joseph; Rosenthal, Loren; Swickard, Andrea; Bates, Derrick; Scherrer, Chad; Webb, Bobbie-Jo; Lawrence, Robert; Mosbrucker, Chris; Prothero, Gary; Andrei, Adi; Romanowski, Tim; Robin, Daniel; Prothero, Jason; Lynch, Robert; Lowe, Michael
2006-01-01
A computational method and software to implement the method have been developed to sift through vast quantities of digital flight data to alert human analysts to aircraft flights that are statistically atypical in ways that signify that safety may be adversely affected. On a typical day, there are tens of thousands of flights in the United States and several times that number throughout the world. Depending on the specific aircraft design, the volume of data collected by sensors and flight recorders can range from a few dozen to several thousand parameters per second during a flight. Whereas these data have long been utilized in investigating crashes, the present method is oriented toward helping to prevent crashes by enabling routine monitoring of flight operations to identify portions of flights that may be of interest with respect to safety issues.
ERIC Educational Resources Information Center
Monterde-i-Bort, Hector; Frias-Navarro, Dolores; Pascual-Llobell, Juan
2010-01-01
The empirical study we present here deals with a pedagogical issue that has not been thoroughly explored up until now in our field. Previous empirical studies in other sectors have identified the opinions of researchers about this topic, showing that completely unacceptable interpretations have been made of significance tests and other statistical…
Configurational Statistics of Magnetic Bead Detection with Magnetoresistive Sensors
Henriksen, Anders Dahl; Ley, Mikkel Wennemoes Hvitfeld; Flyvbjerg, Henrik; Hansen, Mikkel Fougt
2015-01-01
Magnetic biosensors detect magnetic beads that, mediated by a target, have bound to a functionalized area. This area is often larger than the area of the sensor. Both the sign and magnitude of the average magnetic field experienced by the sensor from a magnetic bead depends on the location of the bead relative to the sensor. Consequently, the signal from multiple beads also depends on their locations. Thus, a given coverage of the functionalized area with magnetic beads does not result in a given detector response, except on the average, over many realizations of the same coverage. We present a systematic theoretical analysis of how this location-dependence affects the sensor response. The analysis is done for beads magnetized by a homogeneous in-plane magnetic field. We determine the expected value and standard deviation of the sensor response for a given coverage, as well as the accuracy and precision with which the coverage can be determined from a single sensor measurement. We show that statistical fluctuations between samples may reduce the sensitivity and dynamic range of a sensor significantly when the functionalized area is larger than the sensor area. Hence, the statistics of sampling is essential to sensor design. For illustration, we analyze three important published cases for which statistical fluctuations are dominant, significant, and insignificant, respectively. PMID:26496495
Detection of bearing damage by statistic vibration analysis
NASA Astrophysics Data System (ADS)
Sikora, E. A.
2016-04-01
The condition of bearings, which are essential components in mechanisms, is crucial to safety. The analysis of the bearing vibration signal, which is always contaminated by certain types of noise, is a very important standard for mechanical condition diagnosis of the bearing and mechanical failure phenomenon. In this paper the method of rolling bearing fault detection by statistical analysis of vibration is proposed to filter out Gaussian noise contained in a raw vibration signal. The results of experiments show that the vibration signal can be significantly enhanced by application of the proposed method. Besides, the proposed method is used to analyse real acoustic signals of a bearing with inner race and outer race faults, respectively. The values of attributes are determined according to the degree of the fault. The results confirm that the periods between the transients, which represent bearing fault characteristics, can be successfully detected.
Steganography forensics method for detecting least significant bit replacement attack
NASA Astrophysics Data System (ADS)
Wang, Xiaofeng; Wei, Chengcheng; Han, Xiao
2015-01-01
We present an image forensics method to detect least significant bit replacement steganography attack. The proposed method provides fine-grained forensics features by using the hierarchical structure that combines pixels correlation and bit-planes correlation. This is achieved via bit-plane decomposition and difference matrices between the least significant bit-plane and each one of the others. Generated forensics features provide the susceptibility (changeability) that will be drastically altered when the cover image is embedded with data to form a stego image. We developed a statistical model based on the forensics features and used least square support vector machine as a classifier to distinguish stego images from cover images. Experimental results show that the proposed method provides the following advantages. (1) The detection rate is noticeably higher than that of some existing methods. (2) It has the expected stability. (3) It is robust for content-preserving manipulations, such as JPEG compression, adding noise, filtering, etc. (4) The proposed method provides satisfactory generalization capability.
ERIC Educational Resources Information Center
Huston, Holly L.
This paper begins with a general discussion of statistical significance, effect size, and power analysis; and concludes by extending the discussion to the multivariate case (MANOVA). Historically, traditional statistical significance testing has guided researchers' thinking about the meaningfulness of their data. The use of significance testing…
Deriving statistical significance maps for SVM based image classification and group comparisons.
Gaonkar, Bilwaj; Davatzikos, Christos
2012-01-01
Population based pattern analysis and classification for quantifying structural and functional differences between diverse groups has been shown to be a powerful tool for the study of a number of diseases, and is quite commonly used especially in neuroimaging. The alternative to these pattern analysis methods, namely mass univariate methods such as voxel based analysis and all related methods, cannot detect multivariate patterns associated with group differences, and are not particularly suitable for developing individual-based diagnostic and prognostic biomarkers. A commonly used pattern analysis tool is the support vector machine (SVM). Unlike univariate statistical frameworks for morphometry, analytical tools for statistical inference are unavailable for the SVM. In this paper, we show that null distributions ordinarily obtained by permutation tests using SVMs can be analytically approximated from the data. The analytical computation takes a small fraction of the time it takes to do an actual permutation test, thereby rendering it possible to quickly create statistical significance maps derived from SVMs. Such maps are critical for understanding imaging patterns of group differences and interpreting which anatomical regions are important in determining the classifier's decision.
Why Are People Bad at Detecting Randomness? A Statistical Argument
ERIC Educational Resources Information Center
Williams, Joseph J.; Griffiths, Thomas L.
2013-01-01
Errors in detecting randomness are often explained in terms of biases and misconceptions. We propose and provide evidence for an account that characterizes the contribution of the inherent statistical difficulty of the task. Our account is based on a Bayesian statistical analysis, focusing on the fact that a random process is a special case of…
Assessing Genome-Wide Statistical Significance for Large p Small n Problems
Diao, Guoqing; Vidyashankar, Anand N.
2013-01-01
Assessing genome-wide statistical significance is an important issue in genetic studies. We describe a new resampling approach for determining the appropriate thresholds for statistical significance. Our simulation results demonstrate that the proposed approach accurately controls the genome-wide type I error rate even under the large p small n situations. PMID:23666935
Assessing genome-wide statistical significance for large p small n problems.
Diao, Guoqing; Vidyashankar, Anand N
2013-07-01
Assessing genome-wide statistical significance is an important issue in genetic studies. We describe a new resampling approach for determining the appropriate thresholds for statistical significance. Our simulation results demonstrate that the proposed approach accurately controls the genome-wide type I error rate even under the large p small n situations.
ERIC Educational Resources Information Center
Norris, John M.
2015-01-01
Traditions of statistical significance testing in second language (L2) quantitative research are strongly entrenched in how researchers design studies, select analyses, and interpret results. However, statistical significance tests using "p" values are commonly misinterpreted by researchers, reviewers, readers, and others, leading to…
A Review of Post-1994 Literature on Whether Statistical Significance Tests Should Be Banned.
ERIC Educational Resources Information Center
Sullivan, Jeremy R.
This paper summarizes the literature regarding statistical significance testing with an emphasis on: (1) the post-1994 literature in various disciplines; (2) alternatives to statistical significance testing; and (3) literature exploring why researchers have demonstrably failed to be influenced by the 1994 American Psychological Association…
"What If" Analyses: Ways to Interpret Statistical Significance Test Results Using EXCEL or "R"
ERIC Educational Resources Information Center
Ozturk, Elif
2012-01-01
The present paper aims to review two motivations to conduct "what if" analyses using Excel and "R" to understand the statistical significance tests through the sample size context. "What if" analyses can be used to teach students what statistical significance tests really do and in applied research either prospectively to estimate what sample size…
2013-01-01
Background Relative validity (RV), a ratio of ANOVA F-statistics, is often used to compare the validity of patient-reported outcome (PRO) measures. We used the bootstrap to establish the statistical significance of the RV and to identify key factors affecting its significance. Methods Based on responses from 453 chronic kidney disease (CKD) patients to 16 CKD-specific and generic PRO measures, RVs were computed to determine how well each measure discriminated across clinically-defined groups of patients compared to the most discriminating (reference) measure. Statistical significance of RV was quantified by the 95% bootstrap confidence interval. Simulations examined the effects of sample size, denominator F-statistic, correlation between comparator and reference measures, and number of bootstrap replicates. Results The statistical significance of the RV increased as the magnitude of denominator F-statistic increased or as the correlation between comparator and reference measures increased. A denominator F-statistic of 57 conveyed sufficient power (80%) to detect an RV of 0.6 for two measures correlated at r = 0.7. Larger denominator F-statistics or higher correlations provided greater power. Larger sample size with a fixed denominator F-statistic or more bootstrap replicates (beyond 500) had minimal impact. Conclusions The bootstrap is valuable for establishing the statistical significance of RV estimates. A reasonably large denominator F-statistic (F > 57) is required for adequate power when using the RV to compare the validity of measures with small or moderate correlations (r < 0.7). Substantially greater power can be achieved when comparing measures of a very high correlation (r > 0.9). PMID:23721463
Using statistical methods and genotyping to detect tuberculosis outbreaks
2013-01-01
Background Early identification of outbreaks remains a key component in continuing to reduce the burden of infectious disease in the United States. Previous studies have applied statistical methods to detect unexpected cases of disease in space or time. The objectives of our study were to assess the ability and timeliness of three spatio-temporal methods to detect known outbreaks of tuberculosis. Methods We used routinely available molecular and surveillance data to retrospectively assess the effectiveness of three statistical methods in detecting tuberculosis outbreaks: county-based log-likelihood ratio, cumulative sums, and a spatial scan statistic. Results Our methods identified 8 of the 9 outbreaks, and 6 outbreaks would have been identified 1–52 months (median = 10 months) before local public health authorities identified them. Assuming no delays in data availability, 46 (59.7%) of the 77 patients in the 9 outbreaks were identified after our statistical methods would have detected the outbreak but before local public health authorities became aware of the problem. Conclusions Statistical methods, when applied retrospectively to routinely collected tuberculosis data, can successfully detect known outbreaks, potentially months before local public health authorities become aware of the problem. The three methods showed similar results; no single method was clearly superior to the other two. Further study to elucidate the performance of these methods in detecting tuberculosis outbreaks will be done in a prospective analysis. PMID:23497235
Crow, C.J.
1985-01-01
Middle Ordovician age Chickamauga Group carbonates crop out along the Birmingham and Murphrees Valley anticlines in central Alabama. The macrofossil contents on exposed surfaces of seven bioherms have been counted to determine their various paleontologic characteristics. Twelve groups of organisms are present in these bioherms. Dominant organisms include bryozoans, algae, brachiopods, sponges, pelmatozoans, stromatoporoids and corals. Minor accessory fauna include predators, scavengers and grazers such as gastropods, ostracods, trilobites, cephalopods and pelecypods. Vertical and horizontal niche zonation has been detected for some of the bioherm dwelling fauna. No one bioherm of those studied exhibits all 12 groups of organisms; rather, individual bioherms display various subsets of the total diversity. Statistical treatment (G-test) of the diversity data indicates a lack of statistical homogeneity of the bioherms, both within and between localities. Between-locality population heterogeneity can be ascribed to differences in biologic responses to such gross environmental factors as water depth and clarity, and energy levels. At any one locality, gross aspects of the paleoenvironments are assumed to have been more uniform. Significant differences among bioherms at any one locality may have resulted from patchy distribution of species populations, differential preservation and other factors.
Brazilian Amazonia Deforestation Detection Using Spatio-Temporal Scan Statistics
NASA Astrophysics Data System (ADS)
Vieira, C. A. O.; Santos, N. T.; Carneiro, A. P. S.; Balieiro, A. A. S.
2012-07-01
The spatio-temporal models, developed for analyses of diseases, can also be used for others fields of study, including concerns about forest and deforestation. The aim of this paper is to quantitatively check priority areas in order to combat deforestation on the Amazon forest, using the space-time scan statistic. The study area location is at the south of the Amazonas State and cover around 297.183 kilometre squares, including the municipality of Boca do Acre, Labrea, Canutama, Humaita, Manicore, Novo Aripuana e Apui County on the north region of Brazil. This area has showed a significant change for land cover, which has increased the number of deforestation's alerts. Therefore this situation becomes a concern and gets more investigation, trying to stop factors that increase the number of cases in the area. The methodology includes the location and year that deforestation's alert occurred. These deforestation's alerts are mapped by the DETER (Detection System of Deforestation in Real Time in Amazonia), which is carry out by the Brazilian Space Agency (INPE). The software SatScanTM v7.0 was used in order to define space-time permutation scan statistic for detection of deforestation cases. The outcome of this experiment shows an efficient model to detect space-time clusters of deforestation's alerts. The model was efficient to detect the location, the size, the order and characteristics about activities at the end of the experiments. Two clusters were considered actives and kept actives up to the end of the study. These clusters are located in Canutama and Lábrea County. This quantitative spatial modelling of deforestation warnings allowed: firstly, identifying actives clustering of deforestation, in which the environment government official are able to concentrate their actions; secondly, identifying historic clustering of deforestation, in which the environment government official are able to monitoring in order to avoid them to became actives again; and finally
Statistical power for detecting trends with applications to seabird monitoring
Hatch, Shyla A.
2003-01-01
Power analysis is helpful in defining goals for ecological monitoring and evaluating the performance of ongoing efforts. I examined detection standards proposed for population monitoring of seabirds using two programs (MONITOR and TRENDS) specially designed for power analysis of trend data. Neither program models within- and among-years components of variance explicitly and independently, thus an error term that incorporates both components is an essential input. Residual variation in seabird counts consisted of day-to-day variation within years and unexplained variation among years in approximately equal parts. The appropriate measure of error for power analysis is the standard error of estimation (S.E.est) from a regression of annual means against year. Replicate counts within years are helpful in minimizing S.E.est but should not be treated as independent samples for estimating power to detect trends. Other issues include a choice of assumptions about variance structure and selection of an exponential or linear model of population change. Seabird count data are characterized by strong correlations between S.D. and mean, thus a constant CV model is appropriate for power calculations. Time series were fit about equally well with exponential or linear models, but log transformation ensures equal variances over time, a basic assumption of regression analysis. Using sample data from seabird monitoring in Alaska, I computed the number of years required (with annual censusing) to detect trends of -1.4% per year (50% decline in 50 years) and -2.7% per year (50% decline in 25 years). At ??=0.05 and a desired power of 0.9, estimated study intervals ranged from 11 to 69 years depending on species, trend, software, and study design. Power to detect a negative trend of 6.7% per year (50% decline in 10 years) is suggested as an alternative standard for seabird monitoring that achieves a reasonable match between statistical and biological significance.
NASA Technical Reports Server (NTRS)
Xu, Kuan-Man
2006-01-01
A new method is proposed to compare statistical differences between summary histograms, which are the histograms summed over a large ensemble of individual histograms. It consists of choosing a distance statistic for measuring the difference between summary histograms and using a bootstrap procedure to calculate the statistical significance level. Bootstrapping is an approach to statistical inference that makes few assumptions about the underlying probability distribution that describes the data. Three distance statistics are compared in this study. They are the Euclidean distance, the Jeffries-Matusita distance and the Kuiper distance. The data used in testing the bootstrap method are satellite measurements of cloud systems called cloud objects. Each cloud object is defined as a contiguous region/patch composed of individual footprints or fields of view. A histogram of measured values over footprints is generated for each parameter of each cloud object and then summary histograms are accumulated over all individual histograms in a given cloud-object size category. The results of statistical hypothesis tests using all three distances as test statistics are generally similar, indicating the validity of the proposed method. The Euclidean distance is determined to be most suitable after comparing the statistical tests of several parameters with distinct probability distributions among three cloud-object size categories. Impacts on the statistical significance levels resulting from differences in the total lengths of satellite footprint data between two size categories are also discussed.
A Network-Based Method to Assess the Statistical Significance of Mild Co-Regulation Effects
Horvát, Emőke-Ágnes; Zhang, Jitao David; Uhlmann, Stefan; Sahin, Özgür; Zweig, Katharina Anna
2013-01-01
Recent development of high-throughput, multiplexing technology has initiated projects that systematically investigate interactions between two types of components in biological networks, for instance transcription factors and promoter sequences, or microRNAs (miRNAs) and mRNAs. In terms of network biology, such screening approaches primarily attempt to elucidate relations between biological components of two distinct types, which can be represented as edges between nodes in a bipartite graph. However, it is often desirable not only to determine regulatory relationships between nodes of different types, but also to understand the connection patterns of nodes of the same type. Especially interesting is the co-occurrence of two nodes of the same type, i.e., the number of their common neighbours, which current high-throughput screening analysis fails to address. The co-occurrence gives the number of circumstances under which both of the biological components are influenced in the same way. Here we present SICORE, a novel network-based method to detect pairs of nodes with a statistically significant co-occurrence. We first show the stability of the proposed method on artificial data sets: when randomly adding and deleting observations we obtain reliable results even with noise exceeding the expected level in large-scale experiments. Subsequently, we illustrate the viability of the method based on the analysis of a proteomic screening data set to reveal regulatory patterns of human microRNAs targeting proteins in the EGFR-driven cell cycle signalling system. Since statistically significant co-occurrence may indicate functional synergy and the mechanisms underlying canalization, and thus hold promise in drug target identification and therapeutic development, we provide a platform-independent implementation of SICORE with a graphical user interface as a novel tool in the arsenal of high-throughput screening analysis. PMID:24039936
Statistical detection of EEG synchrony using empirical bayesian inference.
Singh, Archana K; Asoh, Hideki; Takeda, Yuji; Phillips, Steven
2015-01-01
There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries. PMID:25822617
Statistical Studies on Sequential Probability Ratio Test for Radiation Detection
Warnick Kernan, Ding Yuan, et al.
2007-07-01
A Sequential Probability Ratio Test (SPRT) algorithm helps to increase the reliability and speed of radiation detection. This algorithm is further improved to reduce spatial gap and false alarm. SPRT, using Last-in-First-Elected-Last-Out (LIFELO) technique, reduces the error between the radiation measured and resultant alarm. Statistical analysis determines the reduction of spatial error and false alarm.
A Computer Program for Detection of Statistical Outliers
ERIC Educational Resources Information Center
Pascale, Pietro J.; Lovas, Charles M.
1976-01-01
Presents a Fortran program which computes the rejection criteria of ten procedures for detecting outlying observations. These criteria are defined on comment cards. Journal sources for the statistical equations are listed. After applying rejection rules, the program calculates the mean and standard deviation of the censored sample. (Author/RC)
Statistical methods for detecting periodic fragments in DNA sequence data
2011-01-01
Background Period 10 dinucleotides are structurally and functionally validated factors that influence the ability of DNA to form nucleosomes, histone core octamers. Robust identification of periodic signals in DNA sequences is therefore required to understand nucleosome organisation in genomes. While various techniques for identifying periodic components in genomic sequences have been proposed or adopted, the requirements for such techniques have not been considered in detail and confirmatory testing for a priori specified periods has not been developed. Results We compared the estimation accuracy and suitability for confirmatory testing of autocorrelation, discrete Fourier transform (DFT), integer period discrete Fourier transform (IPDFT) and a previously proposed Hybrid measure. A number of different statistical significance procedures were evaluated but a blockwise bootstrap proved superior. When applied to synthetic data whose period-10 signal had been eroded, or for which the signal was approximately period-10, the Hybrid technique exhibited superior properties during exploratory period estimation. In contrast, confirmatory testing using the blockwise bootstrap procedure identified IPDFT as having the greatest statistical power. These properties were validated on yeast sequences defined from a ChIP-chip study where the Hybrid metric confirmed the expected dominance of period-10 in nucleosome associated DNA but IPDFT identified more significant occurrences of period-10. Application to the whole genomes of yeast and mouse identified ~ 21% and ~ 19% respectively of these genomes as spanned by period-10 nucleosome positioning sequences (NPS). Conclusions For estimating the dominant period, we find the Hybrid period estimation method empirically to be the most effective for both eroded and approximate periodicity. The blockwise bootstrap was found to be effective as a significance measure, performing particularly well in the problem of period detection in the
Statistical Significance of Periodicity and Log-Periodicity with Heavy-Tailed Correlated Noise
NASA Astrophysics Data System (ADS)
Zhou, Wei-Xing; Sornette, Didier
We estimate the probability that random noise, of several plausible standard distributions, creates a false alarm that a periodicity (or log-periodicity) is found in a time series. The solution of this problem is already known for independent Gaussian distributed noise. We investigate more general situations with non-Gaussian correlated noises and present synthetic tests on the detectability and statistical significance of periodic components. A periodic component of a time series is usually detected by some sort of Fourier analysis. Here, we use the Lomb periodogram analysis, which is suitable and outperforms Fourier transforms for unevenly sampled time series. We examine the false-alarm probability of the largest spectral peak of the Lomb periodogram in the presence of power-law distributed noises, of short-range and of long-range fractional-Gaussian noises. Increasing heavy-tailness (respectively correlations describing persistence) tends to decrease (respectively increase) the false-alarm probability of finding a large spurious Lomb peak. Increasing anti-persistence tends to decrease the false-alarm probability. We also study the interplay between heavy-tailness and long-range correlations. In order to fully determine if a Lomb peak signals a genuine rather than a spurious periodicity, one should in principle characterize the Lomb peak height, its width and its relations to other peaks in the complete spectrum. As a step towards this full characterization, we construct the joint-distribution of the frequency position (relative to other peaks) and of the height of the highest peak of the power spectrum. We also provide the distributions of the ratio of the highest Lomb peak to the second highest one. Using the insight obtained by the present statistical study, we re-examine previously reported claims of ``log-periodicity'' and find that the credibility for log-periodicity in 2D-freely decaying turbulence is weakened while it is strengthened for fracture, for the
Coulson, Melissa; Healey, Michelle; Fidler, Fiona; Cumming, Geoff
2010-01-01
A statistically significant result, and a non-significant result may differ little, although significance status may tempt an interpretation of difference. Two studies are reported that compared interpretation of such results presented using null hypothesis significance testing (NHST), or confidence intervals (CIs). Authors of articles published in psychology, behavioral neuroscience, and medical journals were asked, via email, to interpret two fictitious studies that found similar results, one statistically significant, and the other non-significant. Responses from 330 authors varied greatly, but interpretation was generally poor, whether results were presented as CIs or using NHST. However, when interpreting CIs respondents who mentioned NHST were 60% likely to conclude, unjustifiably, the two results conflicted, whereas those who interpreted CIs without reference to NHST were 95% likely to conclude, justifiably, the two results were consistent. Findings were generally similar for all three disciplines. An email survey of academic psychologists confirmed that CIs elicit better interpretations if NHST is not invoked. Improved statistical inference can result from encouragement of meta-analytic thinking and use of CIs but, for full benefit, such highly desirable statistical reform requires also that researchers interpret CIs without recourse to NHST. PMID:21607077
A Generative Statistical Algorithm for Automatic Detection of Complex Postures
Amit, Yali; Biron, David
2015-01-01
This paper presents a method for automated detection of complex (non-self-avoiding) postures of the nematode Caenorhabditis elegans and its application to analyses of locomotion defects. Our approach is based on progressively detailed statistical models that enable detection of the head and the body even in cases of severe coilers, where data from traditional trackers is limited. We restrict the input available to the algorithm to a single digitized frame, such that manual initialization is not required and the detection problem becomes embarrassingly parallel. Consequently, the proposed algorithm does not propagate detection errors and naturally integrates in a “big data” workflow used for large-scale analyses. Using this framework, we analyzed the dynamics of postures and locomotion of wild-type animals and mutants that exhibit severe coiling phenotypes. Our approach can readily be extended to additional automated tracking tasks such as tracking pairs of animals (e.g., for mating assays) or different species. PMID:26439258
A statistical modeling approach for detecting generalized synchronization
Schumacher, Johannes; Haslinger, Robert; Pipa, Gordon
2012-01-01
Detecting nonlinear correlations between time series presents a hard problem for data analysis. We present a generative statistical modeling method for detecting nonlinear generalized synchronization. Truncated Volterra series are used to approximate functional interactions. The Volterra kernels are modeled as linear combinations of basis splines, whose coefficients are estimated via l1 and l2 regularized maximum likelihood regression. The regularization manages the high number of kernel coefficients and allows feature selection strategies yielding sparse models. The method's performance is evaluated on different coupled chaotic systems in various synchronization regimes and analytical results for detecting m:n phase synchrony are presented. Experimental applicability is demonstrated by detecting nonlinear interactions between neuronal local field potentials recorded in different parts of macaque visual cortex. PMID:23004851
Statistical feature selection for enhanced detection of brain tumor
NASA Astrophysics Data System (ADS)
Chaddad, Ahmad; Colen, Rivka R.
2014-09-01
Feature-based methods are widely used in the brain tumor recognition system. Robust of early cancer detection is one of the most powerful image processing tools. Specifically, statistical features, such as geometric mean, harmonic mean, mean excluding outliers, median, percentiles, skewness and kurtosis, have been extracted from brain tumor glioma to aid in discriminating two levels namely, Level I and Level II using fluid attenuated inversion recovery (FLAIR) sequence in the diagnosis of brain tumor. Statistical feature describes the major characteristics of each level from glioma which is an important step to evaluate heterogeneity of cancer area pixels. In this paper, we address the task of feature selection to identify the relevant subset of features in the statistical domain, while discarding those that are either redundant or confusing, thereby improving the performance of feature-based scheme to distinguish between Level I and Level II. We apply a Decision Structure algorithm to find the optimal combination of nonhomogeneity based statistical features for the problem at hand. We employ a Naïve Bayes classifier to evaluate the performance of the optimal statistical feature based scheme in terms of its glioma Level I and Level II discrimination capability and use real-data collected from 17 patients have a glioblastoma multiforme (GBM). Dataset provided from 3 Tesla MR imaging system by MD Anderson Cancer Center. For the specific data analyzed, it is shown that the identified dominant features yield higher classification accuracy, with lower number of false alarms and missed detections, compared to the full statistical based feature set. This work has been proposed and analyzed specific GBM types which Level I and Level II and the dominant features were considered as feature aid to prognostic indicators. These features were selected automatically to be better able to determine prognosis from classical imaging studies.
NASA Astrophysics Data System (ADS)
Wilks, Daniel S.
1996-04-01
A simple approach to long-range forecasting of monthly or seasonal quantities is as the average of observations over some number of the most recent years. Finding this `optimal climate normal' (OCN) involves examining the relationships between the observed variable and averages of its values over the previous one to 30 years and selecting the averaging period yielding the best results. This procedure involves a multiplicity of comparisons, which will lead to misleadingly positive results for developments data. The statistical significance of these OCNs are assessed here using a resampling procedure, in which time series of U.S. Climate Division data are repeatedly shuffled to produce statistical distributions of forecast performance measures, under the null hypothesis that the OCNs exhibit no predictive skill. Substantial areas in the United States are found for which forecast performance appears to be significantly better than would occur by chance.Another complication in the assessment of the statistical significance of the OCNs derives from the spatial correlation exhibited by the data. Because of this correlation, instances of Type I errors (false rejections of local null hypotheses) will tend to occur with spatial coherency and accordingly have the potential to be confused with regions for which there may be real predictability. The `field significance' of the collections of local tests is also assessed here by simultaneously and coherently shuffling the time series for the Climate Divisions. Areas exhibiting significant local tests are large enough to conclude that seasonal OCN temperature forecasts exhibit significant skill over parts of the United States for all seasons except SON, OND, and NDJ, and that seasonal OCN precipitation forecasts are significantly skillful only in the fall. Statistical significance is weaker for monthly than for seasonal OCN temperature forecasts, and the monthly OCN precipitation forecasts do not exhibit significant predictive
Detection of a diffusive cloak via second-order statistics
NASA Astrophysics Data System (ADS)
Koirala, Milan; Yamilov, Alexey
2016-08-01
We propose a scheme to detect the diffusive cloak proposed by Schittny et al [Science 345, 427 (2014)]. We exploit the fact that diffusion of light is an approximation that disregards wave interference. The long-range contribution to intensity correlation is sensitive to locations of paths crossings and the interference inside the medium, allowing one to detect the size and position, including the depth, of the diffusive cloak. Our results also suggest that it is possible to separately manipulate the first- and the second-order statistics of wave propagation in turbid media.
Statistically normalized coherent change detection for synthetic aperture sonar imagery
NASA Astrophysics Data System (ADS)
G-Michael, Tesfaye; Tucker, J. D.; Roberts, Rodney G.
2016-05-01
Coherent Change Detection (CCD) is a process of highlighting an area of activity in scenes (seafloor) under survey and generated from pairs of synthetic aperture sonar (SAS) images of approximately the same location observed at two different time instances. The problem of CCD and subsequent anomaly feature extraction/detection is complicated due to several factors such as the presence of random speckle pattern in the images, changing environmental conditions, and platform instabilities. These complications make the detection of weak target activities even more difficult. Typically, the degree of similarity between two images measured at each pixel locations is the coherence between the complex pixel values in the two images. Higher coherence indicates little change in the scene represented by the pixel and lower coherence indicates change activity in the scene. Such coherence estimation scheme based on the pixel intensity correlation is an ad-hoc procedure where the effectiveness of the change detection is determined by the choice of threshold which can lead to high false alarm rates. In this paper, we propose a novel approach for anomalous change pattern detection using the statistical normalized coherence and multi-pass coherent processing. This method may be used to mitigate shadows by reducing the false alarms resulting in the coherent map due to speckles and shadows. Test results of the proposed methods on a data set of SAS images will be presented, illustrating the effectiveness of the normalized coherence in terms statistics from multi-pass survey of the same scene.
Weighing the costs of different errors when determining statistical significant during monitoring
Technology Transfer Automated Retrieval System (TEKTRAN)
Selecting appropriate significance levels when constructing confidence intervals and performing statistical analyses with rangeland monitoring data is not a straightforward process. This process is burdened by the conventional selection of “95% confidence” (i.e., Type I error rate, a =0.05) as the d...
ERIC Educational Resources Information Center
Linting, Marielle; van Os, Bart Jan; Meulman, Jacqueline J.
2011-01-01
In this paper, the statistical significance of the contribution of variables to the principal components in principal components analysis (PCA) is assessed nonparametrically by the use of permutation tests. We compare a new strategy to a strategy used in previous research consisting of permuting the columns (variables) of a data matrix…
ERIC Educational Resources Information Center
Spinella, Sarah
2011-01-01
As result replicability is essential to science and difficult to achieve through external replicability, the present paper notes the insufficiency of null hypothesis statistical significance testing (NHSST) and explains the bootstrap as a plausible alternative, with a heuristic example to illustrate the bootstrap method. The bootstrap relies on…
Interpreting Statistical Significance Test Results: A Proposed New "What If" Method.
ERIC Educational Resources Information Center
Kieffer, Kevin M.; Thompson, Bruce
As the 1994 publication manual of the American Psychological Association emphasized, "p" values are affected by sample size. As a result, it can be helpful to interpret the results of statistical significant tests in a sample size context by conducting so-called "what if" analyses. However, these methods can be inaccurate unless "corrected" effect…
Recent Literature on Whether Statistical Significance Tests Should or Should Not Be Banned.
ERIC Educational Resources Information Center
Deegear, James
This paper summarizes the literature regarding statistical significant testing with an emphasis on recent literature in various discipline and literature exploring why researchers have demonstrably failed to be influenced by the American Psychological Association publication manual's encouragement to report effect sizes. Also considered are…
Statistical Significance of the Trends in Monthly Heavy Precipitation Over the US
Mahajan, Salil; North, Dr. Gerald R.; Saravanan, Dr. R.; Genton, Dr. Marc G.
2012-01-01
Trends in monthly heavy precipitation, defined by a return period of one year, are assessed for statistical significance in observations and Global Climate Model (GCM) simulations over the contiguous United States using Monte Carlo non-parametric and parametric bootstrapping techniques. The results from the two Monte Carlo approaches are found to be similar to each other, and also to the traditional non-parametric Kendall's {tau} test, implying the robustness of the approach. Two different observational data-sets are employed to test for trends in monthly heavy precipitation and are found to exhibit consistent results. Both data-sets demonstrate upward trends, one of which is found to be statistically significant at the 95% confidence level. Upward trends similar to observations are observed in some climate model simulations of the twentieth century, but their statistical significance is marginal. For projections of the twenty-first century, a statistically significant upwards trend is observed in most of the climate models analyzed. The change in the simulated precipitation variance appears to be more important in the twenty-first century projections than changes in the mean precipitation. Stochastic fluctuations of the climate-system are found to be dominate monthly heavy precipitation as some GCM simulations show a downwards trend even in the twenty-first century projections when the greenhouse gas forcings are strong.
ERIC Educational Resources Information Center
Snyder, Patricia; Lawson, Stephen
Magnitude of effect measures (MEMs), when adequately understood and correctly used, are important aids for researchers who do not want to rely solely on tests of statistical significance in substantive result interpretation. The MEM tells how much of the dependent variable can be controlled, predicted, or explained by the independent variables.…
Reflections on Statistical and Substantive Significance, with a Slice of Replication.
ERIC Educational Resources Information Center
Robinson, Daniel H.; Levin, Joel R.
1997-01-01
Proposes modifications to the recent suggestions by B. Thompson (1996) for an American Educational Research Association editorial policy on statistical significance testing. Points out that, although it is useful to include effect sizes, they can be misinterpreted, and argues, as does Thompson, for greater attention to replication in educational…
Simulated performance of an order statistic threshold strategy for detection of narrowband signals
NASA Technical Reports Server (NTRS)
Satorius, E.; Brady, R.; Deich, W.; Gulkis, S.; Olsen, E.
1988-01-01
The application of order statistics to signal detection is becoming an increasingly active area of research. This is due to the inherent robustness of rank estimators in the presence of large outliers that would significantly degrade more conventional mean-level-based detection systems. A detection strategy is presented in which the threshold estimate is obtained using order statistics. The performance of this algorithm in the presence of simulated interference and broadband noise is evaluated. In this way, the robustness of the proposed strategy in the presence of the interference can be fully assessed as a function of the interference, noise, and detector parameters.
Clinical Significance of Microcalcifications Detection in Invasive Breast Carcinoma
Hashimoto, Yuki; Murata, Aya; Miyamoto, Naoki; Takamori, Toshihiro; Hosoda, Yuta; Endo, Yukari; Kodani, Yuka; Sato, Kengo; Hosoya, Keiko; Ishiguro, Kiyosuke; Hirooka, Yasuaki
2015-01-01
Background Recently, a lot of cases with microcalcifications of the breast are pointed by the images of mammography (MG), because breast screening using MG become common. Although MG is a gold standard modality for detecting microcalcifications, images of ultrasonography (US) are now feasible to detect microcalcifications with recent improvements to ultrasound diagnostic devices. In this report, we analyzed clinical significance of microcalcifications detected with US images in invasive breast carcinoma. Methods Eighty-eight patients with invasive breast carcinoma who underwent MG and US before surgery at the Division of Breast and Endocrine Surgery of Tottori University Hospital between January 2012 and August 2013. After reviewing US images, the association between the presence of echogenic spots that indicate microcalcifications and images of MG or pathological findings was assessed. Results Patients without microcalcifications on US images were significantly more likely to have the Luminal A subtype and a lower nuclear grading. Conversely, patients with microcalcifications on US images were significantly more likely to have higher level of MIB-1 index, lymphovascular invasion, comedonecrosis and lymph node metastasis. The rate of detecting microcalcifications on US images was relatively good, with 81.8% of sensitivity, 94.5% of specificity and 89.8% of diagnostic accuracy. Among the calcifications detected by MG images, detected rate of calcifications with US images was higher in necrotic type (92.6%) than secretory type (33.3%). Conclusion This study suggest that microcalcifications of tumors detected by US images could serve as an useful prediction to evaluate the degree of malignancy for patients with invasive breast carcinoma. PMID:26306060
Hulshizer, Randall; Blalock, Eric M
2007-01-01
Background Researchers using RNA expression microarrays in experimental designs with more than two treatment groups often identify statistically significant genes with ANOVA approaches. However, the ANOVA test does not discriminate which of the multiple treatment groups differ from one another. Thus, post hoc tests, such as linear contrasts, template correlations, and pairwise comparisons are used. Linear contrasts and template correlations work extremely well, especially when the researcher has a priori information pointing to a particular pattern/template among the different treatment groups. Further, all pairwise comparisons can be used to identify particular, treatment group-dependent patterns of gene expression. However, these approaches are biased by the researcher's assumptions, and some treatment-based patterns may fail to be detected using these approaches. Finally, different patterns may have different probabilities of occurring by chance, importantly influencing researchers' conclusions about a pattern and its constituent genes. Results We developed a four step, post hoc pattern matching (PPM) algorithm to automate single channel gene expression pattern identification/significance. First, 1-Way Analysis of Variance (ANOVA), coupled with post hoc 'all pairwise' comparisons are calculated for all genes. Second, for each ANOVA-significant gene, all pairwise contrast results are encoded to create unique pattern ID numbers. The # genes found in each pattern in the data is identified as that pattern's 'actual' frequency. Third, using Monte Carlo simulations, those patterns' frequencies are estimated in random data ('random' gene pattern frequency). Fourth, a Z-score for overrepresentation of the pattern is calculated ('actual' against 'random' gene pattern frequencies). We wrote a Visual Basic program (StatiGen) that automates PPM procedure, constructs an Excel workbook with standardized graphs of overrepresented patterns, and lists of the genes comprising
Algorithm for Detecting Significant Locations from Raw GPS Data
NASA Astrophysics Data System (ADS)
Kami, Nobuharu; Enomoto, Nobuyuki; Baba, Teruyuki; Yoshikawa, Takashi
We present a fast algorithm for probabilistically extracting significant locations from raw GPS data based on data point density. Extracting significant locations from raw GPS data is the first essential step of algorithms designed for location-aware applications. Assuming that a location is significant if users spend a certain time around that area, most current algorithms compare spatial/temporal variables, such as stay duration and a roaming diameter, with given fixed thresholds to extract significant locations. However, the appropriate threshold values are not clearly known in priori and algorithms with fixed thresholds are inherently error-prone, especially under high noise levels. Moreover, for N data points, they are generally O(N 2) algorithms since distance computation is required. We developed a fast algorithm for selective data point sampling around significant locations based on density information by constructing random histograms using locality sensitive hashing. Evaluations show competitive performance in detecting significant locations even under high noise levels.
Li, Qingbo; Roxas, Bryan AP
2009-01-01
Background Many studies have provided algorithms or methods to assess a statistical significance in quantitative proteomics when multiple replicates for a protein sample and a LC/MS analysis are available. But, confidence is still lacking in using datasets for a biological interpretation without protein sample replicates. Although a fold-change is a conventional threshold that can be used when there are no sample replicates, it does not provide an assessment of statistical significance such as a false discovery rate (FDR) which is an important indicator of the reliability to identify differentially expressed proteins. In this work, we investigate whether differentially expressed proteins can be detected with a statistical significance from a pair of unlabeled protein samples without replicates and with only duplicate LC/MS injections per sample. A FDR is used to gauge the statistical significance of the differentially expressed proteins. Results We have experimented to operate on several parameters to control a FDR, including a fold-change, a statistical test, and a minimum number of permuted significant pairings. Although none of these parameters alone gives a satisfactory control of a FDR, we find that a combination of these parameters provides a very effective means to control a FDR without compromising the sensitivity. The results suggest that it is possible to perform a significance analysis without protein sample replicates. Only duplicate LC/MS injections per sample are needed. We illustrate that differentially expressed proteins can be detected with a FDR between 0 and 15% at a positive rate of 4–16%. The method is evaluated for its sensitivity and specificity by a ROC analysis, and is further validated with a [15N]-labeled internal-standard protein sample and additional unlabeled protein sample replicates. Conclusion We demonstrate that a statistical significance can be inferred without protein sample replicates in label-free quantitative proteomics. The
Robust Statistical Detection of Power-Law Cross-Correlation
NASA Astrophysics Data System (ADS)
Blythe, Duncan A. J.; Nikulin, Vadim V.; Müller, Klaus-Robert
2016-06-01
We show that widely used approaches in statistical physics incorrectly indicate the existence of power-law cross-correlations between financial stock market fluctuations measured over several years and the neuronal activity of the human brain lasting for only a few minutes. While such cross-correlations are nonsensical, no current methodology allows them to be reliably discarded, leaving researchers at greater risk when the spurious nature of cross-correlations is not clear from the unrelated origin of the time series and rather requires careful statistical estimation. Here we propose a theory and method (PLCC-test) which allows us to rigorously and robustly test for power-law cross-correlations, correctly detecting genuine and discarding spurious cross-correlations, thus establishing meaningful relationships between processes in complex physical systems. Our method reveals for the first time the presence of power-law cross-correlations between amplitudes of the alpha and beta frequency ranges of the human electroencephalogram.
Statistical detection of nanoparticles in cells by darkfield microscopy.
Gnerucci, Alessio; Romano, Giovanni; Ratto, Fulvio; Centi, Sonia; Baccini, Michela; Santosuosso, Ugo; Pini, Roberto; Fusi, Franco
2016-07-01
In the fields of nanomedicine, biophotonics and radiation therapy, nanoparticle (NP) detection in cell models often represents a fundamental step for many in vivo studies. One common question is whether NPs have or have not interacted with cells. In this context, we propose an imaging based technique to detect the presence of NPs in eukaryotic cells. Darkfield images of cell cultures at low magnification (10×) are acquired in different spectral ranges and recombined so as to enhance the contrast due to the presence of NPs. Image analysis is applied to extract cell-based parameters (i.e. mean intensity), which are further analyzed by statistical tests (Student's t-test, permutation test) in order to obtain a robust detection method. By means of a statistical sample size analysis, the sensitivity of the whole methodology is quantified in terms of the minimum cell number that is needed to identify the presence of NPs. The method is presented in the case of HeLa cells incubated with gold nanorods labeled with anti-CA125 antibodies, which exploits the overexpression of CA125 in ovarian cancers. Control cases are considered as well, including PEG-coated NPs and HeLa cells without NPs. PMID:27381231
A new statistical approach to climate change detection and attribution
NASA Astrophysics Data System (ADS)
Ribes, Aurélien; Zwiers, Francis W.; Azaïs, Jean-Marc; Naveau, Philippe
2016-04-01
We propose here a new statistical approach to climate change detection and attribution that is based on additive decomposition and simple hypothesis testing. Most current statistical methods for detection and attribution rely on linear regression models where the observations are regressed onto expected response patterns to different external forcings. These methods do not use physical information provided by climate models regarding the expected response magnitudes to constrain the estimated responses to the forcings. Climate modelling uncertainty is difficult to take into account with regression based methods and is almost never treated explicitly. As an alternative to this approach, our statistical model is only based on the additivity assumption; the proposed method does not regress observations onto expected response patterns. We introduce estimation and testing procedures based on likelihood maximization, and show that climate modelling uncertainty can easily be accounted for. Some discussion is provided on how to practically estimate the climate modelling uncertainty based on an ensemble of opportunity. Our approach is based on the "models are statistically indistinguishable from the truth" paradigm, where the difference between any given model and the truth has the same distribution as the difference between any pair of models, but other choices might also be considered. The properties of this approach are illustrated and discussed based on synthetic data. Lastly, the method is applied to the linear trend in global mean temperature over the period 1951-2010. Consistent with the last IPCC assessment report, we find that most of the observed warming over this period (+0.65 K) is attributable to anthropogenic forcings (+0.67 ± 0.12 K, 90 % confidence range), with a very limited contribution from natural forcings (-0.01± 0.02 K).
Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y; Drake, Steven K; Gucek, Marjan; Suffredini, Anthony F; Sacks, David B; Yu, Yi-Kuo
2016-02-01
Correct and rapid identification of microorganisms is the key to the success of many important applications in health and safety, including, but not limited to, infection treatment, food safety, and biodefense. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is challenging correct microbial identification because of the large number of choices present. To properly disentangle candidate microbes, one needs to go beyond apparent morphology or simple 'fingerprinting'; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptidome profiles of microbes to better separate them and by designing an analysis method that yields accurate statistical significance. Here, we present an analysis pipeline that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using MS/MS data of 81 samples, each composed of a single known microorganism, that the proposed pipeline can correctly identify microorganisms at least at the genus and species levels. We have also shown that the proposed pipeline computes accurate statistical significances, i.e., E-values for identified peptides and unified E-values for identified microorganisms. The proposed analysis pipeline has been implemented in MiCId, a freely available software for Microorganism Classification and Identification. MiCId is available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html . Graphical Abstract ᅟ.
NASA Astrophysics Data System (ADS)
Alves, Gelio; Wang, Guanghui; Ogurtsov, Aleksey Y.; Drake, Steven K.; Gucek, Marjan; Suffredini, Anthony F.; Sacks, David B.; Yu, Yi-Kuo
2016-02-01
Correct and rapid identification of microorganisms is the key to the success of many important applications in health and safety, including, but not limited to, infection treatment, food safety, and biodefense. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is challenging correct microbial identification because of the large number of choices present. To properly disentangle candidate microbes, one needs to go beyond apparent morphology or simple `fingerprinting'; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptidome profiles of microbes to better separate them and by designing an analysis method that yields accurate statistical significance. Here, we present an analysis pipeline that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using MS/MS data of 81 samples, each composed of a single known microorganism, that the proposed pipeline can correctly identify microorganisms at least at the genus and species levels. We have also shown that the proposed pipeline computes accurate statistical significances, i.e., E-values for identified peptides and unified E-values for identified microorganisms. The proposed analysis pipeline has been implemented in MiCId, a freely available software for Microorganism Classification and Identification. MiCId is available for download at http://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.
Statistical detection of the hidden distortions in diffusive spectra
NASA Astrophysics Data System (ADS)
Nigmatullin, R. R.; Toboev, V. A.; Smith, G.; Butler, P.
2003-04-01
The detection of an unknown substance in small concentration represents an important problem in spectroscopy. Usually this detection is based on the recognition of specific `labels' i.e. the visual appearance of new resonance lines that appear in the spectrograms analysed. But if the concentration of the unknown substance is small and visual indications (e.g. resonance peaks in diffusive spectra) are absent then the detection of the unknown substance constitutes a problem. We suggest a new methodology for the statistical detection of an unknown substance, based on the transformation of fluctuations obtained from initial spectrograms into ordered quantized histograms (QHs). The QHs obtained help to detect, statistically, the presence of unknown substances using the characteristics of conventional quantum spectra adopted from quantum mechanics. The averaging of the QHs helps to calculate the ordered `fluctuation fork' (FF), which provides a specific `noise ruler' for the detection and quantification of the trace substance. The quantitative parameters of the FF are the length (D) (defined as the maximal value of the FF along OX axis), maximum value of the width (W) (which coincides with the maximal value of standard deviation along OY axis) and the total area (A) of the FF occupied on OXY plane. The sensitivity of these parameters to the concentration of trace substance forms the basis on which it is possible to analyse the concentration of the unknown substance, despite the fact that visual indications in diffusive spectra are absent. This methodology can provide the foundations for a new fluctuations treatment spectroscopy, which can be effective in the analysis of fluctuations (noise) accompanying the basic spectrogram. Application of the new methodology to near infrared (NIR) spectra obtained for petrol of octane ratings 95 and 98, and their mixtures, confirms the effectiveness and sensitivity of new methodology. In another experiment on the detection of protein
Rudd, James; Moore, Jason H; Urbanowicz, Ryan J
2013-11-01
Permutation-based statistics for evaluating the significance of class prediction, predictive attributes, and patterns of association have only appeared within the learning classifier system (LCS) literature since 2012. While still not widely utilized by the LCS research community, formal evaluations of test statistic confidence are imperative to large and complex real world applications such as genetic epidemiology where it is standard practice to quantify the likelihood that a seemingly meaningful statistic could have been obtained purely by chance. LCS algorithms are relatively computationally expensive on their own. The compounding requirements for generating permutation-based statistics may be a limiting factor for some researchers interested in applying LCS algorithms to real world problems. Technology has made LCS parallelization strategies more accessible and thus more popular in recent years. In the present study we examine the benefits of externally parallelizing a series of independent LCS runs such that permutation testing with cross validation becomes more feasible to complete on a single multi-core workstation. We test our python implementation of this strategy in the context of a simulated complex genetic epidemiological data mining problem. Our evaluations indicate that as long as the number of concurrent processes does not exceed the number of CPU cores, the speedup achieved is approximately linear. PMID:24358057
Krumbholz, Aniko; Anielski, Patricia; Gfrerer, Lena; Graw, Matthias; Geyer, Hans; Schänzer, Wilhelm; Dvorak, Jiri; Thieme, Detlef
2014-01-01
Clenbuterol is a well-established β2-agonist, which is prohibited in sports and strictly regulated for use in the livestock industry. During the last few years clenbuterol-positive results in doping controls and in samples from residents or travellers from a high-risk country were suspected to be related the illegal use of clenbuterol for fattening. A sensitive liquid chromatography-tandem mass spectrometry (LC-MS/MS) method was developed to detect low clenbuterol residues in hair with a detection limit of 0.02 pg/mg. A sub-therapeutic application study and a field study with volunteers, who have a high risk of contamination, were performed. For the application study, a total dosage of 30 µg clenbuterol was applied to 20 healthy volunteers on 5 subsequent days. One month after the beginning of the application, clenbuterol was detected in the proximal hair segment (0-1 cm) in concentrations between 0.43 and 4.76 pg/mg. For the second part, samples of 66 Mexican soccer players were analyzed. In 89% of these volunteers, clenbuterol was detectable in their hair at concentrations between 0.02 and 1.90 pg/mg. A comparison of both parts showed no statistical difference between sub-therapeutic application and contamination. In contrast, discrimination to a typical abuse of clenbuterol is apparently possible. Due to these findings results of real doping control samples can be evaluated. PMID:25388545
Krumbholz, Aniko; Anielski, Patricia; Gfrerer, Lena; Graw, Matthias; Geyer, Hans; Schänzer, Wilhelm; Dvorak, Jiri; Thieme, Detlef
2014-01-01
Clenbuterol is a well-established β2-agonist, which is prohibited in sports and strictly regulated for use in the livestock industry. During the last few years clenbuterol-positive results in doping controls and in samples from residents or travellers from a high-risk country were suspected to be related the illegal use of clenbuterol for fattening. A sensitive liquid chromatography-tandem mass spectrometry (LC-MS/MS) method was developed to detect low clenbuterol residues in hair with a detection limit of 0.02 pg/mg. A sub-therapeutic application study and a field study with volunteers, who have a high risk of contamination, were performed. For the application study, a total dosage of 30 µg clenbuterol was applied to 20 healthy volunteers on 5 subsequent days. One month after the beginning of the application, clenbuterol was detected in the proximal hair segment (0-1 cm) in concentrations between 0.43 and 4.76 pg/mg. For the second part, samples of 66 Mexican soccer players were analyzed. In 89% of these volunteers, clenbuterol was detectable in their hair at concentrations between 0.02 and 1.90 pg/mg. A comparison of both parts showed no statistical difference between sub-therapeutic application and contamination. In contrast, discrimination to a typical abuse of clenbuterol is apparently possible. Due to these findings results of real doping control samples can be evaluated.
Statistics over features for internal carotid arterial disorders detection.
Ubeyli, Elif Derya
2008-03-01
The objective of the present study is to extract the representative features of the internal carotid arterial (ICA) Doppler ultrasound signals and to present the accurate classification model. This paper presented the usage of statistics over the set of the extracted features (Lyapunov exponents and the power levels of the power spectral density estimates obtained by the eigenvector methods) in order to reduce the dimensionality of the extracted feature vectors. Since classification is more accurate when the pattern is simplified through representation by important features, feature extraction and selection play an important role in classifying systems such as neural networks. Mixture of experts (ME) and modified mixture of experts (MME) architectures were formulated and used as basis for detection of arterial disorders. Three types of ICA Doppler signals (Doppler signals recorded from healthy subjects, subjects having stenosis, and subjects having occlusion) were classified. The classification results confirmed that the proposed ME and MME has potential in detecting the arterial disorders. PMID:18179791
Statistical damage detection method for frame structures using a confidence interval
NASA Astrophysics Data System (ADS)
Li, Weiming; Zhu, Hongping; Luo, Hanbin; Xia, Yong
2010-03-01
A novel damage detection method is applied to a 3-story frame structure, to obtain statistical quantification control criterion of the existence, location and identification of damage. The mean, standard deviation, and exponentially weighted moving average (EWMA) are applied to detect damage information according to statistical process control (SPC) theory. It is concluded that the detection is insignificant with the mean and EWMA because the structural response is not independent and is not a normal distribution. On the other hand, the damage information is detected well with the standard deviation because the influence of the data distribution is not pronounced with this parameter. A suitable moderate confidence level is explored for more significant damage location and quantification detection, and the impact of noise is investigated to illustrate the robustness of the method.
Martinez-Val, Ana; Garcia, Fernando; Ximénez-Embún, Pilar; Ibarz, Nuria; Zarzuela, Eduardo; Ruppen, Isabel; Mohammed, Shabaz; Munoz, Javier
2016-09-01
Isobaric labeling is gaining popularity in proteomics due to its multiplexing capacity. However, copeptide fragmentation introduces a bias that undermines its accuracy. Several strategies have been shown to partially and, in some cases, completely solve this issue. However, it is still not clear how ratio compression affects the ability to identify a protein's change of abundance as statistically significant. Here, by using the "two proteomes" approach (E. coli lysates with fixed 2.5 ratios in the presence or absence of human lysates acting as the background interference) and manipulating isolation width values, we were able to model isobaric data with different levels of accuracy and precision in three types of mass spectrometers: LTQ Orbitrap Velos, Impact, and Q Exactive. We determined the influence of these variables on the statistical significance of the distorted ratios and compared them to the ratios measured without impurities. Our results confirm previous findings1-4 regarding the importance of optimizing acquisition parameters in each instrument in order to minimize interference without compromising precision and identification. We also show that, under these experimental conditions, the inclusion of a second replicate increases statistical sensitivity 2-3-fold and counterbalances to a large extent the issue of ratio compression.
Fisher, Aaron; Anderson, G Brooke; Peng, Roger; Leek, Jeff
2014-01-01
Scatterplots are the most common way for statisticians, scientists, and the public to visually detect relationships between measured variables. At the same time, and despite widely publicized controversy, P-values remain the most commonly used measure to statistically justify relationships identified between variables. Here we measure the ability to detect statistically significant relationships from scatterplots in a randomized trial of 2,039 students in a statistics massive open online course (MOOC). Each subject was shown a random set of scatterplots and asked to visually determine if the underlying relationships were statistically significant at the P < 0.05 level. Subjects correctly classified only 47.4% (95% CI [45.1%-49.7%]) of statistically significant relationships, and 74.6% (95% CI [72.5%-76.6%]) of non-significant relationships. Adding visual aids such as a best fit line or scatterplot smooth increased the probability a relationship was called significant, regardless of whether the relationship was actually significant. Classification of statistically significant relationships improved on repeat attempts of the survey, although classification of non-significant relationships did not. Our results suggest: (1) that evidence-based data analysis can be used to identify weaknesses in theoretical procedures in the hands of average users, (2) data analysts can be trained to improve detection of statistically significant results with practice, but (3) data analysts have incorrect intuition about what statistically significant relationships look like, particularly for small effects. We have built a web tool for people to compare scatterplots with their corresponding p-values which is available here: http://glimmer.rstudio.com/afisher/EDA/. PMID:25337457
Fisher, Aaron; Anderson, G. Brooke; Peng, Roger
2014-01-01
Scatterplots are the most common way for statisticians, scientists, and the public to visually detect relationships between measured variables. At the same time, and despite widely publicized controversy, P-values remain the most commonly used measure to statistically justify relationships identified between variables. Here we measure the ability to detect statistically significant relationships from scatterplots in a randomized trial of 2,039 students in a statistics massive open online course (MOOC). Each subject was shown a random set of scatterplots and asked to visually determine if the underlying relationships were statistically significant at the P < 0.05 level. Subjects correctly classified only 47.4% (95% CI [45.1%–49.7%]) of statistically significant relationships, and 74.6% (95% CI [72.5%–76.6%]) of non-significant relationships. Adding visual aids such as a best fit line or scatterplot smooth increased the probability a relationship was called significant, regardless of whether the relationship was actually significant. Classification of statistically significant relationships improved on repeat attempts of the survey, although classification of non-significant relationships did not. Our results suggest: (1) that evidence-based data analysis can be used to identify weaknesses in theoretical procedures in the hands of average users, (2) data analysts can be trained to improve detection of statistically significant results with practice, but (3) data analysts have incorrect intuition about what statistically significant relationships look like, particularly for small effects. We have built a web tool for people to compare scatterplots with their corresponding p-values which is available here: http://glimmer.rstudio.com/afisher/EDA/. PMID:25337457
Auditory cortical detection and discrimination correlates with communicative significance.
Liu, Robert C; Schreiner, Christoph E
2007-07-01
Plasticity studies suggest that behavioral relevance can change the cortical processing of trained or conditioned sensory stimuli. However, whether this occurs in the context of natural communication, where stimulus significance is acquired through social interaction, has not been well investigated, perhaps because neural responses to species-specific vocalizations can be difficult to interpret within a systematic framework. The ultrasonic communication system between isolated mouse pups and adult females that either do or do not recognize the calls' significance provides an opportunity to explore this issue. We applied an information-based analysis to multi- and single unit data collected from anesthetized mothers and pup-naïve females to quantify how the communicative significance of pup calls affects their encoding in the auditory cortex. The timing and magnitude of information that cortical responses convey (at a 2-ms resolution) for pup call detection and discrimination was significantly improved in mothers compared to naïve females, most likely because of changes in call frequency encoding. This was not the case for a non-natural sound ensemble outside the mouse vocalization repertoire. The results demonstrate that a sensory cortical change in the timing code for communication sounds is correlated with the vocalizations' behavioral relevance, potentially enhancing functional processing by improving its signal to noise ratio. PMID:17564499
Jefferson, L; Cooper, E; Hewitt, C; Torgerson, T; Cook, L; Tharmanathan, P; Cockayne, S; Torgerson, D
2016-01-01
Objective Time-lag from study completion to publication is a potential source of publication bias in randomised controlled trials. This study sought to update the evidence base by identifying the effect of the statistical significance of research findings on time to publication of trial results. Design Literature searches were carried out in four general medical journals from June 2013 to June 2014 inclusive (BMJ, JAMA, the Lancet and the New England Journal of Medicine). Setting Methodological review of four general medical journals. Participants Original research articles presenting the primary analyses from phase 2, 3 and 4 parallel-group randomised controlled trials were included. Main outcome measures Time from trial completion to publication. Results The median time from trial completion to publication was 431 days (n = 208, interquartile range 278–618). A multivariable adjusted Cox model found no statistically significant difference in time to publication for trials reporting positive or negative results (hazard ratio: 0.86, 95% CI 0.64 to 1.16, p = 0.32). Conclusion In contrast to previous studies, this review did not demonstrate the presence of time-lag bias in time to publication. This may be a result of these articles being published in four high-impact general medical journals that may be more inclined to publish rapidly, whatever the findings. Further research is needed to explore the presence of time-lag bias in lower quality studies and lower impact journals. PMID:27757242
Robust Statistical Detection of Power-Law Cross-Correlation
Blythe, Duncan A. J.; Nikulin, Vadim V.; Müller, Klaus-Robert
2016-01-01
We show that widely used approaches in statistical physics incorrectly indicate the existence of power-law cross-correlations between financial stock market fluctuations measured over several years and the neuronal activity of the human brain lasting for only a few minutes. While such cross-correlations are nonsensical, no current methodology allows them to be reliably discarded, leaving researchers at greater risk when the spurious nature of cross-correlations is not clear from the unrelated origin of the time series and rather requires careful statistical estimation. Here we propose a theory and method (PLCC-test) which allows us to rigorously and robustly test for power-law cross-correlations, correctly detecting genuine and discarding spurious cross-correlations, thus establishing meaningful relationships between processes in complex physical systems. Our method reveals for the first time the presence of power-law cross-correlations between amplitudes of the alpha and beta frequency ranges of the human electroencephalogram. PMID:27250630
Robust Statistical Detection of Power-Law Cross-Correlation.
Blythe, Duncan A J; Nikulin, Vadim V; Müller, Klaus-Robert
2016-01-01
We show that widely used approaches in statistical physics incorrectly indicate the existence of power-law cross-correlations between financial stock market fluctuations measured over several years and the neuronal activity of the human brain lasting for only a few minutes. While such cross-correlations are nonsensical, no current methodology allows them to be reliably discarded, leaving researchers at greater risk when the spurious nature of cross-correlations is not clear from the unrelated origin of the time series and rather requires careful statistical estimation. Here we propose a theory and method (PLCC-test) which allows us to rigorously and robustly test for power-law cross-correlations, correctly detecting genuine and discarding spurious cross-correlations, thus establishing meaningful relationships between processes in complex physical systems. Our method reveals for the first time the presence of power-law cross-correlations between amplitudes of the alpha and beta frequency ranges of the human electroencephalogram. PMID:27250630
Statistical methods for the detection and analysis of radioactive sources
NASA Astrophysics Data System (ADS)
Klumpp, John
We consider four topics from areas of radioactive statistical analysis in the present study: Bayesian methods for the analysis of count rate data, analysis of energy data, a model for non-constant background count rate distributions, and a zero-inflated model of the sample count rate. The study begins with a review of Bayesian statistics and techniques for analyzing count rate data. Next, we consider a novel system for incorporating energy information into count rate measurements which searches for elevated count rates in multiple energy regions simultaneously. The system analyzes time-interval data in real time to sequentially update a probability distribution for the sample count rate. We then consider a "moving target" model of background radiation in which the instantaneous background count rate is a function of time, rather than being fixed. Unlike the sequential update system, this model assumes a large body of pre-existing data which can be analyzed retrospectively. Finally, we propose a novel Bayesian technique which allows for simultaneous source detection and count rate analysis. This technique is fully compatible with, but independent of, the sequential update system and moving target model.
Performance optimization for pedestrian detection on degraded video using natural scene statistics
NASA Astrophysics Data System (ADS)
Winterlich, Anthony; Denny, Patrick; Kilmartin, Liam; Glavin, Martin; Jones, Edward
2014-11-01
We evaluate the effects of transmission artifacts such as JPEG compression and additive white Gaussian noise on the performance of a state-of-the-art pedestrian detection algorithm, which is based on integral channel features. Integral channel features combine the diversity of information obtained from multiple image channels with the computational efficiency of the Viola and Jones detection framework. We utilize "quality aware" spatial image statistics to blindly categorize distorted video frames by distortion type and level without the use of an explicit reference. We combine quality statistics with a multiclassifier detection framework for optimal pedestrian detection performance across varying image quality. Our detection method provides statistically significant improvements over current approaches based on single classifiers, on two large pedestrian databases containing a wide variety of artificially added distortion. The improvement in detection performance is further demonstrated on real video data captured from multiple cameras containing varying levels of sensor noise and compression. The results of our research have the potential to be used in real-time in-vehicle networks to improve pedestrian detection performance across a wide range of image and video quality.
Algorithms for Detecting Significantly Mutated Pathways in Cancer
NASA Astrophysics Data System (ADS)
Vandin, Fabio; Upfal, Eli; Raphael, Benjamin J.
Recent genome sequencing studies have shown that the somatic mutations that drive cancer development are distributed across a large number of genes. This mutational heterogeneity complicates efforts to distinguish functional mutations from sporadic, passenger mutations. Since cancer mutations are hypothesized to target a relatively small number of cellular signaling and regulatory pathways, a common approach is to assess whether known pathways are enriched for mutated genes. However, restricting attention to known pathways will not reveal novel cancer genes or pathways. An alterative strategy is to examine mutated genes in the context of genome-scale interaction networks that include both well characterized pathways and additional gene interactions measured through various approaches. We introduce a computational framework for de novo identification of subnetworks in a large gene interaction network that are mutated in a significant number of patients. This framework includes two major features. First, we introduce a diffusion process on the interaction network to define a local neighborhood of "influence" for each mutated gene in the network. Second, we derive a two-stage multiple hypothesis test to bound the false discovery rate (FDR) associated with the identified subnetworks. We test these algorithms on a large human protein-protein interaction network using mutation data from two recent studies: glioblastoma samples from The Cancer Genome Atlas and lung adenocarcinoma samples from the Tumor Sequencing Project. We successfully recover pathways that are known to be important in these cancers, such as the p53 pathway. We also identify additional pathways, such as the Notch signaling pathway, that have been implicated in other cancers but not previously reported as mutated in these samples. Our approach is the first, to our knowledge, to demonstrate a computationally efficient strategy for de novo identification of statistically significant mutated subnetworks. We
McKay, J Lucas; Welch, Torrence D J; Vidakovic, Brani; Ting, Lena H
2013-01-01
We developed wavelet-based functional ANOVA (wfANOVA) as a novel approach for comparing neurophysiological signals that are functions of time. Temporal resolution is often sacrificed by analyzing such data in large time bins, increasing statistical power by reducing the number of comparisons. We performed ANOVA in the wavelet domain because differences between curves tend to be represented by a few temporally localized wavelets, which we transformed back to the time domain for visualization. We compared wfANOVA and ANOVA performed in the time domain (tANOVA) on both experimental electromyographic (EMG) signals from responses to perturbation during standing balance across changes in peak perturbation acceleration (3 levels) and velocity (4 levels) and on simulated data with known contrasts. In experimental EMG data, wfANOVA revealed the continuous shape and magnitude of significant differences over time without a priori selection of time bins. However, tANOVA revealed only the largest differences at discontinuous time points, resulting in features with later onsets and shorter durations than those identified using wfANOVA (P < 0.02). Furthermore, wfANOVA required significantly fewer (~1/4;×; P < 0.015) significant F tests than tANOVA, resulting in post hoc tests with increased power. In simulated EMG data, wfANOVA identified known contrast curves with a high level of precision (r(2) = 0.94 ± 0.08) and performed better than tANOVA across noise levels (P < <0.01). Therefore, wfANOVA may be useful for revealing differences in the shape and magnitude of neurophysiological signals (e.g., EMG, firing rates) across multiple conditions with both high temporal resolution and high statistical power. PMID:23100136
Statistical method for detecting structural change in the growth process.
Ninomiya, Yoshiyuki; Yoshimoto, Atsushi
2008-03-01
Due to competition among individual trees and other exogenous factors that change the growth environment, each tree grows following its own growth trend with some structural changes in growth over time. In the present article, a new method is proposed to detect a structural change in the growth process. We formulate the method as a simple statistical test for signal detection without constructing any specific model for the structural change. To evaluate the p-value of the test, the tube method is developed because the regular distribution theory is insufficient. Using two sets of tree diameter growth data sampled from planted forest stands of Cryptomeria japonica in Japan, we conduct an analysis of identifying the effect of thinning on the growth process as a structural change. Our results demonstrate that the proposed method is useful to identify the structural change caused by thinning. We also provide the properties of the method in terms of the size and power of the test. PMID:17608782
2014-01-01
Background Most work on the topic of activity landscapes has focused on their quantitative description and visual representation, with the aim of aiding navigation of SAR. Recent developments have addressed applications such as quantifying the proportion of activity cliffs, investigating the predictive abilities of activity landscape methods and so on. However, all these publications have worked under the assumption that the activity landscape models are “real” (i.e., statistically significant). Results The current study addresses for the first time, in a quantitative manner, the significance of a landscape or individual cliffs in the landscape. In particular, we question whether the activity landscape derived from observed (experimental) activity data is different from a randomly generated landscape. To address this we used the SALI measure with six different data sets tested against one or more molecular targets. We also assessed the significance of the landscapes for single and multiple representations. Conclusions We find that non-random landscapes are data set and molecular representation dependent. For the data sets and representations used in this work, our results suggest that not all representations lead to non-random landscapes. This indicates that not all molecular representations should be used to a) interpret the SAR and b) combined to generate consensus models. Our results suggest that significance testing of activity landscape models and in particular, activity cliffs, is key, prior to the use of such models. PMID:24694189
NASA Astrophysics Data System (ADS)
Santer, B. D.; Wigley, T. M. L.; Boyle, J. S.; Gaffen, D. J.; Hnilo, J. J.; Nychka, D.; Parker, D. E.; Taylor, K. E.
2000-03-01
This paper examines trend uncertainties in layer-average free atmosphere temperatures arising from the use of different trend estimation methods. It also considers statistical issues that arise in assessing the significance of individual trends and of trend differences between data sets. Possible causes of these trends are not addressed. We use data from satellite and radiosonde measurements and from two reanalysis projects. To facilitate intercomparison, we compute from reanalyses and radiosonde data temperatures equivalent to those from the satellite-based Microwave Sounding Unit (MSU). We compare linear trends based on minimization of absolute deviations (LA) and minimization of squared deviations (LS). Differences are generally less than 0.05°C/decade over 1959-1996. Over 1979-1993, they exceed 0.10°C/decade for lower tropospheric time series and 0.15°C/decade for the lower stratosphere. Trend fitting by the LA method can degrade the lower-tropospheric trend agreement of 0.03°C/decade (over 1979-1996) previously reported for the MSU and radiosonde data. In assessing trend significance we employ two methods to account for temporal autocorrelation effects. With our preferred method, virtually none of the individual 1979-1993 trends in deep-layer temperatures are significantly different from zero. To examine trend differences between data sets we compute 95% confidence intervals for individual trends and show that these overlap for almost all data sets considered. Confidence intervals for lower-tropospheric trends encompass both zero and the model-projected trends due to anthropogenic effects. We also test the significance of a trend in d(t), the time series of differences between a pair of data sets. Use of d(t) removes variability common to both time series and facilitates identification of small trend differences. This more discerning test reveals that roughly 30% of the data set comparisons have significant differences in lower-tropospheric trends
Mass detection on real and synthetic mammograms: human observer templates and local statistics
NASA Astrophysics Data System (ADS)
Castella, Cyril; Kinkel, Karen; Verdun, Francis R.; Eckstein, Miguel P.; Abbey, Craig K.; Bochud, François O.
2007-03-01
In this study we estimated human observer templates associated with the detection of a realistic mass signal superimposed on real and simulated but realistic synthetic mammographic backgrounds. Five trained naÃve observers participated in two-alternative forced-choice (2-AFC) experiments in which they were asked to detect a spherical mass signal extracted from a mammographic phantom. This signal was superimposed on statistically stationary clustered lumpy backgrounds (CLB) in one instance, and on nonstationary real mammographic backgrounds in another. Human observer linear templates were estimated using a genetic algorithm. An additional 2-AFC experiment was conducted with twin noise in order to determine which local statistical properties of the real backgrounds influenced the ability of the human observers to detect the signal. Results show that the estimated linear templates are not significantly different for stationary and nonstationary backgrounds. The estimated performance of the linear template compared with the human observer is within 5% in terms of percent correct (Pc) for the 2-AFC task. Detection efficiency is significantly higher on nonstationary real backgrounds than on globally stationary synthetic CLB. Using the twin-noise experiment and a new method to relate image features to observers trial to trial decisions, we found that the local statistical properties preventing or making the detection task easier were the standard deviation and three features derived from the neighborhood gray-tone difference matrix: coarseness, contrast and strength. These statistical features showed a dependency with the human performance only when they are estimated within an area sufficiently small around the searched location. These findings emphasize that nonstationary backgrounds need to be described by their local statistics and not by global ones like the noise Wiener spectrum.
Statistics, Probability, Significance, Likelihood: Words Mean What We Define Them to Mean
ERIC Educational Resources Information Center
Drummond, Gordon B.; Tom, Brian D. M.
2011-01-01
Statisticians use words deliberately and specifically, but not necessarily in the way they are used colloquially. For example, in general parlance "statistics" can mean numerical information, usually data. In contrast, one large statistics textbook defines the term "statistic" to denote "a characteristic of a "sample", such as the average score",…
NASA Astrophysics Data System (ADS)
Kellerer-Pirklbauer, Andreas
2016-04-01
Longer data series (e.g. >10 a) of ground temperatures in alpine regions are helpful to improve the understanding regarding the effects of present climate change on distribution and thermal characteristics of seasonal frost- and permafrost-affected areas. Beginning in 2004 - and more intensively since 2006 - a permafrost and seasonal frost monitoring network was established in Central and Eastern Austria by the University of Graz. This network consists of c.60 ground temperature (surface and near-surface) monitoring sites which are located at 1922-3002 m a.s.l., at latitude 46°55'-47°22'N and at longitude 12°44'-14°41'E. These data allow conclusions about general ground thermal conditions, potential permafrost occurrence, trend during the observation period, and regional pattern of changes. Calculations and analyses of several different temperature-related parameters were accomplished. At an annual scale a region-wide statistical significant warming during the observation period was revealed by e.g. an increase in mean annual temperature values (mean, maximum) or the significant lowering of the surface frost number (F+). At a seasonal scale no significant trend of any temperature-related parameter was in most cases revealed for spring (MAM) and autumn (SON). Winter (DJF) shows only a weak warming. In contrast, the summer (JJA) season reveals in general a significant warming as confirmed by several different temperature-related parameters such as e.g. mean seasonal temperature, number of thawing degree days, number of freezing degree days, or days without night frost. On a monthly basis August shows the statistically most robust and strongest warming of all months, although regional differences occur. Despite the fact that the general ground temperature warming during the last decade is confirmed by the field data in the study region, complications in trend analyses arise by temperature anomalies (e.g. warm winter 2006/07) or substantial variations in the winter
Statistical Analysis of Data with Non-Detectable Values
Frome, E.L.
2004-08-26
Environmental exposure measurements are, in general, positive and may be subject to left censoring, i.e. the measured value is less than a ''limit of detection''. In occupational monitoring, strategies for assessing workplace exposures typically focus on the mean exposure level or the probability that any measurement exceeds a limit. A basic problem of interest in environmental risk assessment is to determine if the mean concentration of an analyte is less than a prescribed action level. Parametric methods, used to determine acceptable levels of exposure, are often based on a two parameter lognormal distribution. The mean exposure level and/or an upper percentile (e.g. the 95th percentile) are used to characterize exposure levels, and upper confidence limits are needed to describe the uncertainty in these estimates. In certain situations it is of interest to estimate the probability of observing a future (or ''missed'') value of a lognormal variable. Statistical methods for random samples (without non-detects) from the lognormal distribution are well known for each of these situations. In this report, methods for estimating these quantities based on the maximum likelihood method for randomly left censored lognormal data are described and graphical methods are used to evaluate the lognormal assumption. If the lognormal model is in doubt and an alternative distribution for the exposure profile of a similar exposure group is not available, then nonparametric methods for left censored data are used. The mean exposure level, along with the upper confidence limit, is obtained using the product limit estimate, and the upper confidence limit on the 95th percentile (i.e. the upper tolerance limit) is obtained using a nonparametric approach. All of these methods are well known but computational complexity has limited their use in routine data analysis with left censored data. The recent development of the R environment for statistical data analysis and graphics has greatly
The Detection and Statistics of Giant Arcs behind CLASH Clusters
NASA Astrophysics Data System (ADS)
Xu, Bingxiao; Postman, Marc; Meneghetti, Massimo; Seitz, Stella; Zitrin, Adi; Merten, Julian; Maoz, Dani; Frye, Brenda; Umetsu, Keiichi; Zheng, Wei; Bradley, Larry; Vega, Jesus; Koekemoer, Anton
2016-02-01
We developed an algorithm to find and characterize gravitationally lensed galaxies (arcs) to perform a comparison of the observed and simulated arc abundance. Observations are from the Cluster Lensing And Supernova survey with Hubble (CLASH). Simulated CLASH images are created using the MOKA package and also clusters selected from the high-resolution, hydrodynamical simulations, MUSIC, over the same mass and redshift range as the CLASH sample. The algorithm's arc elongation accuracy, completeness, and false positive rate are determined and used to compute an estimate of the true arc abundance. We derive a lensing efficiency of 4 ± 1 arcs (with length ≥6″ and length-to-width ratio ≥7) per cluster for the X-ray-selected CLASH sample, 4 ± 1 arcs per cluster for the MOKA-simulated sample, and 3 ± 1 arcs per cluster for the MUSIC-simulated sample. The observed and simulated arc statistics are in full agreement. We measure the photometric redshifts of all detected arcs and find a median redshift zs = 1.9 with 33% of the detected arcs having zs > 3. We find that the arc abundance does not depend strongly on the source redshift distribution but is sensitive to the mass distribution of the dark matter halos (e.g., the c-M relation). Our results show that consistency between the observed and simulated distributions of lensed arc sizes and axial ratios can be achieved by using cluster-lensing simulations that are carefully matched to the selection criteria used in the observations.
WISCOD: a statistical web-enabled tool for the identification of significant protein coding regions.
Vilardell, Mireia; Parra, Genis; Civit, Sergi
2014-01-01
Classically, gene prediction programs are based on detecting signals such as boundary sites (splice sites, starts, and stops) and coding regions in the DNA sequence in order to build potential exons and join them into a gene structure. Although nowadays it is possible to improve their performance with additional information from related species or/and cDNA databases, further improvement at any step could help to obtain better predictions. Here, we present WISCOD, a web-enabled tool for the identification of significant protein coding regions, a novel software tool that tackles the exon prediction problem in eukaryotic genomes. WISCOD has the capacity to detect real exons from large lists of potential exons, and it provides an easy way to use global P value called expected probability of being a false exon (EPFE) that is useful for ranking potential exons in a probabilistic framework, without additional computational costs. The advantage of our approach is that it significantly increases the specificity and sensitivity (both between 80% and 90%) in comparison to other ab initio methods (where they are in the range of 70-75%). WISCOD is written in JAVA and R and is available to download and to run in a local mode on Linux and Windows platforms. PMID:25313355
WISCOD: a statistical web-enabled tool for the identification of significant protein coding regions.
Vilardell, Mireia; Parra, Genis; Civit, Sergi
2014-01-01
Classically, gene prediction programs are based on detecting signals such as boundary sites (splice sites, starts, and stops) and coding regions in the DNA sequence in order to build potential exons and join them into a gene structure. Although nowadays it is possible to improve their performance with additional information from related species or/and cDNA databases, further improvement at any step could help to obtain better predictions. Here, we present WISCOD, a web-enabled tool for the identification of significant protein coding regions, a novel software tool that tackles the exon prediction problem in eukaryotic genomes. WISCOD has the capacity to detect real exons from large lists of potential exons, and it provides an easy way to use global P value called expected probability of being a false exon (EPFE) that is useful for ranking potential exons in a probabilistic framework, without additional computational costs. The advantage of our approach is that it significantly increases the specificity and sensitivity (both between 80% and 90%) in comparison to other ab initio methods (where they are in the range of 70-75%). WISCOD is written in JAVA and R and is available to download and to run in a local mode on Linux and Windows platforms.
Carr, J.R.; Roberts, K.P.
1989-02-01
Universal kriging is compared with ordinary kriging for estimation of earthquake ground motion. Ordinary kriging is based on a stationary random function model; universal kriging is based on a nonstationary random function model representing first-order drift. Accuracy of universal kriging is compared with that for ordinary kriging; cross-validation is used as the basis for comparison. Hypothesis testing on these results shows that accuracy obtained using universal kriging is not significantly different from accuracy obtained using ordinary kriging. Test based on normal distribution assumptions are applied to errors measured in the cross-validation procedure; t and F tests reveal no evidence to suggest universal and ordinary kriging are different for estimation of earthquake ground motion. Nonparametric hypothesis tests applied to these errors and jackknife statistics yield the same conclusion: universal and ordinary kriging are not significantly different for this application as determined by a cross-validation procedure. These results are based on application to four independent data sets (four different seismic events).
Statistical language analysis for automatic exfiltration event detection.
Robinson, David Gerald
2010-04-01
This paper discusses the recent development a statistical approach for the automatic identification of anomalous network activity that is characteristic of exfiltration events. This approach is based on the language processing method eferred to as latent dirichlet allocation (LDA). Cyber security experts currently depend heavily on a rule-based framework for initial detection of suspect network events. The application of the rule set typically results in an extensive list of uspect network events that are then further explored manually for suspicious activity. The ability to identify anomalous network events is heavily dependent on the experience of the security personnel wading through the network log. Limitations f this approach are clear: rule-based systems only apply to exfiltration behavior that has previously been observed, and experienced cyber security personnel are rare commodities. Since the new methodology is not a discrete rule-based pproach, it is more difficult for an insider to disguise the exfiltration events. A further benefit is that the methodology provides a risk-based approach that can be implemented in a continuous, dynamic or evolutionary fashion. This permits uspect network activity to be identified early with a quantifiable risk associated with decision making when responding to suspicious activity.
Alexandrov, N. N.; Go, N.
1994-01-01
We have completed an exhaustive search for the common spatial arrangements of backbone fragments (SARFs) in nonhomologous proteins. This type of local structural similarity, incorporating short fragments of backbone atoms, arranged not necessarily in the same order along the polypeptide chain, appears to be important for protein function and stability. To estimate the statistical significance of the similarities, we have introduced a similarity score. We present several locally similar structures, with a large similarity score, which have not yet been reported. On the basis of the results of pairwise comparison, we have performed hierarchical cluster analysis of protein structures. Our analysis is not limited by comparison of single chains but also includes complex molecules consisting of several subunits. The SARFs with backbone fragments from different polypeptide chains provide a stable interaction between subunits in protein molecules. In many cases the active site of enzyme is located at the same position relative to the common SARFs, implying a function of the certain SARFs as a universal interface of the protein-substrate interaction. PMID:8069217
Lee, L.; Helsel, D.
2005-01-01
Trace contaminants in water, including metals and organics, often are measured at sufficiently low concentrations to be reported only as values below the instrument detection limit. Interpretation of these "less thans" is complicated when multiple detection limits occur. Statistical methods for multiply censored, or multiple-detection limit, datasets have been developed for medical and industrial statistics, and can be employed to estimate summary statistics or model the distributions of trace-level environmental data. We describe S-language-based software tools that perform robust linear regression on order statistics (ROS). The ROS method has been evaluated as one of the most reliable procedures for developing summary statistics of multiply censored data. It is applicable to any dataset that has 0 to 80% of its values censored. These tools are a part of a software library, or add-on package, for the R environment for statistical computing. This library can be used to generate ROS models and associated summary statistics, plot modeled distributions, and predict exceedance probabilities of water-quality standards. ?? 2005 Elsevier Ltd. All rights reserved.
Detecting missing signals in multichannel recordings by using higher order statistics.
Halabi, R; Diab, M O; Moslem, B; Khalil, M; Marque, C
2012-01-01
In real world applications, a multichannel acquisition system is susceptible of having one or many of its sensors displaced or detached, leading therefore to the loss or corruption of the recorded signals. In this paper, we present a technique for detecting missing or corrupted signals in multichannel recordings. Our approach is based on Higher Order Statistics (HOS) analysis. Our approach is tested on real uterine electromyogram (EMG) signals recorded by 4×4 electrode grid. Results have shown that HOS descriptors can discriminate between the two classes of signals (missing vs. non-missing). These results are supported by statistical analysis using the t-test which indicated good statistical significance of 95% confidence level.
Petykhov, A B; Maev, I V; Deriabin, V E
2012-01-01
Anthropometry--a technique, allowing to obtain the necessary features for the characteristic of human body's changes in norm and at pathology. Statistical analysis of anthropometric parameters, such as--body mass, length, waist line, hip, shoulder and wrist circumferences, skin rolls of fat thickness: on triceps, under a bladebone, on a breast, on a venter and on a biceps, with calculation of indexes and an assessment of possible age influence was carried out for the first time in domestic medicine. Complexes of showing interrelations anthropometric characteristics were detected. Correlation coefficients (r) were counted and the factorial (on a method main a component with the subsequent rotation--a varimax method), covariance and discriminative analyses (with application of the Kaiser and Wilks criterions and F-test) is applied. Study of intergroup variability of body composition was carried out on separate characteristics in healthy individuals groups (135 surveyed aged 45,6 +/- 1,2 years, 56,3% men and 43,7% women) and at internal pathology: patients after a gastrectomy--121 (57,7 +/- 1,2 years, 52% men and 48% women); after Billroth operation--214 (56,1 +/- 1,0 years, 53% men and 47% women); after enterectomy--103 (44,5 +/- 1,8 years, 53% men and 47% women); after mixed genesis protein-energy wasting--206 (29,04 +/- 1,6 years, 79% men and 21% women). The group of interlocking characteristics which includes anthropometric parameters of hypodermic lipopexia (rolls of fat thickness on triceps, a biceps, under a bladebone, on a venter) and fatty body mass was defined by results of the analysis. These characteristics are interconnected with age and growth and have more expressed dependence at women, that reflects development of a fatty component of a body, at assessment of body mass index at women (unlike men). The waist-hip circumference index differs irrespective of body composition indicators that doesn't allow to characterize it with the terms of truncal or
Petykhov, A B; Maev, I V; Deriabin, V E
2012-01-01
Anthropometry--a technique, allowing to obtain the necessary features for the characteristic of human body's changes in norm and at pathology. Statistical analysis of anthropometric parameters, such as--body mass, length, waist line, hip, shoulder and wrist circumferences, skin rolls of fat thickness: on triceps, under a bladebone, on a breast, on a venter and on a biceps, with calculation of indexes and an assessment of possible age influence was carried out for the first time in domestic medicine. Complexes of showing interrelations anthropometric characteristics were detected. Correlation coefficients (r) were counted and the factorial (on a method main a component with the subsequent rotation--a varimax method), covariance and discriminative analyses (with application of the Kaiser and Wilks criterions and F-test) is applied. Study of intergroup variability of body composition was carried out on separate characteristics in healthy individuals groups (135 surveyed aged 45,6 +/- 1,2 years, 56,3% men and 43,7% women) and at internal pathology: patients after a gastrectomy--121 (57,7 +/- 1,2 years, 52% men and 48% women); after Billroth operation--214 (56,1 +/- 1,0 years, 53% men and 47% women); after enterectomy--103 (44,5 +/- 1,8 years, 53% men and 47% women); after mixed genesis protein-energy wasting--206 (29,04 +/- 1,6 years, 79% men and 21% women). The group of interlocking characteristics which includes anthropometric parameters of hypodermic lipopexia (rolls of fat thickness on triceps, a biceps, under a bladebone, on a venter) and fatty body mass was defined by results of the analysis. These characteristics are interconnected with age and growth and have more expressed dependence at women, that reflects development of a fatty component of a body, at assessment of body mass index at women (unlike men). The waist-hip circumference index differs irrespective of body composition indicators that doesn't allow to characterize it with the terms of truncal or
NASA Astrophysics Data System (ADS)
Casati, Michele
2014-05-01
The assertion that solar activity may play a significant role in the trigger of large volcanic eruptions is, and has been discussed by many geophysicists. Numerous scientific papers have established a possible correlation between these events and the electromagnetic coupling between the Earth and the Sun, but none of them has been able to highlight a possible statistically significant relationship between large volcanic eruptions and any of the series, such as geomagnetic activity, solar wind, sunspots number. In our research, we compare the 148 volcanic eruptions with index VEI4, the major 37 historical volcanic eruptions equal to or greater than index VEI5, recorded from 1610 to 2012 , with its sunspots number. Staring, as the threshold value, a monthly sunspot number of 46 (recorded during the great eruption of Krakatoa VEI6 historical index, August 1883), we note some possible relationships and conduct a statistical test. • Of the historical 31 large volcanic eruptions with index VEI5+, recorded between 1610 and 1955, 29 of these were recorded when the SSN<46. The remaining 2 eruptions were not recorded when the SSN<46, but rather during solar maxima of the solar cycle of the year 1739 and in the solar cycle No. 14 (Shikotsu eruption of 1739 and Ksudach 1907). • Of the historical 8 large volcanic eruptions with index VEI6+, recorded from 1610 to the present, 7 of these were recorded with SSN<46 and more specifically, within the three large solar minima known : Maunder (1645-1710), Dalton (1790-1830) and during the solar minimums occurred between 1880 and 1920. As the only exception, we note the eruption of Pinatubo of June 1991, recorded in the solar maximum of cycle 22. • Of the historical 6 major volcanic eruptions with index VEI5+, recorded after 1955, 5 of these were not recorded during periods of low solar activity, but rather during solar maxima, of the cycles 19,21 and 22. The significant tests, conducted with the chi-square χ ² = 7,782, detect a
Avalanche Photodiode Statistics in Triggered-avalanche Detection Mode
NASA Technical Reports Server (NTRS)
Tan, H. H.
1984-01-01
The output of a triggered avalanche mode avalanche photodiode is modeled as Poisson distributed primary avalanche events plus conditionally Poisson distributed trapped carrier induced secondary events. The moment generating function as well as the mean and variance of the diode output statistics are derived. The dispersion of the output statistics is shown to always exceed that of the Poisson distribution. Several examples are considered in detail.
NASA Astrophysics Data System (ADS)
Woodruff, J. D.; Donnelly, J. P.; Emanuel, K.
2007-12-01
Coastal overwash deposits preserved within backbarrier sediments extend the documented record of tropical cyclone strikes back several millennia, providing valuable new data that help to elucidate links between tropical cyclone activity and climate variability. Certain caveats should be considered, however, when assessing trends observed within these paleo-storm records. For instance, gaps in overwash activity at a particular site could simply be artifacts produced by the random nature of these episodic events. Recently, a 5000 year record of intense hurricane strikes has been developed using coarse-grained overwash deposits from Laguna Playa Grande (LPG), a coastal lagoon located on the island of Vieques, Puerto Rico. The LPG record exhibits periods of frequent and infrequent hurricane-induced overwash activity spanning many centuries. These trends are consistent with overwash reconstructions from western Long Island, NY, and have been linked in part to variability in the El Niño/Southern Oscillation and the West African monsoon. Here we assess the statistical significance for active and inactive periods at LPG by creating thousands of synthetic overwash records for the site using storm tracks generated by a coupled ocean-atmosphere hurricane model set to mimic modern climatology. Results show that periods of infrequent overwash activity at the LPG site between 3600 and 1500 yrs BP and 1000 and 250 yrs BP are extremely unlikely to occur under modern climate conditions (above 99 percent confidence). This suggests that the variability observed in the Vieques record is consistent with changing climatic boundary conditions. Overwash frequency is greatest over the last 300 years, with 2 to 3 deposits/century compared to 0.6 deposits/century for earlier active regimes from 2500 to 1000 yrs BP and 5000 to 3600 yrs BP. While this may reflect an unprecedented level of activity over the last 5000 years, it may also in part be due to an undercounting of events in older
Statistical Anomaly Detection for Monitoring of Human Dynamics
NASA Astrophysics Data System (ADS)
Kamiya, K.; Fuse, T.
2015-05-01
Understanding of human dynamics has drawn attention to various areas. Due to the wide spread of positioning technologies that use GPS or public Wi-Fi, location information can be obtained with high spatial-temporal resolution as well as at low cost. By collecting set of individual location information in real time, monitoring of human dynamics is recently considered possible and is expected to lead to dynamic traffic control in the future. Although this monitoring focuses on detecting anomalous states of human dynamics, anomaly detection methods are developed ad hoc and not fully systematized. This research aims to define an anomaly detection problem of the human dynamics monitoring with gridded population data and develop an anomaly detection method based on the definition. According to the result of a review we have comprehensively conducted, we discussed the characteristics of the anomaly detection of human dynamics monitoring and categorized our problem to a semi-supervised anomaly detection problem that detects contextual anomalies behind time-series data. We developed an anomaly detection method based on a sticky HDP-HMM, which is able to estimate the number of hidden states according to input data. Results of the experiment with synthetic data showed that our proposed method has good fundamental performance with respect to the detection rate. Through the experiment with real gridded population data, an anomaly was detected when and where an actual social event had occurred.
Probable detection of climatically significant change of the solar constant
NASA Technical Reports Server (NTRS)
Sofia, S.; Endal, A. S.
1980-01-01
It is suggested that the decrease in the solar radius inferred from solar eclipse observations made from 1715 to 1979 reflects a variation of the solar constant that may be of considerable climatic significance. A general, time-averaged relationship between changes in the solar constant and changes in the solar radius is derived based on a model of the contraction and expansion of the convective zone. A preliminary numerical calculation of radius changes due to changes in the mixing length of the solar envelope is presented which indicates that a decrease in solar radius of 0.5 arcsec, as observed in the last 264 years, would correspond to a decrease of 0.7% in the solar constant, a value of large climatic significance. Limitations of the observational method and the numerical approach are pointed out, and required additional theoretical and observational efforts are indicated.
Surface Electromyographic Onset Detection Based On Statistics and Information Content
NASA Astrophysics Data System (ADS)
López, Natalia M.; Orosco, Eugenio; di Sciascio, Fernando
2011-12-01
The correct detection of the onset of muscular contraction is a diagnostic tool to neuromuscular diseases and an action trigger to control myoelectric devices. In this work, entropy and information content concepts were applied in algorithmic methods to automatic detection in surface electromyographic signals.
NASA Astrophysics Data System (ADS)
Alves, Gelio
After the sequencing of many complete genomes, we are in a post-genomic era in which the most important task has changed from gathering genetic information to organizing the mass of data as well as under standing how components interact with each other. The former is usually undertaking using bioinformatics methods, while the latter task is generally termed proteomics. Success in both parts demands correct statistical significance assignments for results found. In my dissertation. I study two concrete examples: global sequence alignment statistics and peptide sequencing/identification using mass spectrometry. High-performance liquid chromatography coupled to a mass spectrometer (HPLC/MS/MS), enabling peptide identifications and thus protein identifications, has become the tool of choice in large-scale proteomics experiments. Peptide identification is usually done by database searches methods. The lack of robust statistical significance assignment among current methods motivated the development of a novel de novo algorithm, RAId, whose score statistics then provide statistical significance for high scoring peptides found in our custom, enzyme-digested peptide library. The ease of incorporating post-translation modifications is another important feature of RAId. To organize the massive protein/DNA data accumulated, biologists often cluster proteins according to their similarity via tools such as sequence alignment. Homologous proteins share similar domains. To assess the similarity of two domains usually requires alignment from head to toe, ie. a global alignment. A good alignment score statistics with an appropriate null model enable us to distinguish the biologically meaningful similarity from chance similarity. There has been much progress in local alignment statistics, which characterize score statistics when alignments tend to appear as a short segment of the whole sequence. For global alignment, which is useful in domain alignment, there is still much room for
Extrasolar planets detections and statistics through gravitational microlensing
NASA Astrophysics Data System (ADS)
Cassan, A.
2014-10-01
Gravitational microlensing was proposed thirty years ago as a promising method to probe the existence and properties of compact objects in the Galaxy and its surroundings. The particularity and strength of the technique is based on the fact that the detection does not rely on the detection of the photon emission of the object itself, but on the way its mass affects the path of light of a background, almost aligned source. Detections thus include not only bright, but also dark objects. Today, the many successes of gravitational microlensing have largely exceeded the original promises. Microlensing contributed important results and breakthroughs in several astrophysical fields as it was used as a powerful tool to probe the Galactic structure (proper motions, extinction maps), to search for dark and compact massive objects in the halo and disk of the Milky Way, to probe the atmospheres of bulge red giant stars, to search for low-mass stars and brown dwarfs and to hunt for extrasolar planets. As an extrasolar planet detection method, microlensing nowadays stands in the top five of the successful observational techniques. Compared to other (complementary) detection methods, microlensing provides unique information on the population of exoplanets, because it allows the detection of very low-mass planets (down to the mass of the Earth) at large orbital distances from their star (0.5 to 10 AU). It is also the only technique that allows the discovery of planets at distances from Earth greater than a few kiloparsecs, up to the bulge of the Galaxy. Microlensing discoveries include the first ever detection of a cool super-Earth around an M-dwarf star, the detection of several cool Neptunes, Jupiters and super-Jupiters, as well as multi-planetary systems and brown dwarfs. So far, the least massive planet detected by microlensing has only three times the mass of the Earth and orbits a very low mass star at the edge of the brown dwarf regime. Several free-floating planetary
Detection of Significant Groups in Hierarchical Clustering by Resampling
Sebastiani, Paola; Perls, Thomas T.
2016-01-01
Hierarchical clustering is a simple and reproducible technique to rearrange data of multiple variables and sample units and visualize possible groups in the data. Despite the name, hierarchical clustering does not provide clusters automatically, and “tree-cutting” procedures are often used to identify subgroups in the data by cutting the dendrogram that represents the similarities among groups used in the agglomerative procedure. We introduce a resampling-based technique that can be used to identify cut-points of a dendrogram with a significance level based on a reference distribution for the heights of the branch points. The evaluation on synthetic data shows that the technique is robust in a variety of situations. An example with real biomarker data from the Long Life Family Study shows the usefulness of the method. PMID:27551289
Detection of Significant Groups in Hierarchical Clustering by Resampling.
Sebastiani, Paola; Perls, Thomas T
2016-01-01
Hierarchical clustering is a simple and reproducible technique to rearrange data of multiple variables and sample units and visualize possible groups in the data. Despite the name, hierarchical clustering does not provide clusters automatically, and "tree-cutting" procedures are often used to identify subgroups in the data by cutting the dendrogram that represents the similarities among groups used in the agglomerative procedure. We introduce a resampling-based technique that can be used to identify cut-points of a dendrogram with a significance level based on a reference distribution for the heights of the branch points. The evaluation on synthetic data shows that the technique is robust in a variety of situations. An example with real biomarker data from the Long Life Family Study shows the usefulness of the method. PMID:27551289
Statistical approach for detecting cancer lesions from prostate ultrasound images
NASA Astrophysics Data System (ADS)
Houston, A. G.; Premkumar, Saganti B.; Babaian, Richard J.; Pitts, David E.
1993-07-01
Sequential digitized cross-sectional ultrasound image planes of several prostates have been studied at the pixel level during the past year. The statistical distribution of gray scale values in terms of simple statistics, sample means and sample standard deviations, have been considered for estimating the differences between cross-sectional image planes of the gland due to the presence of cancer lesions. Based on a variability measure, the results for identifying the presence of cancer lesions in the peripheral zone of the gland for 25 blind test cases were found to be 64% accurate. This accuracy is higher than that obtained by visual photo interpretation of the image data, though not as high as our earlier results were indicating. Axial-view ultrasound image planes of prostate glands were obtained from the apex to the base of the gland at 2 mm intervals. Results for the 25 different prostate glands, which include pathologically confirmed benign and cancer cases, are presented.
Links to sources of cancer-related statistics, including the Surveillance, Epidemiology and End Results (SEER) Program, SEER-Medicare datasets, cancer survivor prevalence data, and the Cancer Trends Progress Report.
A Non-Parametric Surrogate-based Test of Significance for T-Wave Alternans Detection
Nemati, Shamim; Abdala, Omar; Bazán, Violeta; Yim-Yeh, Susie; Malhotra, Atul; Clifford, Gari
2010-01-01
We present a non-parametric adaptive surrogate test that allows for the differentiation of statistically significant T-Wave Alternans (TWA) from alternating patterns that can be solely explained by the statistics of noise. The proposed test is based on estimating the distribution of noise induced alternating patterns in a beat sequence from a set of surrogate data derived from repeated reshuffling of the original beat sequence. Thus, in assessing the significance of the observed alternating patterns in the data no assumptions are made about the underlying noise distribution. In addition, since the distribution of noise-induced alternans magnitudes is calculated separately for each sequence of beats within the analysis window, the method is robust to data non-stationarities in both noise and TWA. The proposed surrogate method for rejecting noise was compared to the standard noise rejection methods used with the Spectral Method (SM) and the Modified Moving Average (MMA) techniques. Using a previously described realistic multi-lead model of TWA, and real physiological noise, we demonstrate the proposed approach reduces false TWA detections, while maintaining a lower missed TWA detection compared with all the other methods tested. A simple averaging-based TWA estimation algorithm was coupled with the surrogate significance testing and was evaluated on three public databases; the Normal Sinus Rhythm Database (NRSDB), the Chronic Heart Failure Database (CHFDB) and the Sudden Cardiac Death Database (SCDDB). Differences in TWA amplitudes between each database were evaluated at matched heart rate (HR) intervals from 40 to 120 beats per minute (BPM). Using the two-sample Kolmogorov-Smirnov test, we found that significant differences in TWA levels exist between each patient group at all decades of heart rates. The most marked difference was generally found at higher heart rates, and the new technique resulted in a larger margin of separability between patient populations than
NASA Astrophysics Data System (ADS)
Krzyżak, A. T.; Jasiński, A.; Adamek, D.
2006-07-01
Qualification of the most statistically "sensitive" diffusion parameters using Magnetic Resonance (MR) Diffusion Tensor Imaging (DTI) of the control and injured spinal cord of a rat in vivo and in vitro after the trauma is reported. Injury was induced in TH12/TH13 level by a controlled "weight-drop". In vitro experiments were performed in a home-built MR microscope, with a 6.4 T magnet, in vivo samples were measured in a 9.4 T/21 horizontal magnet The aim of this work was to find the most effective diffusion parameters which are useful in the statistically significant detection of spinal cord tissue damage. Apparent diffusion tensor (ADT) weighted data measured in vivo and in vitro on control and injured rat spinal cord (RSC) in the transverse planes and analysis of the diffusion anisotropy as a function of many parameters, which allows statisticall expose of the existence of the damage are reported.
Chen, Shuo; Kang, Jian; Xing, Yishi; Wang, Guoqing
2015-12-01
Group-level functional connectivity analyses often aim to detect the altered connectivity patterns between subgroups with different clinical or psychological experimental conditions, for example, comparing cases and healthy controls. We present a new statistical method to detect differentially expressed connectivity networks with significantly improved power and lower false-positive rates. The goal of our method was to capture most differentially expressed connections within networks of constrained numbers of brain regions (by the rule of parsimony). By virtue of parsimony, the false-positive individual connectivity edges within a network are effectively reduced, whereas the informative (differentially expressed) edges are allowed to borrow strength from each other to increase the overall power of the network. We develop a test statistic for each network in light of combinatorics graph theory, and provide p-values for the networks (in the weak sense) by using permutation test with multiple-testing adjustment. We validate and compare this new approach with existing methods, including false discovery rate and network-based statistic, via simulation studies and a resting-state functional magnetic resonance imaging case-control study. The results indicate that our method can identify differentially expressed connectivity networks, whereas existing methods are limited.
Chen, Shuo; Kang, Jian; Xing, Yishi; Wang, Guoqing
2016-01-01
Group-level functional connectivity analyses often aim to detect the altered connectivity patterns between subgroups with different clinical or psychological experimental conditions, for example, comparing cases and healthy controls. We present a new statistical method to detect differentially expressed connectivity networks with significantly improved power and lower false-positive rates. The goal of our method was to capture most differentially expressed connections within networks of constrained numbers of brain regions (by the rule of parsimony). By virtue of parsimony, the false-positive individual connectivity edges within a network are effectively reduced, whereas the informative (differentially expressed) edges are allowed to borrow strength from each other to increase the overall power of the network. We develop a test statistic for each network in light of combinatorics graph theory, and provide p-values for the networks (in the weak sense) by using permutation test with multiple-testing adjustment. We validate and compare this new approach with existing methods, including false discovery rate and network-based statistic, via simulation studies and a resting-state functional magnetic resonance imaging case–control study. The results indicate that our method can identify differentially expressed connectivity networks, whereas existing methods are limited. PMID:26416398
Chen, Shuo; Kang, Jian; Xing, Yishi; Wang, Guoqing
2015-12-01
Group-level functional connectivity analyses often aim to detect the altered connectivity patterns between subgroups with different clinical or psychological experimental conditions, for example, comparing cases and healthy controls. We present a new statistical method to detect differentially expressed connectivity networks with significantly improved power and lower false-positive rates. The goal of our method was to capture most differentially expressed connections within networks of constrained numbers of brain regions (by the rule of parsimony). By virtue of parsimony, the false-positive individual connectivity edges within a network are effectively reduced, whereas the informative (differentially expressed) edges are allowed to borrow strength from each other to increase the overall power of the network. We develop a test statistic for each network in light of combinatorics graph theory, and provide p-values for the networks (in the weak sense) by using permutation test with multiple-testing adjustment. We validate and compare this new approach with existing methods, including false discovery rate and network-based statistic, via simulation studies and a resting-state functional magnetic resonance imaging case-control study. The results indicate that our method can identify differentially expressed connectivity networks, whereas existing methods are limited. PMID:26416398
Detecting modules in biological networks by edge weight clustering and entropy significance
Lecca, Paola; Re, Angela
2015-01-01
Detection of the modular structure of biological networks is of interest to researchers adopting a systems perspective for the analysis of omics data. Computational systems biology has provided a rich array of methods for network clustering. To date, the majority of approaches address this task through a network node classification based on topological or external quantifiable properties of network nodes. Conversely, numerical properties of network edges are underused, even though the information content which can be associated with network edges has augmented due to steady advances in molecular biology technology over the last decade. Properly accounting for network edges in the development of clustering approaches can become crucial to improve quantitative interpretation of omics data, finally resulting in more biologically plausible models. In this study, we present a novel technique for network module detection, named WG-Cluster (Weighted Graph CLUSTERing). WG-Cluster's notable features, compared to current approaches, lie in: (1) the simultaneous exploitation of network node and edge weights to improve the biological interpretability of the connected components detected, (2) the assessment of their statistical significance, and (3) the identification of emerging topological properties in the detected connected components. WG-Cluster utilizes three major steps: (i) an unsupervised version of k-means edge-based algorithm detects sub-graphs with similar edge weights, (ii) a fast-greedy algorithm detects connected components which are then scored and selected according to the statistical significance of their scores, and (iii) an analysis of the convolution between sub-graph mean edge weight and connected component score provides a summarizing view of the connected components. WG-Cluster can be applied to directed and undirected networks of different types of interacting entities and scales up to large omics data sets. Here, we show that WG-Cluster can be
Detecting modules in biological networks by edge weight clustering and entropy significance.
Lecca, Paola; Re, Angela
2015-01-01
Detection of the modular structure of biological networks is of interest to researchers adopting a systems perspective for the analysis of omics data. Computational systems biology has provided a rich array of methods for network clustering. To date, the majority of approaches address this task through a network node classification based on topological or external quantifiable properties of network nodes. Conversely, numerical properties of network edges are underused, even though the information content which can be associated with network edges has augmented due to steady advances in molecular biology technology over the last decade. Properly accounting for network edges in the development of clustering approaches can become crucial to improve quantitative interpretation of omics data, finally resulting in more biologically plausible models. In this study, we present a novel technique for network module detection, named WG-Cluster (Weighted Graph CLUSTERing). WG-Cluster's notable features, compared to current approaches, lie in: (1) the simultaneous exploitation of network node and edge weights to improve the biological interpretability of the connected components detected, (2) the assessment of their statistical significance, and (3) the identification of emerging topological properties in the detected connected components. WG-Cluster utilizes three major steps: (i) an unsupervised version of k-means edge-based algorithm detects sub-graphs with similar edge weights, (ii) a fast-greedy algorithm detects connected components which are then scored and selected according to the statistical significance of their scores, and (iii) an analysis of the convolution between sub-graph mean edge weight and connected component score provides a summarizing view of the connected components. WG-Cluster can be applied to directed and undirected networks of different types of interacting entities and scales up to large omics data sets. Here, we show that WG-Cluster can be
Ultrabroadband direct detection of nonclassical photon statistics at telecom wavelength.
Wakui, Kentaro; Eto, Yujiro; Benichi, Hugo; Izumi, Shuro; Yanagida, Tetsufumi; Ema, Kazuhiro; Numata, Takayuki; Fukuda, Daiji; Takeoka, Masahiro; Sasaki, Masahide
2014-01-01
Broadband light sources play essential roles in diverse fields, such as high-capacity optical communications, optical coherence tomography, optical spectroscopy, and spectrograph calibration. Although a nonclassical state from spontaneous parametric down-conversion may serve as a quantum counterpart, its detection and characterization have been a challenging task. Here we demonstrate the direct detection of photon numbers of an ultrabroadband (110 nm FWHM) squeezed state in the telecom band centred at 1535 nm wavelength, using a superconducting transition-edge sensor. The observed photon-number distributions violate Klyshko's criterion for the nonclassicality. From the observed photon-number distribution, we evaluate the second- and third-order correlation functions, and characterize a multimode structure, which implies that several tens of orthonormal modes of squeezing exist in the single optical pulse. Our results and techniques open up a new possibility to generate and characterize frequency-multiplexed nonclassical light sources for quantum info-communications technology. PMID:24694515
Ultrabroadband direct detection of nonclassical photon statistics at telecom wavelength
Wakui, Kentaro; Eto, Yujiro; Benichi, Hugo; Izumi, Shuro; Yanagida, Tetsufumi; Ema, Kazuhiro; Numata, Takayuki; Fukuda, Daiji; Takeoka, Masahiro; Sasaki, Masahide
2014-01-01
Broadband light sources play essential roles in diverse fields, such as high-capacity optical communications, optical coherence tomography, optical spectroscopy, and spectrograph calibration. Although a nonclassical state from spontaneous parametric down-conversion may serve as a quantum counterpart, its detection and characterization have been a challenging task. Here we demonstrate the direct detection of photon numbers of an ultrabroadband (110 nm FWHM) squeezed state in the telecom band centred at 1535 nm wavelength, using a superconducting transition-edge sensor. The observed photon-number distributions violate Klyshko's criterion for the nonclassicality. From the observed photon-number distribution, we evaluate the second- and third-order correlation functions, and characterize a multimode structure, which implies that several tens of orthonormal modes of squeezing exist in the single optical pulse. Our results and techniques open up a new possibility to generate and characterize frequency-multiplexed nonclassical light sources for quantum info-communications technology. PMID:24694515
Reliable detection of directional couplings using rank statistics.
Chicharro, Daniel; Andrzejak, Ralph G
2009-08-01
To detect directional couplings from time series various measures based on distances in reconstructed state spaces were introduced. These measures can, however, be biased by asymmetries in the dynamics' structure, noise color, or noise level, which are ubiquitous in experimental signals. Using theoretical reasoning and results from model systems we identify the various sources of bias and show that most of them can be eliminated by an appropriate normalization. We furthermore diminish the remaining biases by introducing a measure based on ranks of distances. This rank-based measure outperforms existing distance-based measures concerning both sensitivity and specificity for directional couplings. Therefore, our findings are relevant for a reliable detection of directional couplings from experimental signals.
Ultrabroadband direct detection of nonclassical photon statistics at telecom wavelength.
Wakui, Kentaro; Eto, Yujiro; Benichi, Hugo; Izumi, Shuro; Yanagida, Tetsufumi; Ema, Kazuhiro; Numata, Takayuki; Fukuda, Daiji; Takeoka, Masahiro; Sasaki, Masahide
2014-01-01
Broadband light sources play essential roles in diverse fields, such as high-capacity optical communications, optical coherence tomography, optical spectroscopy, and spectrograph calibration. Although a nonclassical state from spontaneous parametric down-conversion may serve as a quantum counterpart, its detection and characterization have been a challenging task. Here we demonstrate the direct detection of photon numbers of an ultrabroadband (110 nm FWHM) squeezed state in the telecom band centred at 1535 nm wavelength, using a superconducting transition-edge sensor. The observed photon-number distributions violate Klyshko's criterion for the nonclassicality. From the observed photon-number distribution, we evaluate the second- and third-order correlation functions, and characterize a multimode structure, which implies that several tens of orthonormal modes of squeezing exist in the single optical pulse. Our results and techniques open up a new possibility to generate and characterize frequency-multiplexed nonclassical light sources for quantum info-communications technology.
Statistical detection of the mid-Pleistocene transition
Maasch, K.A. )
1988-01-01
Statistical methods have been used to show quantitatively that the transition in mean and variance observed in delta O-18 records during the middle of the Pleistocene was abrupt. By applying these methods to all of the available records spanning the entire Pleistocene, it appears that this jump was global and primarily represents an increase in ice mass. At roughly the same time an abrupt decrease in sea surface temperature also occurred, indicative of sudden global cooling. This kind of evidence suggests a possible bifurcation of the climate system that must be accounted for in a complete explanation of the ice ages. Theoretical models including internal dynamics are capable of exhibiting this kind of rapid transition. 50 refs.
Two New Statistics To Detect Answer Copying. Research Report.
ERIC Educational Resources Information Center
Sotaridona, Leonardo S.; Meijer, Rob R.
Two new indices to detect answer copying on a multiple-choice test, S(1) and S(2) (subscripts), are proposed. The S(1) index is similar to the K-index (P. Holland, 1996) and the K-overscore(2), (K2) index (L. Sotaridona and R. Meijer, in press), but the distribution of the number of matching incorrect answers of the source (examinee s) and the…
A powerful weighted statistic for detecting group differences of directed biological networks
Yuan, Zhongshang; Ji, Jiadong; Zhang, Xiaoshuai; Xu, Jing; Ma, Daoxin; Xue, Fuzhong
2016-01-01
Complex disease is largely determined by a number of biomolecules interwoven into networks, rather than a single biomolecule. Different physiological conditions such as cases and controls may manifest as different networks. Statistical comparison between biological networks can provide not only new insight into the disease mechanism but statistical guidance for drug development. However, the methods developed in previous studies are inadequate to capture the changes in both the nodes and edges, and often ignore the network structure. In this study, we present a powerful weighted statistical test for group differences of directed biological networks, which is independent of the network attributes and can capture the changes in both the nodes and edges, as well as simultaneously accounting for the network structure through putting more weights on the difference of nodes locating on relatively more important position. Simulation studies illustrate that this method had better performance than previous ones under various sample sizes and network structures. One application to GWAS of leprosy successfully identifies the specific gene interaction network contributing to leprosy. Another real data analysis significantly identifies a new biological network, which is related to acute myeloid leukemia. One potential network responsible for lung cancer has also been significantly detected. The source R code is available on our website. PMID:27686331
Parmenter, C. A.; Yates, T. L.; Parmenter, R. R.; Dunnum, J. L.
1999-01-01
A long-term monitoring program begun 1 year after the epidemic of hantavirus pulmonary syndrome in the U.S. Southwest tracked rodent density changes through time and among sites and related these changes to hantavirus infection rates in various small-mammal reservoir species and human disease outbreaks. We assessed the statistical sensitivity of the program's field design and tested for potential biases in population estimates due to unintended deaths of rodents. Analyzing data from two sites in New Mexico from 1994 to 1998, we found that for many species of Peromyscus, Reithrodontomys, Neotoma, Dipodomys, and Perognathus, the monitoring program detected species-specific spatial and temporal differences in rodent densities; trap-related deaths did not significantly affect long-term population estimates. The program also detected a short-term increase in rodent densities in the winter of 1997-98, demonstrating its usefulness in identifying conditions conducive to increased risk for human disease. PMID:10081679
Statistical Fault Detection for Parallel Applications with AutomaDeD
Bronevetsky, G; Laguna, I; Bagchi, S; de Supinski, B R; Ahn, D; Schulz, M
2010-03-23
Today's largest systems have over 100,000 cores, with million-core systems expected over the next few years. The large component count means that these systems fail frequently and often in very complex ways, making them difficult to use and maintain. While prior work on fault detection and diagnosis has focused on faults that significantly reduce system functionality, the wide variety of failure modes in modern systems makes them likely to fail in complex ways that impair system performance but are difficult to detect and diagnose. This paper presents AutomaDeD, a statistical tool that models the timing behavior of each application task and tracks its behavior to identify any abnormalities. If any are observed, AutomaDeD can immediately detect them and report to the system administrator the task where the problem began. This identification of the fault's initial manifestation can provide administrators with valuable insight into the fault's root causes, making it significantly easier and cheaper for them to understand and repair it. Our experimental evaluation shows that AutomaDeD detects a wide range of faults immediately after they occur 80% of the time, with a low false-positive rate. Further, it identifies weaknesses of the current approach that motivate future research.
SAR target detection by fusion of CFAR, variance, and fractal statistics
NASA Astrophysics Data System (ADS)
Kaplan, Lance M.; Murenzi, Romain; Namuduri, Kameswara R.
1998-07-01
Two texture-based and one amplitude-based features are evaluated as detection statistics for synthetic aperture radar (SAR) imagery. The statistics include a local variance, an extended fractal, and a two-parameter CFAR feature. The paper compares the effectiveness of focus of attention (FOA) algorithms that consist of any number of combinations of the three statistics. The public MSTAR database is used to derive receiver-operator-characteristic (ROC) curves for the different detectors at various signal-to-clutter rations (SCR). The database contains one foot resolution X-band SAR imagery. The results in the paper indicate that the extended fractal statistic provides the best target/clutter discrimination, and the variance statistic is the most robust against SCR. In fact, the extended fractal statistic combines the intensity difference information used also by the CFAR feature with the spatial extent of the higher intensity pixels to generate an attractive detection statistics.
ERIC Educational Resources Information Center
Hojat, Mohammadreza; Xu, Gang
2004-01-01
Effect Sizes (ES) are an increasingly important index used to quantify the degree of practical significance of study results. This paper gives an introduction to the computation and interpretation of effect sizes from the perspective of the consumer of the research literature. The key points made are: (1) "ES" is a useful indicator of the…
A High-Order Statistical Tensor Based Algorithm for Anomaly Detection in Hyperspectral Imagery
NASA Astrophysics Data System (ADS)
Geng, Xiurui; Sun, Kang; Ji, Luyan; Zhao, Yongchao
2014-11-01
Recently, high-order statistics have received more and more interest in the field of hyperspectral anomaly detection. However, most of the existing high-order statistics based anomaly detection methods require stepwise iterations since they are the direct applications of blind source separation. Moreover, these methods usually produce multiple detection maps rather than a single anomaly distribution image. In this study, we exploit the concept of coskewness tensor and propose a new anomaly detection method, which is called COSD (coskewness detector). COSD does not need iteration and can produce single detection map. The experiments based on both simulated and real hyperspectral data sets verify the effectiveness of our algorithm.
A high-order statistical tensor based algorithm for anomaly detection in hyperspectral imagery.
Geng, Xiurui; Sun, Kang; Ji, Luyan; Zhao, Yongchao
2014-01-01
Recently, high-order statistics have received more and more interest in the field of hyperspectral anomaly detection. However, most of the existing high-order statistics based anomaly detection methods require stepwise iterations since they are the direct applications of blind source separation. Moreover, these methods usually produce multiple detection maps rather than a single anomaly distribution image. In this study, we exploit the concept of coskewness tensor and propose a new anomaly detection method, which is called COSD (coskewness detector). COSD does not need iteration and can produce single detection map. The experiments based on both simulated and real hyperspectral data sets verify the effectiveness of our algorithm. PMID:25366706
Statistically qualified neuro-analytic failure detection method and system
Vilim, Richard B.; Garcia, Humberto E.; Chen, Frederick W.
2002-03-02
An apparatus and method for monitoring a process involve development and application of a statistically qualified neuro-analytic (SQNA) model to accurately and reliably identify process change. The development of the SQNA model is accomplished in two stages: deterministic model adaption and stochastic model modification of the deterministic model adaptation. Deterministic model adaption involves formulating an analytic model of the process representing known process characteristics, augmenting the analytic model with a neural network that captures unknown process characteristics, and training the resulting neuro-analytic model by adjusting the neural network weights according to a unique scaled equation error minimization technique. Stochastic model modification involves qualifying any remaining uncertainty in the trained neuro-analytic model by formulating a likelihood function, given an error propagation equation, for computing the probability that the neuro-analytic model generates measured process output. Preferably, the developed SQNA model is validated using known sequential probability ratio tests and applied to the process as an on-line monitoring system. Illustrative of the method and apparatus, the method is applied to a peristaltic pump system.
Key statistics related to CO/sub 2/ emissions: Significant contributing countries
Kellogg, M.A.; Edmonds, J.A.; Scott, M.J.; Pomykala, J.S.
1987-07-01
This country selection task report describes and applies a methodology for identifying a set of countries responsible for significant present and anticipated future emissions of CO/sub 2/ and other radiatively important gases (RIGs). The identification of countries responsible for CO/sub 2/ and other RIGs emissions will help determine to what extent a select number of countries might be capable of influencing future emissions. Once identified, those countries could potentially exercise cooperative collective control of global emissions and thus mitigate the associated adverse affects of those emissions. The methodology developed consists of two approaches: the resource approach and the emissions approach. While conceptually very different, both approaches yield the same fundamental conclusion. The core of any international initiative to control global emissions must include three key countries: the US, USSR, and the People's Republic of China. It was also determined that broader control can be achieved through the inclusion of sixteen additional countries with significant contributions to worldwide emissions.
Wille, Anja; Hoh, Josephine; Ott, Jurg
2003-12-01
In complex traits, multiple disease loci presumably interact to produce the disease. For this reason, even with high-resolution single nucleotide polymorphism (SNP) marker maps, it has been difficult to map susceptibility loci by conventional locus-by-locus methods. Fine mapping strategies are needed that allow for the simultaneous detection of interacting disease loci while handling large numbers of densely spaced markers. For this purpose, sum statistics were recently proposed as a first-stage analysis method for case-control association studies with SNPs. Via sums of single-marker statistics, information over multiple disease-associated markers is combined and, with a global significance value alpha, a small set of "interesting" markers is selected for further analysis. Here, the statistical properties of such approaches are examined by computer simulation. It is shown that sum statistics can often be successfully applied when marker-by-marker approaches fail to detect association. Compared with Bonferroni or False Discovery Rate (FDR) procedures, sum statistics have greater power, and more disease loci can be detected. However, in studies with tightly linked markers, simple sum statistics can be suboptimal, since the intermarker correlation is ignored. A method is presented that takes the correlation structure among marker loci into account when marker statistics are combined.
ERIC Educational Resources Information Center
Oshima, T. C.; Raju, Nambury S.; Nanda, Alice O.
2006-01-01
A new item parameter replication method is proposed for assessing the statistical significance of the noncompensatory differential item functioning (NCDIF) index associated with the differential functioning of items and tests framework. In this new method, a cutoff score for each item is determined by obtaining a (1-alpha ) percentile rank score…
NASA Technical Reports Server (NTRS)
Staubert, R.
1985-01-01
Methods for calculating the statistical significance of excess events and the interpretation of the formally derived values are discussed. It is argued that a simple formula for a conservative estimate should generally be used in order to provide a common understanding of quoted values.
Tables of square-law signal detection statistics for Hann spectra with 50 percent overlap
NASA Technical Reports Server (NTRS)
Deans, Stanley R.; Cullers, D. Kent
1991-01-01
The Search for Extraterrestrial Intelligence, currently being planned by NASA, will require that an enormous amount of data be analyzed in real time by special purpose hardware. It is expected that overlapped Hann data windows will play an important role in this analysis. In order to understand the statistical implication of this approach, it has been necessary to compute detection statistics for overlapped Hann spectra. Tables of signal detection statistics are given for false alarm rates from 10(exp -14) to 10(exp -1) and signal detection probabilities from 0.50 to 0.99; the number of computed spectra ranges from 4 to 2000.
Kavvoura, Fotini K.; McQueen, Matthew B.; Khoury, Muin J.; Tanzi, Rudolph E.; Bertram, Lars
2008-01-01
The authors evaluated whether there is an excess of statistically significant results in studies of genetic associations with Alzheimer's disease reflecting either between-study heterogeneity or bias. Among published articles on genetic associations entered into the comprehensive AlzGene database (www.alzgene.org) through January 31, 2007, 1,348 studies included in 175 meta-analyses with 3 or more studies each were analyzed. The number of observed studies (O) with statistically significant results (P = 0.05 threshold) was compared with the expected number (E) under different assumptions for the magnitude of the effect size. In the main analysis, the plausible effect size of each association was the summary effect presented in the respective meta-analysis. Overall, 19 meta-analyses (all with eventually nonsignificant summary effects) had a documented excess of O over E: Typically single studies had significant effects pointing in opposite directions and early summary effects were dissipated over time. Across the whole domain, O was 235 (17.4%), while E was 164.8 (12.2%) (P < 10−6). The excess showed a predilection for meta-analyses with nonsignificant summary effects and between-study heterogeneity. The excess was seen for all levels of statistical significance and also for studies with borderline P values (P = 0.05–0.10). The excess of significant findings may represent significance-chasing biases in a setting of massive testing. PMID:18779388
Anomaly detection based on the statistics of hyperspectral imagery
NASA Astrophysics Data System (ADS)
Catterall, Stephen P.
2004-10-01
The purpose of this paper is to introduce a new anomaly detection algorithm for application to hyperspectral imaging (HSI) data. The algorithm uses characterisations of the joint (among wavebands) probability density function (pdf) of HSI data. Traditionally, the pdf has been assumed to be multivariate Gaussian or a mixture of multivariate Gaussians. Other distributions have been considered by previous authors, in particular Elliptically Contoured Distributions (ECDs). In this paper we focus on another distribution, which has only recently been defined and studied. This distribution has a more flexible and extensive set of parameters than the multivariate Gaussian does, yet the pdf takes on a relatively simple mathematical form. The result of all this is a model for the pdf of a hyperspectral image, consisting of a mixture of these distributions. Once a model for the pdf of a hyperspectral image has been obtained, it can be incorporated into an anomaly detector. The new anomaly detector is implemented and applied to some medium wave infra-red (MWIR) hyperspectral imagery. Comparison is made with a well-known anomaly detector, and it will be seen that the results are promising.
NASA Astrophysics Data System (ADS)
Wang, Ping; Dai, Xin-Gang
2016-09-01
The term "APEC Blue" has been created to describe the clear sky days since the Asia-Pacific Economic Cooperation (APEC) summit held in Beijing during November 5-11, 2014. The duration of the APEC Blue is detected from November 1 to November 14 (hereafter Blue Window) by moving t test in statistics. Observations show that APEC Blue corresponds to low air pollution with respect to PM2.5, PM10, SO2, and NO2 under strict emission-control measures (ECMs) implemented in Beijing and surrounding areas. Quantitative assessment shows that ECM is more effective on reducing aerosols than the chemical constituents. Statistical investigation has revealed that the window also resulted from intensified wind variability, as well as weakened static stability of atmosphere (SSA). The wind and ECMs played key roles in reducing air pollution during November 1-7 and 11-13, and strict ECMs and weak SSA become dominant during November 7-10 under weak wind environment. Moving correlation manifests that the emission reduction for aerosols can increase the apparent wind cleanup effect, leading to significant negative correlations of them, and the period-wise changes in emission rate can be well identified by multi-scale correlations basing on wavelet decomposition. In short, this case study manifests statistically how human interference modified air quality in the mega city through controlling local and surrounding emissions in association with meteorological condition.
Potts, T.T.; Hylko, J.M.; Almond, D.
2007-07-01
A company's overall safety program becomes an important consideration to continue performing work and for procuring future contract awards. When injuries or accidents occur, the employer ultimately loses on two counts - increased medical costs and employee absences. This paper summarizes the human and organizational components that contributed to successful safety programs implemented by WESKEM, LLC's Environmental, Safety, and Health Departments located in Paducah, Kentucky, and Oak Ridge, Tennessee. The philosophy of 'safety, compliance, and then production' and programmatic components implemented at the start of the contracts were qualitatively identified as contributing factors resulting in a significant accumulation of safe work hours and an Experience Modification Rate (EMR) of <1.0. Furthermore, a study by the Associated General Contractors of America quantitatively validated components, already found in the WESKEM, LLC programs, as contributing factors to prevent employee accidents and injuries. Therefore, an investment in the human and organizational components now can pay dividends later by reducing the EMR, which is the key to reducing Workers' Compensation premiums. Also, knowing your employees' demographics and taking an active approach to evaluate and prevent fatigue may help employees balance work and non-work responsibilities. In turn, this approach can assist employers in maintaining a healthy and productive workforce. For these reasons, it is essential that safety needs be considered as the starting point when performing work. (authors)
Wang, Q.; Denton, D.L.; Shukla, R.
2000-01-01
As a follow up to the recommendations of the September 1995 SETAC Pellston Workshop on Whole Effluent Toxicity (WET) on test methods and appropriate endpoints, this paper will discuss the applications and statistical properties of using a statistical criterion of minimum significant difference (MSD). The authors examined the upper limits of acceptable MSDs as acceptance criterion in the case of normally distributed data. The implications of this approach are examined in terms of false negative rate as well as false positive rate. Results indicated that the proposed approach has reasonable statistical properties. Reproductive data from short-term chronic WET test with Ceriodaphnia dubia tests were used to demonstrate the applications of the proposed approach. The data were collected by the North Carolina Department of Environment, Health, and Natural Resources (Raleigh, NC, USA) as part of their National Pollutant Discharge Elimination System program.
Fast and accurate border detection in dermoscopy images using statistical region merging
NASA Astrophysics Data System (ADS)
Celebi, M. Emre; Kingravi, Hassan A.; Iyatomi, Hitoshi; Lee, JeongKyu; Aslandogan, Y. Alp; Van Stoecker, William; Moss, Randy; Malters, Joseph M.; Marghoob, Ashfaq A.
2007-03-01
As a result of advances in skin imaging technology and the development of suitable image processing techniques during the last decade, there has been a significant increase of interest in the computer-aided diagnosis of melanoma. Automated border detection is one of the most important steps in this procedure, since the accuracy of the subsequent steps crucially depends on it. In this paper, a fast and unsupervised approach to border detection in dermoscopy images of pigmented skin lesions based on the Statistical Region Merging algorithm is presented. The method is tested on a set of 90 dermoscopy images. The border detection error is quantified by a metric in which a set of dermatologist-determined borders is used as the ground-truth. The proposed method is compared to six state-of-the-art automated methods (optimized histogram thresholding, orientation-sensitive fuzzy c-means, gradient vector flow snakes, dermatologist-like tumor extraction algorithm, meanshift clustering, and the modified JSEG method) and borders determined by a second dermatologist. The results demonstrate that the presented method achieves both fast and accurate border detection in dermoscopy images.
NASA Technical Reports Server (NTRS)
Gofford, Jason; Reeves, James N.; Tombesi, Francesco; Braito, Valentina; Turner, T. Jane; Miller, Lance; Cappi, Massimo
2013-01-01
We present the results of a new spectroscopic study of Fe K-band absorption in active galactic nuclei (AGN). Using data obtained from the Suzaku public archive we have performed a statistically driven blind search for Fe XXV Healpha and/or Fe XXVI Lyalpha absorption lines in a large sample of 51 Type 1.0-1.9 AGN. Through extensive Monte Carlo simulations we find that statistically significant absorption is detected at E greater than or approximately equal to 6.7 keV in 20/51 sources at the P(sub MC) greater than or equal tov 95 per cent level, which corresponds to approximately 40 per cent of the total sample. In all cases, individual absorption lines are detected independently and simultaneously amongst the two (or three) available X-ray imaging spectrometer detectors, which confirms the robustness of the line detections. The most frequently observed outflow phenomenology consists of two discrete absorption troughs corresponding to Fe XXV Healpha and Fe XXVI Lyalpha at a common velocity shift. From xstar fitting the mean column density and ionization parameter for the Fe K absorption components are log (N(sub H) per square centimeter)) is approximately equal to 23 and log (Xi/erg centimeter per second) is approximately equal to 4.5, respectively. Measured outflow velocities span a continuous range from less than1500 kilometers per second up to approximately100 000 kilometers per second, with mean and median values of approximately 0.1 c and approximately 0.056 c, respectively. The results of this work are consistent with those recently obtained using XMM-Newton and independently provides strong evidence for the existence of very highly ionized circumnuclear material in a significant fraction of both radio-quiet and radio-loud AGN in the local universe.
Parameter-space correlations of the optimal statistic for continuous gravitational-wave detection
Pletsch, Holger J.
2008-11-15
The phase parameters of matched-filtering searches for continuous gravitational-wave signals are sky position, frequency, and frequency time-derivatives. The space of these parameters features strong global correlations in the optimal detection statistic. For observation times smaller than 1 yr, the orbital motion of the Earth leads to a family of global-correlation equations which describes the 'global maximum structure' of the detection statistic. The solution to each of these equations is a different hypersurface in parameter space. The expected detection statistic is maximal at the intersection of these hypersurfaces. The global maximum structure of the detection statistic from stationary instrumental-noise artifacts is also described by the global-correlation equations. This permits the construction of a veto method which excludes false candidate events.
A flexible spatial scan statistic with a restricted likelihood ratio for detecting disease clusters.
Tango, Toshiro; Takahashi, Kunihiko
2012-12-30
Spatial scan statistics are widely used tools for detection of disease clusters. Especially, the circular spatial scan statistic proposed by Kulldorff (1997) has been utilized in a wide variety of epidemiological studies and disease surveillance. However, as it cannot detect noncircular, irregularly shaped clusters, many authors have proposed different spatial scan statistics, including the elliptic version of Kulldorff's scan statistic. The flexible spatial scan statistic proposed by Tango and Takahashi (2005) has also been used for detecting irregularly shaped clusters. However, this method sets a feasible limitation of a maximum of 30 nearest neighbors for searching candidate clusters because of heavy computational load. In this paper, we show a flexible spatial scan statistic implemented with a restricted likelihood ratio proposed by Tango (2008) to (1) eliminate the limitation of 30 nearest neighbors and (2) to have surprisingly much less computational time than the original flexible spatial scan statistic. As a side effect, it is shown to be able to detect clusters with any shape reasonably well as the relative risk of the cluster becomes large via Monte Carlo simulation. We illustrate the proposed spatial scan statistic with data on mortality from cerebrovascular disease in the Tokyo Metropolitan area, Japan.
A Statistical Analysis of Automated and Manually Detected Fires Using Environmental Satellites
NASA Astrophysics Data System (ADS)
Ruminski, M. G.; McNamara, D.
2003-12-01
The National Environmental Satellite and Data Information Service (NESDIS) of the National Oceanic and Atmospheric Administration (NOAA) has been producing an analysis of fires and smoke over the US since 1998. This product underwent significant enhancement in June 2002 with the introduction of the Hazard Mapping System (HMS), an interactive workstation based system that displays environmental satellite imagery (NOAA Geostationary Operational Environmental Satellite (GOES), NOAA Polar Operational Environmental Satellite (POES) and National Aeronautics and Space Administration (NASA) MODIS data) and fire detects from the automated algorithms for each of the satellite sensors. The focus of this presentation is to present statistics compiled on the fire detects since November 2002. The Automated Biomass Burning Algorithm (ABBA) detects fires using GOES East and GOES West imagery. The Fire Identification, Mapping and Monitoring Algorithm (FIMMA) utilizes NOAA POES 15/16/17 imagery and the MODIS algorithm uses imagery from the MODIS instrument on the Terra and Aqua spacecraft. The HMS allows satellite analysts to inspect and interrogate the automated fire detects and the input satellite imagery. The analyst can then delete those detects that are felt to be false alarms and/or add fire points that the automated algorithms have not selected. Statistics are compiled for the number of automated detects from each of the algorithms, the number of automated detects that are deleted and the number of fire points added by the analyst for the contiguous US and immediately adjacent areas of Mexico and Canada. There is no attempt to distinguish between wildfires and control or agricultural fires. A detailed explanation of the automated algorithms is beyond the scope of this presentation. However, interested readers can find a more thorough description by going to www.ssd.noaa.gov/PS/FIRE/hms.html and scrolling down to Individual Fire Layers. For the period November 2002 thru August
A Fast Framework for Abrupt Change Detection Based on Binary Search Trees and Kolmogorov Statistic.
Qi, Jin-Peng; Qi, Jie; Zhang, Qing
2016-01-01
Change-Point (CP) detection has attracted considerable attention in the fields of data mining and statistics; it is very meaningful to discuss how to quickly and efficiently detect abrupt change from large-scale bioelectric signals. Currently, most of the existing methods, like Kolmogorov-Smirnov (KS) statistic and so forth, are time-consuming, especially for large-scale datasets. In this paper, we propose a fast framework for abrupt change detection based on binary search trees (BSTs) and a modified KS statistic, named BSTKS (binary search trees and Kolmogorov statistic). In this method, first, two binary search trees, termed as BSTcA and BSTcD, are constructed by multilevel Haar Wavelet Transform (HWT); second, three search criteria are introduced in terms of the statistic and variance fluctuations in the diagnosed time series; last, an optimal search path is detected from the root to leaf nodes of two BSTs. The studies on both the synthetic time series samples and the real electroencephalograph (EEG) recordings indicate that the proposed BSTKS can detect abrupt change more quickly and efficiently than KS, t-statistic (t), and Singular-Spectrum Analyses (SSA) methods, with the shortest computation time, the highest hit rate, the smallest error, and the highest accuracy out of four methods. This study suggests that the proposed BSTKS is very helpful for useful information inspection on all kinds of bioelectric time series signals. PMID:27413364
Franco, Ana; Gaillard, Vinciane; Cleeremans, Axel; Destrebecqz, Arnaud
2015-12-01
Statistical learning can be used to extract the words from continuous speech. Gómez, Bion, and Mehler (Language and Cognitive Processes, 26, 212-223, 2011) proposed an online measure of statistical learning: They superimposed auditory clicks on a continuous artificial speech stream made up of a random succession of trisyllabic nonwords. Participants were instructed to detect these clicks, which could be located either within or between words. The results showed that, over the length of exposure, reaction times (RTs) increased more for within-word than for between-word clicks. This result has been accounted for by means of statistical learning of the between-word boundaries. However, even though statistical learning occurs without an intention to learn, it nevertheless requires attentional resources. Therefore, this process could be affected by a concurrent task such as click detection. In the present study, we evaluated the extent to which the click detection task indeed reflects successful statistical learning. Our results suggest that the emergence of RT differences between within- and between-word click detection is neither systematic nor related to the successful segmentation of the artificial language. Therefore, instead of being an online measure of learning, the click detection task seems to interfere with the extraction of statistical regularities.
A Fast Framework for Abrupt Change Detection Based on Binary Search Trees and Kolmogorov Statistic
Qi, Jin-Peng; Qi, Jie; Zhang, Qing
2016-01-01
Change-Point (CP) detection has attracted considerable attention in the fields of data mining and statistics; it is very meaningful to discuss how to quickly and efficiently detect abrupt change from large-scale bioelectric signals. Currently, most of the existing methods, like Kolmogorov-Smirnov (KS) statistic and so forth, are time-consuming, especially for large-scale datasets. In this paper, we propose a fast framework for abrupt change detection based on binary search trees (BSTs) and a modified KS statistic, named BSTKS (binary search trees and Kolmogorov statistic). In this method, first, two binary search trees, termed as BSTcA and BSTcD, are constructed by multilevel Haar Wavelet Transform (HWT); second, three search criteria are introduced in terms of the statistic and variance fluctuations in the diagnosed time series; last, an optimal search path is detected from the root to leaf nodes of two BSTs. The studies on both the synthetic time series samples and the real electroencephalograph (EEG) recordings indicate that the proposed BSTKS can detect abrupt change more quickly and efficiently than KS, t-statistic (t), and Singular-Spectrum Analyses (SSA) methods, with the shortest computation time, the highest hit rate, the smallest error, and the highest accuracy out of four methods. This study suggests that the proposed BSTKS is very helpful for useful information inspection on all kinds of bioelectric time series signals. PMID:27413364
A Fast Framework for Abrupt Change Detection Based on Binary Search Trees and Kolmogorov Statistic.
Qi, Jin-Peng; Qi, Jie; Zhang, Qing
2016-01-01
Change-Point (CP) detection has attracted considerable attention in the fields of data mining and statistics; it is very meaningful to discuss how to quickly and efficiently detect abrupt change from large-scale bioelectric signals. Currently, most of the existing methods, like Kolmogorov-Smirnov (KS) statistic and so forth, are time-consuming, especially for large-scale datasets. In this paper, we propose a fast framework for abrupt change detection based on binary search trees (BSTs) and a modified KS statistic, named BSTKS (binary search trees and Kolmogorov statistic). In this method, first, two binary search trees, termed as BSTcA and BSTcD, are constructed by multilevel Haar Wavelet Transform (HWT); second, three search criteria are introduced in terms of the statistic and variance fluctuations in the diagnosed time series; last, an optimal search path is detected from the root to leaf nodes of two BSTs. The studies on both the synthetic time series samples and the real electroencephalograph (EEG) recordings indicate that the proposed BSTKS can detect abrupt change more quickly and efficiently than KS, t-statistic (t), and Singular-Spectrum Analyses (SSA) methods, with the shortest computation time, the highest hit rate, the smallest error, and the highest accuracy out of four methods. This study suggests that the proposed BSTKS is very helpful for useful information inspection on all kinds of bioelectric time series signals.
Willems, Sander; Fraiture, Marie-Alice; Deforce, Dieter; De Keersmaecker, Sigrid C J; De Loose, Marc; Ruttink, Tom; Herman, Philippe; Van Nieuwerburgh, Filip; Roosens, Nancy
2016-02-01
Because the number and diversity of genetically modified (GM) crops has significantly increased, their analysis based on real-time PCR (qPCR) methods is becoming increasingly complex and laborious. While several pioneers already investigated Next Generation Sequencing (NGS) as an alternative to qPCR, its practical use has not been assessed for routine analysis. In this study a statistical framework was developed to predict the number of NGS reads needed to detect transgene sequences, to prove their integration into the host genome and to identify the specific transgene event in a sample with known composition. This framework was validated by applying it to experimental data from food matrices composed of pure GM rice, processed GM rice (noodles) or a 10% GM/non-GM rice mixture, revealing some influential factors. Finally, feasibility of NGS for routine analysis of GM crops was investigated by applying the framework to samples commonly encountered in routine analysis of GM crops. PMID:26304412
Willems, Sander; Fraiture, Marie-Alice; Deforce, Dieter; De Keersmaecker, Sigrid C J; De Loose, Marc; Ruttink, Tom; Herman, Philippe; Van Nieuwerburgh, Filip; Roosens, Nancy
2016-02-01
Because the number and diversity of genetically modified (GM) crops has significantly increased, their analysis based on real-time PCR (qPCR) methods is becoming increasingly complex and laborious. While several pioneers already investigated Next Generation Sequencing (NGS) as an alternative to qPCR, its practical use has not been assessed for routine analysis. In this study a statistical framework was developed to predict the number of NGS reads needed to detect transgene sequences, to prove their integration into the host genome and to identify the specific transgene event in a sample with known composition. This framework was validated by applying it to experimental data from food matrices composed of pure GM rice, processed GM rice (noodles) or a 10% GM/non-GM rice mixture, revealing some influential factors. Finally, feasibility of NGS for routine analysis of GM crops was investigated by applying the framework to samples commonly encountered in routine analysis of GM crops.
Detection and implication of significant temporal b-value variation during earthquake sequences
NASA Astrophysics Data System (ADS)
Gulia, Laura; Tormann, Thessa; Schorlemmer, Danijel; Wiemer, Stefan
2016-04-01
Earthquakes tend to cluster in space and time and periods of increased seismic activity are also periods of increased seismic hazard. Forecasting models currently used in statistical seismology and in Operational Earthquake Forecasting (e.g. ETAS) consider the spatial and temporal changes in the activity rates whilst the spatio-temporal changes in the earthquake size distribution, the b-value, are not included. Laboratory experiments on rock samples show an increasing relative proportion of larger events as the system approaches failure, and a sudden reversal of this trend after the main event. The increasing fraction of larger events during the stress increase period can be mathematically represented by a systematic b-value decrease, while the b-value increases immediately following the stress release. We investigate whether these lab-scale observations also apply to natural earthquake sequences and can help to improve our understanding of the physical processes generating damaging earthquakes. A number of large events nucleated in low b-value regions and spatial b-value variations have been extensively documented in the past. Detecting temporal b-value evolution with confidence is more difficult, one reason being the very different scales that have been suggested for a precursory drop in b-value, from a few days to decadal scale gradients. We demonstrate with the results of detailed case studies of the 2009 M6.3 L'Aquila and 2011 M9 Tohoku earthquakes that significant and meaningful temporal b-value variability can be detected throughout the sequences, which e.g. suggests that foreshock probabilities are not generic but subject to significant spatio-temporal variability. Such potential conclusions require and motivate the systematic study of many sequences to investigate whether general patterns exist that might eventually be useful for time-dependent or even real-time seismic hazard assessment.
Metoyer, Candace N.; Walsh, Stephen J.; Tardiff, Mark F.; Chilton, Lawrence
2008-10-30
The detection and identification of weak gaseous plumes using thermal imaging data is complicated by many factors. These include variability due to atmosphere, ground and plume temperature, and background clutter. This paper presents an analysis of one formulation of the physics-based model that describes the at-sensor observed radiance. The motivating question for the analyses performed in this paper is as follows. Given a set of backgrounds, is there a way to predict the background over which the probability of detecting a given chemical will be the highest? Two statistics were developed to address this question. These statistics incorporate data from the long-wave infrared band to predict the background over which chemical detectability will be the highest. These statistics can be computed prior to data collection. As a preliminary exploration into the predictive ability of these statistics, analyses were performed on synthetic hyperspectral images. Each image contained one chemical (either carbon tetrachloride or ammonia) spread across six distinct background types. The statistics were used to generate predictions for the background ranks. Then, the predicted ranks were compared to the empirical ranks obtained from the analyses of the synthetic images. For the simplified images under consideration, the predicted and empirical ranks showed a promising amount of agreement. One statistic accurately predicted the best and worst background for detection in all of the images. Future work may include explorations of more complicated plume ingredients, background types, and noise structures.
Garud, Nandita R; Rosenberg, Noah A
2015-06-01
Soft selective sweeps represent an important form of adaptation in which multiple haplotypes bearing adaptive alleles rise to high frequency. Most statistical methods for detecting selective sweeps from genetic polymorphism data, however, have focused on identifying hard selective sweeps in which a favored allele appears on a single haplotypic background; these methods might be underpowered to detect soft sweeps. Among exceptions is the set of haplotype homozygosity statistics introduced for the detection of soft sweeps by Garud et al. (2015). These statistics, examining frequencies of multiple haplotypes in relation to each other, include H12, a statistic designed to identify both hard and soft selective sweeps, and H2/H1, a statistic that conditional on high H12 values seeks to distinguish between hard and soft sweeps. A challenge in the use of H2/H1 is that its range depends on the associated value of H12, so that equal H2/H1 values might provide different levels of support for a soft sweep model at different values of H12. Here, we enhance the H12 and H2/H1 haplotype homozygosity statistics for selective sweep detection by deriving the upper bound on H2/H1 as a function of H12, thereby generating a statistic that normalizes H2/H1 to lie between 0 and 1. Through a reanalysis of resequencing data from inbred lines of Drosophila, we show that the enhanced statistic both strengthens interpretations obtained with the unnormalized statistic and leads to empirical insights that are less readily apparent without the normalization. PMID:25891325
Garud, Nandita R.; Rosenberg, Noah A.
2015-01-01
Soft selective sweeps represent an important form of adaptation in which multiple haplotypes bearing adaptive alleles rise to high frequency. Most statistical methods for detecting selective sweeps from genetic polymorphism data, however, have focused on identifying hard selective sweeps in which a favored allele appears on a single haplotypic background; these methods might be underpowered to detect soft sweeps. Among exceptions is the set of haplotype homozygosity statistics introduced for the detection of soft sweeps by Garud et al. (2015). These statistics, examining frequencies of multiple haplotypes in relation to each other, include H12, a statistic designed to identify both hard and soft selective sweeps, and H2/H1, a statistic that conditional on high H12 values seeks to distinguish between hard and soft sweeps. A challenge in the use of H2/H1 is that its range depends on the associated value of H12, so that equal H2/H1 values might provide different levels of support for a soft sweep model at different values of H12. Here, we enhance the H12 and H2/H1 haplotype homozygosity statistics for selective sweep detection by deriving the upper bound on H2/H1 as a function of H12, thereby generating a statistic that normalizes H2/H1 to lie between 0 and 1. Through a reanalysis of resequencing data from inbred lines of Drosophila, we show that the enhanced statistic both strengthens interpretations obtained with the unnormalized statistic and leads to empirical insights that are less readily apparent without the normalization. PMID:25891325
NASA Astrophysics Data System (ADS)
Fujimoto, K.; Yanagisawa, T.; Uetsuhara, M.
Automated detection and tracking of faint objects in optical, or bearing-only, sensor imagery is a topic of immense interest in space surveillance. Robust methods in this realm will lead to better space situational awareness (SSA) while reducing the cost of sensors and optics. They are especially relevant in the search for high area-to-mass ratio (HAMR) objects, as their apparent brightness can change significantly over time. A track-before-detect (TBD) approach has been shown to be suitable for faint, low signal-to-noise ratio (SNR) images of resident space objects (RSOs). TBD does not rely upon the extraction of feature points within the image based on some thresholding criteria, but rather directly takes as input the intensity information from the image file. Not only is all of the available information from the image used, TBD avoids the computational intractability of the conventional feature-based line detection (i.e., "string of pearls") approach to track detection for low SNR data. Implementation of TBD rooted in finite set statistics (FISST) theory has been proposed recently by Vo, et al. Compared to other TBD methods applied so far to SSA, such as the stacking method or multi-pass multi-period denoising, the FISST approach is statistically rigorous and has been shown to be more computationally efficient, thus paving the path toward on-line processing. In this paper, we intend to apply a multi-Bernoulli filter to actual CCD imagery of RSOs. The multi-Bernoulli filter can explicitly account for the birth and death of multiple targets in a measurement arc. TBD is achieved via a sequential Monte Carlo implementation. Preliminary results with simulated single-target data indicate that a Bernoulli filter can successfully track and detect objects with measurement SNR as low as 2.4. Although the advent of fast-cadence scientific CMOS sensors have made the automation of faint object detection a realistic goal, it is nonetheless a difficult goal, as measurements
Statistical signal processing for detection of buried land mines using quadrupole resonance
NASA Astrophysics Data System (ADS)
Liu, Feng; Tantum, Stacy L.; Collins, Leslie M.; Carin, Lawrence
2000-08-01
Quadrupole resonance (QR) is a technique that discriminates mines from clutter by exploiting unique properties of explosives, rather than the attributes of the mine that exist in many forms of anthropic clutter. After exciting the explosive with a properly designed electromagnetic-induction (EMI) system, one attempts to sense late-time spin echoes, which are characterized by radiation at particular frequencies. It is this narrow-band radiation that indicates the present of explosives, since this effect is not seen in most clutter, both natural and anthropic. However, explosives detection via QR is complicated by several practical issues. First, the late-time radiation is often very weak, particularly for TNT, and therefore the signal- to-noise ratio must be high for extracting the QR response. Further, the frequency at which the radiation occurs is often a strong function of the background environment, and therefore in practice the QR radiation frequency is not known a priori. Also, at frequencies of interest, there is a significant amount of background radiation, which induces radio frequency interference (RFI). In addition, the response properties of the system are sensitive to the height of the sensor above the ground, and the QR sensor effectively becomes 'de-tuned'. Finally, present QR systems cannot detect the explosive in metal-cased mines, thus the system and associated signal processing must be extended to also operate as a metal detector. Previously, we have shown that adaptive noise cancellation techniques, in particular, the least-mean-square algorithm, provide an effective means of RFI mitigation and can dramatically improve QR detection. In this paper we discuss several signal processing tools we have developed to further enhance the utility of QR explosives detection. In particular, with regard to the uncertainties concerning the background environment and sensor height, we explore statistical signal processing strategies to rigorously account for
Huang, Ruili; Southall, Noel; Xia, Menghang; Cho, Ming-Hsuang; Jadhav, Ajit; Nguyen, Dac-Trung; Inglese, James; Tice, Raymond R.; Austin, Christopher P.
2009-01-01
In support of the U.S. Tox21 program, we have developed a simple and chemically intuitive model we call weighted feature significance (WFS) to predict the toxicological activity of compounds, based on the statistical enrichment of structural features in toxic compounds. We trained and tested the model on the following: (1) data from quantitative high–throughput screening cytotoxicity and caspase activation assays conducted at the National Institutes of Health Chemical Genomics Center, (2) data from Salmonella typhimurium reverse mutagenicity assays conducted by the U.S. National Toxicology Program, and (3) hepatotoxicity data published in the Registry of Toxic Effects of Chemical Substances. Enrichments of structural features in toxic compounds are evaluated for their statistical significance and compiled into a simple additive model of toxicity and then used to score new compounds for potential toxicity. The predictive power of the model for cytotoxicity was validated using an independent set of compounds from the U.S. Environmental Protection Agency tested also at the National Institutes of Health Chemical Genomics Center. We compared the performance of our WFS approach with classical classification methods such as Naive Bayesian clustering and support vector machines. In most test cases, WFS showed similar or slightly better predictive power, especially in the prediction of hepatotoxic compounds, where WFS appeared to have the best performance among the three methods. The new algorithm has the important advantages of simplicity, power, interpretability, and ease of implementation. PMID:19805409
Zhang, Han; Ni, Weiping; Yan, Weidong; Bian, Hui; Wu, Junzheng
2014-01-01
A novel fast SAR image change detection method is presented in this paper. Based on a Bayesian approach, the prior information that speckles follow the Nakagami distribution is incorporated into the difference image (DI) generation process. The new DI performs much better than the familiar log ratio (LR) DI as well as the cumulant based Kullback-Leibler divergence (CKLD) DI. The statistical region merging (SRM) approach is first introduced to change detection context. A new clustering procedure with the region variance as the statistical inference variable is exhibited to tailor SAR image change detection purposes, with only two classes in the final map, the unchanged and changed classes. The most prominent advantages of the proposed modified SRM (MSRM) method are the ability to cope with noise corruption and the quick implementation. Experimental results show that the proposed method is superior in both the change detection accuracy and the operation efficiency.
Ni, Weiping; Yan, Weidong; Bian, Hui; Wu, Junzheng
2014-01-01
A novel fast SAR image change detection method is presented in this paper. Based on a Bayesian approach, the prior information that speckles follow the Nakagami distribution is incorporated into the difference image (DI) generation process. The new DI performs much better than the familiar log ratio (LR) DI as well as the cumulant based Kullback-Leibler divergence (CKLD) DI. The statistical region merging (SRM) approach is first introduced to change detection context. A new clustering procedure with the region variance as the statistical inference variable is exhibited to tailor SAR image change detection purposes, with only two classes in the final map, the unchanged and changed classes. The most prominent advantages of the proposed modified SRM (MSRM) method are the ability to cope with noise corruption and the quick implementation. Experimental results show that the proposed method is superior in both the change detection accuracy and the operation efficiency. PMID:25258740
Webb-Robertson, Bobbie-Jo M.; McCue, Lee Ann; Waters, Katrina M.; Matzke, Melissa M.; Jacobs, Jon M.; Metz, Thomas O.; Varnum, Susan M.; Pounds, Joel G.
2010-11-01
Liquid chromatography-mass spectrometry-based (LC-MS) proteomics uses peak intensities of proteolytic peptides to infer the differential abundance of peptides/proteins. However, substantial run-to-run variability in peptide intensities and observations (presence/absence) of peptides makes data analysis quite challenging. The missing abundance values in LC-MS proteomics data are difficult to address with traditional imputation-based approaches because the mechanisms by which data are missing are unknown a priori. Data can be missing due to random mechanisms such as experimental error, or non-random mechanisms such as a true biological effect. We present a statistical approach that uses a test of independence known as a G-test to test the null hypothesis of independence between the number of missing values and the experimental groups. We pair the G-test results evaluating independence of missing data (IMD) with a standard analysis of variance (ANOVA) that uses only means and variances computed from the observed data. Each peptide is therefore represented by two statistical confidence metrics, one for qualitative differential observation and one for quantitative differential intensity. We use two simulated and two real LC-MS datasets to demonstrate the robustness and sensitivity of the ANOVA-IMD approach for assigning confidence to peptides with significant differential abundance among experimental groups.
Detection of microcalcifications in mammograms using statistical measures based region-growing
NASA Astrophysics Data System (ADS)
Shanmugavadivu, Pitchai; Lakshmi Narayanan, S. G.
2013-01-01
A novel technique to detect the microcalcifications in digital mammograms presented in this paper uses the statistical measures, namely mean and variance, as the criterion to classify the pixels representing microcalcifications. This method has proved its credentials by accurately segmenting the microcalcifications in the mammogram image. This approach fixes the boundary of the microcalcifications accurately, which confirms, its qualitative performance.
Dual-band, infrared buried mine detection using a statistical pattern recognition approach
Buhl, M.R.; Hernandez, J.E.; Clark, G.A.; Sengupta, S.K.
1993-08-01
The main objective of this work was to detect surrogate land mines, which were buried in clay and sand, using dual-band, infrared images. A statistical pattern recognition approach was used to achieve this objective. This approach is discussed and results of applying it to real images are given.
ERIC Educational Resources Information Center
Turk-Browne, Nicholas B.; Scholl, Brian J.; Chun, Marvin M.; Johnson, Marcia K.
2009-01-01
Our environment contains regularities distributed in space and time that can be detected by way of statistical learning. This unsupervised learning occurs without intent or awareness, but little is known about how it relates to other types of learning, how it affects perceptual processing, and how quickly it can occur. Here we use fMRI during…
NASA Astrophysics Data System (ADS)
Goovaerts, P.; Jacquez, G. M.; Marcus, A. W.
2004-12-01
Spatial data are periodically collected and processed to monitor, analyze and interpret developments in our changing environment. Remote sensing is a modern way of data collecting and has seen an enormous growth since launching of modern satellites and development of airborne sensors. In particular, the recent availability of high spatial resolution hyperspectral imagery (spatial resolution of less than 5 meters and including data collected over 64 or more bands of electromagnetic radiation for each pixel offers a great potential to significantly enhance environmental mapping and our ability to model spatial systems. High spatial resolution imagery contains a remarkable quantity of information that could be used to analyze spatial breaks (boundaries), areas of similarity (clusters), and spatial autocorrelation (associations) across the landscape. This paper addresses the specific issue of soil disturbance detection, which could indicate the presence of land mines or recent movements of troop and heavy equipment. A challenge presented by soil detection is to retain the measurement of fine-scale features (i.e. mineral soil changes, organic content changes, vegetation disturbance related changes, aspect changes) while still covering proportionally large spatial areas. An additional difficulty is that no ground data might be available for the calibration of spectral signatures, and little might be known about the size of patches of disturbed soils to be detected. This paper describes a new technique for automatic target detection which capitalizes on both spatial and across spectral bands correlation, does not require any a priori information on the target spectral signature but does not allow discrimination between targets. This approach involves successively a multivariate statistical analysis (principal component analysis) of all spectral bands, a geostatistical filtering of noise and regional background in the first principal components using factorial kriging, and
NASA Astrophysics Data System (ADS)
Choquet, Élodie; Pueyo, Laurent; Soummer, Rémi; Perrin, Marshall D.; Hagan, J. Brendan; Gofas-Salas, Elena; Rajan, Abhijith; Aguilar, Jonathan
2015-09-01
The ALICE program, for Archival Legacy Investigation of Circumstellar Environment, is currently conducting a virtual survey of about 400 stars, by re-analyzing the HST-NICMOS coronagraphic archive with advanced post-processing techniques. We present here the strategy that we adopted to identify detections and potential candidates for follow-up observations, and we give a preliminary overview of our detections. We present a statistical analysis conducted to evaluate the confidence level on these detection and the completeness of our candidate search.
Detection and classification using higher-order statistics of optical matched filters
NASA Astrophysics Data System (ADS)
Sadler, Brian M.
1990-09-01
In this paper we consider the problem of detection and classification of signals in the presence of additive Gaussian noise of unknown covariance (AGN/TJC), using higher than second-order statistics (HOS) of the output of a matched filter. Specifically, we apply the HOS-based method developed in [1,2] to phase-only matched filters. The main result of this paper is that the HOSbased statistic is appropriate for use with phase-only matched filter (POMF) outputs. Simulation results are presented which indicate the ability of the matched filter and the POMF, which are augmented with 1105, to detect a 2-d signal at signal-to-noise ratios below which the matched filters alone are incapable of making a detection.
NASA Astrophysics Data System (ADS)
Wang, H. J.; Shi, W. L.; Chen, X. H.
2006-05-01
The West Development Policy being implemented in China is causing significant land use and land cover (LULC) changes in West China. With the up-to-date satellite database of the Global Land Cover Characteristics Database (GLCCD) that characterizes the lower boundary conditions, the regional climate model RIEMS-TEA is used to simulate possible impacts of the significant LULC variation. The model was run for five continuous three-month periods from 1 June to 1 September of 1993, 1994, 1995, 1996, and 1997, and the results of the five groups are examined by means of a student t-test to identify the statistical significance of regional climate variation. The main results are: (1) The regional climate is affected by the LULC variation because the equilibrium of water and heat transfer in the air-vegetation interface is changed. (2) The integrated impact of the LULC variation on regional climate is not only limited to West China where the LULC varies, but also to some areas in the model domain where the LULC does not vary at all. (3) The East Asian monsoon system and its vertical structure are adjusted by the large scale LULC variation in western China, where the consequences axe the enhancement of the westward water vapor transfer from the east east and the relevant increase of wet-hydrostatic energy in the middle-upper atmospheric layers. (4) The ecological engineering in West China affects significantly the regional climate in Northwest China, North China and the middle-lower reaches of the Yangtze River; there are obvious effects in South, Northeast, and Southwest China, but minor effects in Tibet.
NASA Astrophysics Data System (ADS)
Ren, W. X.; Lin, Y. Q.; Fang, S. E.
2011-11-01
One of the key issues in vibration-based structural health monitoring is to extract the damage-sensitive but environment-insensitive features from sampled dynamic response measurements and to carry out the statistical analysis of these features for structural damage detection. A new damage feature is proposed in this paper by using the system matrices of the forward innovation model based on the covariance-driven stochastic subspace identification of a vibrating system. To overcome the variations of the system matrices, a non-singularity transposition matrix is introduced so that the system matrices are normalized to their standard forms. For reducing the effects of modeling errors, noise and environmental variations on measured structural responses, a statistical pattern recognition paradigm is incorporated into the proposed method. The Mahalanobis and Euclidean distance decision functions of the damage feature vector are adopted by defining a statistics-based damage index. The proposed structural damage detection method is verified against one numerical signal and two numerical beams. It is demonstrated that the proposed statistics-based damage index is sensitive to damage and shows some robustness to the noise and false estimation of the system ranks. The method is capable of locating damage of the beam structures under different types of excitations. The robustness of the proposed damage detection method to the variations in environmental temperature is further validated in a companion paper by a reinforced concrete beam tested in the laboratory and a full-scale arch bridge tested in the field.
Statistical detection of slow-mode waves in solar polar regions with SDO/AIA
Su, J. T.
2014-10-01
Observations from the Atmospheric Imaging Assembly (AIA) on board the Solar Dynamics Observatory are utilized to statistically investigate the propagating quasi-periodic oscillations in the solar polar plume and inter-plume regions. On average, the periods are found to be nearly equal in the three coronal channels of AIA 171 Å, 193 Å, and 211 Å, and the wavelengths increase with temperature from 171 Å, 193 Å, and 211 Å. The phase speeds may be inferred from the above parameters. Furthermore, the speed ratios of v {sub 193}/v {sub 171} and v {sub 211}/v {sub 171} are derived, e.g., 1.4 ± 0.8 and 2.0 ± 1.9 in the plume regions, respectively, which are equivalent to the theoretical ones for acoustic waves. We find that there are no significant differences for the detected parameters between the plume and inter-plume regions. To our knowledge, this is the first time that we have simultaneously obtained the phase speeds of slow-mode waves in the three channels in the open coronal magnetic structures due to the method adopted in the present work, which is able to minimize the influence of the jets or eruptions on wave signals.
Efficient detection of wound-bed and peripheral skin with statistical colour models.
Veredas, Francisco J; Mesa, Héctor; Morente, Laura
2015-04-01
A pressure ulcer is a clinical pathology of localised damage to the skin and underlying tissue caused by pressure, shear or friction. Reliable diagnosis supported by precise wound evaluation is crucial in order to success on treatment decisions. This paper presents a computer-vision approach to wound-area detection based on statistical colour models. Starting with a training set consisting of 113 real wound images, colour histogram models are created for four different tissue types. Back-projections of colour pixels on those histogram models are used, from a Bayesian perspective, to get an estimate of the posterior probability of a pixel to belong to any of those tissue classes. Performance measures obtained from contingency tables based on a gold standard of segmented images supplied by experts have been used for model selection. The resulting fitted model has been validated on a training set consisting of 322 wound images manually segmented and labelled by expert clinicians. The final fitted segmentation model shows robustness and gives high mean performance rates [(AUC: .9426 (SD .0563); accuracy: .8777 (SD .0799); F-score: 0.7389 (SD .1550); Cohen's kappa: .6585 (SD .1787)] when segmenting significant wound areas that include healing tissues.
Kossobokov, V.G.; Romashkova, L.L.; Keilis-Borok, V. I.; Healy, J.H.
1999-01-01
Algorithms M8 and MSc (i.e., the Mendocino Scenario) were used in a real-time intermediate-term research prediction of the strongest earthquakes in the Circum-Pacific seismic belt. Predictions are made by M8 first. Then, the areas of alarm are reduced by MSc at the cost that some earthquakes are missed in the second approximation of prediction. In 1992-1997, five earthquakes of magnitude 8 and above occurred in the test area: all of them were predicted by M8 and MSc identified correctly the locations of four of them. The space-time volume of the alarms is 36% and 18%, correspondingly, when estimated with a normalized product measure of empirical distribution of epicenters and uniform time. The statistical significance of the achieved results is beyond 99% both for M8 and MSc. For magnitude 7.5 + , 10 out of 19 earthquakes were predicted by M8 in 40% and five were predicted by M8-MSc in 13% of the total volume considered. This implies a significance level of 81% for M8 and 92% for M8-MSc. The lower significance levels might result from a global change in seismic regime in 1993-1996, when the rate of the largest events has doubled and all of them become exclusively normal or reversed faults. The predictions are fully reproducible; the algorithms M8 and MSc in complete formal definitions were published before we started our experiment [Keilis-Borok, V.I., Kossobokov, V.G., 1990. Premonitory activation of seismic flow: Algorithm M8, Phys. Earth and Planet. Inter. 61, 73-83; Kossobokov, V.G., Keilis-Borok, V.I., Smith, S.W., 1990. Localization of intermediate-term earthquake prediction, J. Geophys. Res., 95, 19763-19772; Healy, J.H., Kossobokov, V.G., Dewey, J.W., 1992. A test to evaluate the earthquake prediction algorithm, M8. U.S. Geol. Surv. OFR 92-401]. M8 is available from the IASPEI Software Library [Healy, J.H., Keilis-Borok, V.I., Lee, W.H.K. (Eds.), 1997. Algorithms for Earthquake Statistics and Prediction, Vol. 6. IASPEI Software Library]. ?? 1999 Elsevier
Probability of Detection (POD) as a statistical model for the validation of qualitative methods.
Wehling, Paul; LaBudde, Robert A; Brunelle, Sharon L; Nelson, Maria T
2011-01-01
A statistical model is presented for use in validation of qualitative methods. This model, termed Probability of Detection (POD), harmonizes the statistical concepts and parameters between quantitative and qualitative method validation. POD characterizes method response with respect to concentration as a continuous variable. The POD model provides a tool for graphical representation of response curves for qualitative methods. In addition, the model allows comparisons between candidate and reference methods, and provides calculations of repeatability, reproducibility, and laboratory effects from collaborative study data. Single laboratory study and collaborative study examples are given.
NASA Astrophysics Data System (ADS)
Wilson, Mark; Mitra, Sunanda; Roberson, Glenn H.; Shieh, Yao-Yang
1997-10-01
Currently early detection of breast cancer is primarily accomplished by mammography and suspicious findings may lead to a decision for performing a biopsy. Digital enhancement and pattern recognition techniques may aid in early detection of some patterns such as microcalcification clusters indicating onset of DCIS (ductal carcinoma in situ) that accounts for 20% of all mammographically detected breast cancers and could be treated when detected early. These individual calcifications are hard to detect due to size and shape variability and inhomogeneous background texture. Our study addresses only early detection of microcalcifications that allows the radiologist to interpret the x-ray findings in computer-aided enhanced form easier than evaluating the x-ray film directly. We present an algorithm which locates microcalcifications based on local grayscale variability and of tissue structures and image statistics. Threshold filters with lower and upper bounds computed from the image statistics of the entire image and selected subimages were designed to enhance the entire image. This enhanced image was used as the initial image for identifying the micro-calcifications based on the variable box threshold filters at different resolutions. The test images came from the Texas Tech University Health Sciences Center and the MIAS mammographic database, which are classified into various categories including microcalcifications. Classification of other types of abnormalities in mammograms based on their characteristic features is addressed in later studies.
ERIC Educational Resources Information Center
Harrison, Judith; Thompson, Bruce; Vannest, Kimberly J.
2009-01-01
This article reviews the literature on interventions targeting the academic performance of students with attention-deficit/hyperactivity disorder (ADHD) and does so within the context of the statistical significance testing controversy. Both the arguments for and against null hypothesis statistical significance tests are reviewed. Recent standards…
ERIC Educational Resources Information Center
Weigle, David C.
The purposes of the present paper are to address the historical development of statistical significance testing and to briefly examine contemporary practices regarding such testing in the light of these historical origins. Precursors leading to the advent of statistical significance testing are examined as are more recent controversies surrounding…
Avalappampatty Sivasamy, Aneetha; Sundan, Bose
2015-01-01
The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T2 method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T2 statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better. PMID:26357668
Nicol, Samuel; Roach, Jennifer K.; Griffith, Brad
2013-01-01
Over the past 50 years, the number and size of high-latitude lakes have decreased throughout many regions; however, individual lake trends have been variable in direction and magnitude. This spatial heterogeneity in lake change makes statistical detection of temporal trends challenging, particularly in small analysis areas where weak trends are difficult to separate from inter- and intra-annual variability. Factors affecting trend detection include inherent variability, trend magnitude, and sample size. In this paper, we investigated how the statistical power to detect average linear trends in lake size of 0.5, 1.0 and 2.0 %/year was affected by the size of the analysis area and the number of years of monitoring in National Wildlife Refuges in Alaska. We estimated power for large (930–4,560 sq km) study areas within refuges and for 2.6, 12.9, and 25.9 sq km cells nested within study areas over temporal extents of 4–50 years. We found that: (1) trends in study areas could be detected within 5–15 years, (2) trends smaller than 2.0 %/year would take >50 years to detect in cells within study areas, and (3) there was substantial spatial variation in the time required to detect change among cells. Power was particularly low in the smallest cells which typically had the fewest lakes. Because small but ecologically meaningful trends may take decades to detect, early establishment of long-term monitoring will enhance power to detect change. Our results have broad applicability and our method is useful for any study involving change detection among variable spatial and temporal extents.
Akahori, Takuya; Gaensler, B. M.; Ryu, Dongsu E-mail: bryan.gaensler@sydney.edu.au
2014-08-01
Rotation measure (RM) grids of extragalactic radio sources have been widely used for studying cosmic magnetism. However, their potential for exploring the intergalactic magnetic field (IGMF) in filaments of galaxies is unclear, since other Faraday-rotation media such as the radio source itself, intervening galaxies, and the interstellar medium of our Galaxy are all significant contributors. We study statistical techniques for discriminating the Faraday rotation of filaments from other sources of Faraday rotation in future large-scale surveys of radio polarization. We consider a 30° × 30° field of view toward the south Galactic pole, while varying the number of sources detected in both present and future observations. We select sources located at high redshifts and toward which depolarization and optical absorption systems are not observed so as to reduce the RM contributions from the sources and intervening galaxies. It is found that a high-pass filter can satisfactorily reduce the RM contribution from the Galaxy since the angular scale of this component toward high Galactic latitudes would be much larger than that expected for the IGMF. Present observations do not yet provide a sufficient source density to be able to estimate the RM of filaments. However, from the proposed approach with forthcoming surveys, we predict significant residuals of RM that should be ascribable to filaments. The predicted structure of the IGMF down to scales of 0.°1 should be observable with data from the Square Kilometre Array, if we achieve selections of sources toward which sightlines do not contain intervening galaxies and RM errors are less than a few rad m{sup –2}.
Recommended methods for statistical analysis of data containing less-than-detectable measurements
Atwood, C.L.; Blackwood, L.G.; Harris, G.A.; Loehr, C.A.
1990-09-01
This report is a manual for statistical workers dealing with environmental measurements, when some of the measurements are not given exactly but are only reported as less than detectable. For some statistical settings with such data, many methods have been proposed in the literature, while for others few or none have been proposed. This report gives a recommended method in each of the settings considered. The body of the report gives a brief description of each recommended method. Appendix A gives example programs using the statistical package SAS, for those methods that involve nonstandard methods. Appendix B presents the methods that were compared and the reasons for selecting each recommended method, and explains any fine points that might be of interest. This is an interim version. Future revisions will complete the recommendations. 34 refs., 2 figs., 11 tabs.
Statistics provide guidance for indigenous organic carbon detection on Mars missions.
Sephton, Mark A; Carter, Jonathan N
2014-08-01
Data from the Viking and Mars Science Laboratory missions indicate the presence of organic compounds that are not definitively martian in origin. Both contamination and confounding mineralogies have been suggested as alternatives to indigenous organic carbon. Intuitive thought suggests that we are repeatedly obtaining data that confirms the same level of uncertainty. Bayesian statistics may suggest otherwise. If an organic detection method has a true positive to false positive ratio greater than one, then repeated organic matter detection progressively increases the probability of indigeneity. Bayesian statistics also reveal that methods with higher ratios of true positives to false positives give higher overall probabilities and that detection of organic matter in a sample with a higher prior probability of indigenous organic carbon produces greater confidence. Bayesian statistics, therefore, provide guidance for the planning and operation of organic carbon detection activities on Mars. Suggestions for future organic carbon detection missions and instruments are as follows: (i) On Earth, instruments should be tested with analog samples of known organic content to determine their true positive to false positive ratios. (ii) On the mission, for an instrument with a true positive to false positive ratio above one, it should be recognized that each positive detection of organic carbon will result in a progressive increase in the probability of indigenous organic carbon being present; repeated measurements, therefore, can overcome some of the deficiencies of a less-than-definitive test. (iii) For a fixed number of analyses, the highest true positive to false positive ratio method or instrument will provide the greatest probability that indigenous organic carbon is present. (iv) On Mars, analyses should concentrate on samples with highest prior probability of indigenous organic carbon; intuitive desires to contrast samples of high prior probability and low prior
Statistics provide guidance for indigenous organic carbon detection on Mars missions.
Sephton, Mark A; Carter, Jonathan N
2014-08-01
Data from the Viking and Mars Science Laboratory missions indicate the presence of organic compounds that are not definitively martian in origin. Both contamination and confounding mineralogies have been suggested as alternatives to indigenous organic carbon. Intuitive thought suggests that we are repeatedly obtaining data that confirms the same level of uncertainty. Bayesian statistics may suggest otherwise. If an organic detection method has a true positive to false positive ratio greater than one, then repeated organic matter detection progressively increases the probability of indigeneity. Bayesian statistics also reveal that methods with higher ratios of true positives to false positives give higher overall probabilities and that detection of organic matter in a sample with a higher prior probability of indigenous organic carbon produces greater confidence. Bayesian statistics, therefore, provide guidance for the planning and operation of organic carbon detection activities on Mars. Suggestions for future organic carbon detection missions and instruments are as follows: (i) On Earth, instruments should be tested with analog samples of known organic content to determine their true positive to false positive ratios. (ii) On the mission, for an instrument with a true positive to false positive ratio above one, it should be recognized that each positive detection of organic carbon will result in a progressive increase in the probability of indigenous organic carbon being present; repeated measurements, therefore, can overcome some of the deficiencies of a less-than-definitive test. (iii) For a fixed number of analyses, the highest true positive to false positive ratio method or instrument will provide the greatest probability that indigenous organic carbon is present. (iv) On Mars, analyses should concentrate on samples with highest prior probability of indigenous organic carbon; intuitive desires to contrast samples of high prior probability and low prior
Bornmann, Lutz; Leydesdorff, Loet
2013-01-01
Using the InCites tool of Thomson Reuters, this study compares normalized citation impact values calculated for China, Japan, France, Germany, United States, and the UK throughout the time period from 1981 to 2010. InCites offers a unique opportunity to study the normalized citation impacts of countries using (i) a long publication window (1981 to 2010), (ii) a differentiation in (broad or more narrow) subject areas, and (iii) allowing for the use of statistical procedures in order to obtain an insightful investigation of national citation trends across the years. Using four broad categories, our results show significantly increasing trends in citation impact values for France, the UK, and especially Germany across the last thirty years in all areas. The citation impact of papers from China is still at a relatively low level (mostly below the world average), but the country follows an increasing trend line. The USA exhibits a stable pattern of high citation impact values across the years. With small impact differences between the publication years, the US trend is increasing in engineering and technology but decreasing in medical and health sciences as well as in agricultural sciences. Similar to the USA, Japan follows increasing as well as decreasing trends in different subject areas, but the variability across the years is small. In most of the years, papers from Japan perform below or approximately at the world average in each subject area.
Bornmann, Lutz; Leydesdorff, Loet
2013-01-01
Using the InCites tool of Thomson Reuters, this study compares normalized citation impact values calculated for China, Japan, France, Germany, United States, and the UK throughout the time period from 1981 to 2010. InCites offers a unique opportunity to study the normalized citation impacts of countries using (i) a long publication window (1981 to 2010), (ii) a differentiation in (broad or more narrow) subject areas, and (iii) allowing for the use of statistical procedures in order to obtain an insightful investigation of national citation trends across the years. Using four broad categories, our results show significantly increasing trends in citation impact values for France, the UK, and especially Germany across the last thirty years in all areas. The citation impact of papers from China is still at a relatively low level (mostly below the world average), but the country follows an increasing trend line. The USA exhibits a stable pattern of high citation impact values across the years. With small impact differences between the publication years, the US trend is increasing in engineering and technology but decreasing in medical and health sciences as well as in agricultural sciences. Similar to the USA, Japan follows increasing as well as decreasing trends in different subject areas, but the variability across the years is small. In most of the years, papers from Japan perform below or approximately at the world average in each subject area. PMID:23418600
Baldewijns, Greet; Luca, Stijn; Nagels, William; Vanrumste, Bart; Croonenborghs, Tom
2015-01-01
It has been shown that gait speed and transfer times are good measures of functional ability in elderly. However, data currently acquired by systems that measure either gait speed or transfer times in the homes of elderly people require manual reviewing by healthcare workers. This reviewing process is time-consuming. To alleviate this burden, this paper proposes the use of statistical process control methods to automatically detect both positive and negative changes in transfer times. Three SPC techniques: tabular CUSUM, standardized CUSUM and EWMA, known for their ability to detect small shifts in the data, are evaluated on simulated transfer times. This analysis shows that EWMA is the best-suited method with a detection accuracy of 82% and an average detection time of 9.64 days. PMID:26737425
Ye, Sujuan; Wu, Yanying; Zhai, Xiaomo; Tang, Bo
2015-08-18
Simultaneous detection of cancer biomarkers holds great promise for the early diagnosis of different cancers. However, in the presence of high-concentration biomarkers, the signals of lower-expression biomarkers are overlapped. Existing techniques are not suitable for simultaneously detecting multiple biomarkers at concentrations with significantly different orders of magnitude. Here, we propose an asymmetric signal amplification method for simultaneously detecting multiple biomarkers with significantly different levels. Using the bifunctional probe, a linear amplification mode responds to high-concentration markers, and quadratic amplification mode responds to low-concentration markers. With the combined biobarcode probe and hybridization chain reaction (HCR) amplification method, the detection limits of microRNA (miRNA) and ATP via surface-enhanced Raman scattering (SERS) detection are 0.15 fM and 20 nM, respectively, with a breakthrough of detection concentration difference over 11 orders of magnitude. Furthermore, successful determination of miRNA and ATP in cancer cells supports the practicability of the assay. This methodology promises to open an exciting new avenue for the detection of various types of biomolecules. PMID:26218034
Malm, Christer B; Khoo, Nelson S; Granlund, Irene; Lindstedt, Emilia; Hult, Andreas
2016-01-01
The discovery of erythropoietin (EPO) simplified blood doping in sports, but improved detection methods, for EPO has forced cheating athletes to return to blood transfusion. Autologous blood transfusion with cryopreserved red blood cells (RBCs) is the method of choice, because no valid method exists to accurately detect such event. In endurance sports, it can be estimated that elite athletes improve performance by up to 3% with blood doping, regardless of method. Valid detection methods for autologous blood doping is important to maintain credibility of athletic performances. Recreational male (N = 27) and female (N = 11) athletes served as Transfusion (N = 28) and Control (N = 10) subjects in two different transfusion settings. Hematological variables and physical performance were measured before donation of 450 or 900 mL whole blood, and until four weeks after re-infusion of the cryopreserved RBC fraction. Blood was analyzed for transferrin, iron, Hb, EVF, MCV, MCHC, reticulocytes, leucocytes and EPO. Repeated measures multivariate analysis of variance (MANOVA) and pattern recognition using Principal Component Analysis (PCA) and Orthogonal Projections of Latent Structures (OPLS) discriminant analysis (DA) investigated differences between Control and Transfusion groups over time. Significant increase in performance (15 ± 8%) and VO2max (17 ± 10%) (mean ± SD) could be measured 48 h after RBC re-infusion, and remained increased for up to four weeks in some subjects. In total, 533 blood samples were included in the study (Clean = 220, Transfused = 313). In response to blood transfusion, the largest change in hematological variables occurred 48 h after blood donation, when Control and Transfused groups could be separated with OPLS-DA (R2 = 0.76/Q2 = 0.59). RBC re-infusion resulted in the best model (R2 = 0.40/Q2 = 0.10) at the first sampling point (48 h), predicting one false positive and one false negative. Over all, a 25% and 86% false positives ratio was
Malm, Christer B; Khoo, Nelson S; Granlund, Irene; Lindstedt, Emilia; Hult, Andreas
2016-01-01
The discovery of erythropoietin (EPO) simplified blood doping in sports, but improved detection methods, for EPO has forced cheating athletes to return to blood transfusion. Autologous blood transfusion with cryopreserved red blood cells (RBCs) is the method of choice, because no valid method exists to accurately detect such event. In endurance sports, it can be estimated that elite athletes improve performance by up to 3% with blood doping, regardless of method. Valid detection methods for autologous blood doping is important to maintain credibility of athletic performances. Recreational male (N = 27) and female (N = 11) athletes served as Transfusion (N = 28) and Control (N = 10) subjects in two different transfusion settings. Hematological variables and physical performance were measured before donation of 450 or 900 mL whole blood, and until four weeks after re-infusion of the cryopreserved RBC fraction. Blood was analyzed for transferrin, iron, Hb, EVF, MCV, MCHC, reticulocytes, leucocytes and EPO. Repeated measures multivariate analysis of variance (MANOVA) and pattern recognition using Principal Component Analysis (PCA) and Orthogonal Projections of Latent Structures (OPLS) discriminant analysis (DA) investigated differences between Control and Transfusion groups over time. Significant increase in performance (15 ± 8%) and VO2max (17 ± 10%) (mean ± SD) could be measured 48 h after RBC re-infusion, and remained increased for up to four weeks in some subjects. In total, 533 blood samples were included in the study (Clean = 220, Transfused = 313). In response to blood transfusion, the largest change in hematological variables occurred 48 h after blood donation, when Control and Transfused groups could be separated with OPLS-DA (R2 = 0.76/Q2 = 0.59). RBC re-infusion resulted in the best model (R2 = 0.40/Q2 = 0.10) at the first sampling point (48 h), predicting one false positive and one false negative. Over all, a 25% and 86% false positives ratio was
Investigation of novel spectral and wavelet statistics for UGS-based intrusion detection
NASA Astrophysics Data System (ADS)
Narayanaswami, Ranga; Gandhe, Avinash; Tyurina, Anastasia; McComas, Michael; Mehra, Raman K.
2012-06-01
Seismic Unattended Ground Sensors (UGS) are low cost and covert, making them a suitable candidate for border patrol. Current seismic UGS systems use cadence-based intrusion detection algorithms and are easily confused between humans and animals. The poor discrimination ability between humans and animals results in missed detections as well as higher false (nuisance) alarm rates. In order for seismic UGS systems to be deployed successfully, new signal processing algorithms with better discrimination ability between humans and animals are needed. We have characterized the seismic signals using frequency domain and time-frequency domain statistics, which improve the discrimination between humans, animals and vehicles.
Signal Waveform Detection with Statistical Automaton for Internet and Web Service Streaming
Liu, Yiming; Huang, Nai-Lun; Zeng, Fufu; Lin, Fang-Ying
2014-01-01
In recent years, many approaches have been suggested for Internet and web streaming detection. In this paper, we propose an approach to signal waveform detection for Internet and web streaming, with novel statistical automatons. The system records network connections over a period of time to form a signal waveform and compute suspicious characteristics of the waveform. Network streaming according to these selected waveform features by our newly designed Aho-Corasick (AC) automatons can be classified. We developed two versions, that is, basic AC and advanced AC-histogram waveform automata, and conducted comprehensive experimentation. The results confirm that our approach is feasible and suitable for deployment. PMID:25032231
Signal waveform detection with statistical automaton for internet and web service streaming.
Tseng, Kuo-Kun; Ji, Yuzhu; Liu, Yiming; Huang, Nai-Lun; Zeng, Fufu; Lin, Fang-Ying
2014-01-01
In recent years, many approaches have been suggested for Internet and web streaming detection. In this paper, we propose an approach to signal waveform detection for Internet and web streaming, with novel statistical automatons. The system records network connections over a period of time to form a signal waveform and compute suspicious characteristics of the waveform. Network streaming according to these selected waveform features by our newly designed Aho-Corasick (AC) automatons can be classified. We developed two versions, that is, basic AC and advanced AC-histogram waveform automata, and conducted comprehensive experimentation. The results confirm that our approach is feasible and suitable for deployment.
NASA Astrophysics Data System (ADS)
Gilbert, Richard O.; O'Brien, Robert F.; Wilson, John E.; Pulsipher, Brent A.; McKinstry, Craig A.
2003-09-01
It may not be feasible to completely survey large tracts of land suspected of containing minefields. It is desirable to develop a characterization protocol that will confidently identify minefields within these large land tracts if they exist. Naturally, surveying areas of greatest concern and most likely locations would be necessary but will not provide the needed confidence that an unknown minefield had not eluded detection. Once minefields are detected, methods are needed to bound the area that will require detailed mine detection surveys. The US Department of Defense Strategic Environmental Research and Development Program (SERDP) is sponsoring the development of statistical survey methods and tools for detecting potential UXO targets. These methods may be directly applicable to demining efforts. Statistical methods are employed to determine the optimal geophysical survey transect spacing to have confidence of detecting target areas of a critical size, shape, and anomaly density. Other methods under development determine the proportion of a land area that must be surveyed to confidently conclude that there are no UXO present. Adaptive sampling schemes are also being developed as an approach for bounding the target areas. These methods and tools will be presented and the status of relevant research in this area will be discussed.
A statistical model of the photomultiplier gain process with applications to optical pulse detection
NASA Technical Reports Server (NTRS)
Tan, H. H.
1982-01-01
A Markov diffusion model was used to determine an approximate probability density for the random gain. This approximate density preserves the correct second-order statistics and appears to be in reasonably good agreement with experimental data. The receiver operating curve for a pulse counter detector of PMT cathode emission events was analyzed using this density. The error performance of a simple binary direct detection optical communication system was also derived. Previously announced in STAR as N82-25100
A statistical model of the photomultiplier gain process with applications to optical pulse detection
NASA Technical Reports Server (NTRS)
Tan, H. H.
1982-01-01
A Markov diffusion model was used to determine an approximate probability density for the random gain. This approximate density preserves the correct second-order statistics and appears to be in reasonably good agreement with experimental data. The receiver operating curve for a pulse counter detector of PMT cathode emission events was analyzed using this density. The error performance of a simple binary direct detection optical communication system was also derived.
Banks-Leite, Cristina; Pardini, Renata; Boscolo, Danilo; Cassano, Camila Righetto; Püttker, Thomas; Barros, Camila Santos; Barlow, Jos
2014-01-01
1. In recent years, there has been a fast development of models that adjust for imperfect detection. These models have revolutionized the analysis of field data, and their use has repeatedly demonstrated the importance of sampling design and data quality. There are, however, several practical limitations associated with the use of detectability models which restrict their relevance to tropical conservation science. 2. We outline the main advantages of detectability models, before examining their limitations associated with their applicability to the analysis of tropical communities, rare species and large-scale data sets. Finally, we discuss whether detection probability needs to be controlled before and/or after data collection. 3. Models that adjust for imperfect detection allow ecologists to assess data quality by estimating uncertainty and to obtain adjusted ecological estimates of populations and communities. Importantly, these models have allowed informed decisions to be made about the conservation and management of target species. 4. Data requirements for obtaining unadjusted estimates are substantially lower than for detectability-adjusted estimates, which require relatively high detection/recapture probabilities and a number of repeated surveys at each location. These requirements can be difficult to meet in large-scale environmental studies where high levels of spatial replication are needed, or in the tropics where communities are composed of many naturally rare species. However, while imperfect detection can only be adjusted statistically, covariates of detection probability can also be controlled through study design. Using three study cases where we controlled for covariates of detection probability through sampling design, we show that the variation in unadjusted ecological estimates from nearly 100 species was qualitatively the same as that obtained from adjusted estimates. Finally, we discuss that the decision as to whether one should control for
Fernández-Llamazares, Alvaro; Belmonte, Jordina; Delgado, Rosario; De Linares, Concepción
2014-04-01
Airborne pollen records are a suitable indicator for the study of climate change. The present work focuses on the role of annual pollen indices for the detection of bioclimatic trends through the analysis of the aerobiological spectra of 11 taxa of great biogeographical relevance in Catalonia over an 18-year period (1994-2011), by means of different parametric and non-parametric statistical methods. Among others, two non-parametric rank-based statistical tests were performed for detecting monotonic trends in time series data of the selected airborne pollen types and we have observed that they have similar power in detecting trends. Except for those cases in which the pollen data can be well-modeled by a normal distribution, it is better to apply non-parametric statistical methods to aerobiological studies. Our results provide a reliable representation of the pollen trends in the region and suggest that greater pollen quantities are being liberated to the atmosphere in the last years, specially by Mediterranean taxa such as Pinus, Total Quercus and Evergreen Quercus, although the trends may differ geographically. Longer aerobiological monitoring periods are required to corroborate these results and survey the increasing levels of certain pollen types that could exert an impact in terms of public health.
NASA Astrophysics Data System (ADS)
Cho, Baek Hwan; Chang, Chuho; Lee, Jong-Ha; Ko, Eun Young; Seong, Yeong Kyeong; Woo, Kyoung-Gu
2013-02-01
The existence of microcalcifications (MCs) is an important marker of malignancy in breast cancer. In spite of the benefits in mass detection for dense breasts, ultrasonography is believed that it might not reliably detect MCs. For computer aided diagnosis systems, however, accurate detection of MCs has the possibility of improving the performance in both Breast Imaging-Reporting and Data System (BI-RADS) lexicon description for calcifications and malignancy classification. We propose a new efficient and effective method for MC detection using image enhancement and threshold adjacency statistics (TAS). The main idea of TAS is to threshold an image and to count the number of white pixels with a given number of adjacent white pixels. Our contribution is to adopt TAS features and apply image enhancement to facilitate MC detection in ultrasound images. We employed fuzzy logic, tophat filter, and texture filter to enhance images for MCs. Using a total of 591 images, the classification accuracy of the proposed method in MC detection showed 82.75%, which is comparable to that of Haralick texture features (81.38%). When combined, the performance was as high as 85.11%. In addition, our method also showed the ability in mass classification when combined with existing features. In conclusion, the proposed method exploiting image enhancement and TAS features has the potential to deal with MC detection in ultrasound images efficiently and extend to the real-time localization and visualization of MCs.
Denton, Debra L; Diamond, Jerry; Zheng, Lei
2011-05-01
The U.S. Environmental Protection Agency (U.S. EPA) and state agencies implement the Clean Water Act, in part, by evaluating the toxicity of effluent and surface water samples. A common goal for both regulatory authorities and permittees is confidence in an individual test result (e.g., no-observed-effect concentration [NOEC], pass/fail, 25% effective concentration [EC25]), which is used to make regulatory decisions, such as reasonable potential determinations, permit compliance, and watershed assessments. This paper discusses an additional statistical approach (test of significant toxicity [TST]), based on bioequivalence hypothesis testing, or, more appropriately, test of noninferiority, which examines whether there is a nontoxic effect at a single concentration of concern compared with a control. Unlike the traditional hypothesis testing approach in whole effluent toxicity (WET) testing, TST is designed to incorporate explicitly both α and β error rates at levels of toxicity that are unacceptable and acceptable, given routine laboratory test performance for a given test method. Regulatory management decisions are used to identify unacceptable toxicity levels for acute and chronic tests, and the null hypothesis is constructed such that test power is associated with the ability to declare correctly a truly nontoxic sample as acceptable. This approach provides a positive incentive to generate high-quality WET data to make informed decisions regarding regulatory decisions. This paper illustrates how α and β error rates were established for specific test method designs and tests the TST approach using both simulation analyses and actual WET data. In general, those WET test endpoints having higher routine (e.g., 50th percentile) within-test control variation, on average, have higher method-specific α values (type I error rate), to maintain a desired type II error rate. This paper delineates the technical underpinnings of this approach and demonstrates the benefits
Nakamura, Naoki; Tsunoda, Hiroko; Takahashi, Osamu; Kikuchi, Mari; Honda, Satoshi; Shikama, Naoto; Akahane, Keiko; Sekiguchi, Kenji
2012-11-01
Purpose: To determine the frequency and clinical significance of previously undetected incidental findings found on computed tomography (CT) simulation images for breast cancer patients. Methods and Materials: All CT simulation images were first interpreted prospectively by radiation oncologists and then double-checked by diagnostic radiologists. The official reports of CT simulation images for 881 consecutive postoperative breast cancer patients from 2009 to 2010 were retrospectively reviewed. Potentially important incidental findings (PIIFs) were defined as any previously undetected benign or malignancy-related findings requiring further medical follow-up or investigation. For all patients in whom a PIIF was detected, we reviewed the clinical records to determine the clinical significance of the PIIF. If the findings from the additional studies prompted by a PIIF required a change in management, the PIIF was also recorded as a clinically important incidental finding (CIIF). Results: There were a total of 57 (6%) PIIFs. The 57 patients in whom a PIIF was detected were followed for a median of 17 months (range, 3-26). Six cases of CIIFs (0.7% of total) were detected. Of the six CIIFs, three (50%) cases had not been noted by the radiation oncologist until the diagnostic radiologist detected the finding. On multivariate analysis, previous CT examination was an independent predictor for PIIF (p = 0.04). Patients who had not previously received chest CT examinations within 1 year had a statistically significantly higher risk of PIIF than those who had received CT examinations within 6 months (odds ratio, 3.54; 95% confidence interval, 1.32-9.50; p = 0.01). Conclusions: The rate of incidental findings prompting a change in management was low. However, radiation oncologists appear to have some difficulty in detecting incidental findings that require a change in management. Considering cost, it may be reasonable that routine interpretations are given to those who have not
NASA Astrophysics Data System (ADS)
Akula, Aparna; Khanna, Nidhi; Ghosh, Ripul; Kumar, Satish; Das, Amitava; Sardana, H. K.
2014-03-01
A robust contour-based statistical background subtraction method for detection of non-uniform thermal targets in infrared imagery is presented. The foremost step of the method comprises of generation of background frame using statistical information of an initial set of frames not containing any targets. The generated background frame is made adaptive by continuously updating the background using the motion information of the scene. The background subtraction method followed by a clutter rejection stage ensure the detection of foreground objects. The next step comprises of detection of contours and distinguishing the target boundaries from the noisy background. This is achieved by using the Canny edge detector that extracts the contours followed by a k-means clustering approach to differentiate the object contour from the background contours. The post processing step comprises of morphological edge linking approach to close any broken contours and finally flood fill is performed to generate the silhouettes of moving targets. This method is validated on infrared video data consisting of a variety of moving targets. Experimental results demonstrate a high detection rate with minimal false alarms establishing the robustness of the proposed method.
Statistical tests for detection of misspecified relationships by use of genome-screen data.
McPeek, M S; Sun, L
2000-03-01
Misspecified relationships can have serious consequences for linkage studies, resulting in either reduced power or false-positive evidence for linkage. If some individuals in the pedigree are untyped, then Mendelian errors may not be observed. Previous approaches to detection of misspecified relationships by use of genotype data were developed for sib and half-sib pairs. We extend the likelihood calculations of Göring and Ott and Boehnke and Cox to more-general relative pairs, for which identity-by-descent (IBD) status is no longer a Markov chain, and we propose a likelihood-ratio test. We also extend the identity-by-state (IBS)-based test of Ehm and Wagner to nonsib relative pairs. The likelihood-ratio test has high power, but its drawbacks include the need to construct and apply a separate Markov chain for each possible alternative relationship and the need for simulation to assess significance. The IBS-based test is simpler but has lower power. We propose two new test statistics-conditional expected IBD (EIBD) and adjusted IBS (AIBS)-designed to retain the simplicity of IBS while increasing power by taking into account chance sharing. In simulations, the power of EIBD is generally close to that of the likelihood-ratio test. The power of AIBS is higher than that of IBS, in all cases considered. We suggest a strategy of initial screening by use of EIBD and AIBS, followed by application of the likelihood-ratio test to only a subset of relative pairs, identified by use of EIBD and AIBS. We apply the methods to a Genetic Analysis Workshop 11 data set from the Collaborative Study on the Genetics of Alcoholism.
NASA Technical Reports Server (NTRS)
Friedlander, Alan L.; Harry, David P., III
1960-01-01
An exploratory analysis of vehicle guidance during the approach to a target planet is presented. The objective of the guidance maneuver is to guide the vehicle to a specific perigee distance with a high degree of accuracy and minimum corrective velocity expenditure. The guidance maneuver is simulated by considering the random sampling of real measurements with significant error and reducing this information to prescribe appropriate corrective action. The instrumentation system assumed includes optical and/or infrared devices to indicate range and a reference angle in the trajectory plane. Statistical results are obtained by Monte-Carlo techniques and are shown as the expectation of guidance accuracy and velocity-increment requirements. Results are nondimensional and applicable to any planet within limits of two-body assumptions. The problem of determining how many corrections to make and when to make them is a consequence of the conflicting requirement of accurate trajectory determination and propulsion. Optimum values were found for a vehicle approaching a planet along a parabolic trajectory with an initial perigee distance of 5 radii and a target perigee of 1.02 radii. In this example measurement errors were less than i minute of arc. Results indicate that four corrections applied in the vicinity of 50, 16, 15, and 1.5 radii, respectively, yield minimum velocity-increment requirements. Thrust devices capable of producing a large variation of velocity-increment size are required. For a vehicle approaching the earth, miss distances within 32 miles are obtained with 90-percent probability. Total velocity increments used in guidance are less than 3300 feet per second with 90-percent probability. It is noted that the above representative results are valid only for the particular guidance scheme hypothesized in this analysis. A parametric study is presented which indicates the effects of measurement error size, initial perigee, and initial energy on the guidance
Statistical issues and challenges associated with rapid detection of bio-terrorist attacks.
Fienberg, Stephen E; Shmueli, Galit
2005-02-28
The traditional focus for detecting outbreaks of an epidemic or bio-terrorist attack has been on the collection and analysis of medical and public health data. Although such data are the most direct indicators of symptoms, they tend to be collected, delivered, and analysed days, weeks, and even months after the outbreak. By the time this information reaches decision makers it is often too late to treat the infected population or to react in some other way. In this paper, we explore different sources of data, traditional and non-traditional, that can be used for detecting a bio-terrorist attack in a timely manner. We set our discussion in the context of state-of-the-art syndromic surveillance systems and we focus on statistical issues and challenges associated with non-traditional data sources and the timely integration of multiple data sources for detection purposes.
Statistical method for detecting phase shifts in alpha rhythm from human electroencephalogram data
NASA Astrophysics Data System (ADS)
Naruse, Yasushi; Takiyama, Ken; Okada, Masato; Umehara, Hiroaki
2013-04-01
We developed a statistical method for detecting discontinuous phase changes (phase shifts) in fluctuating alpha rhythms in the human brain from electroencephalogram (EEG) data obtained in a single trial. This method uses the state space models and the line process technique, which is a Bayesian method for detecting discontinuity in an image. By applying this method to simulated data, we were able to detect the phase and amplitude shifts in a single simulated trial. Further, we demonstrated that this method can detect phase shifts caused by a visual stimulus in the alpha rhythm from experimental EEG data even in a single trial. The results for the experimental data showed that the timings of the phase shifts in the early latency period were similar between many of the trials, and that those in the late latency period were different between the trials. The conventional averaging method can only detect phase shifts that occur at similar timings between many of the trials, and therefore, the phase shifts that occur at differing timings cannot be detected using the conventional method. Consequently, our obtained results indicate the practicality of our method. Thus, we believe that our method will contribute to studies examining the phase dynamics of nonlinear alpha rhythm oscillators.
NASA Astrophysics Data System (ADS)
Millard, Steven P.; Deverel, Steven J.
1988-12-01
As concern over the effects of trace amounts of pollutants has increased, so has the need for statistical methods that deal appropriately with data that include values reported as "less than" the detection limit. It has become increasingly common for water quality data to include censored values that reflect more than one detection limit for a single analyte. For such multiply censored data sets, standard statistical methods (for example, to compare analyte concentration in two areas) are not valid. In such cases, methods from the biostatistical field of survival analysis are applicable. Several common two-sample censored data rank tests are explained, and their behaviors are studied via a Monte Carlo simulation in which sample sizes and censoring mechanisms are varied under an assumed lognormal distribution. These tests are applied to shallow groundwater chemistry data from two sites in the San Joaquin Valley, California. The best overall test, in terms of maintained α level, is the normal scores test based on a permutation variance. In cases where the α level is maintained, however, the Peto-Prentice statistic based on an asymptotic variance performs as well or better.
NASA Astrophysics Data System (ADS)
Ortega-Martinez, Antonio; Padilla-Martinez, Juan Pablo; Franco, Walfre
2016-04-01
The skin contains several fluorescent molecules or fluorophores that serve as markers of structure, function and composition. UV fluorescence excitation photography is a simple and effective way to image specific intrinsic fluorophores, such as the one ascribed to tryptophan which emits at a wavelength of 345 nm upon excitation at 295 nm, and is a marker of cellular proliferation. Earlier, we built a clinical UV photography system to image cellular proliferation. In some samples, the naturally low intensity of the fluorescence can make it difficult to separate the fluorescence of cells in higher proliferation states from background fluorescence and other imaging artifacts -- like electronic noise. In this work, we describe a statistical image segmentation method to separate the fluorescence of interest. Statistical image segmentation is based on image averaging, background subtraction and pixel statistics. This method allows to better quantify the intensity and surface distributions of fluorescence, which in turn simplify the detection of borders. Using this method we delineated the borders of highly-proliferative skin conditions and diseases, in particular, allergic contact dermatitis, psoriatic lesions and basal cell carcinoma. Segmented images clearly define lesion borders. UV fluorescence excitation photography along with statistical image segmentation may serve as a quick and simple diagnostic tool for clinicians.
Osche, G R
2000-08-20
Single- and multiple-pulse detection statistics are presented for aperture-averaged direct detection optical receivers operating against partially developed speckle fields. A partially developed speckle field arises when the probability density function of the received intensity does not follow negative exponential statistics. The case of interest here is the target surface that exhibits diffuse as well as specular components in the scattered radiation. An approximate expression is derived for the integrated intensity at the aperture, which leads to single- and multiple-pulse discrete probability density functions for the case of a Poisson signal in Poisson noise with an additive coherent component. In the absence of noise, the single-pulse discrete density function is shown to reduce to a generalized negative binomial distribution. The radar concept of integration loss is discussed in the context of direct detection optical systems where it is shown that, given an appropriate set of system parameters, multiple-pulse processing can be more efficient than single-pulse processing over a finite range of the integration parameter n. PMID:18350006
Performance analysis of Wald-statistic based network detection methods for radiation sources
Sen, Satyabrata; Rao, Nageswara S; Wu, Qishi; Barry, M. L..; Grieme, M.; Brooks, Richard R; Cordone, G.
2016-01-01
There have been increasingly large deployments of radiation detection networks that require computationally fast algorithms to produce prompt results over ad-hoc sub-networks of mobile devices, such as smart-phones. These algorithms are in sharp contrast to complex network algorithms that necessitate all measurements to be sent to powerful central servers. In this work, at individual sensors, we employ Wald-statistic based detection algorithms which are computationally very fast, and are implemented as one of three Z-tests and four chi-square tests. At fusion center, we apply the K-out-of-N fusion to combine the sensors hard decisions. We characterize the performance of detection methods by deriving analytical expressions for the distributions of underlying test statistics, and by analyzing the fusion performances in terms of K, N, and the false-alarm rates of individual detectors. We experimentally validate our methods using measurements from indoor and outdoor characterization tests of the Intelligence Radiation Sensors Systems (IRSS) program. In particular, utilizing the outdoor measurements, we construct two important real-life scenarios, boundary surveillance and portal monitoring, and present the results of our algorithms.
Irshad, Humayun; Roux, Ludovic; Racoceanu, Daniel
2013-01-01
Accurate counting of mitosis in breast cancer histopathology plays a critical role in the grading process. Manual counting of mitosis is tedious and subject to considerable inter- and intra-reader variations. This work aims at improving the accuracy of mitosis detection by selecting the color channels that better capture the statistical and morphological features having mitosis discrimination from other objects. The proposed framework includes comprehensive analysis of first and second order statistical features together with morphological features in selected color channels and a study on balancing the skewed dataset using SMOTE method for increasing the predictive accuracy of mitosis classification. The proposed framework has been evaluated on MITOS data set during an ICPR 2012 contest and ranked second from 17 finalists. The proposed framework achieved 74% detection rate, 70% precision and 72% F-Measure. In future work, we plan to apply our mitosis detection tool to images produced by different types of slide scanners, including multi-spectral and multi-focal microscopy.
Oberer, R.B.
2002-11-12
The current practice of nondestructive assay (NDA) of fissile materials using neutrons is dominated by the {sup 3}He detector. This has been the case since the mid 1980s when Fission Multiplicity Detection (FMD) was replaced with thermal well counters and neutron multiplicity counting (NMC). The thermal well counters detect neutrons by neutron capture in the {sup 3}He detector subsequent to moderation. The process of detection requires from 30 to 60 {micro}s. As will be explained in Section 3.3 the rate of detecting correlated neutrons (signal) from the same fission are independent of this time but the rate of accidental correlations (noise) are proportional to this time. The well counters are at a distinct disadvantage when there is a large source of uncorrelated neutrons present from ({alpha}, n) reactions for example. Plastic scintillating detectors, as were used in FMD, require only about 20 ns to detect neutrons from fission. One thousandth as many accidental coincidences are therefore accumulated. The major problem with the use of fast-plastic scintillation detectors, however, is that both neutrons and gamma rays are detected. The pulses from the two are indistinguishable in these detectors. For this thesis, a new technique was developed to use higher-order time correlation statistics to distinguish combinations of neutron and gamma ray detections in fast-plastic scintillation detectors. A system of analysis to describe these correlations was developed based on simple physical principles. Other sources of correlations from non-fission events are identified and integrated into the analysis developed for fission events. A number of ratios and metric are identified to determine physical properties of the source from the correlations. It is possible to determine both the quantity being measured and detection efficiency from these ratios from a single measurement without a separate calibration. To account for detector dead-time, an alternative analytical technique
NASA Astrophysics Data System (ADS)
Chung, Moo K.; Kim, Seung-Goo; Schaefer, Stacey M.; van Reekum, Carien M.; Peschke-Schmitz, Lara; Sutterer, Matthew J.; Davidson, Richard J.
2014-03-01
The sparse regression framework has been widely used in medical image processing and analysis. However, it has been rarely used in anatomical studies. We present a sparse shape modeling framework using the Laplace- Beltrami (LB) eigenfunctions of the underlying shape and show its improvement of statistical power. Tradition- ally, the LB-eigenfunctions are used as a basis for intrinsically representing surface shapes as a form of Fourier descriptors. To reduce high frequency noise, only the first few terms are used in the expansion and higher frequency terms are simply thrown away. However, some lower frequency terms may not necessarily contribute significantly in reconstructing the surfaces. Motivated by this idea, we present a LB-based method to filter out only the significant eigenfunctions by imposing a sparse penalty. For dense anatomical data such as deformation fields on a surface mesh, the sparse regression behaves like a smoothing process, which will reduce the error of incorrectly detecting false negatives. Hence the statistical power improves. The sparse shape model is then applied in investigating the influence of age on amygdala and hippocampus shapes in the normal population. The advantage of the LB sparse framework is demonstrated by showing the increased statistical power.
NASA Technical Reports Server (NTRS)
Moore, G. K.
1976-01-01
An investigation was carried out to determine the feasibility of mapping lineaments on SKYLAB photographs of central Tennessee and to determine the hydrologic significance of these lineaments, particularly as concerns the occurrence and productivity of ground water. Sixty-nine percent more lineaments were found on SKYLAB photographs by stereo viewing than by projection viewing, but longer lineaments were detected by projection viewing. Most SKYLAB lineaments consisted of topographic depressions and they followed or paralleled the streams. The remainder were found by vegetation alinements and the straight sides of ridges. Test drilling showed that the median yield of wells located on SKYLAB lineaments were about six times the median yield of wells located by random drilling. The best single detection method, in terms of potential savings, was stereo viewing. Larger savings might be achieved by locating wells on lineaments detected by both stereo viewing and projection.
Detection of microcalcifications in mammograms using error of prediction and statistical measures
NASA Astrophysics Data System (ADS)
Acha, Begoña; Serrano, Carmen; Rangayyan, Rangaraj M.; Leo Desautels, J. E.
2009-01-01
A two-stage method for detecting microcalcifications in mammograms is presented. In the first stage, the determination of the candidates for microcalcifications is performed. For this purpose, a 2-D linear prediction error filter is applied, and for those pixels where the prediction error is larger than a threshold, a statistical measure is calculated to determine whether they are candidates for microcalcifications or not. In the second stage, a feature vector is derived for each candidate, and after a classification step using a support vector machine, the final detection is performed. The algorithm is tested with 40 mammographic images, from Screen Test: The Alberta Program for the Early Detection of Breast Cancer with 50-μm resolution, and the results are evaluated using a free-response receiver operating characteristics curve. Two different analyses are performed: an individual microcalcification detection analysis and a cluster analysis. In the analysis of individual microcalcifications, detection sensitivity values of 0.75 and 0.81 are obtained at 2.6 and 6.2 false positives per image, on the average, respectively. The best performance is characterized by a sensitivity of 0.89, a specificity of 0.99, and a positive predictive value of 0.79. In cluster analysis, a sensitivity value of 0.97 is obtained at 1.77 false positives per image, and a value of 0.90 is achieved at 0.94 false positive per image.
Greenhalgh, T.
1997-01-01
It is possible to be seriously misled by taking the statistical competence (and/or the intellectual honesty) of authors for granted. Some common errors committed (deliberately or inadvertently) by the authors of papers are given in the final box. PMID:9277611
Detection of coronal mass ejections using AdaBoost on grayscale statistic features
NASA Astrophysics Data System (ADS)
Zhang, Ling; Yin, Jian-qin; Lin, Jia-ben; Wang, Xiao-fan; Guo, Juan
2016-10-01
We present an automatic algorithm to detect coronal mass ejections (CMEs) in Large Angle Spectrometric Coronagraph (LASCO) C2 running difference images. The algorithm includes 3 steps: (1) split the running difference images into blocks according to slice size and analyze the grayscale statistics of the blocks from a set of images with and without CMEs; (2) select the optimal parameters for slice size, gray threshold and fraction of the bright points and (3) use AdaBoost to combine the weak classifiers designed according to the optimal parameters. Experimental results show that our method is effective and has a high accuracy rate.
Statistical Analysis of Probability of Detection Hit/Miss Data for Small Data Sets
NASA Astrophysics Data System (ADS)
Harding, C. A.; Hugo, G. R.
2003-03-01
This paper examines the validity of statistical methods for determining nondestructive inspection probability of detection (POD) curves from relatively small hit/miss POD data sets. One method published in the literature is shown to be invalid for analysis of POD hit/miss data. Another standard method is shown to be valid only for data sets containing more than 200 observations. An improved method is proposed which allows robust lower 95% confidence limit POD curves to be determined from data sets containing as few as 50 hit/miss observations.
Hüneburg, Robert; Kukuk, Guido; Nattermann, Jacob; Endler, Christoph; Penner, Arndt-Hendrik; Wolter, Karsten; Schild, Hans; Strassburg, Christian; Sauerbruch, Tilman; Schmitz, Volker; Willinek, Winfried
2016-01-01
Background and study aims: Colorectal cancer (CRC) is one of the most common cancers worldwide, and several efforts have been made to reduce its occurrence or severity. Although colonoscopy is considered the gold standard in CRC prevention, it has its disadvantages: missed lesions, bleeding, and perforation. Furthermore, a high number of patients undergo this procedure even though no polyps are detected. Therefore, an initial screening examination may be warranted. Our aim was to compare the adenoma detection rate of magnetic resonance colonography (MRC) with that of optical colonoscopy. Patients and methods: A total of 25 patients with an intermediate risk for CRC (17 men, 8 women; mean age 57.6, standard deviation 11) underwent MRC with a 3.0-tesla magnet, followed by colonoscopy. The endoscopist was initially blinded to the results of MRC and unblinded immediately after examining the distal rectum. Following endoscopic excision, the size, anatomical localization, and appearance of all polyps were described according to the Paris classification. Results: A total of 93 lesions were detected during colonoscopy. These included a malignant infiltration of the transverse colon due to gastric cancer in 1 patient, 28 adenomas in 10 patients, 19 hyperplastic polyps in 9 patients, and 45 non-neoplastic lesions. In 5 patients, no lesion was detected. MRC detected significantly fewer lesions: 1 adenoma (P = 0.001) and 1 hyperplastic polyp (P = 0.004). The malignant infiltration was seen with both modalities. Of the 28 adenomas, 23 (82 %) were 5 mm or smaller; only 4 adenomas 10 mm or larger (14 %) were detected. Conclusion: MRC does not detect adenomas sufficiently independently of the location of the lesion. Even advanced lesions were missed. Therefore, colonoscopy should still be considered the current gold standard, even for diagnostic purposes. PMID:26878043
Statistical detection and imaging of objects hidden in turbid media using ballistic photons.
Farsiu, Sina; Christofferson, James; Eriksson, Brian; Milanfar, Peyman; Friedlander, Benjamin; Shakouri, Ali; Nowak, Robert
2007-08-10
We exploit recent advances in active high-resolution imaging through scattering media with ballistic photons. We derive the fundamental limits on the accuracy of the estimated parameters of a mathematical model that describes such an imaging scenario and compare the performance of ballistic and conventional imaging systems. This model is later used to derive optimal single-pixel statistical tests for detecting objects hidden in turbid media. To improve the detection rate of the aforementioned single-pixel detectors, we develop a multiscale algorithm based on the generalized likelihood ratio test framework. Moreover, considering the effect of diffraction, we derive a lower bound on the achievable spatial resolution of the proposed imaging systems. Furthermore, we present the first experimental ballistic scanner that directly takes advantage of novel adaptive sampling and reconstruction techniques.
NASA Astrophysics Data System (ADS)
Shao, Quanxi; Wang, You-Gan
2009-09-01
Power calculation and sample size determination are critical in designing environmental monitoring programs. The traditional approach based on comparing the mean values may become statistically inappropriate and even invalid when substantial proportions of the response values are below the detection limits or censored because strong distributional assumptions have to be made on the censored observations when implementing the traditional procedures. In this paper, we propose a quantile methodology that is robust to outliers and can also handle data with a substantial proportion of below-detection-limit observations without the need of imputing the censored values. As a demonstration, we applied the methods to a nutrient monitoring project, which is a part of the Perth Long-Term Ocean Outlet Monitoring Program. In this example, the sample size required by our quantile methodology is, in fact, smaller than that by the traditional t-test, illustrating the merit of our method.
Shia, Jinru
2016-01-01
The last two decades have seen significant advancement in our understanding of colorectal tumors with DNA mismatch repair (MMR) deficiency. The ever-emerging revelations of new molecular and genetic alterations in various clinical conditions have necessitated constant refinement of disease terminology and classification. Thus, a case with the clinical condition of hereditary non-polyposis colorectal cancer as defined by the Amsterdam criteria may be one of Lynch syndrome characterized by a germline defect in one of the several MMR genes, one of the yet-to-be-defined “Lynch-like syndrome” if there is evidence of MMR deficiency in the tumor but no detectable germline MMR defect or tumor MLH1 promoter methylation, or “familial colorectal cancer type X” if there is no evidence of MMR deficiency. The detection of these conditions carries significant clinical implications. The detection tools and strategies are constantly evolving. The Bethesda guidelines symbolize a selective approach that uses clinical information and tumor histology as the basis to select high-risk individuals. Such a selective approach has subsequently been found to have limited sensitivity, and is thus gradually giving way to the alternative universal approach that tests all newly diagnosed colorectal cancers. Notably, the universal approach also has its own limitations; its cost-effectiveness in real practice, in particular, remains to be determined. Meanwhile, technological advances such as the next-generation sequencing are offering the promise of direct genetic testing for MMR deficiency at an affordable cost probably in the near future. This article reviews the up-to-date molecular definitions of the various conditions related to MMR deficiency, and discusses the tools and strategies that have been used in detecting these conditions. Special emphasis will be placed on the evolving nature and the clinical importance of the disease definitions and the detection strategies. PMID:25716099
Improved bowel preparation increases polyp detection and unmasks significant polyp miss rate
Papanikolaou, Ioannis S; Sioulas, Athanasios D; Magdalinos, Nektarios; Beintaris, Iosif; Lazaridis, Lazaros-Dimitrios; Polymeros, Dimitrios; Malli, Chrysoula; Dimitriadis, George D; Triantafyllou, Konstantinos
2015-01-01
AIM: To retrospectively compare previous-day vs split-dose preparation in terms of bowel cleanliness and polyp detection in patients referred for polypectomy. METHODS: Fifty patients underwent two colonoscopies: one diagnostic in a private clinic and a second for polypectomy in a University Hospital. The latter procedures were performed within 12 wk of the index ones. Examinations were accomplished by two experienced endoscopists, different in each facility. Twenty-seven patients underwent screening/surveillance colonoscopy, while the rest were symptomatic. Previous day bowel preparation was utilized initially and split-dose for polypectomy. Colon cleansing was evaluated using the Aronchick scale. We measured the number of detected polyps, and the polyp miss rates per-polyp. RESULTS: Excellent/good preparation was reported in 38 cases with previous-day preparation (76%) vs 46 with split-dose (92%), respectively (P = 0.03). One hundred and twenty-six polyps were detected initially and 169 subsequently (P < 0.0001); 88 vs 126 polyps were diminutive (P < 0.0001), 25 vs 29 small (P = 0.048) and 13 vs 14 equal or larger than 10 mm. The miss rates for total, diminutive, small and large polyps were 25.4%, 30.1%, 13.7% and 6.6%, respectively. Multivariate analysis revealed that split-dose preparation was significantly associated (OR, P) with increased number of polyps detected overall (0.869, P < 0.001), in the right (0.418, P = 0.008) and in the left colon (0.452, P = 0.02). CONCLUSION: Split-dose preparation improved colon cleansing, enhanced polyp detection and unmasked significant polyp miss rates. PMID:26488024
Shia, Jinru
2015-09-01
The last two decades have seen significant advancement in our understanding of colorectal tumors with DNA mismatch repair (MMR) deficiency. The ever-emerging revelations of new molecular and genetic alterations in various clinical conditions have necessitated constant refinement of disease terminology and classification. Thus, a case with the clinical condition of hereditary non-polyposis colorectal cancer as defined by the Amsterdam criteria may be one of Lynch syndrome characterized by a germline defect in one of the several MMR genes, one of the yet-to-be-defined "Lynch-like syndrome" if there is evidence of MMR deficiency in the tumor but no detectable germline MMR defect or tumor MLH1 promoter methylation, or "familial colorectal cancer type X" if there is no evidence of MMR deficiency. The detection of these conditions carries significant clinical implications. The detection tools and strategies are constantly evolving. The Bethesda guidelines symbolize a selective approach that uses clinical information and tumor histology as the basis to select high-risk individuals. Such a selective approach has subsequently been found to have limited sensitivity, and is thus gradually giving way to the alternative universal approach that tests all newly diagnosed colorectal cancers. Notably, the universal approach also has its own limitations; its cost-effectiveness in real practice, in particular, remains to be determined. Meanwhile, technological advances such as the next-generation sequencing are offering the promise of direct genetic testing for MMR deficiency at an affordable cost probably in the near future. This article reviews the up-to-date molecular definitions of the various conditions related to MMR deficiency, and discusses the tools and strategies that have been used in detecting these conditions. Special emphasis will be placed on the evolving nature and the clinical importance of the disease definitions and the detection strategies.
Using statistical distances to detect changes in the normal behavior of ECG-Holter signals
NASA Astrophysics Data System (ADS)
Bastos de Figueiredo, Julio C.; Furuie, Sergio S.
2001-05-01
One of the main problems in the study of complex systems is to define a good metric that can distinguish between different dynamical behaviors in a nonlinear system. In this work we describe a method to detect different types of behaviors in a long term ECG-Holter using short portions of the Holter signal. This method is based on the calculation of the statistical distance between two distributions in a phase-space of a dynamical system. A short portion of an ECG-Holter signal with normal behavior is used to reconstruct the trajectory of an attractor in low dimensional phase-space. The points in this trajectory are interpreted as statistical distributions in the phase-space and assumed to represent the normal dynamical behavior of the ECG recording in this space. A fast algorithm is then used to compute the statistical distance between this attractor and all other attractors that are built using a sliding temporal window over the signal. For normal cases the distance stayed almost constant and below a threshold. For cases with abnormal transients, on the abnormal portion of ECG, the distance increased consistently with morphological changes.
NASA Astrophysics Data System (ADS)
Svärd, Carl; Nyberg, Mattias; Frisk, Erik; Krysander, Mattias
2014-03-01
An important step in model-based fault detection is residual evaluation, where residuals are evaluated with the aim to detect changes in their behavior caused by faults. To handle residuals subject to time-varying uncertainties and disturbances, which indeed are present in practice, a novel statistical residual evaluation approach is presented. The main contribution is to base the residual evaluation on an explicit comparison of the probability distribution of the residual, estimated online using current data, with a no-fault residual distribution. The no-fault distribution is based on a set of a priori known no-fault residual distributions, and is continuously adapted to the current situation. As a second contribution, a method is proposed for estimating the required set of no-fault residual distributions off-line from no-fault training data. The proposed residual evaluation approach is evaluated with measurement data on a residual for fault detection in the gas-flow system of a Scania truck diesel engine. Results show that small faults can be reliably detected with the proposed approach in cases where regular methods fail.
Dipnall, Joanna F.
2016-01-01
Background Atheoretical large-scale data mining techniques using machine learning algorithms have promise in the analysis of large epidemiological datasets. This study illustrates the use of a hybrid methodology for variable selection that took account of missing data and complex survey design to identify key biomarkers associated with depression from a large epidemiological study. Methods The study used a three-step methodology amalgamating multiple imputation, a machine learning boosted regression algorithm and logistic regression, to identify key biomarkers associated with depression in the National Health and Nutrition Examination Study (2009–2010). Depression was measured using the Patient Health Questionnaire-9 and 67 biomarkers were analysed. Covariates in this study included gender, age, race, smoking, food security, Poverty Income Ratio, Body Mass Index, physical activity, alcohol use, medical conditions and medications. The final imputed weighted multiple logistic regression model included possible confounders and moderators. Results After the creation of 20 imputation data sets from multiple chained regression sequences, machine learning boosted regression initially identified 21 biomarkers associated with depression. Using traditional logistic regression methods, including controlling for possible confounders and moderators, a final set of three biomarkers were selected. The final three biomarkers from the novel hybrid variable selection methodology were red cell distribution width (OR 1.15; 95% CI 1.01, 1.30), serum glucose (OR 1.01; 95% CI 1.00, 1.01) and total bilirubin (OR 0.12; 95% CI 0.05, 0.28). Significant interactions were found between total bilirubin with Mexican American/Hispanic group (p = 0.016), and current smokers (p<0.001). Conclusion The systematic use of a hybrid methodology for variable selection, fusing data mining techniques using a machine learning algorithm with traditional statistical modelling, accounted for missing data and
NASA Astrophysics Data System (ADS)
Perles, Stephanie J.; Wagner, Tyler; Irwin, Brian J.; Manning, Douglas R.; Callahan, Kristina K.; Marshall, Matthew R.
2014-09-01
Forests are socioeconomically and ecologically important ecosystems that are exposed to a variety of natural and anthropogenic stressors. As such, monitoring forest condition and detecting temporal changes therein remain critical to sound public and private forestland management. The National Parks Service's Vital Signs monitoring program collects information on many forest health indicators, including species richness, cover by exotics, browse pressure, and forest regeneration. We applied a mixed-model approach to partition variability in data for 30 forest health indicators collected from several national parks in the eastern United States. We then used the estimated variance components in a simulation model to evaluate trend detection capabilities for each indicator. We investigated the extent to which the following factors affected ability to detect trends: (a) sample design: using simple panel versus connected panel design, (b) effect size: increasing trend magnitude, (c) sample size: varying the number of plots sampled each year, and (d) stratified sampling: post-stratifying plots into vegetation domains. Statistical power varied among indicators; however, indicators that measured the proportion of a total yielded higher power when compared to indicators that measured absolute or average values. In addition, the total variability for an indicator appeared to influence power to detect temporal trends more than how total variance was partitioned among spatial and temporal sources. Based on these analyses and the monitoring objectives of the Vital Signs program, the current sampling design is likely overly intensive for detecting a 5 % trend·year-1 for all indicators and is appropriate for detecting a 1 % trend·year-1 in most indicators.
Perles, Stephanie J; Wagner, Tyler; Irwin, Brian J; Manning, Douglas R; Callahan, Kristina K; Marshall, Matthew R
2014-09-01
Forests are socioeconomically and ecologically important ecosystems that are exposed to a variety of natural and anthropogenic stressors. As such, monitoring forest condition and detecting temporal changes therein remain critical to sound public and private forestland management. The National Parks Service's Vital Signs monitoring program collects information on many forest health indicators, including species richness, cover by exotics, browse pressure, and forest regeneration. We applied a mixed-model approach to partition variability in data for 30 forest health indicators collected from several national parks in the eastern United States. We then used the estimated variance components in a simulation model to evaluate trend detection capabilities for each indicator. We investigated the extent to which the following factors affected ability to detect trends: (a) sample design: using simple panel versus connected panel design, (b) effect size: increasing trend magnitude, (c) sample size: varying the number of plots sampled each year, and (d) stratified sampling: post-stratifying plots into vegetation domains. Statistical power varied among indicators; however, indicators that measured the proportion of a total yielded higher power when compared to indicators that measured absolute or average values. In addition, the total variability for an indicator appeared to influence power to detect temporal trends more than how total variance was partitioned among spatial and temporal sources. Based on these analyses and the monitoring objectives of the Vital Signs program, the current sampling design is likely overly intensive for detecting a 5 % trend·year(-1) for all indicators and is appropriate for detecting a 1 % trend·year(-1) in most indicators.
Perles, Stephanie J.; Wagner, Tyler; Irwin, Brian J.; Manning, Douglas R.; Callahan, Kristina K.; Marshall, Matthew R.
2014-01-01
Forests are socioeconomically and ecologically important ecosystems that are exposed to a variety of natural and anthropogenic stressors. As such, monitoring forest condition and detecting temporal changes therein remain critical to sound public and private forestland management. The National Parks Service’s Vital Signs monitoring program collects information on many forest health indicators, including species richness, cover by exotics, browse pressure, and forest regeneration. We applied a mixed-model approach to partition variability in data for 30 forest health indicators collected from several national parks in the eastern United States. We then used the estimated variance components in a simulation model to evaluate trend detection capabilities for each indicator. We investigated the extent to which the following factors affected ability to detect trends: (a) sample design: using simple panel versus connected panel design, (b) effect size: increasing trend magnitude, (c) sample size: varying the number of plots sampled each year, and (d) stratified sampling: post-stratifying plots into vegetation domains. Statistical power varied among indicators; however, indicators that measured the proportion of a total yielded higher power when compared to indicators that measured absolute or average values. In addition, the total variability for an indicator appeared to influence power to detect temporal trends more than how total variance was partitioned among spatial and temporal sources. Based on these analyses and the monitoring objectives of theVital Signs program, the current sampling design is likely overly intensive for detecting a 5 % trend·year−1 for all indicators and is appropriate for detecting a 1 % trend·year−1 in most indicators.
Kim, Jihoon; Grillo, Janice M; Ohno-Machado, Lucila
2011-01-01
Objective To determine whether statistical and machine-learning methods, when applied to electronic health record (EHR) access data, could help identify suspicious (ie, potentially inappropriate) access to EHRs. Methods From EHR access logs and other organizational data collected over a 2-month period, the authors extracted 26 features likely to be useful in detecting suspicious accesses. Selected events were marked as either suspicious or appropriate by privacy officers, and served as the gold standard set for model evaluation. The authors trained logistic regression (LR) and support vector machine (SVM) models on 10-fold cross-validation sets of 1291 labeled events. The authors evaluated the sensitivity of final models on an external set of 58 events that were identified as truly inappropriate and investigated independently from this study using standard operating procedures. Results The area under the receiver operating characteristic curve of the models on the whole data set of 1291 events was 0.91 for LR, and 0.95 for SVM. The sensitivity of the baseline model on this set was 0.8. When the final models were evaluated on the set of 58 investigated events, all of which were determined as truly inappropriate, the sensitivity was 0 for the baseline method, 0.76 for LR, and 0.79 for SVM. Limitations The LR and SVM models may not generalize because of interinstitutional differences in organizational structures, applications, and workflows. Nevertheless, our approach for constructing the models using statistical and machine-learning techniques can be generalized. An important limitation is the relatively small sample used for the training set due to the effort required for its construction. Conclusion The results suggest that statistical and machine-learning methods can play an important role in helping privacy officers detect suspicious accesses to EHRs. PMID:21672912
Significance of parametric spectral ratio methods in detection and recognition of whispered speech
NASA Astrophysics Data System (ADS)
Mathur, Arpit; Reddy, Shankar M.; Hegde, Rajesh M.
2012-12-01
In this article the significance of a new parametric spectral ratio method that can be used to detect whispered speech segments within normally phonated speech is described. Adaptation methods based on the maximum likelihood linear regression (MLLR) are then used to realize a mismatched train-test style speech recognition system. This proposed parametric spectral ratio method computes a ratio spectrum of the linear prediction (LP) and the minimum variance distortion-less response (MVDR) methods. The smoothed ratio spectrum is then used to detect whispered segments of speech within neutral speech segments effectively. The proposed LP-MVDR ratio method exhibits robustness at different SNRs as indicated by the whisper diarization experiments conducted on the CHAINS and the cell phone whispered speech corpus. The proposed method also performs reasonably better than the conventional methods for whisper detection. In order to integrate the proposed whisper detection method into a conventional speech recognition engine with minimal changes, adaptation methods based on the MLLR are used herein. The hidden Markov models corresponding to neutral mode speech are adapted to the whispered mode speech data in the whispered regions as detected by the proposed ratio method. The performance of this method is first evaluated on whispered speech data from the CHAINS corpus. The second set of experiments are conducted on the cell phone corpus of whispered speech. This corpus is collected using a set up that is used commercially for handling public transactions. The proposed whisper speech recognition system exhibits reasonably better performance when compared to several conventional methods. The results shown indicate the possibility of a whispered speech recognition system for cell phone based transactions.
Wagner, Tyler; Irwin, Brian J.; James R. Bence,; Daniel B. Hayes,
2016-01-01
Monitoring to detect temporal trends in biological and habitat indices is a critical component of fisheries management. Thus, it is important that management objectives are linked to monitoring objectives. This linkage requires a definition of what constitutes a management-relevant “temporal trend.” It is also important to develop expectations for the amount of time required to detect a trend (i.e., statistical power) and for choosing an appropriate statistical model for analysis. We provide an overview of temporal trends commonly encountered in fisheries management, review published studies that evaluated statistical power of long-term trend detection, and illustrate dynamic linear models in a Bayesian context, as an additional analytical approach focused on shorter term change. We show that monitoring programs generally have low statistical power for detecting linear temporal trends and argue that often management should be focused on different definitions of trends, some of which can be better addressed by alternative analytical approaches.
Towards spatial localisation of harmful algal blooms; statistics-based spatial anomaly detection
NASA Astrophysics Data System (ADS)
Shutler, J. D.; Grant, M. G.; Miller, P. I.
2005-10-01
Harmful algal blooms are believed to be increasing in occurrence and their toxins can be concentrated by filter-feeding shellfish and cause amnesia or paralysis when ingested. As a result fisheries and beaches in the vicinity of blooms may need to be closed and the local population informed. For this avoidance planning timely information on the existence of a bloom, its species and an accurate map of its extent would be prudent. Current research to detect these blooms from space has mainly concentrated on spectral approaches towards determining species. We present a novel statistics-based background-subtraction technique that produces improved descriptions of an anomaly's extent from remotely-sensed ocean colour data. This is achieved by extracting bulk information from a background model; this is complemented by a computer vision ramp filtering technique to specifically detect the perimeter of the anomaly. The complete extraction technique uses temporal-variance estimates which control the subtraction of the scene of interest from the time-weighted background estimate, producing confidence maps of anomaly extent. Through the variance estimates the method learns the associated noise present in the data sequence, providing robustness, and allowing generic application. Further, the use of the median for the background model reduces the effects of anomalies that appear within the time sequence used to generate it, allowing seasonal variations in the background levels to be closely followed. To illustrate the detection algorithm's application, it has been applied to two spectrally different oceanic regions.
Zhao, Xing; Zhou, Xiao-Hua; Feng, Zijian; Guo, Pengfei; He, Hongyan; Zhang, Tao; Duan, Lei; Li, Xiaosong
2013-01-01
As a useful tool for geographical cluster detection of events, the spatial scan statistic is widely applied in many fields and plays an increasingly important role. The classic version of the spatial scan statistic for the binary outcome is developed by Kulldorff, based on the Bernoulli or the Poisson probability model. In this paper, we apply the Hypergeometric probability model to construct the likelihood function under the null hypothesis. Compared with existing methods, the likelihood function under the null hypothesis is an alternative and indirect method to identify the potential cluster, and the test statistic is the extreme value of the likelihood function. Similar with Kulldorff's methods, we adopt Monte Carlo test for the test of significance. Both methods are applied for detecting spatial clusters of Japanese encephalitis in Sichuan province, China, in 2009, and the detected clusters are identical. Through a simulation to independent benchmark data, it is indicated that the test statistic based on the Hypergeometric model outweighs Kulldorff's statistics for clusters of high population density or large size; otherwise Kulldorff's statistics are superior.
A Hybrid Swarm Intelligence Algorithm for Intrusion Detection Using Significant Features.
Amudha, P; Karthik, S; Sivakumari, S
2015-01-01
Intrusion detection has become a main part of network security due to the huge number of attacks which affects the computers. This is due to the extensive growth of internet connectivity and accessibility to information systems worldwide. To deal with this problem, in this paper a hybrid algorithm is proposed to integrate Modified Artificial Bee Colony (MABC) with Enhanced Particle Swarm Optimization (EPSO) to predict the intrusion detection problem. The algorithms are combined together to find out better optimization results and the classification accuracies are obtained by 10-fold cross-validation method. The purpose of this paper is to select the most relevant features that can represent the pattern of the network traffic and test its effect on the success of the proposed hybrid classification algorithm. To investigate the performance of the proposed method, intrusion detection KDDCup'99 benchmark dataset from the UCI Machine Learning repository is used. The performance of the proposed method is compared with the other machine learning algorithms and found to be significantly different. PMID:26221625
A Hybrid Swarm Intelligence Algorithm for Intrusion Detection Using Significant Features
Amudha, P.; Karthik, S.; Sivakumari, S.
2015-01-01
Intrusion detection has become a main part of network security due to the huge number of attacks which affects the computers. This is due to the extensive growth of internet connectivity and accessibility to information systems worldwide. To deal with this problem, in this paper a hybrid algorithm is proposed to integrate Modified Artificial Bee Colony (MABC) with Enhanced Particle Swarm Optimization (EPSO) to predict the intrusion detection problem. The algorithms are combined together to find out better optimization results and the classification accuracies are obtained by 10-fold cross-validation method. The purpose of this paper is to select the most relevant features that can represent the pattern of the network traffic and test its effect on the success of the proposed hybrid classification algorithm. To investigate the performance of the proposed method, intrusion detection KDDCup'99 benchmark dataset from the UCI Machine Learning repository is used. The performance of the proposed method is compared with the other machine learning algorithms and found to be significantly different. PMID:26221625
Three-dimensional building detection and modeling using a statistical approach.
Cord, M; Declercq, D
2001-01-01
In this paper, we address the problem of building reconstruction in high-resolution stereoscopic aerial imagery. We present a hierarchical strategy to detect and model buildings in urban sites, based on a global focusing process, followed by a local modeling. During the first step, we extract the building regions by exploiting to the full extent the depth information obtained with a new adaptive correlation stereo matching. In the modeling step, we propose a statistical approach, which is competitive to the sequential methods using segmentation and modeling. This parametric method is based on a multiplane model of the data, interpreted as a mixture model. From a Bayesian point of view the so-called augmentation of the model with indicator variables allows using stochastic algorithms to achieve both model parameter estimation and plane segmentation. We then report a Monte Carlo study of the performance of the stochastic algorithm on synthetic data, before displaying results on real data.
Du, Fei; Li, Yibo; Jin, Shijiu
2015-01-01
An accurate performance analysis on the MDL criterion for source enumeration in array processing is presented in this paper. The enumeration results of MDL can be predicted precisely by the proposed procedure via the statistical analysis of the sample eigenvalues, whose distributive properties are investigated with the consideration of their interactions. A novel approach is also developed for the performance evaluation when the source number is underestimated by a number greater than one, which is denoted as “multiple-missed detection”, and the probability of a specific underestimated source number can be estimated by ratio distribution analysis. Simulation results are included to demonstrate the superiority of the presented method over available results and confirm the ability of the proposed approach to perform multiple-missed detection analysis. PMID:26295232
COMPASS server for homology detection: improved statistical accuracy, speed and functionality
Sadreyev, Ruslan I.; Tang, Ming; Kim, Bong-Hyun; Grishin, Nick V.
2009-01-01
COMPASS is a profile-based method for the detection of remote sequence similarity and the prediction of protein structure. Here we describe a recently improved public web server of COMPASS, http://prodata.swmed.edu/compass. The server features three major developments: (i) improved statistical accuracy; (ii) increased speed from parallel implementation; and (iii) new functional features facilitating structure prediction. These features include visualization tools that allow the user to quickly and effectively analyze specific local structural region predictions suggested by COMPASS alignments. As an application example, we describe the structural, evolutionary and functional analysis of a protein with unknown function that served as a target in the recent CASP8 (Critical Assessment of Techniques for Protein Structure Prediction round 8). URL: http://prodata.swmed.edu/compass PMID:19435884
COMPASS server for homology detection: improved statistical accuracy, speed and functionality.
Sadreyev, Ruslan I; Tang, Ming; Kim, Bong-Hyun; Grishin, Nick V
2009-07-01
COMPASS is a profile-based method for the detection of remote sequence similarity and the prediction of protein structure. Here we describe a recently improved public web server of COMPASS, http://prodata.swmed.edu/compass. The server features three major developments: (i) improved statistical accuracy; (ii) increased speed from parallel implementation; and (iii) new functional features facilitating structure prediction. These features include visualization tools that allow the user to quickly and effectively analyze specific local structural region predictions suggested by COMPASS alignments. As an application example, we describe the structural, evolutionary and functional analysis of a protein with unknown function that served as a target in the recent CASP8 (Critical Assessment of Techniques for Protein Structure Prediction round 8). URL: http://prodata.swmed.edu/compass. PMID:19435884
COMPASS server for homology detection: improved statistical accuracy, speed and functionality.
Sadreyev, Ruslan I; Tang, Ming; Kim, Bong-Hyun; Grishin, Nick V
2009-07-01
COMPASS is a profile-based method for the detection of remote sequence similarity and the prediction of protein structure. Here we describe a recently improved public web server of COMPASS, http://prodata.swmed.edu/compass. The server features three major developments: (i) improved statistical accuracy; (ii) increased speed from parallel implementation; and (iii) new functional features facilitating structure prediction. These features include visualization tools that allow the user to quickly and effectively analyze specific local structural region predictions suggested by COMPASS alignments. As an application example, we describe the structural, evolutionary and functional analysis of a protein with unknown function that served as a target in the recent CASP8 (Critical Assessment of Techniques for Protein Structure Prediction round 8). URL: http://prodata.swmed.edu/compass.
Hardingham, Jennifer E; Grover, Phulwinder; Winter, Marnie; Hewett, Peter J; Price, Timothy J; Thierry, Benjamin
2015-01-01
Circulating tumor cells (CTC) may be defined as tumor- or metastasis-derived cells that are present in the bloodstream. The CTC pool in colorectal cancer (CRC) patients may include not only epithelial tumor cells, but also tumor cells undergoing epithelial–mesenchymal transition (EMT) and tumor stem cells. A significant number of patients diagnosed with early stage CRC subsequently relapse with recurrent or metastatic disease despite undergoing “curative” resection of their primary tumor. This suggests that an occult metastatic disease process was already underway, with viable tumor cells being shed from the primary tumor site, at least some of which have proliferative and metastatic potential and the ability to survive in the bloodstream. Such tumor cells are considered to be responsible for disease relapse in these patients. Their detection in peripheral blood at the time of diagnosis or after resection of the primary tumor may identify those early-stage patients who are at risk of developing recurrent or metastatic disease and who would benefit from adjuvant therapy. CTC may also be a useful adjunct to radiological assessment of tumor response to therapy. Over the last 20 years many approaches have been developed for the isolation and characterization of CTC. However, none of these methods can be considered the gold standard for detection of the entire pool of CTC. Recently our group has developed novel unbiased inertial microfluidics to enrich for CTC, followed by identification of CTC by imaging flow cytometry. Here, we provide a review of progress on CTC detection and clinical significance over the last 20 years. PMID:26605644
Hébert-Dufresne, Laurent; Grochow, Joshua A.; Allard, Antoine
2016-01-01
We introduce a network statistic that measures structural properties at the micro-, meso-, and macroscopic scales, while still being easy to compute and interpretable at a glance. Our statistic, the onion spectrum, is based on the onion decomposition, which refines the k-core decomposition, a standard network fingerprinting method. The onion spectrum is exactly as easy to compute as the k-cores: It is based on the stages at which each vertex gets removed from a graph in the standard algorithm for computing the k-cores. Yet, the onion spectrum reveals much more information about a network, and at multiple scales; for example, it can be used to quantify node heterogeneity, degree correlations, centrality, and tree- or lattice-likeness. Furthermore, unlike the k-core decomposition, the combined degree-onion spectrum immediately gives a clear local picture of the network around each node which allows the detection of interesting subgraphs whose topological structure differs from the global network organization. This local description can also be leveraged to easily generate samples from the ensemble of networks with a given joint degree-onion distribution. We demonstrate the utility of the onion spectrum for understanding both static and dynamic properties on several standard graph models and on many real-world networks. PMID:27535466
Hébert-Dufresne, Laurent; Grochow, Joshua A; Allard, Antoine
2016-01-01
We introduce a network statistic that measures structural properties at the micro-, meso-, and macroscopic scales, while still being easy to compute and interpretable at a glance. Our statistic, the onion spectrum, is based on the onion decomposition, which refines the k-core decomposition, a standard network fingerprinting method. The onion spectrum is exactly as easy to compute as the k-cores: It is based on the stages at which each vertex gets removed from a graph in the standard algorithm for computing the k-cores. Yet, the onion spectrum reveals much more information about a network, and at multiple scales; for example, it can be used to quantify node heterogeneity, degree correlations, centrality, and tree- or lattice-likeness. Furthermore, unlike the k-core decomposition, the combined degree-onion spectrum immediately gives a clear local picture of the network around each node which allows the detection of interesting subgraphs whose topological structure differs from the global network organization. This local description can also be leveraged to easily generate samples from the ensemble of networks with a given joint degree-onion distribution. We demonstrate the utility of the onion spectrum for understanding both static and dynamic properties on several standard graph models and on many real-world networks.
Hébert-Dufresne, Laurent; Grochow, Joshua A; Allard, Antoine
2016-01-01
We introduce a network statistic that measures structural properties at the micro-, meso-, and macroscopic scales, while still being easy to compute and interpretable at a glance. Our statistic, the onion spectrum, is based on the onion decomposition, which refines the k-core decomposition, a standard network fingerprinting method. The onion spectrum is exactly as easy to compute as the k-cores: It is based on the stages at which each vertex gets removed from a graph in the standard algorithm for computing the k-cores. Yet, the onion spectrum reveals much more information about a network, and at multiple scales; for example, it can be used to quantify node heterogeneity, degree correlations, centrality, and tree- or lattice-likeness. Furthermore, unlike the k-core decomposition, the combined degree-onion spectrum immediately gives a clear local picture of the network around each node which allows the detection of interesting subgraphs whose topological structure differs from the global network organization. This local description can also be leveraged to easily generate samples from the ensemble of networks with a given joint degree-onion distribution. We demonstrate the utility of the onion spectrum for understanding both static and dynamic properties on several standard graph models and on many real-world networks. PMID:27535466
NASA Astrophysics Data System (ADS)
Robila, Stefan A.
2005-03-01
Hyperspectral data is modeled as an unknown mixture of original features (such as the materials present in the scene). The goal is to find the unmixing matrix and to perform the inversion in order to recover them. Unlike first and second order techniques (such as PCA), higher order statistics (HOS) methods assume the data has nongaussian behavior are able to represent much subtle differences among the original features. The HOS algorithms transform the data such that the result components are uncorrelated and their nongaussianity is maximized (the resulting components are statistical independent). Subpixel targets in a natural background can be seen as anomalies of the image scene. They expose a strong nongaussian behavior and correspond to independent components leading to their detection when HOS techniques are employed. The methods presented in this paper start by preprocessing the hyperspectral image through centering and sphering. The resulting bands are transformed using gradient-based optimization on the HOS measure. Next, the data are reduced through a selection of the components associated with small targets using the changes of the slope in the scree graph of the non-Gaussianity values. The targets are filtered using histogram-based analysis. The end result is a map of the pixels associated with small targets.
NASA Astrophysics Data System (ADS)
Hébert-Dufresne, Laurent; Grochow, Joshua A.; Allard, Antoine
2016-08-01
We introduce a network statistic that measures structural properties at the micro-, meso-, and macroscopic scales, while still being easy to compute and interpretable at a glance. Our statistic, the onion spectrum, is based on the onion decomposition, which refines the k-core decomposition, a standard network fingerprinting method. The onion spectrum is exactly as easy to compute as the k-cores: It is based on the stages at which each vertex gets removed from a graph in the standard algorithm for computing the k-cores. Yet, the onion spectrum reveals much more information about a network, and at multiple scales; for example, it can be used to quantify node heterogeneity, degree correlations, centrality, and tree- or lattice-likeness. Furthermore, unlike the k-core decomposition, the combined degree-onion spectrum immediately gives a clear local picture of the network around each node which allows the detection of interesting subgraphs whose topological structure differs from the global network organization. This local description can also be leveraged to easily generate samples from the ensemble of networks with a given joint degree-onion distribution. We demonstrate the utility of the onion spectrum for understanding both static and dynamic properties on several standard graph models and on many real-world networks.
Structural damage detection using extended Kalman filter combined with statistical process control
NASA Astrophysics Data System (ADS)
Jin, Chenhao; Jang, Shinae; Sun, Xiaorong
2015-04-01
Traditional modal-based methods, which identify damage based upon changes in vibration characteristics of the structure on a global basis, have received considerable attention in the past decades. However, the effectiveness of the modalbased methods is dependent on the type of damage and the accuracy of the structural model, and these methods may also have difficulties when applied to complex structures. The extended Kalman filter (EKF) algorithm which has the capability to estimate parameters and catch abrupt changes, is currently used in continuous and automatic structural damage detection to overcome disadvantages of traditional methods. Structural parameters are typically slow-changing variables under effects of operational and environmental conditions, thus it would be difficult to observe the structural damage and quantify the damage in real-time with EKF only. In this paper, a Statistical Process Control (SPC) is combined with EFK method in order to overcome this difficulty. Based on historical measurements of damage-sensitive feathers involved in the state-space dynamic models, extended Kalman filter (EKF) algorithm is used to produce real-time estimations of these features as well as standard derivations, which can then be used to form control ranges for SPC to detect any abnormality of the selected features. Moreover, confidence levels of the detection can be adjusted by choosing different times of sigma and number of adjacent out-of-range points. The proposed method is tested using simulated data of a three floors linear building in different damage scenarios, and numerical results demonstrate high damage detection accuracy and light computation of this presented method.
Frome, EL
2005-09-20
Environmental exposure measurements are, in general, positive and may be subject to left censoring; i.e,. the measured value is less than a ''detection limit''. In occupational monitoring, strategies for assessing workplace exposures typically focus on the mean exposure level or the probability that any measurement exceeds a limit. Parametric methods used to determine acceptable levels of exposure, are often based on a two parameter lognormal distribution. The mean exposure level, an upper percentile, and the exceedance fraction are used to characterize exposure levels, and confidence limits are used to describe the uncertainty in these estimates. Statistical methods for random samples (without non-detects) from the lognormal distribution are well known for each of these situations. In this report, methods for estimating these quantities based on the maximum likelihood method for randomly left censored lognormal data are described and graphical methods are used to evaluate the lognormal assumption. If the lognormal model is in doubt and an alternative distribution for the exposure profile of a similar exposure group is not available, then nonparametric methods for left censored data are used. The mean exposure level, along with the upper confidence limit, is obtained using the product limit estimate, and the upper confidence limit on an upper percentile (i.e., the upper tolerance limit) is obtained using a nonparametric approach. All of these methods are well known but computational complexity has limited their use in routine data analysis with left censored data. The recent development of the R environment for statistical data analysis and graphics has greatly enhanced the availability of high-quality nonproprietary (open source) software that serves as the basis for implementing the methods in this paper.
Yokoyama, Shozo; Takenaka, Naomi
2005-04-01
Red-green color vision is strongly suspected to enhance the survival of its possessors. Despite being red-green color blind, however, many species have successfully competed in nature, which brings into question the evolutionary advantage of achieving red-green color vision. Here, we propose a new method of identifying positive selection at individual amino acid sites with the premise that if positive Darwinian selection has driven the evolution of the protein under consideration, then it should be found mostly at the branches in the phylogenetic tree where its function had changed. The statistical and molecular methods have been applied to 29 visual pigments with the wavelengths of maximal absorption at approximately 510-540 nm (green- or middle wavelength-sensitive [MWS] pigments) and at approximately 560 nm (red- or long wavelength-sensitive [LWS] pigments), which are sampled from a diverse range of vertebrate species. The results show that the MWS pigments are positively selected through amino acid replacements S180A, Y277F, and T285A and that the LWS pigments have been subjected to strong evolutionary conservation. The fact that these positively selected M/LWS pigments are found not only in animals with red-green color vision but also in those with red-green color blindness strongly suggests that both red-green color vision and color blindness have undergone adaptive evolution independently in different species.
Kurtz, S.E.; Fields, D.E.
1983-10-01
This report describes a version of the TERPED/P computer code that is very useful for small data sets. A new algorithm for determining the Kolmogorov-Smirnov (KS) statistics is used to extend program applicability. The TERPED/P code facilitates the analysis of experimental data and assists the user in determining its probability distribution function. Graphical and numerical tests are performed interactively in accordance with the user's assumption of normally or log-normally distributed data. Statistical analysis options include computation of the chi-square statistic and the KS one-sample test statistic and the corresponding significance levels. Cumulative probability plots of the user's data are generated either via a local graphics terminal, a local line printer or character-oriented terminal, or a remote high-resolution graphics device such as the FR80 film plotter or the Calcomp paper plotter. Several useful computer methodologies suffer from limitations of their implementations of the KS nonparametric test. This test is one of the more powerful analysis tools for examining the validity of an assumption about the probability distribution of a set of data. KS algorithms are found in other analysis codes, including the Statistical Analysis Subroutine (SAS) package and earlier versions of TERPED. The inability of these algorithms to generate significance levels for sample sizes less than 50 has limited their usefulness. The release of the TERPED code described herein contains algorithms to allow computation of the KS statistic and significance level for data sets of, if the user wishes, as few as three points. Values computed for the KS statistic are within 3% of the correct value for all data set sizes.
NASA Astrophysics Data System (ADS)
Flach, Milan; Mahecha, Miguel; Gans, Fabian; Rodner, Erik; Bodesheim, Paul; Guanche-Garcia, Yanira; Brenning, Alexander; Denzler, Joachim; Reichstein, Markus
2016-04-01
The number of available Earth observations (EOs) is currently substantially increasing. Detecting anomalous patterns in these multivariate time series is an important step in identifying changes in the underlying dynamical system. Likewise, data quality issues might result in anomalous multivariate data constellations and have to be identified before corrupting subsequent analyses. In industrial application a common strategy is to monitor production chains with several sensors coupled to some statistical process control (SPC) algorithm. The basic idea is to raise an alarm when these sensor data depict some anomalous pattern according to the SPC, i.e. the production chain is considered 'out of control'. In fact, the industrial applications are conceptually similar to the on-line monitoring of EOs. However, algorithms used in the context of SPC or process monitoring are rarely considered for supervising multivariate spatio-temporal Earth observations. The objective of this study is to exploit the potential and transferability of SPC concepts to Earth system applications. We compare a range of different algorithms typically applied by SPC systems and evaluate their capability to detect e.g. known extreme events in land surface processes. Specifically two main issues are addressed: (1) identifying the most suitable combination of data pre-processing and detection algorithm for a specific type of event and (2) analyzing the limits of the individual approaches with respect to the magnitude, spatio-temporal size of the event as well as the data's signal to noise ratio. Extensive artificial data sets that represent the typical properties of Earth observations are used in this study. Our results show that the majority of the algorithms used can be considered for the detection of multivariate spatiotemporal events and directly transferred to real Earth observation data as currently assembled in different projects at the European scale, e.g. http://baci-h2020.eu
Empirical Bayes scan statistics for detecting clusters of disease risk variants in genetic studies.
McCallum, Kenneth J; Ionita-Laza, Iuliana
2015-12-01
Recent developments of high-throughput genomic technologies offer an unprecedented detailed view of the genetic variation in various human populations, and promise to lead to significant progress in understanding the genetic basis of complex diseases. Despite this tremendous advance in data generation, it remains very challenging to analyze and interpret these data due to their sparse and high-dimensional nature. Here, we propose novel applications and new developments of empirical Bayes scan statistics to identify genomic regions significantly enriched with disease risk variants. We show that the proposed empirical Bayes methodology can be substantially more powerful than existing scan statistics methods especially so in the presence of many non-disease risk variants, and in situations when there is a mixture of risk and protective variants. Furthermore, the empirical Bayes approach has greater flexibility to accommodate covariates such as functional prediction scores and additional biomarkers. As proof-of-concept we apply the proposed methods to a whole-exome sequencing study for autism spectrum disorders and identify several promising candidate genes.
Statistical modeling, detection, and segmentation of stains in digitized fabric images
NASA Astrophysics Data System (ADS)
Gururajan, Arunkumar; Sari-Sarraf, Hamed; Hequet, Eric F.
2007-02-01
This paper will describe a novel and automated system based on a computer vision approach, for objective evaluation of stain release on cotton fabrics. Digitized color images of the stained fabrics are obtained, and the pixel values in the color and intensity planes of these images are probabilistically modeled as a Gaussian Mixture Model (GMM). Stain detection is posed as a decision theoretic problem, where the null hypothesis corresponds to absence of a stain. The null hypothesis and the alternate hypothesis mathematically translate into a first order GMM and a second order GMM respectively. The parameters of the GMM are estimated using a modified Expectation-Maximization (EM) algorithm. Minimum Description Length (MDL) is then used as the test statistic to decide the verity of the null hypothesis. The stain is then segmented by a decision rule based on the probability map generated by the EM algorithm. The proposed approach was tested on a dataset of 48 fabric images soiled with stains of ketchup, corn oil, mustard, ragu sauce, revlon makeup and grape juice. The decision theoretic part of the algorithm produced a correct detection rate (true positive) of 93% and a false alarm rate of 5% on these set of images.
NASA Astrophysics Data System (ADS)
Zakaria, Chahnez; Curé, Olivier; Salzano, Gabriella; Smaïli, Kamel
In Computer Supported Cooperative Work (CSCW), it is crucial for project leaders to detect conflicting situations as early as possible. Generally, this task is performed manually by studying a set of documents exchanged between team members. In this paper, we propose a full-fledged automatic solution that identifies documents, subjects and actors involved in relational conflicts. Our approach detects conflicts in emails, probably the most popular type of documents in CSCW, but the methods used can handle other text-based documents. These methods rely on the combination of statistical and ontological operations. The proposed solution is decomposed in several steps: (i) we enrich a simple negative emotion ontology with terms occuring in the corpus of emails, (ii) we categorize each conflicting email according to the concepts of this ontology and (iii) we identify emails, subjects and team members involved in conflicting emails using possibilistic description logic and a set of proposed measures. Each of these steps are evaluated and validated on concrete examples. Moreover, this approach's framework is generic and can be easily adapted to domains other than conflicts, e.g. security issues, and extended with operations making use of our proposed set of measures.
Early detection of illness associated with poisonings of public health significance.
Wolkin, Amy F; Patel, Manish; Watson, William; Belson, Martin; Rubin, Carol; Schier, Joshua; Kilbourne, Edwin M; Crawford, Carol Gotway; Wattigney, Wendy; Litovitz, Toby
2006-02-01
Since September 11, 2001, concern about potential terrorist attacks has increased in the United States. To reduce morbidity and mortality from outbreaks of illness from the intentional release of chemical agents, we examine data from the Toxic Exposure Surveillance System (TESS). TESS, a national system for timely collection of reports from US poison control centers, can facilitate early recognition of outbreaks of illness from chemical exposures. TESS data can serve as proxy markers for a diagnosis and may provide early alerts to potential outbreaks of covert events. We use 3 categories of information from TESS to detect potential outbreaks, including call volume, clinical effect, and substance-specific data. Analysis of the data identifies aberrations by comparing the observed number of events with a threshold based on historical data. Using TESS, we have identified several events of potential public health significance, including an arsenic poisoning at a local church gathering in Maine, the TOPOFF 2 national preparedness exercise, and contaminated food and water during the northeastern US blackout. Integration of poison control centers into the public health network will enhance the detection and response to emerging chemical threats. Traditionally, emergency physicians and other health care providers have used poison control centers for management information; their reporting to these centers is crucial in poisoning surveillance efforts.
Liu, Ran; Jin, Cuiyun; Song, Fengjuan; Liu, Jing
2013-01-01
The conductivity and permittivity of tumors are known to differ significantly from those of normal tissues. Electrical impedance tomography (EIT) is a relatively new imaging method for exploiting these differences. However, the accuracy of data capture is one of the difficult problems urgently to be solved in the clinical application of EIT technology. A new concept of EIT sensitizers is put forward in this paper with the goal of expanding the contrast ratio of tumor and healthy tissue to enhance EIT imaging quality. The use of nanoparticles for changing tumor characteristics and determining the infiltration vector for easier detection has been widely accepted in the biomedical field. Ultra-pure water, normal saline, and gold nanoparticles, three kinds of material with large differences in electrical characteristics, are considered as sensitizers and undergo mathematical model analysis and animal experimentation. Our preliminary results suggest that nanoparticles are promising for sensitization work. Furthermore, in experimental and simulation results, we found that we should select different sensitizers for the detection of different types and stages of tumor. PMID:23319858
Pepe, Pietro; Pennisi, Michele; Fraggetta, Filippo
2015-01-01
ABSTRACT Purpose: Detection rate for anterior prostate cancer (PCa) in men who underwent initial and repeat biopsy has been prospectively evaluated. Materials and Methods: From January 2013 to March 2014, 400 patients all of Caucasian origin (median age 63.5 years) underwent initial (285 cases) and repeat (115 cases) prostate biopsy; all the men had negative digital rectal examination and the indications to biopsy were: PSA values > 10 ng/mL, PSA between 4.1-10 or 2.6-4 ng/mL with free/total PSA≤25% and ≤20%, respectively. A median of 22 (initial biopsy) and 31 cores (repeat biopsy) were transperineally performed including 4 cores of the anterior zone (AZ) and 4 cores of the AZ plus 2 cores of the transition zone (TZ), respectively. Results: Median PSA was 7.9 ng/mL; overall, a PCa was found in 180 (45%) patients: in 135 (47.4%) and 45 (36%) of the men who underwent initial and repeat biopsy, respectively. An exclusive PCa of the anterior zone was found in the 8.9 (initial biopsy) vs 13.3% (repeat biopsy) of the men: a single microfocus of cancer was found in the 61.2% of the cases; moreover, in 7 out 18 AZ PCa the biopsy histology was predictive of significant cancer in 2 (28.5%) and 5 (71.5%) men who underwent initial and repeat biopsy, respectively. Conclusions: However AZ biopsies increased detection rate for PCa (10% of the cases), the majority of AZ PCa with histological findings predictive of clinically significant cancer were found at repeat biopsy (about 70% of the cases). PMID:26689509
Akrami, Yashar; Savage, Christopher; Scott, Pat; Conrad, Jan; Edsjö, Joakim E-mail: savage@fysik.su.se E-mail: conrad@fysik.su.se
2011-07-01
Models of weak-scale supersymmetry offer viable dark matter (DM) candidates. Their parameter spaces are however rather large and complex, such that pinning down the actual parameter values from experimental data can depend strongly on the employed statistical framework and scanning algorithm. In frequentist parameter estimation, a central requirement for properly constructed confidence intervals is that they cover true parameter values, preferably at exactly the stated confidence level when experiments are repeated infinitely many times. Since most widely-used scanning techniques are optimised for Bayesian statistics, one needs to assess their abilities in providing correct confidence intervals in terms of the statistical coverage. Here we investigate this for the Constrained Minimal Supersymmetric Standard Model (CMSSM) when only constrained by data from direct searches for dark matter. We construct confidence intervals from one-dimensional profile likelihoods and study the coverage by generating several pseudo-experiments for a few benchmark sets of pseudo-true parameters. We use nested sampling to scan the parameter space and evaluate the coverage for the benchmarks when either flat or logarithmic priors are imposed on gaugino and scalar mass parameters. The sampling algorithm has been used in the configuration usually adopted for exploration of the Bayesian posterior. We observe both under- and over-coverage, which in some cases vary quite dramatically when benchmarks or priors are modified. We show how most of the variation can be explained as the impact of explicit priors as well as sampling effects, where the latter are indirectly imposed by physicality conditions. For comparison, we also evaluate the coverage for Bayesian credible intervals, and observe significant under-coverage in those cases.
Detecting trends in raptor counts: power and type I error rates of various statistical tests
Hatfield, J.S.; Gould, W.R.; Hoover, B.A.; Fuller, M.R.; Lindquist, E.L.
1996-01-01
We conducted simulations that estimated power and type I error rates of statistical tests for detecting trends in raptor population count data collected from a single monitoring site. Results of the simulations were used to help analyze count data of bald eagles (Haliaeetus leucocephalus) from 7 national forests in Michigan, Minnesota, and Wisconsin during 1980-1989. Seven statistical tests were evaluated, including simple linear regression on the log scale and linear regression with a permutation test. Using 1,000 replications each, we simulated n = 10 and n = 50 years of count data and trends ranging from -5 to 5% change/year. We evaluated the tests at 3 critical levels (alpha = 0.01, 0.05, and 0.10) for both upper- and lower-tailed tests. Exponential count data were simulated by adding sampling error with a coefficient of variation of 40% from either a log-normal or autocorrelated log-normal distribution. Not surprisingly, tests performed with 50 years of data were much more powerful than tests with 10 years of data. Positive autocorrelation inflated alpha-levels upward from their nominal levels, making the tests less conservative and more likely to reject the null hypothesis of no trend. Of the tests studied, Cox and Stuart's test and Pollard's test clearly had lower power than the others. Surprisingly, the linear regression t-test, Collins' linear regression permutation test, and the nonparametric Lehmann's and Mann's tests all had similar power in our simulations. Analyses of the count data suggested that bald eagles had increasing trends on at least 2 of the 7 national forests during 1980-1989.
Alkerwi, Ala'a; Shivappa, Nitin; Crichton, Georgina; Hébert, James R
2014-12-01
Recently, there has been an influx of research interest regarding the anti-inflammatory role that diet has in chronic and metabolic diseases. A literature-based dietary inflammatory index (DII) that can be used to characterize the inflammation-modulating capacity of individuals' diets has even been developed and validated in an American population. We hypothesized that the DII could predict levels of high-sensitivity C-reactive protein (CRP), which is an important inflammatory marker, as well as metabolic measures that include the metabolic syndrome and its components in European adults. This hypothesis was tested according to data from 1352 participants from the Observation of Cardiovascular Risk Factors in Luxembourg study, a nationwide, cross-sectional survey based in Luxembourg. Statistical methods consisted of descriptive and multivariable logistic regression analyses. The DII ranged from a minimum of -4.02 (most anti-inflammatory) to a maximum of 4.00 points, with a mean value of -0.41. Participants with higher DII score were significantly younger and had lower body mass index, waist circumferences, and systolic blood pressure levels. Other cardiovascular biomarkers including diastolic blood pressure, CRP, lipids, and glycemic biomarkers did not vary significantly across DII tertiles. Participants with proinflammatory (>1) DII scores had increased adjusted odds (odds ratio, 1.46; 95% confidence interval, 1.00-2.13) of having a low high-density lipoprotein cholesterol, compared with those with anti-inflammatory scores (DII ≤1). There were no significant relationships between high-sensitivity CRP and the DII. This study, which tested the inflammatory capacity of the DII outside the United States, did not detect a significant independent relationship with cardiometabolic biomarkers, by using Food Frequency Questionnaire-collected data. These results are informative and representative of a relevant step in directing future research for nutrition and diet
Alkerwi, Ala'a; Shivappa, Nitin; Crichton, Georgina; Hébert, James R.
2015-01-01
Recently, there has been an influx of research interest regarding the anti-inflammatory role that diet has in chronic and metabolic diseases. A literature-based dietary inflammatory index (DII) that can be used to characterize the inflammation-modulating capacity of individuals’ diets has even been developed and validated in an American population. We hypothesized that the DII could predict levels of high-sensitivity C-reactive protein (CRP), which is an important inflammatory marker, as well as metabolic measures that include the metabolic syndrome and its components in European adults. This hypothesis was tested according to data from 1352 participants from the Observation of Cardiovascular Risk Factors in Luxembourg study, a nationwide, cross-sectional survey based in Luxembourg. Statistical methods consisted of descriptive and multivariable logistic regression analyses. The DII ranged from a minimum of −4.02 (most anti-inflammatory) to a maximum of 4.00 points, with a mean value of −0.41. Participants with higher DII score were significantly younger and had lower body mass index, waist circumferences, and systolic blood pressure levels. Other cardiovascular biomarkers including diastolic blood pressure, CRP, lipids, and glycemic biomarkers did not vary significantly across DII tertiles. Participants with proinflammatory (>1) DII scores had increased adjusted odds (odds ratio, 1.46; 95% confidence interval, 1.00-2.13) of having a low high-density lipoprotein cholesterol, compared with those with anti-inflammatory scores (DII ≤1). There were no significant relationships between high-sensitivity CRP and the DII. This study, which tested the inflammatory capacity of the DII outside the United States, did not detect a significant independent relationship with cardiometabolic biomarkers, by using Food Frequency Questionnaire–collected data. These results are informative and representative of a relevant step in directing future research for nutrition and
Clare, Elizabeth L
2014-01-01
The emerging field of ecological genomics contains several broad research areas. Comparative genomic and conservation genetic analyses are providing great insight into adaptive processes, species bottlenecks, population dynamics and areas of conservation priority. Now the same technological advances in high-throughput sequencing, coupled with taxonomically broad sequence repositories, are providing greater resolution and fundamentally new insights into functional ecology. In particular, we now have the capacity in some systems to rapidly identify thousands of species-level interactions using non-invasive methods based on the detection of trace DNA. This represents a powerful tool for conservation biology, for example allowing the identification of species with particularly inflexible niches and the investigation of food-webs or interaction networks with unusual or vulnerable dynamics. As they develop, these analyses will no doubt provide significant advances in the field of restoration ecology and the identification of appropriate locations for species reintroduction, as well as highlighting species at ecological risk. Here, I describe emerging patterns that have come from the various initial model systems, the advantages and limitations of the technique and key areas where these methods may significantly advance our empirical and applied conservation practices. PMID:25553074
A new efficient statistical test for detecting variability in the gene expression data.
Mathur, Sunil; Dolo, Samuel
2008-08-01
DNA microarray technology allows researchers to monitor the expressions of thousands of genes under different conditions. The detection of differential gene expression under two different conditions is very important in microarray studies. Microarray experiments are multi-step procedures and each step is a potential source of variance. This makes the measurement of variability difficult because approach based on gene-by-gene estimation of variance will have few degrees of freedom. It is highly possible that the assumption of equal variance for all the expression levels may not hold. Also, the assumption of normality of gene expressions may not hold. Thus it is essential to have a statistical procedure which is not based on the normality assumption and also it can detect genes with differential variance efficiently. The detection of differential gene expression variance will allow us to identify experimental variables that affect different biological processes and accuracy of DNA microarray measurements.In this article, a new nonparametric test for scale is developed based on the arctangent of the ratio of two expression levels. Most of the tests available in literature require the assumption of normal distribution, which makes them inapplicable in many situations, and it is also hard to verify the suitability of the normal distribution assumption for the given data set. The proposed test does not require the assumption of the distribution for the underlying population and hence makes it more practical and widely applicable. The asymptotic relative efficiency is calculated under different distributions, which show that the proposed test is very powerful when the assumption of normality breaks down. Monte Carlo simulation studies are performed to compare the power of the proposed test with some of the existing procedures. It is found that the proposed test is more powerful than commonly used tests under almost all the distributions considered in the study. A
Boareto, Marcelo; Caticha, Nestor
2014-01-01
Microarray data analysis typically consists in identifying a list of differentially expressed genes (DEG), i.e., the genes that are differentially expressed between two experimental conditions. Variance shrinkage methods have been considered a better choice than the standard t-test for selecting the DEG because they correct the dependence of the error with the expression level. This dependence is mainly caused by errors in background correction, which more severely affects genes with low expression values. Here, we propose a new method for identifying the DEG that overcomes this issue and does not require background correction or variance shrinkage. Unlike current methods, our methodology is easy to understand and implement. It consists of applying the standard t-test directly on the normalized intensity data, which is possible because the probe intensity is proportional to the gene expression level and because the t-test is scale- and location-invariant. This methodology considerably improves the sensitivity and robustness of the list of DEG when compared with the t-test applied to preprocessed data and to the most widely used shrinkage methods, Significance Analysis of Microarrays (SAM) and Linear Models for Microarray Data (LIMMA). Our approach is useful especially when the genes of interest have small differences in expression and therefore get ignored by standard variance shrinkage methods.
NASA Astrophysics Data System (ADS)
Poghosyan, G. V.
2013-12-01
A statistical analysis of time intervals between the dates of birth of genetic relatives has been carried out on the basis of 33 family trees. Using the Monte Carlo method, a significant departure of the distribution of birthdays from random results is detected relative to two long-period solar harmonics known from the theory of the Earth tides, i.e., a solar elliptical wave ( S a ) with a period of an anomalistic year (365.259640 days) and a solar declinational wave ( S sa ) with a period of half of the tropical year (182.621095 days). Further research requires larger statistical samples and involves clarifying the effect of long-period lunar harmonics, i.e., an lunar elliptical wave ( M m ) with a period of an anomalistic month (27.554551 days) and a lunar declinational wave ( M f ) with a period of half of a tropical month (13.660791 day), as well as the impact of important lunar and solar tides of time intervals with periods of half (14.765294 days, the interval between syzygial tides at new and full moon) and a whole (29.530588 days) synodic month. It is known that the periodic compression and stretching of the Earth's crust at the time of the tides by means of the piezoelectric effect lead to the generation of long-period electric oscillations with periods corresponding to the harmonics of the theory of the Earth tides. The detection of these harmonics in connection with biological processes will make it possible to determine the impact of regular cosmogeophysical fluctuations (tidal waves) on the processes in the biosphere.
KLIP-ing for Analogs - Detection Statistics for HR8799-like systems
NASA Astrophysics Data System (ADS)
Hanson, Jake R.; Apai, Daniel
2015-01-01
In late 2008, the announcement of the discovery of the directly imaged quadruple planetary system HR8799 was made. This system is unique not only due to the number of planets it contains but also because it poses a serious threat to our current understanding of planetary core accretion. Namely, the observed radial separations between the planets and their A/F-type host star are not consistent with the amount of gas we would expect the planets to have accreted, as well as the fact the system as a whole contains more than 70 times the mass of our own solar system.In order to examine whether or not planetary systems similar to HR8799 are anomalous, this project has conducted the largest survey to date of directly imaged A/F-type stars. Using the NACO-VLT imaging system, we implement a modern image reduction algorithm known as KLIP on over 60 targets to detect analogs. KLIP is a PCA based algorithm and operates by creating a library of PSF eigenimages for a given set of input images. This library contains all of the time-independent PSF sources that rotate with the field of view for the input images. Once the PSF library is created, KLIP then recreates any target from the input images as a superposition of known PSF eigenimages from the library and subtracts this from the original, leaving behind possible planetary candidates.The results of this project provide a quantitative comparison of KLIP and other image reduction algorithms for this data set. We will also use a Monte Carlo based simulation to determine the frequency of HR8799 analogs around AF type stars based on our detection statistics.
Nonparametric simulation-based statistics for detecting linkage in general pedigrees
Davis, S.; Schroeder, M.; Weeks, D.E.; Goldin, L.R.
1996-04-01
We present here four nonparametric statistics for linkage analysis that test whether pairs of affected relatives share marker alleles more often than expected. These statistics are based on simulating the null distribution of a given statistic conditional on the unaffecteds` marker genotypes. Each statistic uses a different measure of marker sharing: the SimAPM statistic uses the simulation-based affected-pedigree-member measure based on identity-by-state (IBS) sharing. The SimKIN (kinship) measure is 1.0 for identity-by-descent (IBD) sharing, 0.0 for no IBD sharing, and the kinship coefficient when the IBD status is ambiguous. The simulation-based IBD (SimIBD) statistic uses a recursive algorithm to determine the probability of two affecteds sharing a specific allele IBD. The SimISO statistic is identical to SimIBD, except that it also measures marker similarity between unaffected pairs. We evaluated our statistics on data simulated under different two-locus disease models, comparing our results to those obtained with several other nonparametric statistics. Use of IBD information produces dramatic increases in power over the SimAPM method, which uses only IBS information. The power of our best statistic in most cases meets or exceeds the power of the other nonparametric statistics. Furthermore, our statistics perform comparisons between all affected relative pairs within general pedigrees and are not restricted to sib pairs or nuclear families. 32 refs., 5 figs., 6 tabs.
Nonparametric simulation-based statistics for detecting linkage in general pedigrees.
Davis, S.; Schroeder, M.; Goldin, L. R.; Weeks, D. E.
1996-01-01
We present here four nonparametric statistics for linkage analysis that test whether pairs of affected relatives share marker alleles more often than expected. These statistics are based on simulating the null distribution of a given statistic conditional on the unaffecteds' marker genotypes. Each statistic uses a different measure of marker sharing: the SimAPM statistic uses the simulation-based affected-pedigree-member measure based on identity-by-state (IBS) sharing. The SimKIN (kinship) measure is 1.0 for identity-by-descent (IBD) sharing, 0.0 for no IBD status sharing, and the kinship coefficient when the IBD status is ambiguous. The simulation-based IBD (SimIBD) statistic uses a recursive algorithm to determine the probability of two affecteds sharing a specific allele IBD. The SimISO statistic is identical to SimIBD, except that it also measures marker similarity between unaffected pairs. We evaluated our statistics on data simulated under different two-locus disease models, comparing our results to those obtained with several other nonparametric statistics. Use of IBD information produces dramatic increases in power over the SimAPM method, which uses only IBS information. The power of our best statistic in most cases meets or exceeds the power of the other nonparametric statistics. Furthermore, our statistics perform comparisons between all affected relative pairs within general pedigrees and are not restricted to sib pairs or nuclear families. PMID:8644751
Prognostic significance of computed tomography-detected extramural vascular invasion in colon cancer
Yao, Xun; Yang, Su-Xing; Song, Xing-He; Cui, Yan-Cheng; Ye, Ying-Jiang; Wang, Yi
2016-01-01
AIM To compare disease-free survival (DFS) between extramural vascular invasion (EMVI)-positive and -negative colon cancer patients evaluated by computed tomography (CT). METHODS Colon cancer patients (n = 194) undergoing curative surgery between January 2009 and December 2013 were included. Each patient’s demographics, cancer characteristics, EMVI status, pathological status and survival outcomes were recorded. All included patients had been routinely monitored until December 2015. EMVI was defined as tumor tissue within adjacent vessels beyond the colon wall as seen on enhanced CT. Disease recurrence was defined as metachronous metastases, local recurrence, or death due to colon cancer. Kaplan-Meier analyses were used to compare DFS between the EMVI-positive and -negative groups. Cox’s proportional hazards models were used to measure the impact of confounding variables on survival rates. RESULTS EMVI was observed on CT (ctEMVI) in 60 patients (30.9%, 60/194). One year after surgery, there was no statistically significant difference regarding the rates of progressive events between EMVI-positive and -negative patients [11.7% (7/60) and 6.7% (9/134), respectively; P = 0.266]. At the study endpoint, the EMVI-positive patients had significantly more progressive events than the EMVI-negative patients [43.3% (26/60) and 14.9% (20/134), respectively; odds ratio = 4.4, P < 0.001]. Based on the Kaplan-Meier method, the cumulative 1-year DFS rates were 86.7% (95%CI: 82.3-91.1) and 92.4% (95%CI: 90.1-94.7) for EMVI-positive and EMVI-negative patients, respectively. The cumulative 3-year DFS rates were 49.5% (95%CI: 42.1-56.9) and 85.8% (95%CI: 82.6-89.0), respectively. Cox proportional hazards regression analysis revealed that ctEMVI was an independent predictor of DFS with a hazard ratio of 2.15 (95%CI: 1.12-4.14, P = 0.023). CONCLUSION ctEMVI may be helpful when evaluating disease progression in colon cancer patients.
Prognostic significance of computed tomography-detected extramural vascular invasion in colon cancer
Yao, Xun; Yang, Su-Xing; Song, Xing-He; Cui, Yan-Cheng; Ye, Ying-Jiang; Wang, Yi
2016-01-01
AIM To compare disease-free survival (DFS) between extramural vascular invasion (EMVI)-positive and -negative colon cancer patients evaluated by computed tomography (CT). METHODS Colon cancer patients (n = 194) undergoing curative surgery between January 2009 and December 2013 were included. Each patient’s demographics, cancer characteristics, EMVI status, pathological status and survival outcomes were recorded. All included patients had been routinely monitored until December 2015. EMVI was defined as tumor tissue within adjacent vessels beyond the colon wall as seen on enhanced CT. Disease recurrence was defined as metachronous metastases, local recurrence, or death due to colon cancer. Kaplan-Meier analyses were used to compare DFS between the EMVI-positive and -negative groups. Cox’s proportional hazards models were used to measure the impact of confounding variables on survival rates. RESULTS EMVI was observed on CT (ctEMVI) in 60 patients (30.9%, 60/194). One year after surgery, there was no statistically significant difference regarding the rates of progressive events between EMVI-positive and -negative patients [11.7% (7/60) and 6.7% (9/134), respectively; P = 0.266]. At the study endpoint, the EMVI-positive patients had significantly more progressive events than the EMVI-negative patients [43.3% (26/60) and 14.9% (20/134), respectively; odds ratio = 4.4, P < 0.001]. Based on the Kaplan-Meier method, the cumulative 1-year DFS rates were 86.7% (95%CI: 82.3-91.1) and 92.4% (95%CI: 90.1-94.7) for EMVI-positive and EMVI-negative patients, respectively. The cumulative 3-year DFS rates were 49.5% (95%CI: 42.1-56.9) and 85.8% (95%CI: 82.6-89.0), respectively. Cox proportional hazards regression analysis revealed that ctEMVI was an independent predictor of DFS with a hazard ratio of 2.15 (95%CI: 1.12-4.14, P = 0.023). CONCLUSION ctEMVI may be helpful when evaluating disease progression in colon cancer patients. PMID:27610025
Significance of the detection of esters of p-hydroxybenzoic acid (parabens) in human breast tumours.
Harvey, Philip W; Everett, David J
2004-01-01
This issue of Journal of Applied Toxicology publishes the paper Concentrations of Parabens in Human Breast Tumours by Darbre et al. (2004), which reports that esters of p-hydroxybenzoic acid (parabens) can be detected in samples of tissue from human breast tumours. Breast tumour samples were supplied from 20 patients, in collaboration with the Edinburgh Breast Unit Research Group, and analysed by high-pressure liquid chromatography and tandem mass spectrometry. The parabens are used as antimicrobial preservatives in underarm deodorants and antiperspirants and in a wide range of other consumer products. The parabens also have inherent oestrogenic and other hormone related activity (increased progesterone receptor gene expression). As oestrogen is a major aetiological factor in the growth and development of the majority of human breast cancers, it has been previously suggested by Darbre that parabens and other chemicals in underarm cosmetics may contribute to the rising incidence of breast cancer. The significance of the finding of parabens in tumour samples is discussed here in terms of 1). Darbre et al's study design, 2). what can be inferred from this type of data (and what can not, such as the cause of these tumours), 3). the toxicology of these compounds and 4). the limitations of the existing toxicology database and the need to consider data that is appropriate to human exposures.
Abbate, V; Kicman, A T; Evans-Brown, M; McVeigh, J; Cowan, D A; Wilson, C; Coles, S J; Walker, C J
2015-07-01
Twenty-four products suspected of containing anabolic steroids and sold in fitness equipment shops in the United Kingdom (UK) were analyzed for their qualitative and semi-quantitative content using full scan gas chromatography-mass spectrometry (GC-MS), accurate mass liquid chromatography-mass spectrometry (LC-MS), high pressure liquid chromatography with diode array detection (HPLC-DAD), UV-Vis, and nuclear magnetic resonance (NMR) spectroscopy. In addition, X-ray crystallography enabled the identification of one of the compounds, where reference standard was not available. Of the 24 products tested, 23 contained steroids including known anabolic agents; 16 of these contained steroids that were different to those indicated on the packaging and one product contained no steroid at all. Overall, 13 different steroids were identified; 12 of these are controlled in the UK under the Misuse of Drugs Act 1971. Several of the products contained steroids that may be considered to have considerable pharmacological activity, based on their chemical structures and the amounts present. This could unwittingly expose users to a significant risk to their health, which is of particular concern for naïve users.
NASA Technical Reports Server (NTRS)
Moore, G. K. (Principal Investigator)
1976-01-01
The author has identified the following significant results. Lineaments were detected on Skylab photographs by stereo viewing, projection viewing, and composite viewing. Sixty-nine percent more lineaments were found by stereo viewing than by projection, but segments of projection lineaments are longer; total length of lineaments found by these two methods is nearly the same. Most Skylab lineaments consist of topographic depression: stream channel alinements, straight valley walls, elongated swales, and belts where sinkholes are abundant. Most of the remainder are vegetation alinements. Lineaments are most common in dissected areas having a thin soil cover. Results of test drilling show: (1) the median yield of test wells on Skylab lineaments is about six times the median yield of all existing wells; (2) three out of seven wells on Skylab lineaments yield more than 6.3 1/s (110 gal/min): (3) low yields are possible on lineaments as well as in other favorable locations; and (4) the largest well yields can be obtained at well locations of Skylab lineaments that also are favorably located with respect to topography and geologic structure, and are in the vicinity of wells with large yields.
NASA Technical Reports Server (NTRS)
Druzhinin, I. P.; Khamyanova, N. V.; Yagodinskiy, V. N.
1974-01-01
Statistical evaluations of the significance of the relationship of abrupt changes in solar activity and discontinuities in the multi-year pattern of an epidemic process are reported. They reliably (with probability of more than 99.9%) show the real nature of this relationship and its great specific weight (about half) in the formation of discontinuities in the multi-year pattern of the processes in question.
Mark Burden, Adrian; Lewis, Sandra Elizabeth; Willcox, Emma
2014-12-01
Numerous ways exist to process raw electromyograms (EMGs). However, the effect of altering processing methods on peak and mean EMG has seldom been investigated. The aim of this study was to investigate the effect of using different root mean square (RMS) window lengths and overlaps on the amplitude, reliability and inter-individual variability of gluteus maximus EMGs recorded during the clam exercise, and on the statistical significance and clinical relevance of amplitude differences between two exercise conditions. Mean and peak RMS of 10 repetitions from 17 participants were obtained using processing window lengths of 0.01, 0.15, 0.2, 0.25 and 1 s, with no overlap and overlaps of 25, 50 and 75% of window length. The effect of manipulating window length on reliability and inter-individual variability was greater for peak EMG (coefficient of variation [CV] <9%) than for mean EMG (CV <3%), with the 1 s window generally displaying the lowest variability. As a consequence, neither statistical significance nor clinical relevance (effect size [ES]) of mean EMG was affected by manipulation of window length. Statistical significance of peak EMG was more sensitive to changes in window length, with lower p-values generally being recorded for the 1 s window. As use of different window lengths has a greater effect on variability and statistical significance of the peak EMG, then clinicians should use the mean EMG. They should also be aware that use of different numbers of exercise repetitions and participants can have a greater effect on EMG parameters than length of processing window.
NASA Astrophysics Data System (ADS)
Zechlin, Hannes-S.; Cuoco, Alessandro; Donato, Fiorenza; Fornengo, Nicolao; Vittino, Andrea
2016-08-01
The source-count distribution as a function of their flux, {dN}/{dS}, is one of the main quantities characterizing gamma-ray source populations. We employ statistical properties of the Fermi Large Area Telescope (LAT) photon counts map to measure the composition of the extragalactic gamma-ray sky at high latitudes (| b| ≥slant 30°) between 1 and 10 GeV. We present a new method, generalizing the use of standard pixel-count statistics, to decompose the total observed gamma-ray emission into (a) point-source contributions, (b) the Galactic foreground contribution, and (c) a truly diffuse isotropic background contribution. Using the 6 yr Fermi-LAT data set (P7REP), we show that the {dN}/{dS} distribution in the regime of so far undetected point sources can be consistently described with a power law with an index between 1.9 and 2.0. We measure {dN}/{dS} down to an integral flux of ˜ 2× {10}-11 {{cm}}-2 {{{s}}}-1, improving beyond the 3FGL catalog detection limit by about one order of magnitude. The overall {dN}/{dS} distribution is consistent with a broken power law, with a break at {2.1}-1.3+1.0× {10}-8 {{cm}}-2 {{{s}}}-1. The power-law index {n}1={3.1}-0.5+0.7 for bright sources above the break hardens to {n}2=1.97+/- 0.03 for fainter sources below the break. A possible second break of the {dN}/{dS} distribution is constrained to be at fluxes below 6.4× {10}-11 {{cm}}-2 {{{s}}}-1 at 95% confidence level. The high-latitude gamma-ray sky between 1 and 10 GeV is shown to be composed of ˜25% point sources, ˜69.3% diffuse Galactic foreground emission, and ˜6% isotropic diffuse background.
A Multi-Scale Sampling Strategy for Detecting Physiologically Significant Signals in AVIRIS Imagery
NASA Technical Reports Server (NTRS)
Gamon, John A.; Lee, Lai-Fun; Qiu, Hong-Lie; Davis, Stephen; Roberts, Dar A.; Ustin, Susan L.
1998-01-01
Models of photosynthetic production at ecosystem and global scales require multiple input parameters specifying physical and physiological surface features. While certain physical parameters (e.g., absorbed photosynthetically active radiation) can be derived from current satellite sensors, other physiologically relevant measures (e.g., vegetation type, water status, carboxylation capacity, or photosynthetic light-use efficiency), are not generally directly available from current satellite sensors at the appropriate geographic scale. Consequently, many model parameters must be assumed or derived from independent sources, often at an inappropriate scale. An abundance of ecophysiological studies at the leaf and canopy scales suggests strong physiological control of vegetation-atmosphere CO2 and water vapor fluxes, particularly in evergreen vegetation subjected to diurnal or seasonal stresses. For example hot, dry conditions can lead to stomatal closure, and associated "downregulation" of photosynthetic biochemical processes, a phenomenon often manifested as a "midday photosynthetic depression". A recent study with the revised simple biosphere (SiB2) model demonstrated that photosynthetic downregulation can significantly impact global climate. However, at the global scale, the exact significance of downregulation remains unclear, largely because appropriate physiological measures are generally unavailable at this scale. Clearly, there is a need to develop reliable ways of extracting physiologically relevant information from remote sensing. Narrow-band spectrometers offer many opportunities for deriving physiological parameters needed for ecosystem and global scale photosynthetic models. Experimental studies on the ground at the leaf- to stand-scale have indicated that several narrow-band features can be used to detect plant physiological status. One physiological signal is caused by xanthophyll cycle pigment activity, and is often expressed as the Photochemical
Statistical Approaches to Detecting and Analyzing Tandem Repeats in Genomic Sequences
Anisimova, Maria; Pečerska, Julija; Schaper, Elke
2015-01-01
Tandem repeats (TRs) are frequently observed in genomes across all domains of life. Evidence suggests that some TRs are crucial for proteins with fundamental biological functions and can be associated with virulence, resistance, and infectious/neurodegenerative diseases. Genome-scale systematic studies of TRs have the potential to unveil core mechanisms governing TR evolution and TR roles in shaping genomes. However, TR-related studies are often non-trivial due to heterogeneous and sometimes fast evolving TR regions. In this review, we discuss these intricacies and their consequences. We present our recent contributions to computational and statistical approaches for TR significance testing, sequence profile-based TR annotation, TR-aware sequence alignment, phylogenetic analyses of TR unit number and order, and TR benchmarks. Importantly, all these methods explicitly rely on the evolutionary definition of a tandem repeat as a sequence of adjacent repeat units stemming from a common ancestor. The discussed work has a focus on protein TRs, yet is generally applicable to nucleic acid TRs, sharing similar features. PMID:25853125
Simonson, K.M.
1998-08-01
The rate at which a mine detection system falsely identifies man-made or natural clutter objects as mines is referred to as the system's false alarm rate (FAR). Generally expressed as a rate per unit area or time, the FAR is one of the primary metrics used to gauge system performance. In this report, an overview is given of statistical methods appropriate for the analysis of data relating to FAR. Techniques are presented for determining a suitable size for the clutter collection area, for summarizing the performance of a single sensor, and for comparing different sensors. For readers requiring more thorough coverage of the topics discussed, references to the statistical literature are provided. A companion report addresses statistical issues related to the estimation of mine detection probabilities.
NASA Astrophysics Data System (ADS)
Meng, X.; Daniels, C.; Smith, E.; Peng, Z.; Chen, X.; Wagner, L. S.; Fischer, K. M.; Hawman, R. B.
2015-12-01
Since 2001, the number of M>3 earthquakes increased significantly in Central and Eastern United States (CEUS), likely due to waste-water injection, also known as "induced earthquakes" [Ellsworth, 2013]. Because induced earthquakes are driven by short-term external forcing and hence may behave like earthquake swarms, which are not well characterized by branching point-process models, such as the Epidemic Type Aftershock Sequence (ETAS) model [Ogata, 1988]. In this study we focus on the 02/15/2014 M4.1 South Carolina and the 06/16/2014 M4.3 Oklahoma earthquakes, which likely represent intraplate tectonic and induced events, respectively. For the South Carolina event, only one M3.0 aftershock is identified by the ANSS catalog, which may be caused by a lack of low-magnitude events in this catalog. We apply a recently developed matched filter technique to detect earthquakes from 02/08/2014 to 02/22/2014 around the epicentral region. 15 seismic stations (both permanent and temporary USArray networks) within 100 km of the mainshock are used for detection. The mainshock and aftershock are used as templates for the initial detection. Newly detected events are employed as new templates, and the same detection procedure repeats until no new event can be added. Overall we have identified more than 10 events, including one foreshock occurred ~11 min before the M4.1 mainshock. However, the numbers of aftershocks are still much less than predicted with the modified Bath's law. For the Oklahoma event, we use 1270 events from the ANSS catalog and 182 events from a relocated catalog as templates to scan through continuous recordings 3 days before to 7 days after the mainshock. 12 seismic stations within the vicinity of the mainshock are included in the study. After obtaining more complete catalogs for both sequences, we plan to compare the statistical parameters (e.g., b, a, K, and p values) between the two sequences, as well as their spatial-temporal migration pattern, which may
NASA Astrophysics Data System (ADS)
Passaro, Marcello; Benveniste, Jérôme; Cipollini, Paolo; Fenoglio-Marc, Luciana
For more than two decades, it has been possible to map the Significant Wave Height (SWH) globally through Satellite Altimetry. SWH estimation is possible because the shape of an altimetric waveform, which usually presents a sharp leading edge and a slowly decaying trailing edge, depends on the sea state: in particular, the higher the sea state, the longer the rising time of the leading edge. The algorithm for SWH also depends on the width of the point target response (PTR) function, which is usually approximated by a constant value that contributes to the rising time. Particularly challenging for SWH detection are coastal data and low sea states. The first are usually flagged as unreliable due to land and calm water interference in the altimeter footprint; the second are characterized by an extremely sharp leading edge that is consequently poorly sampled in the digitalized waveform. ALES, a new algorithm for reprocessing altimetric waveforms, has recently been validated for sea surface height estimation (Passaro et al. 2014). The aim of this work is to check its validity also for SWH estimation in a particularly challenging area. The German Bight region presents both low sea state and coastal issues and is particularly suitable for validation, thanks to the extended network of buoys of the Bundesamt für Seeschifffahrt und Hydrographie (BSH). In-situ data include open sea, off-shore and coastal sea conditions, respectively at the Helgoland, lighthouse Alte Weser and Westerland locations. Reprocessed data from Envisat, Jason-1 and Jason-2 tracks are validated against those three buoys. The in-situ validation is applied both at the nearest point and at points along-track. The skill metrics is based on bias, standard deviation, slope of regression line, scatter index, number of cycles with correlation larger than 90%. The same metrics is applied to the altimeter data obtained by standard processing and the validation results are compared. Data are evaluated at high
Beyond Gaussian statistical analysis for man-made object detection in hyperspectral images
NASA Astrophysics Data System (ADS)
Bernhardt, Mark; Roberts, Joanne M.
1999-12-01
Emerging Hyper-Spectral imaging technology allows the acquisition of data 'cubes' which simultaneously have high- resolution spatial and spectral components. There is a wealth of information in this data and effective techniques for extracting and processing this information are vital. Previous work by ERIM on man-made object detection has demonstrated that there is a huge amount of discriminatory information in hyperspectral images. This work used the hypothesis that the spectral characteristics of natural backgrounds can be described by a multivariate Gaussian model. The Mahalanobis distance (derived from the covariance matrix) between the background and other objects in the spectral data is the key discriminant. Other work (by DERA and Pilkington Optronics Ltd) has confirmed these findings, but indicates that in order to obtain the lowest possible false alarm probability, a way of including higher order statistics is necessary. There are many ways in which this could be done ranging from neural networks to classical density estimation approaches. In this paper we report on a new method for extending the Gaussian approach to more complex spectral signatures. By using ideas from the theory of Support Vector Machines we are able to map the spectral data into a higher dimensional space. The co- ordinates of this space are derived from all possible multiplicative combinations of the original spectral line intensities, up to a given order d -- which is the main parameter of the method. The data in this higher dimensional space are then analyzed using a multivariate Gaussian approach. Thus when d equals 1 we recover the ERIM model -- in this case the mapping is the identity. In order for such an approach to be at all tractable we must solve the 'combinatorial explosion' problem implicit in this mapping for large numbers of spectral lines in the signature data. In order to do this we note that in the final analysis of this approach it is only the inner (dot) products
NASA Astrophysics Data System (ADS)
Monaco, E.; Memmolo, V.; Ricci, F.; Boffa, N. D.; Maio, L.
2015-03-01
Maintenance approaches based on sensorised structures and Structural Health Monitoring systems could represent one of the most promising innovations in the fields of aerostructures since many years, mostly when composites materials (fibers reinforced resins) are considered. Layered materials still suffer today of drastic reductions of maximum allowable stress values during the design phase as well as of costly and recurrent inspections during the life cycle phase that don't permit of completely exploit their structural and economic potentialities in today aircrafts. Those penalizing measures are necessary mainly to consider the presence of undetected hidden flaws within the layered sequence (delaminations) or in bonded areas (partial disbonding); in order to relax design and maintenance constraints a system based on sensors permanently installed on the structure to detect and locate eventual flaws can be considered (SHM system) once its effectiveness and reliability will be statistically demonstrated via a rigorous Probability Of Detection function definition and evaluation. This paper presents an experimental approach with a statistical procedure for the evaluation of detection threshold of a guided waves based SHM system oriented to delaminations detection on a typical wing composite layered panel. The experimental tests are mostly oriented to characterize the statistical distribution of measurements and damage metrics as well as to characterize the system detection capability using this approach. Numerically it is not possible to substitute part of the experimental tests aimed at POD where the noise in the system response is crucial. Results of experiments are presented in the paper and analyzed.
Moore, Gerald K.
1976-01-01
Lineaments were detected on SKYLAB photographs by stereo viewing , projection viewing, and composite viewing. Large well yields of 25 gal/min or more can be obtained in the study area by locating future wells on SKYLAB lineaments rather than on lineaments detected on either high-altitude aerial photographs, LANDSAT images, or by random drilling. Larger savings might be achieved by locating wells on lineaments detected by both stereo viewing and projection. The test site is underlain by dense, fractured, flat-lying limestones. Soil cover averages 4 ft thick in the Central Basin and about 40 ft thick on the Eastern Highland rim. Groundwater occurs mostly in horizontal, sheetlike solution cavities, and the trends of these cavities are controlled by joints. (Lardner-ISWS)
Significance of detection of extra metacentric microchromosome in amniotic cell culture.
Bernstein, R; Hakim, C; Hardwick, B; Nurse, G T
1978-01-01
A metacentric bisatellited microchromosome was detected in all metaphases from an amniotic culture performed because of maternal age. A wide-ranging survey of the literature failed to disclose any consistent anomaly associated with such a marker, but did reveal that the clinical picture of patients manifesting it could range from complete normality through mental retardation to a variety of deformities. The parents elected for termination, and the only deformity detected in the abortus was fixed talipes equinovarus. The implications of the finding of this marker chromosome on amniocentesis, believed to be reported for the first time here, are discussed particularly in the context of genetic counselling. Images PMID:641948
Ferrari, Ricardo J; Pinto, Carlos H Villa; da Silva, Bruno C Gregório; Bernardes, Danielle; Carvalho-Tavares, Juliana
2015-02-01
Intravital microscopy is an important experimental tool for the study of cellular and molecular mechanisms of the leukocyte-endothelial interactions in the microcirculation of various tissues and in different inflammatory conditions of in vivo specimens. However, due to the limited control over the conditions of the image acquisition, motion blur and artifacts, resulting mainly from the heartbeat and respiratory movements of the in vivo specimen, will very often be present. This problem can significantly undermine the results of either visual or computerized analysis of the acquired video images. Since only a fraction of the total number of images are usually corrupted by severe motion blur, it is necessary to have a procedure to automatically identify such images in the video for either further restoration or removal. This paper proposes a new technique for the detection of motion blur in intravital video microscopy based on directional statistics of local energy maps computed using a bank of 2D log-Gabor filters. Quantitative assessment using both artificially corrupted images and real microscopy data were conducted to test the effectiveness of the proposed method. Results showed an area under the receiver operating characteristic curve (AUC) of 0.95 (AUC = 0.95; 95 % CI 0.93-0.97) when tested on 329 video images visually ranked by four observers.
Carroll, Adam J.; Zhang, Peng; Whitehead, Lynne; Kaines, Sarah; Tcherkez, Guillaume; Badger, Murray R.
2015-01-01
This article describes PhenoMeter (PM), a new type of metabolomics database search that accepts metabolite response patterns as queries and searches the MetaPhen database of reference patterns for responses that are statistically significantly similar or inverse for the purposes of detecting functional links. To identify a similarity measure that would detect functional links as reliably as possible, we compared the performance of four statistics in correctly top-matching metabolic phenotypes of Arabidopsis thaliana metabolism mutants affected in different steps of the photorespiration metabolic pathway to reference phenotypes of mutants affected in the same enzymes by independent mutations. The best performing statistic, the PM score, was a function of both Pearson correlation and Fisher’s Exact Test of directional overlap. This statistic outperformed Pearson correlation, biweight midcorrelation and Fisher’s Exact Test used alone. To demonstrate general applicability, we show that the PM reliably retrieved the most closely functionally linked response in the database when queried with responses to a wide variety of environmental and genetic perturbations. Attempts to match metabolic phenotypes between independent studies were met with varying success and possible reasons for this are discussed. Overall, our results suggest that integration of pattern-based search tools into metabolomics databases will aid functional annotation of newly recorded metabolic phenotypes analogously to the way sequence similarity search algorithms have aided the functional annotation of genes and proteins. PM is freely available at MetabolomeExpress (https://www.metabolome-express.org/phenometer.php). PMID:26284240
Aouinti, Safa; Malouche, Dhafer; Giudicelli, Véronique; Kossida, Sofia; Lefranc, Marie-Paule
2015-01-01
The adaptive immune responses of humans and of other jawed vertebrate species (gnasthostomata) are characterized by the B and T cells and their specific antigen receptors, the immunoglobulins (IG) or antibodies and the T cell receptors (TR) (up to 2.1012 different IG and TR per individual). IMGT, the international ImMunoGeneTics information system (http://www.imgt.org), was created in 1989 by Marie-Paule Lefranc (Montpellier University and CNRS) to manage the huge and complex diversity of these antigen receptors. IMGT built on IMGT-ONTOLOGY concepts of identification (keywords), description (labels), classification (gene and allele nomenclature) and numerotation (IMGT unique numbering), is at the origin of immunoinformatics, a science at the interface between immunogenetics and bioinformatics. IMGT/HighV-QUEST, the first web portal, and so far the only one, for the next generation sequencing (NGS) analysis of IG and TR, is the paradigm for immune repertoire standardized outputs and immunoprofiles of the adaptive immune responses. It provides the identification of the variable (V), diversity (D) and joining (J) genes and alleles, analysis of the V-(D)-J junction and complementarity determining region 3 (CDR3) and the characterization of the 'IMGT clonotype (AA)' (AA for amino acid) diversity and expression. IMGT/HighV-QUEST compares outputs of different batches, up to one million nucleotide sequencesfor the statistical module. These high throughput IG and TR repertoire immunoprofiles are of prime importance in vaccination, cancer, infectious diseases, autoimmunity and lymphoproliferative disorders, however their comparative statistical analysis still remains a challenge. We present a standardized statistical procedure to analyze IMGT/HighV-QUEST outputs for the evaluation of the significance of the IMGT clonotype (AA) diversity differences in proportions, per gene of a given group, between NGS IG and TR repertoire immunoprofiles. The procedure is generic and
Detection of obstacles on runway using Ego-Motion compensation and tracking of significant features
NASA Technical Reports Server (NTRS)
Kasturi, Rangachar (Principal Investigator); Camps, Octavia (Principal Investigator); Gandhi, Tarak; Devadiga, Sadashiva
1996-01-01
This report describes a method for obstacle detection on a runway for autonomous navigation and landing of an aircraft. Detection is done in the presence of extraneous features such as tiremarks. Suitable features are extracted from the image and warping using approximately known camera and plane parameters is performed in order to compensate ego-motion as far as possible. Residual disparity after warping is estimated using an optical flow algorithm. Features are tracked from frame to frame so as to obtain more reliable estimates of their motion. Corrections are made to motion parameters with the residual disparities using a robust method, and features having large residual disparities are signaled as obstacles. Sensitivity analysis of the procedure is also studied. Nelson's optical flow constraint is proposed to separate moving obstacles from stationary ones. A Bayesian framework is used at every stage so that the confidence in the estimates can be determined.
Best, R; Harrell, A; Geesey, C; Libby, B; Wijesooriya, K
2014-06-15
Purpose: The purpose of this study is to inter-compare and find statistically significant differences between flattened field fixed-beam (FB) IMRT with flattening-filter free (FFF) volumetric modulated arc therapy (VMAT) for stereotactic body radiation therapy SBRT. Methods: SBRT plans using FB IMRT and FFF VMAT were generated for fifteen SBRT lung patients using 6 MV beams. For each patient, both IMRT and VMAT plans were created for comparison. Plans were generated utilizing RTOG 0915 (peripheral, 10 patients) and RTOG 0813 (medial, 5 patients) lung protocols. Target dose, critical structure dose, and treatment time were compared and tested for statistical significance. Parameters of interest included prescription isodose surface coverage, target dose heterogeneity, high dose spillage (location and volume), low dose spillage (location and volume), lung dose spillage, and critical structure maximum- and volumetric-dose limits. Results: For all criteria, we found equivalent or higher conformality with VMAT plans as well as reduced critical structure doses. Several differences passed a Student's t-test of significance: VMAT reduced the high dose spillage, evaluated with conformality index (CI), by an average of 9.4%±15.1% (p=0.030) compared to IMRT. VMAT plans reduced the lung volume receiving 20 Gy by 16.2%±15.0% (p=0.016) compared with IMRT. For the RTOG 0915 peripheral lesions, the volumes of lung receiving 12.4 Gy and 11.6 Gy were reduced by 27.0%±13.8% and 27.5%±12.6% (for both, p<0.001) in VMAT plans. Of the 26 protocol pass/fail criteria, VMAT plans were able to achieve an average of 0.2±0.7 (p=0.026) more constraints than the IMRT plans. Conclusions: FFF VMAT has dosimetric advantages over fixed beam IMRT for lung SBRT. Significant advantages included increased dose conformity, and reduced organs-at-risk doses. The overall improvements in terms of protocol pass/fail criteria were more modest and will require more patient data to establish difference
The significance of clinical practice guidelines on adult varicocele detection and management
Shridharani, Anand; Owen, Ryan C; Elkelany, Osama O; Kim, Edward D
2016-01-01
Varicoceles are the most common correctable etiology of male factor infertility. However, the detection and management of varicoceles have not been standardized. This has led to decades of debate regarding the effect of varicocele on male infertility and subsequently whether repair leads to an improved fertility status. The current body of evidence investigating the role of varicocele and varicocelectomy is weak and conflicting. The stance taken by the AUA and ASRM suggests that there is insufficient outcomes data to support evidenced-based guidelines, citing evidence used to provide current recommendations are generally of a low quality level. On the other hand, the EAU Guidelines give a level 1a of evidence for management of varicoceles that are clinically palpable, associated with subnormal semen analyses and having otherwise unexplained fertility. Besides aiding with clinical varicocele detection and management, clinical practice opinion statements and guidelines aim to direct and strengthen the infrastructure of future studies. We review the current status of opinion statements and guidelines in varicocele and management detection with focus on their application in practice. PMID:26806081
A comparative study of four significance measures for periodicity detection in astronomical surveys
NASA Astrophysics Data System (ADS)
Süveges, Maria; Guy, Leanne P.; Eyer, Laurent; Cuypers, Jan; Holl, Berry; Lecoeur-Taïbi, Isabelle; Mowlavi, Nami; Nienartowicz, Krzysztof; Blanco, Diego Ordóñez; Rimoldini, Lorenzo; Ruiz, Idoia
2015-06-01
We study the problem of periodicity detection in massive data sets of photometric or radial velocity time series, as presented by ESA's Gaia mission. Periodicity detection hinges on the estimation of the false alarm probability of the extremum of the periodogram of the time series. We consider the problem of its estimation with two main issues in mind. First, for a given number of observations and signal-to-noise ratio, the rate of correct periodicity detections should be constant for all realized cadences of observations regardless of the observational time patterns, in order to avoid sky biases that are difficult to assess. Secondly, the computational loads should be kept feasible even for millions of time series. Using the Gaia case, we compare the FM method of Paltani and Schwarzenberg-Czerny, the Baluev method and the GEV method of Süveges, as well as a method for the direct estimation of a threshold. Three methods involve some unknown parameters, which are obtained by fitting a regression-type predictive model using easily obtainable covariates derived from observational time series. We conclude that the GEV and the Baluev methods both provide good solutions to the issues posed by a large-scale processing. The first of these yields the best scientific quality at the price of some moderately costly pre-processing. When this pre-processing is impossible for some reason (e.g. the computational costs are prohibitive or good regression models cannot be constructed), the Baluev method provides a computationally inexpensive alternative with slight biases in regions where time samplings exhibit strong aliases.
NASA Astrophysics Data System (ADS)
Zhao, J. Q.; Yang, J.; Li, P. X.; Liu, M. Y.; Shi, Y. M.
2016-06-01
Accurate and timely change detection of Earth's surface features is extremely important for understanding relationships and interactions between people and natural phenomena. Many traditional methods of change detection only use a part of polarization information and the supervised threshold selection. Those methods are insufficiency and time-costing. In this paper, we present a novel unsupervised change-detection method based on quad-polarimetric SAR data and automatic threshold selection to solve the problem of change detection. First, speckle noise is removed for the two registered SAR images. Second, the similarity measure is calculated by the test statistic, and automatic threshold selection of KI is introduced to obtain the change map. The efficiency of the proposed method is demonstrated by the quad-pol SAR images acquired by Radarsat-2 over Wuhan of China.
Malm, Christer B.; Khoo, Nelson S.; Granlund, Irene; Lindstedt, Emilia; Hult, Andreas
2016-01-01
The discovery of erythropoietin (EPO) simplified blood doping in sports, but improved detection methods, for EPO has forced cheating athletes to return to blood transfusion. Autologous blood transfusion with cryopreserved red blood cells (RBCs) is the method of choice, because no valid method exists to accurately detect such event. In endurance sports, it can be estimated that elite athletes improve performance by up to 3% with blood doping, regardless of method. Valid detection methods for autologous blood doping is important to maintain credibility of athletic performances. Recreational male (N = 27) and female (N = 11) athletes served as Transfusion (N = 28) and Control (N = 10) subjects in two different transfusion settings. Hematological variables and physical performance were measured before donation of 450 or 900 mL whole blood, and until four weeks after re-infusion of the cryopreserved RBC fraction. Blood was analyzed for transferrin, iron, Hb, EVF, MCV, MCHC, reticulocytes, leucocytes and EPO. Repeated measures multivariate analysis of variance (MANOVA) and pattern recognition using Principal Component Analysis (PCA) and Orthogonal Projections of Latent Structures (OPLS) discriminant analysis (DA) investigated differences between Control and Transfusion groups over time. Significant increase in performance (15 ± 8%) and VO2max (17 ± 10%) (mean ± SD) could be measured 48 h after RBC re-infusion, and remained increased for up to four weeks in some subjects. In total, 533 blood samples were included in the study (Clean = 220, Transfused = 313). In response to blood transfusion, the largest change in hematological variables occurred 48 h after blood donation, when Control and Transfused groups could be separated with OPLS-DA (R2 = 0.76/Q2 = 0.59). RBC re-infusion resulted in the best model (R2 = 0.40/Q2 = 0.10) at the first sampling point (48 h), predicting one false positive and one false negative. Over all, a 25% and 86% false positives ratio was
Buhl, M.R.; Clark, G.A.; Candy, J.V.; Thomas, G.H.
1993-12-01
The goal of this work was to detect ``single-leg separated`` Bjoerk-Shiley Convexo-Concave heart valves which had been implanted in sheep. A ``single-leg separated`` heart valve contains a fracture in the outlet strut resulting in an increased risk of mechanical failure. The approach presented in this report detects such fractures by applying statistical pattern recognition with the nearest neighbor classifier to the acoustic signatures of the valve opening. This approach is discussed and results of applying it to real data are given.
Buhl, M.R.; Clark, G.A.; Candy, J.V.; Thomas, G.H.
1993-07-16
The goal of this work was to detect ``single-leg separated`` Bjoerk-Shiley Convexo-Concave heart valves which had been implanted in sheep. A ``single-leg separated`` heart valve contains a fracture in the outlet strut resulting in an increased risk of mechanical failure. The approach presented in this report detects such fractures by applying statistical pattern recognition with the nearest neighbor classifier to the acoustic signatures of the valve opening. This approach is discussed and results of applying it to real data are given.
Statistical Detection of Multiple-Choice Answer Copying: Review and Commentary.
ERIC Educational Resources Information Center
Frary, Robert B.
1993-01-01
Methods for detecting copying of multiple-choice test responses are reviewed and compared with respect to their effectiveness and the practicality of their application for groups of varying sizes. Reasons why effective detection methods are seldom applied in standardized and classroom testing are discussed. (SLD)
Research of adaptive threshold edge detection algorithm based on statistics canny operator
NASA Astrophysics Data System (ADS)
Xu, Jian; Wang, Huaisuo; Huang, Hua
2015-12-01
The traditional Canny operator cannot get the optimal threshold in different scene, on this foundation, an improved Canny edge detection algorithm based on adaptive threshold is proposed. The result of the experiment pictures indicate that the improved algorithm can get responsible threshold, and has the better accuracy and precision in the edge detection.
NASA Astrophysics Data System (ADS)
Yu, Gang; Li, Changning; Zhang, Jianfeng
2013-12-01
Due to limited information given by traditional local statistics, a new statistical modeling method for rolling element bearing fault signals is proposed based on alpha-stable distribution. In order to fully take advantages of complete information provided by alpha-stable distribution, this paper focuses on testing the validity of the proposed statistical model. A number of hypothetical test methods were applied to practical bearing fault vibration signals with different fault types and degrees. Through testing on the consistency of three alpha-stable parameter estimation methods, and the probability density function fitting level between fault signals and their corresponding hypothetical alpha-stable distributions, it can be concluded that such a non-Gaussian model is sufficient to thoroughly describe the statistical characteristics of bearing fault signals with impulsive behaviors, and consequently the alpha-stable hypothesis is verified. In the meantime, a new bearing fault detection method based on kurtogram and α parameter of the alpha-stable model is proposed, experimental results have shown that the proposed method has better performance on detecting incipient bearing faults than that based on the traditional kurtogram.
de Marcellus, Pierre; Bertrand, Marylène; Nuevo, Michel; Westall, Frances; Le Sergeant d'Hendecourt, Louis
2011-11-01
The delivery of extraterrestrial organic materials to primitive Earth from meteorites or micrometeorites has long been postulated to be one of the origins of the prebiotic molecules involved in the subsequent apparition of life. Here, we report on experiments in which vacuum UV photo-irradiation of interstellar/circumstellar ice analogues containing H(2)O, CH(3)OH, and NH(3) led to the production of several molecules of prebiotic interest. These were recovered at room temperature in the semi-refractory, water-soluble residues after evaporation of the ice. In particular, we detected small quantities of hydantoin (2,4-imidazolidinedione), a species suspected to play an important role in the formation of poly- and oligopeptides. In addition, hydantoin is known to form under extraterrestrial, abiotic conditions, since it has been detected, along with various other derivatives, in the soluble part of organic matter of primitive carbonaceous meteorites. This result, together with other related experiments reported recently, points to the potential importance of the photochemistry of interstellar "dirty" ices in the formation of organics in Solar System materials. Such molecules could then have been delivered to the surface of primitive Earth, as well as other telluric (exo-) planets, to help trigger first prebiotic reactions with the capacity to lead to some form of primitive biomolecular activity.
de Marcellus, Pierre; Bertrand, Marylène; Nuevo, Michel; Westall, Frances; Le Sergeant d'Hendecourt, Louis
2011-11-01
The delivery of extraterrestrial organic materials to primitive Earth from meteorites or micrometeorites has long been postulated to be one of the origins of the prebiotic molecules involved in the subsequent apparition of life. Here, we report on experiments in which vacuum UV photo-irradiation of interstellar/circumstellar ice analogues containing H(2)O, CH(3)OH, and NH(3) led to the production of several molecules of prebiotic interest. These were recovered at room temperature in the semi-refractory, water-soluble residues after evaporation of the ice. In particular, we detected small quantities of hydantoin (2,4-imidazolidinedione), a species suspected to play an important role in the formation of poly- and oligopeptides. In addition, hydantoin is known to form under extraterrestrial, abiotic conditions, since it has been detected, along with various other derivatives, in the soluble part of organic matter of primitive carbonaceous meteorites. This result, together with other related experiments reported recently, points to the potential importance of the photochemistry of interstellar "dirty" ices in the formation of organics in Solar System materials. Such molecules could then have been delivered to the surface of primitive Earth, as well as other telluric (exo-) planets, to help trigger first prebiotic reactions with the capacity to lead to some form of primitive biomolecular activity. PMID:22059641
An Automated Statistical Process Control Study of Inline Mixing Using Spectrophotometric Detection
ERIC Educational Resources Information Center
Dickey, Michael D.; Stewart, Michael D.; Willson, C. Grant
2006-01-01
An experiment is described, which is designed for a junior-level chemical engineering "fundamentals of measurements and data analysis" course, where students are introduced to the concept of statistical process control (SPC) through a simple inline mixing experiment. The students learn how to create and analyze control charts in an effort to…
Bamidis, P D; Lithari, C; Konstantinidis, S T
2010-01-01
With the number of scientific papers published in journals, conference proceedings, and international literature ever increasing, authors and reviewers are not only facilitated with an abundance of information, but unfortunately continuously confronted with risks associated with the erroneous copy of another's material. In parallel, Information Communication Technology (ICT) tools provide to researchers novel and continuously more effective ways to analyze and present their work. Software tools regarding statistical analysis offer scientists the chance to validate their work and enhance the quality of published papers. Moreover, from the reviewers and the editor's perspective, it is now possible to ensure the (text-content) originality of a scientific article with automated software tools for plagiarism detection. In this paper, we provide a step-bystep demonstration of two categories of tools, namely, statistical analysis and plagiarism detection. The aim is not to come up with a specific tool recommendation, but rather to provide useful guidelines on the proper use and efficiency of either category of tools. In the context of this special issue, this paper offers a useful tutorial to specific problems concerned with scientific writing and review discourse. A specific neuroscience experimental case example is utilized to illustrate the young researcher's statistical analysis burden, while a test scenario is purpose-built using open access journal articles to exemplify the use and comparative outputs of seven plagiarism detection software pieces. PMID:21487489
Nakashima, R; Hosono, Y; Mimori, T
2016-07-01
Anti-aminoacyl-tRNA synthetase (ARS) and anti-melanoma differentiation-associated gene 5 (MDA5) antibodies are closely associated with interstitial lung disease in polymyositis and dermatomyositis. Anti-ARS-positive patients develop common clinical characteristics termed anti-synthetase syndrome and share a common clinical course, in which they respond well to initial treatment with glucocorticoids but in which disease tends to recur when glucocorticoids are tapered. Anti-MDA5 antibody is associated with rapidly progressive interstitial lung disease and poor prognosis, particularly in Asia. Therefore, intensive immunosuppressive therapy is required for anti-MDA5-positive patients from the early phase of the disease. New enzyme-linked immunosorbent assays to detect anti-ARS and anti-MDA5 antibodies have recently been established and are suggested to be efficient and useful. These assays are expected to be widely applied in daily practice. PMID:27252271
Maric, Marija; de Haan, Else; Hogendoorn, Sanne M; Wolters, Lidewij H; Huizenga, Hilde M
2015-03-01
Single-case experimental designs are useful methods in clinical research practice to investigate individual client progress. Their proliferation might have been hampered by methodological challenges such as the difficulty applying existing statistical procedures. In this article, we describe a data-analytic method to analyze univariate (i.e., one symptom) single-case data using the common package SPSS. This method can help the clinical researcher to investigate whether an intervention works as compared with a baseline period or another intervention type, and to determine whether symptom improvement is clinically significant. First, we describe the statistical method in a conceptual way and show how it can be implemented in SPSS. Simulation studies were performed to determine the number of observation points required per intervention phase. Second, to illustrate this method and its implications, we present a case study of an adolescent with anxiety disorders treated with cognitive-behavioral therapy techniques in an outpatient psychotherapy clinic, whose symptoms were regularly assessed before each session. We provide a description of the data analyses and results of this case study. Finally, we discuss the advantages and shortcomings of the proposed method.
Detecting significant change in stream benthic macroinvertebrate communities in wilderness areas
Milner, Alexander M.; Woodward, Andrea; Freilich, Jerome E.; Black, Robert W.; Resh, Vincent H.
2015-01-01
Within a region, both MDS analyses typically identified similar years as exceeding reference condition variation, illustrating the utility of the approach for identifying wider spatial scale effects that influence more than one stream. MDS responded to both simulated water temperature stress and a pollutant event, and generally outlying years on MDS plots could be explained by environmental variables, particularly higher precipitation. Multivariate control charts successfully identified whether shifts in community structure identified by MDS were significant and whether the shift represented a press disturbance (long-term change) or a pulse disturbance. We consider a combination of TD and MDS with control charts to be a potentially powerful tool for determining years significantly outside of a reference condition variation.
Statistical power of detecting trends in total suspended sediment loads to the Great Barrier Reef.
Darnell, Ross; Henderson, Brent; Kroon, Frederieke J; Kuhnert, Petra
2012-01-01
The export of pollutant loads from coastal catchments is of primary interest to natural resource management. For example, Reef Plan, a joint initiative by the Australian Government and the Queensland Government, has indicated that a 20% reduction in sediment is required by 2020. There is an obvious need to consider our ability to detect any trend if we are to set realistic targets or to reliably identify changes to catchment loads. We investigate the number of years of monitoring aquatic pollutant loads necessary to detect trends. Instead of modelling the trend in the annual loads directly, given their strong relationship to flow, we consider trends through the reduction in concentration for a given flow. Our simulations show very low power (<40%) of detecting changes of 20% over time periods of several decades, indicating that the chances of detecting trends of reasonable magnitudes over these time frames are very small. PMID:22551850
Aouinti, Safa; Malouche, Dhafer; Giudicelli, Véronique; Kossida, Sofia; Lefranc, Marie-Paule
2015-01-01
The adaptive immune responses of humans and of other jawed vertebrate species (gnasthostomata) are characterized by the B and T cells and their specific antigen receptors, the immunoglobulins (IG) or antibodies and the T cell receptors (TR) (up to 2.1012 different IG and TR per individual). IMGT, the international ImMunoGeneTics information system (http://www.imgt.org), was created in 1989 by Marie-Paule Lefranc (Montpellier University and CNRS) to manage the huge and complex diversity of these antigen receptors. IMGT built on IMGT-ONTOLOGY concepts of identification (keywords), description (labels), classification (gene and allele nomenclature) and numerotation (IMGT unique numbering), is at the origin of immunoinformatics, a science at the interface between immunogenetics and bioinformatics. IMGT/HighV-QUEST, the first web portal, and so far the only one, for the next generation sequencing (NGS) analysis of IG and TR, is the paradigm for immune repertoire standardized outputs and immunoprofiles of the adaptive immune responses. It provides the identification of the variable (V), diversity (D) and joining (J) genes and alleles, analysis of the V-(D)-J junction and complementarity determining region 3 (CDR3) and the characterization of the ‘IMGT clonotype (AA)’ (AA for amino acid) diversity and expression. IMGT/HighV-QUEST compares outputs of different batches, up to one million nucleotide sequencesfor the statistical module. These high throughput IG and TR repertoire immunoprofiles are of prime importance in vaccination, cancer, infectious diseases, autoimmunity and lymphoproliferative disorders, however their comparative statistical analysis still remains a challenge. We present a standardized statistical procedure to analyze IMGT/HighV-QUEST outputs for the evaluation of the significance of the IMGT clonotype (AA) diversity differences in proportions, per gene of a given group, between NGS IG and TR repertoire immunoprofiles. The procedure is generic and
Čunderlíková, B
2016-09-01
Our understanding of cancer has evolved mainly from results of studies utilizing experimental models. Simplification inherent to in vitro cell culture models enabled potential ways of cell behaviour in response to various external stimuli to be described, but it has led also to disappointments in clinical trials, presumably due to the lack of crucial tissue components, including extracellular matrix (ECM). ECM and its role in healthy and diseased tissues are being explored extensively and significance of ECM for cell behaviour has been evidenced experimentally. Part of the information gathered in such research that is relevant for natural conditions of a human body can be identified by carefully designed analyses of human tissue samples. This review summarizes published information on clinical significance of ECM in cancer and examines whether effects of ECM on cell behaviour evidenced in vitro, could be supported by clinically based data acquired from analysis of tissue samples. Based on current approaches of clinical immunohistochemical analyses, impact of ECM components on tumour cell behaviour is vague. Except of traditionally considered limitations, other reasons may include lack of stratification of analyzed cases based on clinicopathologic parameters, inclusion of patients treated postoperatively by different treatments or neglecting complexity of interactions among tumour constituents. Nevertheless, reliable immunohistochemical studies represent a source of crucial information for design of tumour models comprising ECM corresponding to real clinical situation. Knowledge gathered from such immunohistochemical studies combined with achievements in tissue engineering hold promise for reversal of the unfavourable trends in the current translational oncologic research. PMID:27443915
Bartnik, A.; Wachulak, P.; Fiedorowicz, H.; Fok, T.; Jarocki, R.; Szczurek, M.
2013-11-15
In this work, spectral investigations of photoionized He plasmas were performed. The photoionized plasmas were created by irradiation of helium stream, with intense pulses from laser-plasma extreme ultraviolet (EUV) source. The EUV source was based on a double-stream Xe/Ne gas-puff target irradiated with 10 ns/10 J Nd:YAG laser pulses. The most intense emission from the source spanned a relatively narrow spectral region below 20 nm, however, spectrally integrated intensity at longer wavelengths was also significant. The EUV radiation was focused onto a gas stream, injected into a vacuum chamber synchronously with the EUV pulse. The long-wavelength part of the EUV radiation was used for backlighting of the photoionized plasmas to obtain absorption spectra. Both emission and absorption spectra in the EUV range were investigated. Significant differences between absorption spectra acquired for neutral helium and low temperature photoionized plasmas were demonstrated for the first time. Strong increase of intensities and spectral widths of absorption lines, together with a red shift of the K-edge, was shown.
NASA Astrophysics Data System (ADS)
Bartnik, A.; Wachulak, P.; Fiedorowicz, H.; Fok, T.; Jarocki, R.; Szczurek, M.
2013-11-01
In this work, spectral investigations of photoionized He plasmas were performed. The photoionized plasmas were created by irradiation of helium stream, with intense pulses from laser-plasma extreme ultraviolet (EUV) source. The EUV source was based on a double-stream Xe/Ne gas-puff target irradiated with 10 ns/10 J Nd:YAG laser pulses. The most intense emission from the source spanned a relatively narrow spectral region below 20 nm, however, spectrally integrated intensity at longer wavelengths was also significant. The EUV radiation was focused onto a gas stream, injected into a vacuum chamber synchronously with the EUV pulse. The long-wavelength part of the EUV radiation was used for backlighting of the photoionized plasmas to obtain absorption spectra. Both emission and absorption spectra in the EUV range were investigated. Significant differences between absorption spectra acquired for neutral helium and low temperature photoionized plasmas were demonstrated for the first time. Strong increase of intensities and spectral widths of absorption lines, together with a red shift of the K-edge, was shown.
Zeng, W; Liu, B
1999-01-01
Digital watermarking has been proposed as the means for copyright protection of multimedia data. Many of existing watermarking schemes focused on the robust means to mark an image invisibly without really addressing the ends of these schemes. This paper first discusses some scenarios in which many current watermarking schemes fail to resolve the rightful ownership of an image. The key problems are then identified, and some crucial requirements for a valid invisible watermark detection are discussed. In particular, we show that, for the particular application of resolving rightful ownership using invisible watermarks, it might be crucial to require that the original image not be directly involved in the watermark detection process. A general framework for validly detecting the invisible watermarks is then proposed. Some requirements on the claimed signature/watermarks to be used for detection are discussed to prevent the existence of any counterfeit scheme. The optimal detection strategy within the framework is derived. We show the effectiveness of this technique based on some visual-model-based watermark encoding schemes. PMID:18267429
Prendes, Jorge; Chabert, Marie; Pascal, Frederic; Giros, Alain; Tourneret, Jean-Yves
2015-03-01
Remote sensing images are commonly used to monitor the earth surface evolution. This surveillance can be conducted by detecting changes between images acquired at different times and possibly by different kinds of sensors. A representative case is when an optical image of a given area is available and a new image is acquired in an emergency situation (resulting from a natural disaster for instance) by a radar satellite. In such a case, images with heterogeneous properties have to be compared for change detection. This paper proposes a new approach for similarity measurement between images acquired by heterogeneous sensors. The approach exploits the considered sensor physical properties and specially the associated measurement noise models and local joint distributions. These properties are inferred through manifold learning. The resulting similarity measure has been successfully applied to detect changes between many kinds of images, including pairs of optical images and pairs of optical-radar images.
Dantec, Loïck Le; Chagné, David; Pot, David; Cantin, Olivier; Garnier-Géré, Pauline; Bedon, Frank; Frigerio, Jean-Marc; Chaumeil, Philippe; Léger, Patrick; Garcia, Virginie; Laigret, Frédéric; De Daruvar, Antoine; Plomion, Christophe
2004-02-01
We developed an automated pipeline for the detection of single nucleotide polymorphisms (SNPs) in expressed sequence tag (EST) data sets, by combining three DNA sequence analysis programs: Phred, Phrap and PolyBayes. This application requires access to the individual electrophoregram traces. First, a reference set of 65 SNPs was obtained from the sequencing of 30 gametes in 13 maritime pine (Pinus pinaster Ait.) gene fragments (6671 bp), resulting in a frequency of 1 SNP every 102.6 bp. Second, parameters of the three programs were optimized in order to retrieve as many true SNPs, while keeping the rate of false positive as low as possible. Overall, the efficiency of detection of true SNPs was 83.1%. However, this rate varied largely as a function of the rare SNP allele frequency: down to 41% for rare SNP alleles (frequency < 10%), up to 98% for allele frequencies above 10%. Third, the detection method was applied to the 18498 assembled maritime pine (Pinus pinaster Ait.) ESTs, allowing to identify a total of 1400 candidate SNPs, in contigs containing between 4 and 20 sequence reads. These genetic resources, described for the first time in a forest tree species, were made available at http://www.pierroton.inra/genetics/Pinesnps. We also derived an analytical expression for the SNP detection probability as a function of the SNP allele frequency, the number of haploid genomes used to generate the EST sequence database, and the sample size of the contigs considered for SNP detection. The frequency of the SNP allele was shown to be the main factor influencing the probability of SNP detection.
ERIC Educational Resources Information Center
Liu, Ming-Tsung; Yu, Pao-Ta
2011-01-01
A personalized e-learning service provides learning content to fit learners' individual differences. Learning achievements are influenced by cognitive as well as non-cognitive factors such as mood, motivation, interest, and personal styles. This paper proposes the Learning Caution Indexes (LCI) to detect aberrant learning patterns. The philosophy…
NASA Astrophysics Data System (ADS)
Li, Feixue; Li, Manchun; Liang, Jian; Liu, Yongxue; Chen, Zhenjie; Chen, Dong
2008-10-01
Numerous remote sensing change detection methods have been used in urban land use change identification and analysis, in which image regression is regarded as effective as other approaches. Traditional image regression approaches for change detection often produce unsatisfactory results by assuming the relationships in study data in a consistent manner in place, and spatial correlation between pixels inherent in remote sensing images is usually ignored in the analysis. Geographically Weighted Regression (GWR) addresses this weakness by obtaining local parameter estimates for each observation. This paper reports preliminary results from a study applying GWR to the land use change detection in urban center and urban fringe of Nanjing city, China, using satellite images of 2000 and 2004. The results show that the use of GWR can identify the land use change, the global patterns, the local patterns, as well as the points not consistent with local patterns in the urban environment; and the under-development and over-development points are also detected by GWR model.
Efficient snoring and breathing detection based on sub-band spectral statistics.
Sun, Xiang; Kim, Jin Young; Won, Yonggwan; Kim, Jung-Ja; Kim, Kyung-Ah
2015-01-01
Snoring, a common symptom in the general population may indicate the presence of obstructive sleep apnea (OSA). In order to detect snoring events in sleep sound recordings, a novel method was proposed in this paper. The proposed method operates by analyzing the acoustic characteristics of the snoring sounds. Based on these acoustic properties, the feature vectors are obtained using the mean and standard deviation of the sub-band spectral energy. A support vector machine is then applied to perform the frame-based classification procedure. This method was demonstrated experimentally to be effective for snoring detection. The database for detection included full-night audio recordings from four individuals who acknowledged having snoring habits. The performance of the proposed method was evaluated by classifying different events (snoring, breathing and silence) from the sleep sound recordings and comparing the classification against ground truth. The proposed algorithm was able to achieve an accuracy of 99.61% for detecting snoring events, 99.16% for breathing, and 99.55% for silence. PMID:26406075
Statistical Detection of Multiple-Choice Test Answer Copying: State of the Art.
ERIC Educational Resources Information Center
Frary, Robert B.
Practical and effective methods for detecting copying of multiple-choice test responses have been available for many years. These methods have been used routinely by large admissions and licensing testing programs. However, these methods are seldom applied in the areas of standardized or classroom testing in schools or colleges, and knowledge…
NASA Astrophysics Data System (ADS)
Salih, A. L.; Mühlbauer, M.; Grumpe, A.; Pasckert, J. H.; Wöhler, C.; Hiesinger, H.
2016-06-01
The analysis of the impact crater size-frequency distribution (CSFD) is a well-established approach to the determination of the age of planetary surfaces. Classically, estimation of the CSFD is achieved by manual crater counting and size determination in spacecraft images, which, however, becomes very time-consuming for large surface areas and/or high image resolution. With increasing availability of high-resolution (nearly) global image mosaics of planetary surfaces, a variety of automated methods for the detection of craters based on image data and/or topographic data have been developed. In this contribution a template-based crater detection algorithm is used which analyses image data acquired under known illumination conditions. Its results are used to establish the CSFD for the examined area, which is then used to estimate the absolute model age of the surface. The detection threshold of the automatic crater detection algorithm is calibrated based on a region with available manually determined CSFD such that the age inferred from the manual crater counts corresponds to the age inferred from the automatic crater detection results. With this detection threshold, the automatic crater detection algorithm can be applied to a much larger surface region around the calibration area. The proposed age estimation method is demonstrated for a Kaguya Terrain Camera image mosaic of 7.4 m per pixel resolution of the floor region of the lunar crater Tsiolkovsky, which consists of dark and flat mare basalt and has an area of nearly 10,000 km2. The region used for calibration, for which manual crater counts are available, has an area of 100 km2. In order to obtain a spatially resolved age map, CSFDs and surface ages are computed for overlapping quadratic regions of about 4.4 x 4.4 km2 size offset by a step width of 74 m. Our constructed surface age map of the floor of Tsiolkovsky shows age values of typically 3.2-3.3 Ga, while for small regions lower (down to 2.9 Ga) and higher
Jäger, Markus; Bottlender, Ronald; Strauss, Anton; Möller, Hans-Jürgen
2005-01-01
Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV), after Kraepelin's original description of "manic-depressive insanity," embodied a broad concept of affective disorders including mood-congruent and mood-incongruent psychotic features. Controversial results have been reported about the prognostic significance of psychotic symptoms in depressive disorders challenging this broad concept of affective disorders. One hundred seventeen inpatients first hospitalized in 1980 to 1982 who retrospectively fulfilled the DSM-IV criteria for depressive disorders with mood-congruent or mood-incongruent psychotic features (n = 20), nonpsychotic depressive disorders (n = 33), or schizophrenia (n = 64) were followed up 15 years after their first hospitalization. Global functioning was recorded with the Global Assessment Scale; the clinical picture at follow-up was assessed using the Hamilton Rating Scale for Depression, the Positive and Negative Syndrome Scale, and the Scale for the Assessment of Negative Symptoms. With respect to global functioning, clinical picture, and social impairment at follow-up, depressive disorders with psychotic features were similar to those without, but markedly different from schizophrenia. However, patients with psychotic depressive disorders experienced more rehospitalizations than those with nonpsychotic ones. The findings indicating low prognostic significance of psychotic symptoms in depressive disorders are in line with the broad concept of affective disorders in DSM-IV.
Detecting hippocampal shape changes in Alzheimer's disease using statistical shape models
NASA Astrophysics Data System (ADS)
Shen, Kaikai; Bourgeat, Pierrick; Fripp, Jurgen; Meriaudeau, Fabrice; Salvado, Olivier
2011-03-01
The hippocampus is affected at an early stage in the development of Alzheimer's disease (AD). Using brain Magnetic Resonance (MR) images, we can investigate the effect of AD on the morphology of the hippocampus. Statistical shape models (SSM) are usually used to describe and model the hippocampal shape variations among the population. We use the shape variation from SSM as features to classify AD from normal control cases (NC). Conventional SSM uses principal component analysis (PCA) to compute the modes of variations among the population. Although these modes are representative of variations within the training data, they are not necessarily discriminant on labelled data. In this study, a Hotelling's T 2 test is used to qualify the landmarks which can be used for PCA. The resulting variation modes are used as predictors of AD from NC. The discrimination ability of these predictors is evaluated in terms of their classification performances using support vector machines (SVM). Using only landmarks statistically discriminant between AD and NC in SSM showed a better separation between AD and NC. These predictors also showed better correlation to the cognitive scores such as mini-mental state examination (MMSE) and Alzheimer's disease assessment scale (ADAS).
NASA Astrophysics Data System (ADS)
Vanli, O. Arda; Jung, Sungmoon
2014-01-01
Health monitoring of large structures with embedded, distributed sensor systems is gaining importance. This study proposes a new probabilistic model updating method in order to improve the damage prediction capability of a finite element analysis (FEA) model with experimental observations from a Lamb-wave sensing system. The approach statistically calibrates unknown parameters of the FEA model and estimates a bias-correcting function to achieve a good match between the model predictions and sensor observations. An experimental validation study is presented in which a set of controlled damages are generated on a composite panel. Time-series signals are collected with the damage condition using a Lamb-wave sensing system and a one dimensional FEA model of the panel is constructed to quantify the damages. The damage indices from both the experiments and the computational model are used to calibrate assumed parameters of the FEA model and to estimate a bias-correction function. The updated model is used to predict the size (extent) and location of damage. It is shown that the proposed model updating approach achieves a prediction accuracy that is superior to a purely statistical approach or a deterministic model calibration approach.
Xu, Kai; Yoshida, Ruriko
2010-01-01
Although exchange of genetic information by recombination plays an important role in the evolution of viruses, it is not clear how it generates diversity. Understanding recombination events helps with the study of the evolution of new virus strains or new viruses. Geminiviruses are plant viruses which have ambisense single-stranded circular DNA genomes and are one of the most economically important plant viruses in agricultural production. Small circular single-stranded DNA satellites, termed DNA-β, have recently been found to be associated with some geminivirus infections. In this paper we analyze several DNA-β sequences of geminiviruses for recombination events using phylogenetic and statistical analysis and we find that one strain from ToLCMaB has a recombination pattern and is a recombinant molecule between two strains from two species, PaLCuB-[IN:Chi:05] (major parent) and ToLCB-[IN:CP:04] (minor parent). We propose that this recombination event contributed to the evolution of the strain of ToLCMaB in South India. The Hidden Markov Chain (HMM) method developed by Webb et al. (2009) estimating phylogenetic tree through out the whole alignment provide us a recombination history of these DNA-β strains. It is the first time that this statistic method has been used on DNA-β recombination study and give a clear recombination history of DNA-β recombination. PMID:21423447
Statistical approaches to nonstationary EEGs for the detection of slow vertex responses.
Fujikake, M; Ninomija, S P; Fujita, H
1989-06-01
A slow vertex response (SVR) is an electric auditory evoked response used for an objective hearing power test. One of the aims of an objective hearing power test is to find infants whose hearing is less than that of normal infants. Early medical treatment is important for infants with a loss of hearing so that they do not have retarded growth. To measure SVRs, we generally use the averaged summation method of an electroencephalogram (EEG), because the signal-to-noise ratio (SVR to EEG and etc.) is very poor. To increase the reliability and stability of measured SVRs, and at the same time, to make the burden of testing light, it is necessary to device an effective measurement method of SVR. Two factors must be considered: (1) SVR waveforms change following the changes of EEGs caused by sleeping and (2) EEGs are considered as nonstationary data in prolonged measurement. In this paper, five statistical methods are used on two different models; a stationary model and a nonstationary model. Through the comparison of waves obtained by each method, we will clarify the statistical characteristics of the original data (EEGs including SVRs), and consider the conditions that effect the measurement method of an SVR. PMID:2794816
NASA Astrophysics Data System (ADS)
Lin, Y. Q.; Ren, W. X.; Fang, S. E.
2011-11-01
Although most vibration-based damage detection methods can acquire satisfactory verification on analytical or numerical structures, most of them may encounter problems when applied to real-world structures under varying environments. The damage detection methods that directly extract damage features from the periodically sampled dynamic time history response measurements are desirable but relevant research and field application verification are still lacking. In this second part of a two-part paper, the robustness and performance of the statistics-based damage index using the forward innovation model by stochastic subspace identification of a vibrating structure proposed in the first part have been investigated against two prestressed reinforced concrete (RC) beams tested in the laboratory and a full-scale RC arch bridge tested in the field under varying environments. Experimental verification is focused on temperature effects. It is demonstrated that the proposed statistics-based damage index is insensitive to temperature variations but sensitive to the structural deterioration or state alteration. This makes it possible to detect the structural damage for the real-scale structures experiencing ambient excitations and varying environmental conditions.
Guilera, Georgina; Gómez-Benito, Juana; Hidalgo, Maria Dolores; Sánchez-Meca, Julio
2013-12-01
This article presents a meta-analysis of studies investigating the effectiveness of the Mantel-Haenszel (MH) procedure when used to detect differential item functioning (DIF). Studies were located electronically in the main databases, representing the codification of 3,774 different simulation conditions, 1,865 related to Type I error and 1,909 to statistical power. The homogeneity of effect-size distributions was assessed by the Q statistic. The extremely high heterogeneity in both error rates (I² = 94.70) and power (I² = 99.29), due to the fact that numerous studies test the procedure in extreme conditions, means that the main interest of the results lies in explaining the variability in detection rates. One-way analysis of variance was used to determine the effects of each variable on detection rates, showing that the MH test was more effective when purification procedures were used, when the data fitted the Rasch model, when test contamination was below 20%, and with sample sizes above 500. The results imply a series of recommendations for practitioners who wish to study DIF with the MH test. A limitation, one inherent to all meta-analyses, is that not all the possible moderator variables, or the levels of variables, have been explored. This serves to remind us of certain gaps in the scientific literature (i.e., regarding the direction of DIF or variances in ability distribution) and is an aspect that methodologists should consider in future simulation studies. PMID:24127986
Röling, Wilfred F M; Aerts, Joost W; Patty, C H Lucas; ten Kate, Inge Loes; Ehrenfreund, Pascale; Direito, Susana O L
2015-06-01
The detection of biomarkers plays a central role in our effort to establish whether there is, or was, life beyond Earth. In this review, we address the importance of considering mineralogy in relation to the selection of locations and biomarker detection methodologies with characteristics most promising for exploration. We review relevant mineral-biomarker and mineral-microbe interactions. The local mineralogy on a particular planet reflects its past and current environmental conditions and allows a habitability assessment by comparison with life under extreme conditions on Earth. The type of mineral significantly influences the potential abundances and types of biomarkers and microorganisms containing these biomarkers. The strong adsorptive power of some minerals aids in the preservation of biomarkers and may have been important in the origin of life. On the other hand, this strong adsorption as well as oxidizing properties of minerals can interfere with efficient extraction and detection of biomarkers. Differences in mechanisms of adsorption and in properties of minerals and biomarkers suggest that it will be difficult to design a single extraction procedure for a wide range of biomarkers. While on Mars samples can be used for direct detection of biomarkers such as nucleic acids, amino acids, and lipids, on other planetary bodies remote spectrometric detection of biosignatures has to be relied upon. The interpretation of spectral signatures of photosynthesis can also be affected by local mineralogy. We identify current gaps in our knowledge and indicate how they may be filled to improve the chances of detecting biomarkers on Mars and beyond.
Röling, Wilfred F M; Aerts, Joost W; Patty, C H Lucas; ten Kate, Inge Loes; Ehrenfreund, Pascale; Direito, Susana O L
2015-06-01
The detection of biomarkers plays a central role in our effort to establish whether there is, or was, life beyond Earth. In this review, we address the importance of considering mineralogy in relation to the selection of locations and biomarker detection methodologies with characteristics most promising for exploration. We review relevant mineral-biomarker and mineral-microbe interactions. The local mineralogy on a particular planet reflects its past and current environmental conditions and allows a habitability assessment by comparison with life under extreme conditions on Earth. The type of mineral significantly influences the potential abundances and types of biomarkers and microorganisms containing these biomarkers. The strong adsorptive power of some minerals aids in the preservation of biomarkers and may have been important in the origin of life. On the other hand, this strong adsorption as well as oxidizing properties of minerals can interfere with efficient extraction and detection of biomarkers. Differences in mechanisms of adsorption and in properties of minerals and biomarkers suggest that it will be difficult to design a single extraction procedure for a wide range of biomarkers. While on Mars samples can be used for direct detection of biomarkers such as nucleic acids, amino acids, and lipids, on other planetary bodies remote spectrometric detection of biosignatures has to be relied upon. The interpretation of spectral signatures of photosynthesis can also be affected by local mineralogy. We identify current gaps in our knowledge and indicate how they may be filled to improve the chances of detecting biomarkers on Mars and beyond. PMID:26060985
Accounting for imperfect detection and survey bias in statistical analysis of presence-only data
Dorazio, Robert M.
2014-01-01
Using mathematical proof and simulation-based comparisons, I demonstrate that biases induced by errors in detection or biased selection of survey locations can be reduced or eliminated by using the hierarchical model to analyse presence-only data in conjunction with counts observed in planned surveys. I show that a relatively small number of high-quality data (from planned surveys) can be used to leverage the information in presence-only observations, which usually have broad spatial coverage but may not be informative of both occurrence and detectability of individuals. Because a variety of sampling protocols can be used in planned surveys, this approach to the analysis of presence-only data is widely applicable. In addition, since the point-process model is formulated at the level of an individual, it can be extended to account for biological interactions between individuals and temporal changes in their spatial distributions.
Acoustic detection of North Atlantic right whale contact calls using spectrogram-based statistics.
Urazghildiiev, Ildar R; Clark, Christopher W
2007-08-01
This paper considers the problem of detection of contact calls produced by the critically endangered North Atlantic right whale, Eubalaena glacialis. To reduce computational time, the class of acceptable detectors is constrained by the detectors implemented as a bank of two-dimensional linear FIR filters and using the data spectrogram as the input. The closed form representations for the detectors are derived and the detection performance is compared with that of the generalized likelihood ratio test (GLRT) detector. The test results demonstrate that in the presence of impulsive noise, the spectrogram-based detector using the French hat wavelet as the filter kernel outperforms the GLRT detector and decreases computational time by a factor of 6. PMID:17672627
NASA Technical Reports Server (NTRS)
Natarajan, Suresh; Gardner, C. S.
1987-01-01
Receiver timing synchronization of an optical Pulse-Position Modulation (PPM) communication system can be achieved using a phased-locked loop (PLL), provided the photodetector output is suitably processed. The magnitude of the PLL phase error is a good indicator of the timing error at the receiver decoder. The statistics of the phase error are investigated while varying several key system parameters such as PPM order, signal and background strengths, and PPL bandwidth. A practical optical communication system utilizing a laser diode transmitter and an avalanche photodiode in the receiver is described, and the sampled phase error data are presented. A linear regression analysis is applied to the data to obtain estimates of the relational constants involving the phase error variance and incident signal power.
Robust Statistical Approaches for RSS-Based Floor Detection in Indoor Localization.
Razavi, Alireza; Valkama, Mikko; Lohan, Elena Simona
2016-01-01
Floor detection for indoor 3D localization of mobile devices is currently an important challenge in the wireless world. Many approaches currently exist, but usually the robustness of such approaches is not addressed or investigated. The goal of this paper is to show how to robustify the floor estimation when probabilistic approaches with a low number of parameters are employed. Indeed, such an approach would allow a building-independent estimation and a lower computing power at the mobile side. Four robustified algorithms are to be presented: a robust weighted centroid localization method, a robust linear trilateration method, a robust nonlinear trilateration method, and a robust deconvolution method. The proposed approaches use the received signal strengths (RSS) measured by the Mobile Station (MS) from various heard WiFi access points (APs) and provide an estimate of the vertical position of the MS, which can be used for floor detection. We will show that robustification can indeed increase the performance of the RSS-based floor detection algorithms. PMID:27258279
Robust Statistical Approaches for RSS-Based Floor Detection in Indoor Localization
Razavi, Alireza; Valkama, Mikko; Lohan, Elena Simona
2016-01-01
Floor detection for indoor 3D localization of mobile devices is currently an important challenge in the wireless world. Many approaches currently exist, but usually the robustness of such approaches is not addressed or investigated. The goal of this paper is to show how to robustify the floor estimation when probabilistic approaches with a low number of parameters are employed. Indeed, such an approach would allow a building-independent estimation and a lower computing power at the mobile side. Four robustified algorithms are to be presented: a robust weighted centroid localization method, a robust linear trilateration method, a robust nonlinear trilateration method, and a robust deconvolution method. The proposed approaches use the received signal strengths (RSS) measured by the Mobile Station (MS) from various heard WiFi access points (APs) and provide an estimate of the vertical position of the MS, which can be used for floor detection. We will show that robustification can indeed increase the performance of the RSS-based floor detection algorithms. PMID:27258279
Robust Statistical Approaches for RSS-Based Floor Detection in Indoor Localization.
Razavi, Alireza; Valkama, Mikko; Lohan, Elena Simona
2016-05-31
Floor detection for indoor 3D localization of mobile devices is currently an important challenge in the wireless world. Many approaches currently exist, but usually the robustness of such approaches is not addressed or investigated. The goal of this paper is to show how to robustify the floor estimation when probabilistic approaches with a low number of parameters are employed. Indeed, such an approach would allow a building-independent estimation and a lower computing power at the mobile side. Four robustified algorithms are to be presented: a robust weighted centroid localization method, a robust linear trilateration method, a robust nonlinear trilateration method, and a robust deconvolution method. The proposed approaches use the received signal strengths (RSS) measured by the Mobile Station (MS) from various heard WiFi access points (APs) and provide an estimate of the vertical position of the MS, which can be used for floor detection. We will show that robustification can indeed increase the performance of the RSS-based floor detection algorithms.
NASA Astrophysics Data System (ADS)
Ciuonzo, D.; De Maio, A.; Orlando, D.
2016-06-01
This paper deals with the problem of adaptive multidimensional/multichannel signal detection in homogeneous Gaussian disturbance with unknown covariance matrix and structured deterministic interference. The aforementioned problem corresponds to a generalization of the well-known Generalized Multivariate Analysis of Variance (GMANOVA). In this first part of the work, we formulate the considered problem in canonical form and, after identifying a desirable group of transformations for the considered hypothesis testing, we derive a Maximal Invariant Statistic (MIS) for the problem at hand. Furthermore, we provide the MIS distribution in the form of a stochastic representation. Finally, strong connections to the MIS obtained in the open literature in simpler scenarios are underlined.
Kulp, Marjean Taylor; Ying, Gui-shuang; Huang, Jiayan; Maguire, Maureen; Quinn, Graham; Ciner, Elise B.; Cyert, Lynn A.; Orel-Bixler, Deborah A.; Moore, Bruce D.
2014-01-01
Purpose. To evaluate, by receiver operating characteristic (ROC) analysis, the ability of noncycloplegic retinoscopy (NCR), Retinomax Autorefractor (Retinomax), and SureSight Vision Screener (SureSight) to detect significant refractive errors (RE) among preschoolers. Methods. Refraction results of eye care professionals using NCR, Retinomax, and SureSight (n = 2588) and of nurse and lay screeners using Retinomax and SureSight (n = 1452) were compared with masked cycloplegic retinoscopy results. Significant RE was defined as hyperopia greater than +3.25 diopters (D), myopia greater than 2.00 D, astigmatism greater than 1.50 D, and anisometropia greater than 1.00 D interocular difference in hyperopia, greater than 3.00 D interocular difference in myopia, or greater than 1.50 D interocular difference in astigmatism. The ability of each screening test to identify presence, type, and/or severity of significant RE was summarized by the area under the ROC curve (AUC) and calculated from weighted logistic regression models. Results. For detection of each type of significant RE, AUC of each test was high; AUC was better for detecting the most severe levels of RE than for all REs considered important to detect (AUC 0.97–1.00 vs. 0.92–0.93). The area under the curve of each screening test was high for myopia (AUC 0.97–0.99). Noncycloplegic retinoscopy and Retinomax performed better than SureSight for hyperopia (AUC 0.92–0.99 and 0.90–0.98 vs. 0.85–0.94, P ≤ 0.02), Retinomax performed better than NCR for astigmatism greater than 1.50 D (AUC 0.95 vs. 0.90, P = 0.01), and SureSight performed better than Retinomax for anisometropia (AUC 0.85–1.00 vs. 0.76–0.96, P ≤ 0.07). Performance was similar for nurse and lay screeners in detecting any significant RE (AUC 0.92–1.00 vs. 0.92–0.99). Conclusions. Each test had a very high discriminatory power for detecting children with any significant RE. PMID:24481262
Outlier detection using some methods of mathematical statistic in meteorological time-series
NASA Astrophysics Data System (ADS)
Elias, Michal; Dousa, Jan
2016-06-01
In many applications of Global Navigate Satellite Systems the meteorological time-series play a very important role, especially when representing source of input data for other calculations such as corrections for very precise positioning. We are interested in those corrections which are related to the troposphere delay modelling. Time-series might contain some non-homogeneities, depending on the type of the data source. In this paper the outlier detection is discussed. For investigation we used method based on the autoregressive model and the results of its application were compared with the regression model.
Statistical analysis of detection of, and upper limits on, dark matter lines
Conrad, Jan; Ylinen, Tomi; Scargle, Jeffrey
2007-07-12
In this note we present calculations of coverage and power for three different methods which could be used to calculate upper limits and/or claim discovery in a GLAST-LAT search for a dark matter line. The methods considered are Profile Likelihood, Bayesian factors and likelihood ratio confidence intervals and the calculations are done considering a simple benchmark model of two uncorrelated Poissonian measurements. Profile likelihood has the best coverage properties, the standard {chi}2 test the worst. For the power, the situation is vice-versa. In choosing a method one has to consider the false detection rate (1 - coverage) and counterweigh it to the possible achievable power.
Statistical modelling and power analysis for detecting trends in total suspended sediment loads
NASA Astrophysics Data System (ADS)
Wang, You-Gan; Wang, Shen S. J.; Dunlop, Jason
2015-01-01
The export of sediments from coastal catchments can have detrimental impacts on estuaries and near shore reef ecosystems such as the Great Barrier Reef. Catchment management approaches aimed at reducing sediment loads require monitoring to evaluate their effectiveness in reducing loads over time. However, load estimation is not a trivial task due to the complex behaviour of constituents in natural streams, the variability of water flows and often a limited amount of data. Regression is commonly used for load estimation and provides a fundamental tool for trend estimation by standardising the other time specific covariates such as flow. This study investigates whether load estimates and resultant power to detect trends can be enhanced by (i) modelling the error structure so that temporal correlation can be better quantified, (ii) making use of predictive variables, and (iii) by identifying an efficient and feasible sampling strategy that may be used to reduce sampling error. To achieve this, we propose a new regression model that includes an innovative compounding errors model structure and uses two additional predictive variables (average discounted flow and turbidity). By combining this modelling approach with a new, regularly optimised, sampling strategy, which adds uniformity to the event sampling strategy, the predictive power was increased to 90%. Using the enhanced regression model proposed here, it was possible to detect a trend of 20% over 20 years. This result is in stark contrast to previous conclusions presented in the literature.
A proposal for statistical evaluation of the detection of gunshot residues on a suspect.
Cardinetti, Bruno; Ciampini, Claudio; Abate, Sergio; Marchetrti, Christian; Ferrari, Francesco; Di Tullio, Donatello; D'Onofrio, Carlo; Orlando, Giovanni; Gravina, Luciano; Torresi, Luca; Saporita, Giuseppe
2006-01-01
The possibility of accidental contamination of a suspect by gunshot residues (GSRs) is considered. If two hypotheses are taken into account ("the suspect has shot a firearm" and "the suspect has not shot a firearm"), the likelihood ratio of the conditional probabilities of finding a number n of GSRs is defined. Choosing two Poisson distributions, the parameter lambda of the first one coincides with the mean number of GSRs that can be found on a firearm shooter, while the parameter mu of the second one is the mean number of GSRs that can be found on a nonshooter. In this scenario, the likelihood ratio of the conditional probabilities of finding a number n of GSRs in the two hypotheses can be easily calculated. The evaluation of the two parameters lambda and mu and of the goodness of the two probability distributions is performed by using different sets of data: "exclusive" lead-antimony-barium GSRs have been detected in two populations of 31 and 28 police officers at diverse fixed times since firearm practice, and in a population of 81 police officers who stated that they had not handled firearms for almost 1 month. The results show that the Poisson distributions well fit the data for both shooters and nonshooters, and that the probability of detection of two or more GSRs is normally greater if the suspect has shot firearms. PMID:16878785
Farrington, C. Paddy; Noufaily, Angela; Andrews, Nick J.; Charlett, Andre
2016-01-01
A large-scale multiple surveillance system for infectious disease outbreaks has been in operation in England and Wales since the early 1990s. Changes to the statistical algorithm at the heart of the system were proposed and the purpose of this paper is to compare two new algorithms with the original algorithm. Test data to evaluate performance are created from weekly counts of the number of cases of each of more than 2000 diseases over a twenty-year period. The time series of each disease is separated into one series giving the baseline (background) disease incidence and a second series giving disease outbreaks. One series is shifted forward by twelve months and the two are then recombined, giving a realistic series in which it is known where outbreaks have been added. The metrics used to evaluate performance include a scoring rule that appropriately balances sensitivity against specificity and is sensitive to variation in probabilities near 1. In the context of disease surveillance, a scoring rule can be adapted to reflect the size of outbreaks and this was done. Results indicate that the two new algorithms are comparable to each other and better than the algorithm they were designed to replace. PMID:27513749
Shen, Kai-kai; Fripp, Jurgen; Mériaudeau, Fabrice; Chételat, Gaël; Salvado, Olivier; Bourgeat, Pierrick
2012-02-01
The hippocampus is affected at an early stage in the development of Alzheimer's disease (AD). With the use of structural magnetic resonance (MR) imaging, we can investigate the effect of AD on the morphology of the hippocampus. The hippocampal shape variations among a population can be usually described using statistical shape models (SSMs). Conventional SSMs model the modes of variations among the population via principal component analysis (PCA). Although these modes are representative of variations within the training data, they are not necessarily discriminative on labeled data or relevant to the differences between the subpopulations. We use the shape descriptors from SSM as features to classify AD from normal control (NC) cases. In this study, a Hotelling's T2 test is performed to select a subset of landmarks which are used in PCA. The resulting variation modes are used as predictors of AD from NC. The discrimination ability of these predictors is evaluated in terms of their classification performances with bagged support vector machines (SVMs). Restricting the model to landmarks with better separation between AD and NC increases the discrimination power of SSM. The predictors extracted on the subregions also showed stronger correlation with the memory-related measurements such as Logical Memory, Auditory Verbal Learning Test (AVLT) and the memory subscores of Alzheimer Disease Assessment Scale (ADAS).
Wille, Anja; Gruissem, Wilhelm; Bühlmann, Peter; Hennig, Lars
2007-11-01
Accurately identifying differentially expressed genes from microarray data is not a trivial task, partly because of poor variance estimates of gene expression signals. Here, after analyzing 380 replicated microarray experiments, we found that probesets have typical, distinct variances that can be estimated based on a large number of microarray experiments. These probeset-specific variances depend at least in part on the function of the probed gene: genes for ribosomal or structural proteins often have a small variance, while genes implicated in stress responses often have large variances. We used these variance estimates to develop a statistical test for differentially expressed genes called EVE (external variance estimation). The EVE algorithm performs better than the t-test and LIMMA on some real-world data, where external information from appropriate databases is available. Thus, EVE helps to maximize the information gained from a typical microarray experiment. Nonetheless, only a large number of replicates will guarantee to identify nearly all truly differentially expressed genes. However, our simulation studies suggest that even limited numbers of replicates will usually result in good coverage of strongly differentially expressed genes.
Heggeseth, Brianna; Harley, Kim; Warner, Marcella; Jewell, Nicholas; Eskenazi, Brenda
2015-01-01
It has been hypothesized that environmental exposures at key development periods such as in utero play a role in childhood growth and obesity. To investigate whether in utero exposure to endocrine-disrupting chemicals, dichlorodiphenyltrichloroethane (DDT) and its metabolite, dichlorodiphenyldichloroethane (DDE), is associated with childhood physical growth, we took a novel statistical approach to analyze data from the CHAMACOS cohort study. To model heterogeneity in the growth patterns, we used a finite mixture model in combination with a data transformation to characterize body mass index (BMI) with four groups and estimated the association between exposure and group membership. In boys, higher maternal concentrations of DDT and DDE during pregnancy are associated with a BMI growth pattern that is stable until about age five followed by increased growth through age nine. In contrast, higher maternal DDT exposure during pregnancy is associated with a flat, relatively stable growth pattern in girls. This study suggests that in utero exposure to DDT and DDE may be associated with childhood BMI growth patterns, not just BMI level, and both the magnitude of exposure and sex may impact the relationship.
Tai, Caroline G.; Graff, Rebecca E.; Liu, Jinghua; Passarelli, Michael N.; Mefford, Joel A.; Shaw, Gary M.; Hoffmann, Thomas J.; Witte, John S.
2015-01-01
Background The National Birth Defects Prevention Study (NBDPS) contains a wealth of information on affected and unaffected family triads, and thus provides numerous opportunities to study gene-environment interactions (GxE) in the etiology of birth defect outcomes. Depending on the research objective, several analytic options exist to estimate GxE effects that utilize varying combinations of individuals drawn from available triads. Methods In this paper we discuss several considerations in the collection of genetic data and environmental exposures. We will also present several population- and family-based approaches that can be applied to data from the NBDPS including case-control, case-only, family-based trio, and maternal versus fetal effects. For each, we describe the data requirements, applicable statistical methods, advantages and disadvantages. Discussion A range of approaches can be used to evaluate potentially important GxE effects in the NBDPS. Investigators should be aware of the limitations inherent to each approach when choosing a study design and interpreting results. PMID:26010994
Heggeseth, Brianna; Harley, Kim; Warner, Marcella; Jewell, Nicholas; Eskenazi, Brenda
2015-01-01
It has been hypothesized that environmental exposures at key development periods such as in utero play a role in childhood growth and obesity. To investigate whether in utero exposure to endocrine-disrupting chemicals, dichlorodiphenyltrichloroethane (DDT) and its metabolite, dichlorodiphenyldichloroethane (DDE), is associated with childhood physical growth, we took a novel statistical approach to analyze data from the CHAMACOS cohort study. To model heterogeneity in the growth patterns, we used a finite mixture model in combination with a data transformation to characterize body mass index (BMI) with four groups and estimated the association between exposure and group membership. In boys, higher maternal concentrations of DDT and DDE during pregnancy are associated with a BMI growth pattern that is stable until about age five followed by increased growth through age nine. In contrast, higher maternal DDT exposure during pregnancy is associated with a flat, relatively stable growth pattern in girls. This study suggests that in utero exposure to DDT and DDE may be associated with childhood BMI growth patterns, not just BMI level, and both the magnitude of exposure and sex may impact the relationship. PMID:26125556
Enki, Doyo G; Garthwaite, Paul H; Farrington, C Paddy; Noufaily, Angela; Andrews, Nick J; Charlett, Andre
2016-01-01
A large-scale multiple surveillance system for infectious disease outbreaks has been in operation in England and Wales since the early 1990s. Changes to the statistical algorithm at the heart of the system were proposed and the purpose of this paper is to compare two new algorithms with the original algorithm. Test data to evaluate performance are created from weekly counts of the number of cases of each of more than 2000 diseases over a twenty-year period. The time series of each disease is separated into one series giving the baseline (background) disease incidence and a second series giving disease outbreaks. One series is shifted forward by twelve months and the two are then recombined, giving a realistic series in which it is known where outbreaks have been added. The metrics used to evaluate performance include a scoring rule that appropriately balances sensitivity against specificity and is sensitive to variation in probabilities near 1. In the context of disease surveillance, a scoring rule can be adapted to reflect the size of outbreaks and this was done. Results indicate that the two new algorithms are comparable to each other and better than the algorithm they were designed to replace. PMID:27513749
Graph-based and statistical approaches for detecting spectrally variable target materials
NASA Astrophysics Data System (ADS)
Ziemann, Amanda K.; Theiler, James
2016-05-01
In discriminating target materials from background clutter in hyperspectral imagery, one must contend with variability in both. Most algorithms focus on the clutter variability, but for some materials there is considerable variability in the spectral signatures of the target. This is especially the case for solid target materials, whose signatures depend on morphological properties (particle size, packing density, etc.) that are rarely known a priori. In this paper, we investigate detection algorithms that explicitly take into account the diversity of signatures for a given target. In particular, we investigate variable target detectors when applied to new representations of the hyperspectral data: a manifold learning based approach, and a residual based approach. The graph theory and manifold learning based approach incorporates multiple spectral signatures of the target material of interest; this is built upon previous work that used a single target spectrum. In this approach, we first build an adaptive nearest neighbors (ANN) graph on the data and target spectra, and use a biased locally linear embedding (LLE) transformation to perform nonlinear dimensionality reduction. This biased transformation results in a lower-dimensional representation of the data that better separates the targets from the background. The residual approach uses an annulus based computation to represent each pixel after an estimate of the local background is removed, which suppresses local backgrounds and emphasizes the target-containing pixels. We will show detection results in the original spectral space, the dimensionality-reduced space, and the residual space, all using subspace detectors: ranked spectral angle mapper (rSAM), subspace adaptive matched filter (ssAMF), and subspace adaptive cosine/coherence estimator (ssACE). Results of this exploratory study will be shown on a ground-truthed hyperspectral image with variable target spectra and both full and mixed pixel targets.
NASA Astrophysics Data System (ADS)
von Larcher, Thomas; Harlander, Uwe; Alexandrov, Kiril; Wang, Yongtai
2010-05-01
Experiments on baroclinic wave instabilities in a rotating cylindrical gap have been long performed, e.g., to unhide regular waves of different zonal wave number, to better understand the transition to the quasi-chaotic regime, and to reveal the underlying dynamical processes of complex wave flows. We present the application of appropriate multivariate data analysis methods on time series data sets acquired by the use of non-intrusive measurement techniques of a quite different nature. While the high accurate Laser-Doppler-Velocimetry (LDV ) is used for measurements of the radial velocity component at equidistant azimuthal positions, a high sensitive thermographic camera measures the surface temperature field. The measurements are performed at particular parameter points, where our former studies show that kinds of complex wave patterns occur [1, 2]. Obviously, the temperature data set has much more information content as the velocity data set due to the particular measurement techniques. Both sets of time series data are analyzed by using multivariate statistical techniques. While the LDV data sets are studied by applying the Multi-Channel Singular Spectrum Analysis (M - SSA), the temperature data sets are analyzed by applying the Empirical Orthogonal Functions (EOF ). Our goal is (a) to verify the results yielded with the analysis of the velocity data and (b) to compare the data analysis methods. Therefor, the temperature data are processed in a way to become comparable to the LDV data, i.e. reducing the size of the data set in such a manner that the temperature measurements would imaginary be performed at equidistant azimuthal positions only. This approach initially results in a great loss of information. But applying the M - SSA to the reduced temperature data sets enable us to compare the methods. [1] Th. von Larcher and C. Egbers, Experiments on transitions of baroclinic waves in a differentially heated rotating annulus, Nonlinear Processes in Geophysics
Zhang, Yuan; Wang, Xiaobei; Ma, Ling; Wang, Zehua; Hu, Lihua
2009-06-01
This study evaluated the clinical significance of hTERC gene amplification detection by fluorescence in situ hybridization (FISH) in the screening of cervical lesions. Cervical specimens of 50 high risk patients were detected by thin liquid-based cytology. The patients whose cytological results were classified as ASCUS or above were subjected to the subsequent colposcopic biopsies. Slides prepared from these 50 cervical specimens were analyzed for hTERC gene amplification using interphase FISH with the two-color hTERC probe. The results of the cytological analysis and those of subsequent biopsies, when available, were compared with the FISH-detected hTERC abnormalities. It was found that the positive rates of hTERC gene amplification in NILM, ASCUS, LSIL, HSIL, and SCC groups were 0.00, 28.57%, 57.14%, 100%, and 100%, respectively. The positive rates of hTERC gene amplification in HSIL and SCC groups were significantly higher than those in NILM, ASCUS and LSIL groups (all P<0.05). The mean percentages of cells with hTERC gene amplification in NILM, ASCUS, LSIL, HSIL, and SCC groups were 0.00, 10.50%, 36.00%, 79.00%, and 96.50%, respectively. Patients with HSIL or SCC cytological diagnoses had significantly higher mean percentages of cells with hTERC gene amplification than did patients with NILM, ASCUS or LSIL cytological diagnoses (all P<0.05). It was concluded that two-color interphase FISH could detect hTERC gene amplification to accurately distinguish HSIL and ISIL of cervical cells. It may be an adjunct to cytology screening, especially high-risk patients.
Aerts, Joost W.; Patty, C.H. Lucas; ten Kate, Inge Loes; Ehrenfreund, Pascale; Direito, Susana O.L.
2015-01-01
Abstract The detection of biomarkers plays a central role in our effort to establish whether there is, or was, life beyond Earth. In this review, we address the importance of considering mineralogy in relation to the selection of locations and biomarker detection methodologies with characteristics most promising for exploration. We review relevant mineral-biomarker and mineral-microbe interactions. The local mineralogy on a particular planet reflects its past and current environmental conditions and allows a habitability assessment by comparison with life under extreme conditions on Earth. The type of mineral significantly influences the potential abundances and types of biomarkers and microorganisms containing these biomarkers. The strong adsorptive power of some minerals aids in the preservation of biomarkers and may have been important in the origin of life. On the other hand, this strong adsorption as well as oxidizing properties of minerals can interfere with efficient extraction and detection of biomarkers. Differences in mechanisms of adsorption and in properties of minerals and biomarkers suggest that it will be difficult to design a single extraction procedure for a wide range of biomarkers. While on Mars samples can be used for direct detection of biomarkers such as nucleic acids, amino acids, and lipids, on other planetary bodies remote spectrometric detection of biosignatures has to be relied upon. The interpretation of spectral signatures of photosynthesis can also be affected by local mineralogy. We identify current gaps in our knowledge and indicate how they may be filled to improve the chances of detecting biomarkers on Mars and beyond. Key Words: DNA—Lipids—Photosynthesis—Extremophiles—Mineralogy—Subsurface. Astrobiology 15, 492–507. PMID:26060985
Himemoto, Yoshiaki; Hiramatsu, Takashi; Taruya, Atsushi; Kudoh, Hideaki
2007-01-15
We discuss a robust data analysis method to detect a stochastic background of gravitational waves in the presence of non-Gaussian noise. In contrast to the standard cross-correlation (SCC) statistic frequently used in the stochastic background searches, we consider a generalized cross-correlation (GCC) statistic, which is nearly optimal even in the presence of non-Gaussian noise. The detection efficiency of the GCC statistic is investigated analytically, particularly focusing on the statistical relation between the false-alarm and the false-dismissal probabilities, and the minimum detectable amplitude of gravitational-wave signals. We derive simple analytic formulas for these statistical quantities. The robustness of the GCC statistic is clarified based on these formulas, and one finds that the detection efficiency of the GCC statistic roughly corresponds to the one of the SCC statistic neglecting the contribution of non-Gaussian tails. This remarkable property is checked by performing the Monte Carlo simulations and successful agreement between analytic and simulation results was found.
NASA Astrophysics Data System (ADS)
Tibaduiza, D.-A.; Torres-Arredondo, M.-A.; Mujica, L. E.; Rodellar, J.; Fritzen, C.-P.
2013-12-01
This article is concerned with the practical use of Multiway Principal Component Analysis (MPCA), Discrete Wavelet Transform (DWT), Squared Prediction Error (SPE) measures and Self-Organizing Maps (SOM) to detect and classify damages in mechanical structures. The formalism is based on a distributed piezoelectric active sensor network for the excitation and detection of structural dynamic responses. Statistical models are built using PCA when the structure is known to be healthy either directly from the dynamic responses or from wavelet coefficients at different scales representing Time-frequency information. Different damages on the tested structures are simulated by adding masses at different positions. The data from the structure in different states (damaged or not) are then projected into the different principal component models by each actuator in order to obtain the input feature vectors for a SOM from the scores and the SPE measures. An aircraft fuselage from an Airbus A320 and a multi-layered carbon fiber reinforced plastic (CFRP) plate are used as examples to test the approaches. Results are presented, compared and discussed in order to determine their potential in structural health monitoring. These results showed that all the simulated damages were detectable and the selected features proved capable of separating all damage conditions from the undamaged state for both approaches.
NASA Astrophysics Data System (ADS)
Mills, R. T.; Kumar, J.; Hoffman, F. M.; Hargrove, W. W.; Spruce, J.
2011-12-01
Variations in vegetation phenology, the annual temporal pattern of leaf growth and senescence, can be a strong indicator of ecological change or disturbance. However, phenology is also strongly influenced by seasonal, interannual, and long-term trends in climate, making identification of changes in forest ecosystems a challenge. Forest ecosystems are vulnerable to extreme weather events, insect and disease attacks, wildfire, harvesting, and other land use change. Normalized difference vegetation index (NDVI), a remotely sensed measure of greenness, provides a proxy for phenology. NDVI for the conterminous United States (CONUS) derived from the Moderate Resolution Spectroradiometer (MODIS) at 250 m resolution was used in this study to develop phenological signatures of ecological regimes called phenoregions. By applying a quantitative data mining technique to the NDVI measurements for every eight days over the entire MODIS record, annual maps of phenoregions were developed. This geospatiotemporal cluster analysis technique employs high performance computing resources, enabling analysis of such very large data sets. This technique produces a prescribed number of prototypical phenological states to which every location belongs in any year. Analysis of the shifts among phenological states yields information about responses to interannual climate variability and, more importantly, changes in ecosystem health due to disturbances. Moreover, a large change in the phenological states occupied by a single location over time indicates a significant disturbance or ecological shift. This methodology has been applied for identification of various forest disturbance events, including wildfire, tree mortality due to Mountain Pine Beetle, and other insect infestation and diseases, as well as extreme events like storms and hurricanes in the U.S. Presented will be results from analysis of phenological state dynamics, along with disturbance and validation data.
Kato, Hiroki; Shimosegawa, Eku; Fujino, Koichi; Hatazawa, Jun
2016-01-01
Background Integrated SPECT/CT enables non-uniform attenuation correction (AC) using built-in CT instead of the conventional uniform AC. The effect of CT-based AC on voxel-based statistical analyses of brain SPECT findings has not yet been clarified. Here, we assessed differences in the detectability of regional cerebral blood flow (CBF) reduction using SPECT voxel-based statistical analyses based on the two types of AC methods. Subjects and Methods N-isopropyl-p-[123I]iodoamphetamine (IMP) CBF SPECT images were acquired for all the subjects and were reconstructed using 3D-OSEM with two different AC methods: Chang’s method (Chang’s AC) and the CT-based AC method. A normal database was constructed for the analysis using SPECT findings obtained for 25 healthy normal volunteers. Voxel-based Z-statistics were also calculated for SPECT findings obtained for 15 patients with chronic cerebral infarctions and 10 normal subjects. We assumed that an analysis with a higher specificity would likely produce a lower mean absolute Z-score for normal brain tissue, and a more sensitive voxel-based statistical analysis would likely produce a higher absolute Z-score for in old infarct lesions, where the CBF was severely decreased. Results The inter-subject variation in the voxel values in the normal database was lower using CT-based AC, compared with Chang’s AC, for most of the brain regions. The absolute Z-score indicating a SPECT count reduction in infarct lesions was also significantly higher in the images reconstructed using CT-based AC, compared with Chang’s AC (P = 0.003). The mean absolute value of the Z-score in the 10 intact brains was significantly lower in the images reconstructed using CT-based AC than in those reconstructed using Chang’s AC (P = 0.005). Conclusions Non-uniform CT-based AC by integrated SPECT/CT significantly improved sensitivity and the specificity of the voxel-based statistical analyses for regional SPECT count reductions, compared with
A statistical model of ChIA-PET data for accurate detection of chromatin 3D interactions
Paulsen, Jonas; Rødland, Einar A.; Holden, Lars; Holden, Marit; Hovig, Eivind
2014-01-01
Identification of three-dimensional (3D) interactions between regulatory elements across the genome is crucial to unravel the complex regulatory machinery that orchestrates proliferation and differentiation of cells. ChIA-PET is a novel method to identify such interactions, where physical contacts between regions bound by a specific protein are quantified using next-generation sequencing. However, determining the significance of the observed interaction frequencies in such datasets is challenging, and few methods have been proposed. Despite the fact that regions that are close in linear genomic distance have a much higher tendency to interact by chance, no methods to date are capable of taking such dependency into account. Here, we propose a statistical model taking into account the genomic distance relationship, as well as the general propensity of anchors to be involved in contacts overall. Using both real and simulated data, we show that the previously proposed statistical test, based on Fisher's exact test, leads to invalid results when data are dependent on genomic distance. We also evaluate our method on previously validated cell-line specific and constitutive 3D interactions, and show that relevant interactions are significant, while avoiding over-estimating the significance of short nearby interactions. PMID:25114054
Gentile, Mauro; De Vito, Alessandro; Azzini, Cristiano; Tamborino, Carmine; Casetta, Ilaria
2014-11-01
Contrast-transcranial Doppler and contrast-transcranial color-coded duplex sonography (c-TCCD) have been reported to have high sensitivity in detecting patent foramen ovale as compared with transesophageal echocardiography. An international consensus meeting (Jauss and Zanette 2000) recommended that the contrast agent for right-to left-shunt (RLS) detection using contrast-transcranial Doppler be prepared by mixing 9 mL of isotonic saline solution and 1 mL of air. The aim of our study was to determine whether adding blood to the contrast agent results in improved detection of RLS. We enrolled all consecutive patients admitted to our neurosonology laboratory for RLS diagnosis. For each patient, we performed c-TCCD both at rest and during the Valsalva maneuver using two different contrast agents: ANSs (1 mL of air mixed with 9 mL of normal saline) and ANSHBs (1 mL of air mixed with 8 mL of normal saline and 1 mL of the patient's blood). To classify RLS, we used a four-level visual categorization: (i) no occurrence of micro-embolic signals; (ii) grade I, 1-10 signals; (iii) grade II, >10 signals but no curtain; grade III, curtain pattern. We included 80 patients, 33 men and 47 women. RLS was detected in 18.8% at rest and in 35% during the Valsalva maneuver using ANSs, and in 31.3% and in 46.3% using ANSHBs, respectively (p < 0.0001). There was a statistically significant increase in the number of micro-embolic signals with the use of ANSHBs. The use of blood mixed with saline solution and air as a c-TCCD contrast agent produced an increase in positive tests and a higher grade of RLS compared with normal saline and air alone, either with or without the Valsalva maneuver.
Cohn, T.A.; England, J.F.; Berenbrock, C.E.; Mason, R.R.; Stedinger, J.R.; Lamontagne, J.R.
2013-01-01
he Grubbs-Beck test is recommended by the federal guidelines for detection of low outliers in flood flow frequency computation in the United States. This paper presents a generalization of the Grubbs-Beck test for normal data (similar to the Rosner (1983) test; see also Spencer and McCuen (1996)) that can provide a consistent standard for identifying multiple potentially influential low flows. In cases where low outliers have been identified, they can be represented as “less-than” values, and a frequency distribution can be developed using censored-data statistical techniques, such as the Expected Moments Algorithm. This approach can improve the fit of the right-hand tail of a frequency distribution and provide protection from lack-of-fit due to unimportant but potentially influential low flows (PILFs) in a flood series, thus making the flood frequency analysis procedure more robust.
Broccolo, Francesco; Bossolasco, Simona; Careddu, Anna M.; Tambussi, Giuseppe; Lazzarin, Adriano; Cinque, Paola
2002-01-01
The frequency and clinical significance of detection of DNA of cytomegalovirus (CMV), Epstein-Barr virus (EBV), human herpesvirus 6 (HHV-6), HHV-7, and HHV-8 in plasma were investigated by PCR. The plasma was obtained from 120 selected human immunodeficiency virus (HIV)-infected patients, of whom 75 had AIDS-related manifestations, 32 had primary HIV infection (PHI), and 13 had asymptomatic infections. Nested PCR analysis revealed that none of the lymphotropic herpesviruses tested were found in patients with PHI, in asymptomatic HIV-positive individuals, or in HIV-negative controls. By contrast, DNA of one or more of the viruses was found in 42 (56%) of 75 patients with AIDS-related manifestations, including CMV disease (CMV-D) or AIDS-related tumors. The presence of CMV DNA in plasma was significantly associated with CMV-D (P < 0.001). By contrast, EBV detection was not significantly associated with AIDS-related lymphomas (P = 0.31). Interestingly, the presence of HHV-8 DNA in plasma was significantly associated with Kaposi's sarcoma (KS) disease (P < 0.001) and with the clinical status of KS patients (P < 0.001). CMV (primarily), EBV, and HHV-8 were the viruses most commonly reactivated in the context of severe immunosuppression (P < 0.05). In contrast, HHV-6 and HHV-7 infections were infrequent at any stage of disease. In conclusion, plasma PCR was confirmed to be useful in the diagnosis of CMV-D but not in that of tumors or other conditions possibly associated with EBV, HHV-6, and HHV-7. Our findings support the hypothesis of a direct involvement of HHV-8 replication in KS pathogenesis, thus emphasizing the usefulness of sensitive and specific diagnostic tests to monitor HHV-8 infection. PMID:12414753
Minor changes in the indicator used to measure fine PM, which cause only modest changes in Mass concentrations, can lead to dramatic changes in the statistical relationship of fine PM mass with cardiovascular mortality. An epidemiologic study in Phoenix (Mar et al., 2000), augme...
NASA Astrophysics Data System (ADS)
Meng, X.; Peng, Z.
2014-12-01
It is now well established that extraction of fossil fuels and/or waste water disposal do cause earthquakes in Central and Eastern United States (CEUS). However, the physics underneath of the nucleation of induced earthquakes still remain elusive. In particular, do induced and tectonic earthquake sequences in CEUS share the same statistics, for example the Omori's law [Utsu et al., 1995] and the Gutenberg-Richter's law? Some studies have show that most naturally occurring earthquake sequences are driven by cascading-type triggering. Hence, they would follow the typical Gutenberg-Richter relation and Omori's aftershock decay and could be well described by multi-dimensional point-process models such as Epidemic Type Aftershock Sequence (ETAS) [Ogata, 1988; Zhuang et al., 2012]. However, induced earthquakes are likely driven by external forcing such as injected fluid pressure, and hence would not be well described by the ETAS model [Llenos and Michael, 2013]. Existing catalogs in CEUS (e.g. the ANSS catalog) have relatively high magnitude of completeness [e.g., Van Der Elst et al., 2013] and hence may not be ideal for a detailed ETAS modeling analysis. A waveform matched filter technique has been successfully applied to detect many missing earthquakes in CEUS with a sparse network in Illinois [Yang et al., 2009] and on single station in Texas, Oklahoma and Colorado [e.g., Van Der Elst et al., 2013]. In addition, the deployment of the USArray station in CEUS also helped to expand the station coverage. In this study, we systematically detect missing events during 14 moderate-size (M>=4) earthquake sequences since 2000 in CEUS and quantify their statistical parameters (e.g. b, a, K, and p values) and spatio-temporal evolutions. Then we compare the statistical parameters and the spatio-temporal evolution pattern between induced and naturally occurring earthquake sequences to see if one or more diagnostic parameters exist. Our comprehensive analysis of earthquake sequences
NASA Astrophysics Data System (ADS)
Kheir, Rania Bou; Hansmann, Berthold; Abdallah, Chadi
2010-05-01
Soil erosion by water is one of the major causes of land degradation in Mediterranean karst environments, including Lebanon, which represents a good case study. This research deals with how to use Geographic Information Systems (GIS) for establishing the relationships between gully erosion occurrence and different environmental parameters over a representative region of Lebanon. Factors influencing the gully erosion process can be represented by different parameters, and each can be extracted from remote sensing, digital elevation models DEMs, ancillary maps or field observations. These parameters can be endogenous/quasi-static, i.e. soil type, organic matter content, soil depth, lithology, proximity to fault zone, karstification, distance to drainage line, slope gradient, slope aspect, slope curvature, or exogenous/dynamic triggering the erosion process, i.e. land cover/use, proximity to sources and rainfall erosivity. All these parameters have been analyzed and correlated with existing gullies under a GIS environment. The gullies were first detected through visual interpretation of two stereo-pairs of SPOT 4 images (anaglyph) at 10 m resolution. This study indicates, depending on bivariate remote sensing and GIS statistical correlations (Kendall Tau-b correlation), that the soil type is the most influencing factor on gully erosion occurrence. It also shows that statistical correlations to gullies exist best between the extracted parameters at the following decreasing order of importance: soil type-lithology, soil-land cover/use, soil-slope gradient, lithology-distance to drainage line, and soil type-karstification at 1% level of significance, and lithology-proximity to fault line, slope aspect-land cover/use, and soil type-slope curvature at 5% level of significance. These correlations were verified and checked through field observations and explained using univariate statistical correlations. Therefore, they could be extrapolated to other Mediterranean karst
Hyde, J M; Cerezo, A; Williams, T J
2009-04-01
Statistical analysis of atom probe data has improved dramatically in the last decade and it is now possible to determine the size, the number density and the composition of individual clusters or precipitates such as those formed in reactor pressure vessel (RPV) steels during irradiation. However, the characterisation of the onset of clustering or co-segregation is more difficult and has traditionally focused on the use of composition frequency distributions (for detecting clustering) and contingency tables (for detecting co-segregation). In this work, the authors investigate the possibility of directly examining the neighbourhood of each individual solute atom as a means of identifying the onset of solute clustering and/or co-segregation. The methodology involves comparing the mean observed composition around a particular type of solute with that expected from the overall composition of the material. The methodology has been applied to atom probe data obtained from several irradiated RPV steels. The results show that the new approach is more sensitive to fine scale clustering and co-segregation than that achievable using composition frequency distribution and contingency table analyses.
NASA Astrophysics Data System (ADS)
Zheng, Hao; Holzworth, Robert H.; Brundell, James B.; Jacobson, Abram R.; Wygant, John R.; Hospodarsky, George B.; Mozer, Forrest S.; Bonnell, John
2016-03-01
Lightning-generated whistler waves are electromagnetic plasma waves in the very low frequency (VLF) band, which play an important role in the dynamics of radiation belt particles. In this paper, we statistically analyze simultaneous waveform data from the Van Allen Probes (Radiation Belt Storm Probes, RBSP) and global lightning data from the World Wide Lightning Location Network (WWLLN). Data were obtained between July to September 2013 and between March and April 2014. For each day during these periods, we predicted the most probable 10 min for which each of the two RBSP satellites would be magnetically conjugate to lightning producing regions. The prediction method uses integrated WWLLN stroke data for that day obtained during the three previous years. Using these predicted times for magnetic conjugacy to lightning activity regions, we recorded high time resolution, burst mode waveform data. Here we show that whistlers are observed by the satellites in more than 80% of downloaded waveform data. About 22.9% of the whistlers observed by RBSP are one-to-one coincident with source lightning strokes detected by WWLLN. About 40.1% more of whistlers are found to be one-to-one coincident with lightning if source regions are extended out 2000 km from the satellites footpoints. Lightning strokes with far-field radiated VLF energy larger than about 100 J are able to generate a detectable whistler wave in the inner magnetosphere. One-to-one coincidences between whistlers observed by RBSP and lightning strokes detected by WWLLN are clearly shown in the L shell range of L = 1-3. Nose whistlers observed in July 2014 show that it may be possible to extend this coincidence to the region of L≥4.
NASA Astrophysics Data System (ADS)
Larsen, L.; Watts, D.; Khurana, A.; Anderson, J. L.; Xu, C.; Merritts, D. J.
2015-12-01
The classic signal of self-organization in nature is pattern formation. However, the interactions and feedbacks that organize depositional landscapes do not always result in regular or fractal patterns. How might we detect their existence and effects in these "irregular" landscapes? Emergent landscapes such as newly forming deltaic marshes or some restoration sites provide opportunities to study the autogenic processes that organize landscapes and their physical signatures. Here we describe a quest to understand autogenic vs. allogenic controls on landscape evolution in Big Spring Run, PA, a landscape undergoing restoration from bare-soil conditions to a target wet meadow landscape. The contemporary motivation for asking questions about autogenic vs. allogenic controls is to evaluate how important initial conditions or environmental controls may be for the attainment of management objectives. However, these questions can also inform interpretation of the sedimentary record by enabling researchers to separate signals that may have arisen through self-organization processes from those resulting from environmental perturbations. Over three years at Big Spring Run, we mapped the dynamic evolution of floodplain vegetation communities and distributions of abiotic variables and topography. We used principal component analysis and transition probability analysis to detect associative interactions between vegetation and geomorphic variables and convergent cross-mapping on lidar data to detect causal interactions between biomass and topography. Exploratory statistics revealed that plant communities with distinct morphologies exerted control on landscape evolution through stress divergence (i.e., channel initiation) and promoting the accumulation of fine sediment in channels. Together, these communities participated in a negative feedback that maintains low energy and multiple channels. Because of the spatially explicit nature of this feedback, causal interactions could not
Nieto, A; Peña, L; Pérez-Alenza, M D; Sánchez, M A; Flores, J M; Castaño, M
2000-05-01
Eighty-nine canine mammary tumors and dysplasias of 66 bitches were investigated to determine the immunohistochemical expression of classical estrogen receptor (ER-alpha) and its clinical and pathologic associations and prognostic value. A complete clinical examination was performed and reproductive history was evaluated. After surgery, all animals were followed-up for 18 months, with clinical examinations every 3-4 months. ER-alpha expression was higher in tumors of genitally intact and young bitches (P < 0.01, P < 0.01) and in animals with regular estrous periods (P = 0.03). Malignant tumors of the bitches with a previous clinical history of pseudopregnancy expressed significantly more ER-alpha (P = 0.04). Immunoexpression of ER-alpha decreased significantly with tumor size (P = 0.05) and skin ulceration (P = 0.01). Low levels of ER-alpha were significantly associated with lymph node involvement (P < 0.01). Malignant tumors had lower ER-alpha expression than did benign tumors (P < 0.01). Proliferation index measured by proliferating cell nuclear antigen immunostaining was inversely correlated with ER-alpha scores (P = 0.05) in all tumors. Low ER-alpha levels in primary malignant tumors were significantly associated with the occurrence of metastases in the follow-up (P = 0.03). Multivariate analyses were performed to determine the prognostic significance of some follow-up variables. ER-alpha value, Ki-67 index, and age were independent factors that could predict disease-free survival. Lymph node status, age, and ER-alpha index were independent prognostic factors for the overall survival. The immunohistochemical detection of ER-alpha in canine mammary tumors is a simple technique with prognostic value that could be useful in selecting appropriate hormonal therapy.
NASA Astrophysics Data System (ADS)
de Laat, Jos; van Weele, Michiel; van der A, Ronald
2015-04-01
An important new landmark in present day ozone research is presented through MLS satellite observations of significant ozone increases during the ozone hole season that are attributed unequivocally to declining ozone depleting substances. For many decades the Antarctic ozone hole has been the prime example of both the detrimental effects of human activities on our environment as well as how to construct effective and successful environmental policies. Nowadays atmospheric concentrations of ozone depleting substances are on the decline and first signs of recovery of stratospheric ozone and ozone in the Antarctic ozone hole have been observed. The claimed detection of significant recovery, however, is still subject of debate. In this talk we will discuss first current uncertainties in the assessment of ozone recovery in the Antarctic ozone hole by using multi-variate regression methods, and, secondly present an alternative approach to identify ozone hole recovery unequivocally. Even though multi-variate regression methods help to reduce uncertainties in estimates of ozone recovery, great care has to be taken in their application due to the existence of uncertainties and degrees of freedom in the choice of independent variables. We show that taking all uncertainties into account in the regressions the formal recovery of ozone in the Antarctic ozone hole cannot be established yet, though is likely before the end of the decade (before 2020). Rather than focusing on time and area averages of total ozone columns or ozone profiles, we argue that the time evolution of the probability distribution of vertically resolved ozone in the Antarctic ozone hole contains a better fingerprint for the detection of ozone recovery in the Antarctic ozone hole. The advantages of this method over more tradition methods of trend analyses based on spatio-temporal average ozone are discussed. The 10-year record of MLS satellite measurements of ozone in the Antarctic ozone hole shows a
NASA Astrophysics Data System (ADS)
Liu, Fangfang
The thesis is composed of three independent projects: (i) analyzing transposon-sequencing data to infer functions of genes on bacteria growth (chapter 2), (ii) developing semi-parametric Bayesian method for differential gene expression analysis with RNA-sequencing data (chapter 3), (iii) solving group selection problem for survival data (chapter 4). All projects are motivated by statistical challenges raised in biological research. The first project is motivated by the need to develop statistical models to accommodate the transposon insertion sequencing (Tn-Seq) data, Tn-Seq data consist of sequence reads around each transposon insertion site. The detection of transposon insertion at a given site indicates that the disruption of genomic sequence at this site does not cause essential function loss and the bacteria can still grow. Hence, such measurements have been used to infer the functions of each gene on bacteria growth. We propose a zero-inflated Poisson regression method for analyzing the Tn-Seq count data, and derive an Expectation-Maximization (EM) algorithm to obtain parameter estimates. We also propose a multiple testing procedure that categorizes genes into each of the three states, hypo-tolerant, tolerant, and hyper-tolerant, while controlling false discovery rate. Simulation studies show our method provides good estimation of model parameters and inference on gene functions. In the second project, we model the count data from RNA-sequencing experiment for each gene using a Poisson-Gamma hierarchical model, or equivalently, a negative binomial (NB) model. We derive a full semi-parametric Bayesian approach with Dirichlet process as the prior for the fold changes between two treatment means. An inference strategy using Gibbs algorithm is developed for differential expression analysis. We evaluate our method with several simulation studies, and the results demonstrate that our method outperforms other methods including the popularly applied ones such as edge
Liang, Jin-Hua; Sun, Jin; Wang, Li; Fan, Lei; Chen, Yao-Yu; Qu, Xiao-Yan; Li, Tian-Nv; Li, Jian-Yong; Xu, Wei
2016-01-01
The aim of this study was to examine the prognostic value of bone marrow involvement (BMI) assessed by baseline PET-CT (PET(0)-BMI) in treatment-naïve patients with diffuse large B-cell lymphoma (DLBCL). All patients from a single centre diagnosed as DLBCL between 2005 and 2014 had data extracted from staging PET-CT (PET(0)-CT), bone marrow biopsy (BMB), and treatment records. The PET(3)-CT (PET-CT scan after cycle 3 of immunochemotherapy) was performed on all the patients with PET(0)-BMI positivity (PET(0)-BMI(+)). Of 169 patients, 20 (11.8%) had BMI on BMB, whereas 35 (20.7%) were PET(0)-BMI positive. Among PET(0)-BMI(+) patients, patients with maximum of standard uptake value (SUVmax) of bone marrow (SUVmax(BM)) more than 8.6 were significantly associated with high IPI score (3–5) (P=0.002), worse progression-free survival (PFS) and overall survival (OS) (P=0.025 and P=0.002, respectively). In the 68 stage IV cases, 3-year OS was higher in the patients with negative PET(0)-BMI (PET(0)-BMI(−)) than that with PET(0)-BMI(+) (84.2%±6.5% vs. 44.1%±8.6%; P=0.003), while 3-year PFS only shown a trend of statistic significance (P=0.077) between the two groups. Among the 69 patients of inter-risk of IPI (2–3), patients with PET(0)-BMI(+) had significantly inferior PFS and OS than that with PET(0)-BMI(−) (P=0.009 and P<0.001, respectively). The cut-off value of the decreased percentage of SUVmax(BM) between PET(0)-CT and PET(3)-CT (ΔSUVmax(BM)) was 70.0%, which can predict PFS (P=0.003) and OS (P=0.023). These data confirmed that along with the increased sensitivity and accuracy of identifying bone marrow by PET-CT, novel prognostic values of marrow involvement were found in patients with DLBCL. PMID:26919239
Myers, Jamie L.; Sekar, Raju; Richardson, Laurie L.
2007-01-01
Black band disease (BBD) is a pathogenic, sulfide-rich microbial mat dominated by filamentous cyanobacteria that infect corals worldwide. We isolated cyanobacteria from BBD into culture, confirmed their presence in the BBD community by using denaturing gradient gel electrophoresis (DGGE), and demonstrated their ecological significance in terms of physiological sulfide tolerance and photosynthesis-versus-irradiance values. Twenty-nine BBD samples were collected from nine host coral species, four of which have not previously been investigated, from reefs of the Florida Keys, the Bahamas, St. Croix, and the Philippines. From these samples, seven cyanobacteria were isolated into culture. Cloning and sequencing of the 16S rRNA gene using universal primers indicated that four isolates were related to the genus Geitlerinema and three to the genus Leptolyngbya. DGGE results, obtained using Cyanobacteria-specific 16S rRNA primers, revealed that the most common BBD cyanobacterial sequence, detected in 26 BBD field samples, was related to that of an Oscillatoria sp. The next most common sequence, 99% similar to that of the Geitlerinema BBD isolate, was present in three samples. One Leptolyngbya- and one Phormidium-related sequence were also found. Laboratory experiments using isolates of BBD Geitlerinema and Leptolyngbya revealed that they could carry out sulfide-resistant oxygenic photosynthesis, a relatively rare characteristic among cyanobacteria, and that they are adapted to the sulfide-rich, low-light BBD environment. The presence of the cyanotoxin microcystin in these cultures and in BBD suggests a role in BBD pathogenicity. Our results confirm the presence of Geitlerinema in the BBD microbial community and its ecological significance, which have been challenged, and provide evidence of a second ecologically significant BBD cyanobacterium, Leptolyngbya. PMID:17601818
ERIC Educational Resources Information Center
Tabor, Josh
2010-01-01
On the 2009 AP[c] Statistics Exam, students were asked to create a statistic to measure skewness in a distribution. This paper explores several of the most popular student responses and evaluates which statistic performs best when sampling from various skewed populations. (Contains 8 figures, 3 tables, and 4 footnotes.)
NASA Astrophysics Data System (ADS)
Leich, Marcus; Kiltz, Stefan; Krätzer, Christian; Dittmann, Jana; Vielhauer, Claus
2011-03-01
According to the European Commission around 200,000 counterfeit Euro coins are removed from circulation every year. While approaches exist to automatically detect these coins, satisfying error rates are usually only reached for low quality forgeries, so-called "local classes". High-quality minted forgeries ("common classes") pose a problem for these methods as well as for trained humans. This paper presents a first approach for statistical analysis of coins based on high resolution 3D data acquired with a chromatic white light sensor. The goal of this analysis is to determine whether two coins are of common origin. The test set for these first and new investigations consists of 62 coins from not more than five different sources. The analysis is based on the assumption that, apart from markings caused by wear such as scratches and residue consisting of grease and dust, coins from equal origin have a more similar height field than coins from different mints. First results suggest that the selected approach is heavily affected by influences of wear like dents and scratches and the further research is required the eliminate this influence. A course for future work is outlined.
NASA Astrophysics Data System (ADS)
Neubert, A.; Fripp, J.; Engstrom, C.; Schwarz, R.; Lauer, L.; Salvado, O.; Crozier, S.
2012-12-01
Recent advances in high resolution magnetic resonance (MR) imaging of the spine provide a basis for the automated assessment of intervertebral disc (IVD) and vertebral body (VB) anatomy. High resolution three-dimensional (3D) morphological information contained in these images may be useful for early detection and monitoring of common spine disorders, such as disc degeneration. This work proposes an automated approach to extract the 3D segmentations of lumbar and thoracic IVDs and VBs from MR images using statistical shape analysis and registration of grey level intensity profiles. The algorithm was validated on a dataset of volumetric scans of the thoracolumbar spine of asymptomatic volunteers obtained on a 3T scanner using the relatively new 3D T2-weighted SPACE pulse sequence. Manual segmentations and expert radiological findings of early signs of disc degeneration were used in the validation. There was good agreement between manual and automated segmentation of the IVD and VB volumes with the mean Dice scores of 0.89 ± 0.04 and 0.91 ± 0.02 and mean absolute surface distances of 0.55 ± 0.18 mm and 0.67 ± 0.17 mm respectively. The method compares favourably to existing 3D MR segmentation techniques for VBs. This is the first time IVDs have been automatically segmented from 3D volumetric scans and shape parameters obtained were used in preliminary analyses to accurately classify (100% sensitivity, 98.3% specificity) disc abnormalities associated with early degenerative changes.
Zhu, M F; Ye, X P; Huang, Y Y; Guo, Z Y; Zhuang, Z F; Liu, S H
2014-01-01
Raman spectroscopy has been shown to have the potential for revealing oxygenated and spin ability of hemoglobin. In this study, confocal micro-Raman spectroscopy is developed to monitor the effect of sodium nitrite on oxyhemoglobin (HbO2 ) in whole blood. We observe that the band at 1,638 cm(-1) which is sensitive to the oxidation state decreases dramatically, while the 1,586 cm(-1) (low-spin state band) reduces both in methemoglobin (MetHb) and poisoning blood. Our results show that adding in sodium nitrite lead to the transition from HbO2 (Fe(2+) ) to MetHb (Fe(3+) ) in whole blood, and the iron atom converts from the low spin state to the high spin state with a delocalization from porphyrin plane. Moreover, multivariate statistical techniques, including principal components analysis (PCA) and linear discriminant analysis (LDA) are employed to develop effective diagnostic algorithms for classification of spectra between pure blood and poisoning blood. The diagnostic algorithms based on PCA-LDA yield a diagnostic sensitivity of 100% and specificity of 100% for separating poisoning blood from normal blood. Receiver operating characteristic (ROC) curve further confirms the effectiveness of the diagnostic algorithm based on PCA-LDA technique. The results from this study demonstrate that Raman spectroscopy combined with PCA-LDA algorithms has tremendous potential for the non-invasive detection of nitrite poisoning blood. PMID:24729434
Trka, J; Kalinová, M; Hrusák, O; Zuna, J; Krejcí, O; Madzo, J; Sedlácek, P; Vávra, V; Michalová, K; Jarosová, M; Starý, J
2002-07-01
The clinical significance of WT1 gene expression at diagnosis and during therapy of AML has not yet been resolved. We analysed WT1 expression at presentation in an unselected group of 47 childhood AML patients using real-time quantitative reverse-transcription PCR. We also showed that within the first 30 h following aspiration RQ-RT-PCR results were not influenced by transportation time. We observed lower levels of WT1 transcript in AML M5 (P = 0.0015); no association was found between expression levels and sex, initial leukocyte count and karyotype-based prognostic groups. There was significant correlation between very low WT1 expression at presentation and excellent outcome (EFS P = 0.0014). Combined analysis of WT1 levels, three-colour flow cytometry residual disease detection and the course of the disease in 222 samples from 28 children with AML showed remarkable correlation. Fourteen patients expressed high WT1 levels at presentation. In eight of them, who suffered relapse or did not reach complete remission, dynamics of WT1 levels clearly correlated with the disease status and residual disease by flow cytometry. We conclude that very low WT1 levels at presentation represent a good prognostic factor and that RQ-RT-PCR-based analysis of WT1 expression is a promising and rapid approach for monitoring of MRD in approximately half of paediatric AML patients.
ERIC Educational Resources Information Center
Cicchetti, Domenic V.; Koenig, Kathy; Klin, Ami; Volkmar, Fred R.; Paul, Rhea; Sparrow, Sara
2011-01-01
The objectives of this report are: (a) to trace the theoretical roots of the concept clinical significance that derives from Bayesian thinking, Marginal Utility/Diminishing Returns in Economics, and the "just noticeable difference", in Psychophysics. These concepts then translated into: Effect Size (ES), strength of agreement, clinical…
NASA Astrophysics Data System (ADS)
Tejos, Nicolas; Prochaska, J. Xavier; Crighton, Neil H. M.; Morris, Simon L.; Werk, Jessica K.; Theuns, Tom; Padilla, Nelson; Bielby, Rich M.; Finn, Charles W.
2016-01-01
Modern analyses of structure formation predict a universe tangled in a `cosmic web' of dark matter and diffuse baryons. These theories further predict that at low z, a significant fraction of the baryons will be shock-heated to T ˜ 105-107 K yielding a warm-hot intergalactic medium (WHIM), but whose actual existence has eluded a firm observational confirmation. We present a novel experiment to detect the WHIM, by targeting the putative filaments connecting galaxy clusters. We use HST/COS to observe a remarkable quasi-stellar object (QSO) sightline that passes within Δd = 3 Mpc from the seven intercluster axes connecting seven independent cluster pairs at redshifts 0.1 ≤ z ≤ 0.5. We find tentative excesses of total H I, narrow H I (NLA; Doppler parameters b < 50 km s-1), broad H I (BLA; b ≥ 50 km s-1) and O VI absorption lines within rest-frame velocities of Δv ≲ 1000 km s-1 from the cluster-pairs redshifts, corresponding to ˜2, ˜1.7, ˜6 and ˜4 times their field expectations, respectively. Although the excess of O VI likely comes from gas close to individual galaxies, we conclude that most of the excesses of NLAs and BLAs are truly intergalactic. We find the covering fractions, fc, of BLAs close to cluster pairs are ˜4-7 times higher than the random expectation (at the ˜2σ c.l.), whereas the fc of NLAs and O VI are not significantly enhanced. We argue that a larger relative excess of BLAs compared to those of NLAs close to cluster pairs may be a signature of the WHIM in intercluster filaments. By extending this analysis to tens of sightlines, our experiment offers a promising route to detect the WHIM.
NASA Astrophysics Data System (ADS)
Tombesi, F.; Cappi, M.; Reeves, J. N.; Palumbo, G. G. C.; Yaqoob, T.; Braito, V.; Dadina, M.
2010-10-01
Context. Blue-shifted Fe K absorption lines have been detected in recent years between 7 and 10 keV in the X-ray spectra of several radio-quiet AGNs. The derived blue-shifted velocities of the lines can often reach mildly relativistic values, up to 0.2-0.4c. These findings are important because they suggest the presence of a previously unknown massive and highly ionized absorbing material outflowing from their nuclei, possibly connected with accretion disk winds/outflows. Aims: The scope of the present work is to statistically quantify the parameters and incidence of the blue-shifted Fe K absorption lines through a uniform analysis on a large sample of radio-quiet AGNs. This allows us to assess their global detection significance and to overcome any possible publication bias. Methods: We performed a blind search for narrow absorption features at energies greater than 6.4 keV in a sample of 42 radio-quiet AGNs observed with XMM-Newton. A simple uniform model composed by an absorbed power-law plus Gaussian emission and absorption lines provided a good fit for all the data sets. We derived the absorption lines parameters and calculated their detailed detection significance making use of the classical F-test and extensive Monte Carlo simulations. Results: We detect 36 narrow absorption lines on a total of 101 XMM-Newton EPIC pn observations. The number of absorption lines at rest-frame energies higher than 7 keV is 22. Their global probability to be generated by random fluctuations is very low, less than 3 × 10-8, and their detection have been independently confirmed by a spectral analysis of the MOS data, with associated random probability <10-7. We identify the lines as Fe XXV and Fe XXVI K-shell resonant absorption. They are systematically blue-shifted, with a velocity distribution ranging from zero up to ~0.3c, with a peak and mean value at ~0.1c. We detect variability of the lines on both EWs and blue-shifted velocities among different XMM-Newton observations
ERIC Educational Resources Information Center
Hidalgo-Montesinos, Maria Dolores; Lopez-Pina, Jose Antonio
2002-01-01
Examined the effect of test purification in detecting differential item functioning (DIF) by means of polytomous extensions of the Raju area measures (N. Raju, 1990) and the Lord statistic (F. Lord, 1980). Simulation results suggest the necessity of using a two-stage equating purification process with the Raju exact measures and the Lord statistic…
NASA Astrophysics Data System (ADS)
Narayanan, Gopal; Shi, Yun Qing
We first develop a probability mass function (PMF) for quantized block discrete cosine transform (DCT) coefficients in JPEG compression using statistical analysis of quantization, with a Generalized Gaussian model being considered as the PDF for non-quantized block DCT coefficients. We subsequently propose a novel method to detect potential JPEG compression history in bitmap images using the PMF that has been developed. We show that this method outperforms a classical approach to compression history detection in terms of effectiveness. We also show that it detects history with both independent JPEG group (IJG) and custom quantization tables.
NASA Technical Reports Server (NTRS)
Wilson, Robert M.
2001-01-01
Since 1750, the number of cataclysmic volcanic eruptions (volcanic explosivity index (VEI)>=4) per decade spans 2-11, with 96 percent located in the tropics and extra-tropical Northern Hemisphere. A two-point moving average of the volcanic time series has higher values since the 1860's than before, being 8.00 in the 1910's (the highest value) and 6.50 in the 1980's, the highest since the 1910's peak. Because of the usual behavior of the first difference of the two-point moving averages, one infers that its value for the 1990's will measure approximately 6.50 +/- 1, implying that approximately 7 +/- 4 cataclysmic volcanic eruptions should be expected during the present decade (2000-2009). Because cataclysmic volcanic eruptions (especially those having VEI>=5) nearly always have been associated with short-term episodes of global cooling, the occurrence of even one might confuse our ability to assess the effects of global warming. Poisson probability distributions reveal that the probability of one or more events with a VEI>=4 within the next ten years is >99 percent. It is approximately 49 percent for an event with a VEI>=5, and 18 percent for an event with a VEI>=6. Hence, the likelihood that a climatically significant volcanic eruption will occur within the next ten years appears reasonably high.
Technology Transfer Automated Retrieval System (TEKTRAN)
Statistically robust sampling strategies form an integral component of grain storage and handling activities throughout the world. Developing sampling strategies to target biological pests such as insects in stored grain is inherently difficult due to species biology and behavioral characteristics. ...
Kalinin, A V; Krasheninnikov, V N; Sviridov, A P; Titov, V N
2015-11-01
The content of clinically important fatty acids and individual triglycerides in food and biological mediums are traditionally detected by gas and fluid chromatography in various methodical modifications. The techniques are hard-to-get in laboratories of clinical biochemistry. The study was carried out to develop procedures and equipment for operative quantitative detection of concentration of fatty acids, primarily palmitic saturated fatty acid and oleic mono unsaturated fatty acid. Also detection was applied to sums ofpolyenoic (eicosapentaenoic and docosahexaenoic acid) fatty acids in biological mediums (cod-liver oil, tissues, blood plasma) using spectrometers of short-range infrared band of different types: with Fourier transform, diffraction and combined scattering. The evidences of reliable and reproducible quantitative detection offatty acids were received on the basis of technique of calibration (regression) by projection on latent structures using standard samples of mixtures of oils and fats. The evaluation is implemented concerning possibility of separate detection of content of palmitic and oleic triglycerides in mediums with presence of water The choice of technical conditions and mode of application of certain types of infrared spectrometers and techniques of their calibration is substantiated PMID:26999859
Maruvada, Padma; Srivastava, Sudhir
2006-06-01
Cancer remains the second leading cause of death in the United States, in spite of tremendous advances made in therapeutic and diagnostic strategies. Successful cancer treatment depends on improved methods to detect cancers at early stages when they can be treated more effectively. Biomarkers for early detection of cancer enable screening of asymptomatic populations and thus play a critical role in cancer diagnosis. However, the approaches for validating biomarkers have yet to be addressed clearly. In an effort to delineate the ambiguities related to biomarker validation and related statistical considerations, the National Cancer Institute, in collaboration with the Food and Drug Administration, conducted a workshop in July 2004 entitled "Research Strategies, Study Designs, and Statistical Approaches to Biomarker Validation for Cancer Diagnosis and Detection." The main objective of this workshop was to review basic considerations underpinning the study designs, statistical methodologies, and novel approaches necessary to rapidly advance the clinical application of cancer biomarkers. The current commentary describes various aspects of statistical considerations and study designs for cancer biomarker validation discussed in this workshop.
NASA Astrophysics Data System (ADS)
Skatter, Sondre; Fritsch, Sebastian; Schlomka, Jens-Peter
2016-05-01
The performance limits were explored for an X-ray Diffraction based explosives detection system for baggage scanning. This XDi system offers 4D imaging that comprises three spatial dimensions with voxel sizes in the order of ~(0.5cm)3, and one spectral dimension for material discrimination. Because only a very small number of photons are observed for an individual voxel, material discrimination cannot work reliably at the voxel level. Therefore, an initial 3D reconstruction is performed, which allows the identification of objects of interest. Combining all the measured photons that scattered within an object, more reliable spectra are determined on the object-level. As a case study we looked at two liquid materials, one threat and one innocuous, with very similar spectral characteristics, but with 15% difference in electron density. Simulations showed that Poisson statistics alone reduce the material discrimination performance to undesirable levels when the photon counts drop to 250. When additional, uncontrolled variation sources are considered, the photon count plays a less dominant role in detection performance, but limits the performance also for photon counts of 500 and higher. Experimental data confirmed the presence of such non-Poisson variation sources also in the XDi prototype system, which suggests that the present system can still be improved without necessarily increasing the photon flux, but by better controlling and accounting for these variation sources. When the classification algorithm was allowed to use spectral differences in the experimental data, the discrimination between the two materials improved significantly, proving the potential of X-ray diffraction also for liquid materials.
Maity, Debabrata; Jiang, Juanjuan; Ehlers, Martin; Wu, Junchen; Schmuck, Carsten
2016-05-01
A cationic molecular peptide beacon NAP1 functionalized with a fluorescence resonance energy transfer-pair at its ends allows the ratiometric detection of ds-DNA with a preference for AT rich sequences. NAP1 most likely binds in a folded form into the minor groove of ds-DNA, which results in a remarkable change in its fluorescence properties. As NAP1 exhibits quite low cytotoxicity, it can also be used for imaging of nuclear DNA in cells. PMID:27071707
Maity, Debabrata; Jiang, Juanjuan; Ehlers, Martin; Wu, Junchen; Schmuck, Carsten
2016-05-01
A cationic molecular peptide beacon NAP1 functionalized with a fluorescence resonance energy transfer-pair at its ends allows the ratiometric detection of ds-DNA with a preference for AT rich sequences. NAP1 most likely binds in a folded form into the minor groove of ds-DNA, which results in a remarkable change in its fluorescence properties. As NAP1 exhibits quite low cytotoxicity, it can also be used for imaging of nuclear DNA in cells.
Cottrell, R.Les; Logg, Connie; Chhaparia, Mahesh; Grigoriev, Maxim; Haro, Felipe; Nazir, Fawad; Sandford, Mark
2006-01-25
End-to-End fault and performance problems detection in wide area production networks is becoming increasingly hard as the complexity of the paths, the diversity of the performance, and dependency on the network increase. Several monitoring infrastructures are built to monitor different network metrics and collect monitoring information from thousands of hosts around the globe. Typically there are hundreds to thousands of time-series plots of network metrics which need to be looked at to identify network performance problems or anomalous variations in the traffic. Furthermore, most commercial products rely on a comparison with user configured static thresholds and often require access to SNMP-MIB information, to which a typical end-user does not usually have access. In our paper we propose new techniques to detect network performance problems proactively in close to realtime and we do not rely on static thresholds and SNMP-MIB information. We describe and compare the use of several different algorithms that we have implemented to detect persistent network problems using anomalous variations analysis in real end-to-end Internet performance measurements. We also provide methods and/or guidance for how to set the user settable parameters. The measurements are based on active probes running on 40 production network paths with bottlenecks varying from 0.5Mbits/s to 1000Mbit/s. For well behaved data (no missed measurements and no very large outliers) with small seasonal changes most algorithms identify similar events. We compare the algorithms' robustness with respect to false positives and missed events especially when there are large seasonal effects in the data. Our proposed techniques cover a wide variety of network paths and traffic patterns. We also discuss the applicability of the algorithms in terms of their intuitiveness, their speed of execution as implemented, and areas of applicability. Our encouraging results compare and evaluate the accuracy of our detection
Imanishi, M; Newton, A E; Vieira, A R; Gonzalez-Aviles, G; Kendall Scott, M E; Manikonda, K; Maxwell, T N; Halpin, J L; Freeman, M M; Medalla, F; Ayers, T L; Derado, G; Mahon, B E; Mintz, E D
2015-08-01
Although rare, typhoid fever cases acquired in the United States continue to be reported. Detection and investigation of outbreaks in these domestically acquired cases offer opportunities to identify chronic carriers. We searched surveillance and laboratory databases for domestically acquired typhoid fever cases, used a space-time scan statistic to identify clusters, and classified clusters as outbreaks or non-outbreaks. From 1999 to 2010, domestically acquired cases accounted for 18% of 3373 reported typhoid fever cases; their isolates were less often multidrug-resistant (2% vs. 15%) compared to isolates from travel-associated cases. We identified 28 outbreaks and two possible outbreaks within 45 space-time clusters of ⩾2 domestically acquired cases, including three outbreaks involving ⩾2 molecular subtypes. The approach detected seven of the ten outbreaks published in the literature or reported to CDC. Although this approach did not definitively identify any previously unrecognized outbreaks, it showed the potential to detect outbreaks of typhoid fever that may escape detection by routine analysis of surveillance data. Sixteen outbreaks had been linked to a carrier. Every case of typhoid fever acquired in a non-endemic country warrants thorough investigation. Space-time scan statistics, together with shoe-leather epidemiology and molecular subtyping, may improve outbreak detection.
Tonello, Lucio; Conway de Macario, Everly; Marino Gammazza, Antonella; Cocchi, Massimo; Gabrielli, Fabio; Zummo, Giovanni; Cappello, Francesco; Macario, Alberto J L
2015-03-01
The pathogenesis of Hashimoto's thyroiditis includes autoimmunity involving thyroid antigens, autoantibodies, and possibly cytokines. It is unclear what role plays Hsp60, but our recent data indicate that it may contribute to pathogenesis as an autoantigen. Its role in the induction of cytokine production, pro- or anti-inflammatory, was not elucidated, except that we found that peripheral blood mononucleated cells (PBMC) from patients or from healthy controls did not respond with cytokine production upon stimulation by Hsp60 in vitro with patterns that would differentiate patients from controls with statistical significance. This "negative" outcome appeared when the data were pooled and analyzed with conventional statistical methods. We re-analyzed our data with non-conventional statistical methods based on data mining using the classification and regression tree learning algorithm and clustering methodology. The results indicate that by focusing on IFN-γ and IL-2 levels before and after Hsp60 stimulation of PBMC in each patient, it is possible to differentiate patients from controls. A major general conclusion is that when trying to identify disease markers such as levels of cytokines and Hsp60, reference to standards obtained from pooled data from many patients may be misleading. The chosen biomarker, e.g., production of IFN-γ and IL-2 by PBMC upon stimulation with Hsp60, must be assessed before and after stimulation and the results compared within each patient and analyzed with conventional and data mining statistical methods.
Nondetect (ND) or below detection limit (BDL) results cannot be measured accurately, and, therefore, are reported as less than certain detection limit (DL) values. However, since the presence of some contaminants (e.g., dioxin) in environmental media may pose a threat to human he...
NASA Technical Reports Server (NTRS)
Mitrofanov, I. G.; Chernenko, A. M.; Pozanenko, A. S.; Fishman, G. J.; Meegan, C. A.; Briggs, M. S.; Paciesas, W. S.; Sagdeev, R. Z.
1995-01-01
The new method of statistical studying of cosmic gamma-ray bursts is presented based on the averaging of time profiles. The comparison is done between bright and dim events: while no differences were found between average flux curves, the hardness ratios pointed out the effect of hardness/brightness correlation.
Vyazovaya, Anna; Zhuravlev, Viacheslav; Otten, Tatiana; Millet, Julie; Jiao, Wei-Wei; Shen, A-Dong; Rastogi, Nalin; Vishnevsky, Boris; Narvskaya, Olga
2014-01-01
Mycobacterium tuberculosis Beijing genotype strains are rapidly disseminating, frequently hypervirulent, and multidrug resistant. Here, we describe a method for their rapid detection by real-time PCR that targets the specific IS6110 insertion in the dnaA-dnaN genome region. The method was evaluated with a geographically and genetically diverse collection representing areas in East Asia and the former Soviet Union in which the Beijing genotype is endemic and epidemic (i.e., major foci of its global propagation) and with clinical specimens. PMID:24523461
Mokrousov, Igor; Vyazovaya, Anna; Zhuravlev, Viacheslav; Otten, Tatiana; Millet, Julie; Jiao, Wei-Wei; Shen, A-Dong; Rastogi, Nalin; Vishnevsky, Boris; Narvskaya, Olga
2014-05-01
Mycobacterium tuberculosis Beijing genotype strains are rapidly disseminating, frequently hypervirulent, and multidrug resistant. Here, we describe a method for their rapid detection by real-time PCR that targets the specific IS6110 insertion in the dnaA-dnaN genome region. The method was evaluated with a geographically and genetically diverse collection representing areas in East Asia and the former Soviet Union in which the Beijing genotype is endemic and epidemic (i.e., major foci of its global propagation) and with clinical specimens.
Gladysz, Szymon; Yaitskova, Natalia; Christou, Julian C
2010-11-01
This paper is an introduction to the problem of modeling the probability density function of adaptive-optics speckle. We show that with the modified Rician distribution one cannot describe the statistics of light on axis. A dual solution is proposed: the modified Rician distribution for off-axis speckle and gamma-based distribution for the core of the point spread function. From these two distributions we derive optimal statistical discriminators between real sources and quasi-static speckles. In the second part of the paper the morphological difference between the two probability density functions is used to constrain a one-dimensional, "blind," iterative deconvolution at the position of an exoplanet. Separation of the probability density functions of signal and speckle yields accurate differential photometry in our simulations of the SPHERE planet finder instrument.
Hillengass, Jens; Ritsch, Judith; Merz, Maximilian; Wagner, Barbara; Kunz, Christina; Hielscher, Thomas; Laue, Hendrik; Bäuerle, Tobias; Zechmann, Christian M; Ho, Anthony D; Schlemmer, Heinz-Peter; Goldschmidt, Hartmut; Moehler, Thomas M; Delorme, Stefan
2016-07-01
This prospective study aimed to investigate the prognostic significance of dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) as a non-invasive imaging technique delivering the quantitative parameters amplitude A (reflecting blood volume) and exchange rate constant kep (reflecting vascular permeability) in patients with asymptomatic monoclonal plasma cell diseases. We analysed DCE-MRI parameters in 33 healthy controls and 148 patients with monoclonal gammopathy of undetermined significance (MGUS) or smouldering multiple myeloma (SMM) according to the 2003 IMWG guidelines. All individuals underwent standardized DCE-MRI of the lumbar spine. Regions of interest were drawn manually on T1-weighted images encompassing the bone marrow of each of the 5 lumbar vertebrae sparing the vertebral vessel. Prognostic significance for median of amplitude A (univariate: P < 0·001, hazard ratio (HR) 2·42, multivariate P = 0·02, HR 2·7) and exchange rate constant kep (univariate P = 0·03, HR 1·92, multivariate P = 0·46, HR 1·5) for time to progression of 79 patients with SMM was found. Patients with amplitude A above the optimal cut-off point of 0·89 arbitrary units had a 2-year progression rate into symptomatic disease of 80%. In conclusion, DCE-MRI parameters are of prognostic significance for time to progression in patients with SMM but not in individuals with MGUS. PMID:26991959
Platts-Mills, James A.; Liu, Jie; Gratz, Jean; Mduma, Esto; Amour, Caroline; Swai, Ndealilia; Taniuchi, Mami; Begum, Sharmin; Peñataro Yori, Pablo; Tilley, Drake H.; Lee, Gwenyth; Shen, Zeli; Whary, Mark T.; Fox, James G.; McGrath, Monica; Kosek, Margaret; Haque, Rashidul
2014-01-01
Campylobacter is a common bacterial enteropathogen that can be detected in stool by culture, enzyme immunoassay (EIA), or PCR. We compared culture for C. jejuni/C. coli, EIA (ProSpecT), and duplex PCR to distinguish Campylobacter jejuni/C. coli and non-jejuni/coli Campylobacter on 432 diarrheal and matched control stool samples from infants in a multisite longitudinal study of enteric infections in Tanzania, Bangladesh, and Peru. The sensitivity and specificity of culture were 8.5% and 97.6%, respectively, compared with the results of EIA and 8.7% and 98.0%, respectively, compared with the results of PCR for C. jejuni/C. coli. Most (71.6%) EIA-positive samples were positive by PCR for C. jejuni/C. coli, but 27.6% were positive for non-jejuni/coli Campylobacter species. Sequencing of 16S rRNA from 53 of these non-jejuni/coli Campylobacter samples showed that it most closely matched the 16S rRNA of C. hyointestinalis subsp. lawsonii (56%), C. troglodytis (33%), C. upsaliensis (7.7%), and C. jejuni/C. coli (2.6%). Campylobacter-negative stool spiked with each of the above-mentioned Campylobacter species revealed reactivity with EIA. PCR detection of Campylobacter species was strongly associated with diarrhea in Peru (odds ratio [OR] = 3.66, P < 0.001) but not in Tanzania (OR = 1.56, P = 0.24) or Bangladesh (OR = 1.13, P = 0.75). According to PCR, Campylobacter jejuni/C. coli infections represented less than half of all infections with Campylobacter species. In sum, in infants in developing country settings, the ProSpecT EIA and PCR for Campylobacter reveal extremely high rates of positivity. We propose the use of PCR because it retains high sensitivity, can ascertain burden, and can distinguish between Campylobacter infections at the species level. PMID:24452175
Ahn, Hyo-Sae; Son, Whee Sung; Shin, Ji-Hoon; Ahn, Myun-Whan
2016-01-01
Study Design Retrospective exploratory imaging study. Purpose To investigate the significance of the coronal magnetic resonance imaging (MRI) using Proset technique to detect the hidden zone in patients with mid-zone stenosis by comparing with conventional axial and sagittal MRI and to explore the morphologic characteristic patterns of the mid-zone stenosis. Overview of Literature Despite advancements in diagnostic modalities such as computed tomography and MRI, stenotic lesions under the pedicle and pars interarticularis, also called the mid-zone, are still difficult to definitely detect with the conventional axial and sagittal MRI due to its inherited anatomical peculiarity. Methods Of 180 patients scheduled to undergo selective nerve root block, 20 patients with mid-zone stenosis were analyzed using MRI. Characteristic group patterns were also explored morphologically by comparing MRI views of each group after verifying statistical differences between them. Hierarchical cluster analysis was performed to classify morphological characteristic groups based on three-dimensional radiologic grade for stenosis at all three zones. Results At the mid-zone, the stenosis of grade 2 or more was found in 14 cases in the coronal image,13 cases in the sagittal image, and 9 cases in the axial image (p<0.05). Especially, mid-zone stenosis was not detected in six of 20 cases at the axial images. At the entrance and exit-zone, coronal image was also associated with more accurate detection of hidden zone compared to other views such as axial and sagittal images. After repeated statistical verification, the morphological patterns of hidden zone were classified into 5 groups: 6 cases in group I; 1 case in group II; 4 cases in group III; 7 cases in group IV; and 2 cases in group V. Conclusions Coronal MRI using the Proset technique more accurately detected hidden zone of the mid-zone stenosis compared to conventional axial and sagittal images. PMID:27559443
Ziessman, H.A.; Wahl, R.L.; Lahti, D.; Juni, J.E.; Thrall, J.H.; Keyes, J.W.
1984-01-01
HAPS has proven useful in the clinical management of patients receiving intraarterial chemotherapy for liver cancer. This therapy can be successful when the entire tumor-bearing liver is perfused, however abdominal EHP may reduce tumor exposure and increase systemic toxicity. EHP can be difficult to determine or be overlooked particularly when the stomach overlaps an enlarged left lobe of the liver. This study reports the frequency and clinical significance of EHP and evaluates the use of SPECT and EZ gas (NaHCO/sub 3/) to localize it. EHP was seen in 14% of 147 pts with surgically placed catheters, but was significantly more frequent, 53%, in 57 pts with percutaneously placed caths (rho <.005). Significantly more pts with EHP (70%) had symptoms of drug toxicity compared to 19% without EHP (rho <.005). Review of 73 HAPS studies using SPECT in addition to planar images showed EHP in 7. SPECT was very helpful in evaluating EHP suspected on planar images in 6 cases, confirming in 4, and excluding in 2. Two planar studies with likely EHP were confirmed by SPECT and 1 was inconclusive. EZ gas effervescent granulates have also been found useful in defining gastric EHP in 20 planar HAPS studies. These ''air-contrast'' views were helpful in confirming or excluding EHP in 80% of the studies and the initial impression was changed in 20%. Results were corroborated by oral Tc-DPTA and angiography. This study demonstrates that EHP is frequent and has important clinical significance but can be difficult to determine on HAPS. SPECT, EZ Gas and Tc-DPTA are very helpful in confirming or excluding suspected EHP.
NASA Astrophysics Data System (ADS)
Xiao, Yongshuang; Ma, Daoyuan; Xu, Shihong; Liu, Qinghua; Wang, Yanfeng; Xiao, Zhizhong; Li, Jun
2016-05-01
Oplegnathus fasciatus (rock bream) is a commercial rocky reef fish species in East Asia that has been considered for aquaculture. We estimated the population genetic diversity and population structure of the species along the coastal waters of China using fluorescent-amplified fragment length polymorphisms technology. Using 53 individuals from three populations and four pairs of selective primers, we amplified 1 264 bands, 98.73% of which were polymorphic. The Zhoushan population showed the highest Nei's genetic diversity and Shannon genetic diversity. The results of analysis of molecular variance (AMOVA) showed that 59.55% of genetic variation existed among populations and 40.45% occurred within populations, which indicated that a significant population genetic structure existed in the species. The pairwise fixation index F st ranged from 0.20 to 0.63 and were significant after sequential Bonferroni correction. The topology of an unweighted pair group method with arithmetic mean tree showed two significant genealogical branches corresponding to the sampling locations of North and South China. The AMOVA and STRUCTURE analyses suggested that the O. fasciatus populations examined should comprise two stocks.
Fleg, J.L.; Gerstenblith, G.; Zonderman, A.B.; Becker, L.C.; Weisfeldt, M.L.; Costa, P.T. Jr.; Lakatta, E.G. )
1990-02-01
Although a silent ischemic electrocardiographic response to treadmill exercise in clinically healthy populations is associated with an increased likelihood of future coronary events (i.e., angina pectoris, myocardial infarction, or cardiac death), such a response has a low predictive value for future events because of the low prevalence of disease in asymptomatic populations. To examine whether detection of reduced regional perfusion by thallium scintigraphy improved the predictive value of exercise-induced ST segment depression, we performed maximal treadmill exercise electrocardiography (ECG) and thallium scintigraphy (201Tl) in 407 asymptomatic volunteers 40-96 years of age (mean = 60) from the Baltimore Longitudinal Study on Aging. The prevalence of exercise-induced silent ischemia, defined by concordant ST segment depression and a thallium perfusion defect, increased more than sevenfold from 2% in the fifth and sixth decades to 15% in the ninth decade. Over a mean follow-up period of 4.6 years, cardiac events developed in 9.8% of subjects and consisted of 20 cases of new angina pectoris, 13 myocardial infarctions, and seven deaths. Events occurred in 7% of individuals with both negative 201Tl and ECG, 8% of those with either test positive, and 48% of those in whom both tests were positive (p less than 0.001). By proportional hazards analysis, age, hypertension, exercise duration, and a concordant positive ECG and 201Tl result were independent predictors of coronary events. Furthermore, those with positive ECG and 201Tl had a 3.6-fold relative risk for subsequent coronary events, independent of conventional risk factors.
NASA Astrophysics Data System (ADS)
Jing, Yu; Wang, Yaxuan; Liu, Jianxin; Liu, Zhaoxia
2015-08-01
Edge detection is a crucial method for the location and quantity estimation of oil slick when oil spills on the sea. In this paper, we present a robust active contour edge detection algorithm for oil spill remote sensing images. In the proposed algorithm, we define a local Gaussian data fitting energy term with spatially varying means and variances, and this data fitting energy term is introduced into a global minimization active contour (GMAC) framework. The energy function minimization is achieved fast by a dual formulation of the weighted total variation norm. The proposed algorithm avoids the existence of local minima, does not require the definition of initial contour, and is robust to weak boundaries, high noise and severe intensity inhomogeneity exiting in oil slick remote sensing images. Furthermore, the edge detection of oil slick and the correction of intensity inhomogeneity are simultaneously achieved via the proposed algorithm. The experiment results have shown that a superior performance of proposed algorithm over state-of-the-art edge detection algorithms. In addition, the proposed algorithm can also deal with the special images with the object and background of the same intensity means but different variances.
Girard, Philippe
2011-01-01
Null alleles are common technical artifacts in genetic-based analysis. Powerful methods enabling their detection in either panmictic or inbred populations have been proposed. However, none of these methods appears unbiased in both types of mating systems, necessitating a priori knowledge of the inbreeding level of the population under study. To counter this problem, I propose to use the software FDist2 to detect the atypical fixation indices that characterize markers with null alleles. The rational behind this approach and the parameter settings are explained. The power of the method for various sample sizes, degrees of inbreeding and null allele frequencies is evaluated using simulated microsatellite and SNP datasets and then compared to two other null allele detection methods. The results clearly show the robustness of the method proposed here as well as its greater accuracy in both panmictic and inbred populations for both types of marker. By allowing a proper detection of null alleles for a wide range of mating systems and markers, this new method is particularly appealing for numerous genetic studies using co-dominant loci. PMID:21381434
Dolan, T E; Lynch, P D; Karazsia, J L; Serafy, J E
2016-03-01
An expansion is underway of a nuclear power plant on the shoreline of Biscayne Bay, Florida, USA. While the precise effects of its construction and operation are unknown, impacts on surrounding marine habitats and biota are considered by experts to be likely. The objective of the present study was to determine the adequacy of an ongoing monitoring survey of fish communities associated with mangrove habitats directly adjacent to the power plant to detect fish community changes, should they occur, at three spatial scales. Using seasonally resolved data recorded during 532 fish surveys over an 8-year period, power analyses were performed for four mangrove fish metrics (fish diversity, fish density, and the occurrence of two ecologically important fish species: gray snapper (Lutjanus griseus) and goldspotted killifish (Floridichthys carpio). Results indicated that the monitoring program at current sampling intensity allows for detection of <33% changes in fish density and diversity metrics in both the wet and the dry season in the two larger study areas. Sampling effort was found to be insufficient in either season to detect changes at this level (<33%) in species-specific occurrence metrics for the two fish species examined. The option of supplementing ongoing, biological monitoring programs for improved, focused change detection deserves consideration from both ecological and cost-benefit perspectives. PMID:26903208
Dolan, T E; Lynch, P D; Karazsia, J L; Serafy, J E
2016-03-01
An expansion is underway of a nuclear power plant on the shoreline of Biscayne Bay, Florida, USA. While the precise effects of its construction and operation are unknown, impacts on surrounding marine habitats and biota are considered by experts to be likely. The objective of the present study was to determine the adequacy of an ongoing monitoring survey of fish communities associated with mangrove habitats directly adjacent to the power plant to detect fish community changes, should they occur, at three spatial scales. Using seasonally resolved data recorded during 532 fish surveys over an 8-year period, power analyses were performed for four mangrove fish metrics (fish diversity, fish density, and the occurrence of two ecologically important fish species: gray snapper (Lutjanus griseus) and goldspotted killifish (Floridichthys carpio). Results indicated that the monitoring program at current sampling intensity allows for detection of <33% changes in fish density and diversity metrics in both the wet and the dry season in the two larger study areas. Sampling effort was found to be insufficient in either season to detect changes at this level (<33%) in species-specific occurrence metrics for the two fish species examined. The option of supplementing ongoing, biological monitoring programs for improved, focused change detection deserves consideration from both ecological and cost-benefit perspectives.
ERIC Educational Resources Information Center
Fidalgo, Angel M.
2011-01-01
Mantel-Haenszel (MH) methods constitute one of the most popular nonparametric differential item functioning (DIF) detection procedures. GMHDIF has been developed to provide an easy-to-use program for conducting DIF analyses. Some of the advantages of this program are that (a) it performs two-stage DIF analyses in multiple groups simultaneously;…
NASA Astrophysics Data System (ADS)
Pasmanik, Dmitry; Hayosh, Mykhaylo; Demekhov, Andrei; Santolík, Ondřej; Nemec, František; Parrot, Michel
2015-04-01
We present a statistical study of the quasi-periodic (QP) ELF/VLF emissions measured by the DEMETER spacecraft. Events with modulation period larger than 10 s and frequency bandwidth more than 200 Hz were visually selected among the six year of measurements. Selected QP-emissions events occur mostly at frequencies from about 750 Hz to 2 kHz, but they may be observed at frequencies as low as 500 Hz and as high as 8 kHz. The statistical analysis clearly shows that QP events with larger modulation periods have lower frequency drift and smaller wave amplitude. Intense QP events have higher frequency drifts and larger values of the frequency bandwiths. Numerical simulation of the QP emissions based on the theoretical model of the flow cyclotron maser is performed. Calculations were made for wide range of plasma parameters (i.e. cold plasma density, L-shell, energetic electron flux and etc.) The numerical results are in good agreement with the observed relationship between different parameters of the QP emissions. The comparison between theoretical results and observations allow us to estimate the typical properties of the source of the QP emissions observed by the DEMETER satellite.
NASA Astrophysics Data System (ADS)
Li, Jianyong; Meng, Guojie; Wang, Min; Liao, Hua; Shen, Xuhui
2009-10-01
Ionospheric TEC (total electron content) time series are derived from GPS measurements at 13 stations around the epicenter of the 2008 Wenchuan earthquake. Defining anomaly bounds for a sliding window by quartile and 2-standard deviation of TEC values, this paper analyzed the characteristics of ionospheric changes before and after the destructive event. The Neyman-Pearson signal detection method is employed to compute the probabilities of TEC abnormalities. Result shows that one week before the Wenchuan earthquake, ionospheric TEC over the epicenter and its vicinities displays obvious abnormal disturbances, most of which are positive anomalies. The largest TEC abnormal changes appeared on May 9, three days prior to the seismic event. Signal detection shows that the largest possibility of TEC abnormity on May 9 is 50.74%, indicating that ionospheric abnormities three days before the main shock are likely related to the preparation process of the M S8.0 Wenchuan earthquake.
NASA Astrophysics Data System (ADS)
Gurcan, Metin N.; Petrick, Nicholas; Sahiner, Berkman; Chan, Heang-Ping; Cascade, Philip N.; Kazerooni, Ella A.; Hadjiiski, Lubomir M.
2001-07-01
We are developing a computer-aided diagnosis (CAD) system for lung nodule detection on thoracic helical computed tomography (CT) images. In the first stage of this CAD system, lung regions are identified and suspicious structures are segmented. These structures may include true lung nodules or normal structures that consist mainly of vascular structures. We have designed rule-based classifiers to distinguish nodules and normal structures using 2D and 3D features. After rule-based classification, linear discriminant analysis (LDA) is used to further reduce the number of false positive (FP) objects. We have performed a preliminary study using CT images from 17 patients with 31 lung nodules. When only LDA classification was applied to the segmented objects, the sensitivity was 84% (26/31) with 2.53 (1549/612) FP objects per slice. When the LDA followed the rule-based classifier, the number of FP objects per slice decreased to 1.75 (1072/612) at the same sensitivity. These preliminary results demonstrate the feasibility of our approach for nodule detection and FP reduction on CT images. The inclusion of rule-based classification leads to an improvement in detection accuracy for the CAD system.
Bint, Susan; Irving, Melita D.; Kyle, Phillipa M.; Akolekar, Ranjit; Mohammed, Shehla N.; Mackie Ogilvie, Caroline
2014-01-01
Purpose. To design and validate a prenatal chromosomal microarray testing strategy that moves away from size-based detection thresholds, towards a more clinically relevant analysis, providing higher resolution than G-banded chromosomes but avoiding the detection of copy number variants (CNVs) of unclear prognosis that cause parental anxiety. Methods. All prenatal samples fulfilling our criteria for karyotype analysis (n = 342) were tested by chromosomal microarray and only CNVs of established deletion/duplication syndrome regions and any other CNV >3 Mb were detected and reported. A retrospective full-resolution analysis of 249 of these samples was carried out to ascertain the performance of this testing strategy. Results. Using our prenatal analysis, 23/342 (6.7%) samples were found to be abnormal. Of the remaining samples, 249 were anonymized and reanalyzed at full-resolution; a further 46 CNVs were detected in 44 of these cases (17.7%). None of these additional CNVs were of clear clinical significance. Conclusion. This prenatal chromosomal microarray strategy detected all CNVs of clear prognostic value and did not miss any CNVs of clear clinical significance. This strategy avoided both the problems associated with interpreting CNVs of uncertain prognosis and the parental anxiety that are a result of such findings. PMID:24795849
NASA Astrophysics Data System (ADS)
Gómez González, A.; Fassois, S. D.
2016-03-01
The problem of vibration-based damage detection under varying environmental conditions and uncertainty is considered, and a novel, supervised, PCA-type statistical methodology is postulated. The methodology employs vibration data records from the healthy and damaged states of a structure under various environmental conditions. Unlike standard PCA-type methods in which a feature vector corresponding to the least important eigenvalues is formed in a single step, the postulated methodology uses supervised learning in which damaged-state data records are employed to sequentially form a feature vector by appending a transformed scalar element at a time under the condition that it optimally, among all remaining elements, improves damage detectability. This leads to the formulation of feature vectors with optimized sensitivity to damage, and thus high damage detectability. Within this methodology three particular methods, two non-parametric and one parametric, are formulated. These are validated and comparatively assessed via a laboratory case study focusing on damage detection on a scale wind turbine blade under varying temperature and the potential presence of sprayed water. Damage detection performance is shown to be excellent based on a single vibration response sensor and a limited frequency bandwidth.
Grimm, Lars J. Ghate, Sujata V.; Yoon, Sora C.; Kim, Connie; Kuzmiak, Cherie M.; Mazurowski, Maciej A.
2014-03-15
Purpose: The purpose of this study is to explore Breast Imaging-Reporting and Data System (BI-RADS) features as predictors of individual errors made by trainees when detecting masses in mammograms. Methods: Ten radiology trainees and three expert breast imagers reviewed 100 mammograms comprised of bilateral medial lateral oblique and craniocaudal views on a research workstation. The cases consisted of normal and biopsy proven benign and malignant masses. For cases with actionable abnormalities, the experts recorded breast (density and axillary lymph nodes) and mass (shape, margin, and density) features according to the BI-RADS lexicon, as well as the abnormality location (depth and clock face). For each trainee, a user-specific multivariate model was constructed to predict the trainee's likelihood of error based on BI-RADS features. The performance of the models was assessed using area under the receive operating characteristic curves (AUC). Results: Despite the variability in errors between different trainees, the individual models were able to predict the likelihood of error for the trainees with a mean AUC of 0.611 (range: 0.502–0.739, 95% Confidence Interval: 0.543–0.680,p < 0.002). Conclusions: Patterns in detection errors for mammographic masses made by radiology trainees can be modeled using BI-RADS features. These findings may have potential implications for the development of future educational materials that are personalized to individual trainees.
Barbosa, Daniel J C; Ramos, Jaime; Correia, José Higino; Lima, Carlos S
2009-01-01
Traditional endoscopic methods do not allow the visualization of the entire Gastrointestinal (GI) tract. Wireless Capsule Endoscopy (CE) is a diagnostic procedure that overcomes this limitation of the traditional endoscopic methods. The CE video frames possess rich information about the condition of the stomach and intestine mucosa, encoded as color and texture patterns. It is known for a long time that human perception of texture is based in a multi-scale analysis of patterns, which can be modeled by multi-resolution approaches. Furthermore, modeling the covariance of textural descriptors has been successfully used in classification of colonoscopy videos. Therefore, in the present paper it is proposed a frame classification scheme based on statistical textural descriptors taken from the Discrete Curvelet Transform (DCT) domain, a recent multi-resolution mathematical tool. The DCT is based on an anisotropic notion of scale and high directional sensitivity in multiple directions, being therefore suited to characterization of complex patterns as texture. The covariance of texture descriptors taken at a given detail level, in different angles, is used as classification feature, in a scheme designated as Color Curvelet Covariance. The classification step is performed by a multilayer perceptron neural network. The proposed method has been applied in real data taken from several capsule endoscopic exams and reaches 97.2% of sensitivity and 97.4% specificity. These promising results support the feasibility of the proposed method.
Parker, S
2015-06-15
Purpose: To evaluate the ability of statistical process control methods to detect systematic errors when using a two dimensional (2D) detector array for routine electron beam energy verification. Methods: Electron beam energy constancy was measured using an aluminum wedge and a 2D diode array on four linear accelerators. Process control limits were established. Measurements were recorded in control charts and compared with both calculated process control limits and TG-142 recommended specification limits. The data was tested for normality, process capability and process acceptability. Additional measurements were recorded while systematic errors were intentionally introduced. Systematic errors included shifts in the alignment of the wedge, incorrect orientation of the wedge, and incorrect array calibration. Results: Control limits calculated for each beam were smaller than the recommended specification limits. Process capability and process acceptability ratios were greater than one in all cases. All data was normally distributed. Shifts in the alignment of the wedge were most apparent for low energies. The smallest shift (0.5 mm) was detectable using process control limits in some cases, while the largest shift (2 mm) was detectable using specification limits in only one case. The wedge orientation tested did not affect the measurements as this did not affect the thickness of aluminum over the detectors of interest. Array calibration dependence varied with energy and selected array calibration. 6 MeV was the least sensitive to array calibration selection while 16 MeV was the most sensitive. Conclusion: Statistical process control methods demonstrated that the data distribution was normally distributed, the process was capable of meeting specifications, and that the process was centered within the specification limits. Though not all systematic errors were distinguishable from random errors, process control limits increased the ability to detect systematic errors
Grossling, Bernardo F.
1975-01-01
Exploratory drilling is still in incipient or youthful stages in those areas of the world where the bulk of the potential petroleum resources is yet to be discovered. Methods of assessing resources from projections based on historical production and reserve data are limited to mature areas. For most of the world's petroleum-prospective areas, a more speculative situation calls for a critical review of resource-assessment methodology. The language of mathematical statistics is required to define more rigorously the appraisal of petroleum resources. Basically, two approaches have been used to appraise the amounts of undiscovered mineral resources in a geologic province: (1) projection models, which use statistical data on the past outcome of exploration and development in the province; and (2) estimation models of the overall resources of the province, which use certain known parameters of the province together with the outcome of exploration and development in analogous provinces. These two approaches often lead to widely different estimates. Some of the controversy that arises results from a confusion of the probabilistic significance of the quantities yielded by each of the two approaches. Also, inherent limitations of analytic projection models-such as those using the logistic and Gomperts functions --have often been ignored. The resource-assessment problem should be recast in terms that provide for consideration of the probability of existence of the resource and of the probability of discovery of a deposit. Then the two above-mentioned models occupy the two ends of the probability range. The new approach accounts for (1) what can be expected with reasonably high certainty by mere projections of what has been accomplished in the past; (2) the inherent biases of decision-makers and resource estimators; (3) upper bounds that can be set up as goals for exploration; and (4) the uncertainties in geologic conditions in a search for minerals. Actual outcomes can then
Georgouli, Konstantia; Martinez Del Rincon, Jesus; Koidis, Anastasios
2017-02-15
The main objective of this work was to develop a novel dimensionality reduction technique as a part of an integrated pattern recognition solution capable of identifying adulterants such as hazelnut oil in extra virgin olive oil at low percentages based on spectroscopic chemical fingerprints. A novel Continuous Locality Preserving Projections (CLPP) technique is proposed which allows the modelling of the continuous nature of the produced in-house admixtures as data series instead of discrete points. The maintenance of the continuous structure of the data manifold enables the better visualisation of this examined classification problem and facilitates the more accurate utilisation of the manifold for detecting the adulterants. The performance of the proposed technique is validated with two different spectroscopic techniques (Raman and Fourier transform infrared, FT-IR). In all cases studied, CLPP accompanied by k-Nearest Neighbors (kNN) algorithm was found to outperform any other state-of-the-art pattern recognition techniques. PMID:27664692
NASA Astrophysics Data System (ADS)
Calderon, Christopher P.; Weiss, Lucien E.; Moerner, W. E.
2014-05-01
Experimental advances have improved the two- (2D) and three-dimensional (3D) spatial resolution that can be extracted from in vivo single-molecule measurements. This enables researchers to quantitatively infer the magnitude and directionality of forces experienced by biomolecules in their native environment. Situations where such force information is relevant range from mitosis to directed transport of protein cargo along cytoskeletal structures. Models commonly applied to quantify single-molecule dynamics assume that effective forces and velocity in the x ,y (or x ,y,z) directions are statistically independent, but this assumption is physically unrealistic in many situations. We present a hypothesis testing approach capable of determining if there is evidence of statistical dependence between positional coordinates in experimentally measured trajectories; if the hypothesis of independence between spatial coordinates is rejected, then a new model accounting for 2D (3D) interactions can and should be considered. Our hypothesis testing technique is robust, meaning it can detect interactions, even if the noise statistics are not well captured by the model. The approach is demonstrated on control simulations and on experimental data (directed transport of intraflagellar transport protein 88 homolog in the primary cilium).
Calderon, Christopher P; Weiss, Lucien E; Moerner, W E
2014-05-01
Experimental advances have improved the two- (2D) and three-dimensional (3D) spatial resolution that can be extracted from in vivo single-molecule measurements. This enables researchers to quantitatively infer the magnitude and directionality of forces experienced by biomolecules in their native environment. Situations where such force information is relevant range from mitosis to directed transport of protein cargo along cytoskeletal structures. Models commonly applied to quantify single-molecule dynamics assume that effective forces and velocity in the x,y (or x,y,z) directions are statistically independent, but this assumption is physically unrealistic in many situations. We present a hypothesis testing approach capable of determining if there is evidence of statistical dependence between positional coordinates in experimentally measured trajectories; if the hypothesis of independence between spatial coordinates is rejected, then a new model accounting for 2D (3D) interactions can and should be considered. Our hypothesis testing technique is robust, meaning it can detect interactions, even if the noise statistics are not well captured by the model. The approach is demonstrated on control simulations and on experimental data (directed transport of intraflagellar transport protein 88 homolog in the primary cilium). PMID:25353827
Arismendi, Ivan; Johnson, Sherri L.; Dunham, Jason
2015-01-01
Statistics of central tendency and dispersion may not capture relevant or desired characteristics of the distribution of continuous phenomena and, thus, they may not adequately describe temporal patterns of change. Here, we present two methodological approaches that can help to identify temporal changes in environmental regimes. First, we use higher-order statistical moments (skewness and kurtosis) to examine potential changes of empirical distributions at decadal extents. Second, we adapt a statistical procedure combining a non-metric multidimensional scaling technique and higher density region plots to detect potentially anomalous years. We illustrate the use of these approaches by examining long-term stream temperature data from minimally and highly human-influenced streams. In particular, we contrast predictions about thermal regime responses to changing climates and human-related water uses. Using these methods, we effectively diagnose years with unusual thermal variability and patterns in variability through time, as well as spatial variability linked to regional and local factors that influence stream temperature. Our findings highlight the complexity of responses of thermal regimes of streams and reveal their differential vulnerability to climate warming and human-related water uses. The two approaches presented here can be applied with a variety of other continuous phenomena to address historical changes, extreme events, and their associated ecological responses.
Cosmic statistics of statistics
NASA Astrophysics Data System (ADS)
Szapudi, István; Colombi, Stéphane; Bernardeau, Francis
1999-12-01
The errors on statistics measured in finite galaxy catalogues are exhaustively investigated. The theory of errors on factorial moments by Szapudi & Colombi is applied to cumulants via a series expansion method. All results are subsequently extended to the weakly non-linear regime. Together with previous investigations this yields an analytic theory of the errors for moments and connected moments of counts in cells from highly non-linear to weakly non-linear scales. For non-linear functions of unbiased estimators, such as the cumulants, the phenomenon of cosmic bias is identified and computed. Since it is subdued by the cosmic errors in the range of applicability of the theory, correction for it is inconsequential. In addition, the method of Colombi, Szapudi & Szalay concerning sampling effects is generalized, adapting the theory for inhomogeneous galaxy catalogues. While previous work focused on the variance only, the present article calculates the cross-correlations between moments and connected moments as well for a statistically complete description. The final analytic formulae representing the full theory are explicit but somewhat complicated. Therefore we have made available a fortran program capable of calculating the described quantities numerically (for further details e-mail SC at colombi@iap.fr). An important special case is the evaluation of the errors on the two-point correlation function, for which this should be more accurate than any method put forward previously. This tool will be immensely useful in the future for assessing the precision of measurements from existing catalogues, as well as aiding the design of new galaxy surveys. To illustrate the applicability of the results and to explore the numerical aspects of the theory qualitatively and quantitatively, the errors and cross-correlations are predicted under a wide range of assumptions for the future Sloan Digital Sky Survey. The principal results concerning the cumulants ξ, Q3 and Q4 is that
NASA Astrophysics Data System (ADS)
Lee, Lopaka; Helsel, Dennis
2007-05-01
Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data—perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.
Lee, L.; Helsel, D.
2007-01-01
Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.
Zhang, Han; Zhao, Yang-Yu; Song, Jing; Zhu, Qi-Ying; Yang, Hua; Zheng, Mei-Ling; Xuan, Zhao-Ling; Wei, Yuan; Chen, Yang; Yuan, Peng-Bo; Yu, Yang; Li, Da-Wei; Liang, Jun-Bin; Fan, Ling; Chen, Chong-Jian; Qiao, Jie
2015-01-01
Analyses of cell-free fetal DNA (cff-DNA) from maternal plasma using massively parallel sequencing enable the noninvasive detection of feto-placental chromosome aneuploidy; this technique has been widely used in clinics worldwide. Noninvasive prenatal tests (NIPT) based on cff-DNA have achieved very high accuracy; however, they suffer from maternal copy-number variations (CNV) that may cause false positives and false negatives. In this study, we developed an algorithm to exclude the effect of maternal CNV and refined the Z-score that is used to determine fetal aneuploidy. The simulation results showed that the algorithm is robust against variations of fetal concentration and maternal CNV size. We also introduced a method based on the discrepancy between feto-placental concentrations to help reduce the false-positive ratio. A total of 6615 pregnant women were enrolled in a prospective study to validate the accuracy of our method. All 106 fetuses with T21, 20 with T18, and three with T13 were tested using our method, with sensitivity of 100% and specificity of 99.97%. In the results, two cases with maternal duplications in chromosome 21, which were falsely predicted as T21 by the previous NIPT method, were correctly classified as normal by our algorithm, which demonstrated the effectiveness of our approach. PMID:26534864
Zhang, Han; Zhao, Yang-Yu; Song, Jing; Zhu, Qi-Ying; Yang, Hua; Zheng, Mei-Ling; Xuan, Zhao-Ling; Wei, Yuan; Chen, Yang; Yuan, Peng-Bo; Yu, Yang; Li, Da-Wei; Liang, Jun-Bin; Fan, Ling; Chen, Chong-Jian; Qiao, Jie
2015-01-01
Analyses of cell-free fetal DNA (cff-DNA) from maternal plasma using massively parallel sequencing enable the noninvasive detection of feto-placental chromosome aneuploidy; this technique has been widely used in clinics worldwide. Noninvasive prenatal tests (NIPT) based on cff-DNA have achieved very high accuracy; however, they suffer from maternal copy-number variations (CNV) that may cause false positives and false negatives. In this study, we developed an algorithm to exclude the effect of maternal CNV and refined the Z-score that is used to determine fetal aneuploidy. The simulation results showed that the algorithm is robust against variations of fetal concentration and maternal CNV size. We also introduced a method based on the discrepancy between feto-placental concentrations to help reduce the false-positive ratio. A total of 6615 pregnant women were enrolled in a prospective study to validate the accuracy of our method. All 106 fetuses with T21, 20 with T18, and three with T13 were tested using our method, with sensitivity of 100% and specificity of 99.97%. In the results, two cases with maternal duplications in chromosome 21, which were falsely predicted as T21 by the previous NIPT method, were correctly classified as normal by our algorithm, which demonstrated the effectiveness of our approach. PMID:26534864
Kim, Tae Sun; Ko, Kwang Jin; Shin, Seung Jea; Ryoo, Hyun Soo; Song, Wan; Sung, Hyun Hwan; Han, Deok Hyun; Jeong, Byong Chang; Seo, Seong Il; Jeon, Seong Soo; Lee, Kyu Sung; Lee, Sung Won; Lee, Hyun Moo; Choi, Han Yong
2015-01-01
Purpose To investigate the differences in the cancer detection rate and pathological findings on a second prostate biopsy according to benign diagnosis, high-grade prostatic intraepithelial neoplasia (HGPIN), and atypical small acinar proliferation (ASAP) on first biopsy. Materials and Methods We retrospectively reviewed the records of 1,323 patients who underwent a second prostate biopsy between March 1995 and November 2012. We divided the patients into three groups according to the pathologic findings on the first biopsy (benign diagnosis, HGPIN, and ASAP). We compared the cancer detection rate and Gleason scores on second biopsy and the unfavorable disease rate after radical prostatectomy among the three groups. Results A total of 214 patients (16.2%) were diagnosed with prostate cancer on a second biopsy. The rate of cancer detection was 14.6% in the benign diagnosis group, 22.1% in the HGPIN group, and 32.1% in the ASAP group, respectively (p<0.001). When patients were divided into subgroups according to the number of positive cores, the rate of cancer detection was 16.7%, 30.5%, 31.0%, and 36.4% in patients with a single core of HGPIN, more than one core of HGPIN, a single core of ASAP, and more than one core of ASAP, respectively. There were no significant differences in Gleason scores on second biopsy (p=0.324) or in the unfavorable disease rate after radical prostatectomy among the three groups (benign diagnosis vs. HGPIN, p=0.857, and benign diagnosis vs. ASAP, p=0.957, respectively). Conclusions Patients with multiple cores of HGPIN or any core number of ASAP on a first biopsy had a significantly higher cancer detection rate on a second biopsy. Repeat biopsy should be considered and not be delayed in those patients. PMID:26682019
NASA Astrophysics Data System (ADS)
Coleman, N.; Abramson, L.
2004-05-01
Yucca Mt. (YM) is a potential repository site for high-level radioactive waste and spent fuel. One issue is the potential for future igneous activity to intersect the repository. If the event probability is <1E-8/yr, it need not be considered in licensing. Plio-Quaternary volcanos and older basalts occur near YM. Connor et al (JGR, 2000) estimate a probability of 1E-8/yr to 1E-7/yr for a basaltic dike to intersect the potential repository. Based on aeromagnetic data, Hill and Stamatakos (CNWRA, 2002) propose that additional volcanos may lie buried in nearby basins. They suggest if these volcanos are part of temporal-clustered volcanic activity, the probability of an intrusion may be as high as 1E-6/yr. We examine whether recurrence probabilities >2E-7/yr are realistic given that no dikes have been found in or above the 1.3E7 yr-old potential repository block. For 2E-7/yr (or 1E-6/yr), the expected number of penetrating dikes is 2.6 (respectively, 13), and the probability of at least one penetration is 0.93 (0.999). These results are not consistent with the exploration evidence. YM is one of the most intensively studied places on Earth. Over 20 yrs of studies have included surface and subsurface mapping, geophysical surveys, construction of 10+ km of tunnels in the mountain, drilling of many boreholes, and construction of many pits (DOE, Site Recommendation, 2002). It seems unlikely that multiple dikes could exist within the proposed repository footprint and escape detection. A dike complex dated 11.7 Ma (Smith et al, UNLV, 1997) or 10 Ma (Carr and Parrish, 1985) does exist NW of YM and west of the main Solitario Canyon Fault. These basalts intruded the Tiva Canyon Tuff (12.7 Ma) in an epoch of caldera-forming pyroclastic eruptions that ended millions of yrs ago. We would conclude that basaltic volcanism related to Miocene silicic volcanism may also have ended. Given the nondetection of dikes in the potential repository, we can use a Poisson model to estimate an
Cloud vertical structures detected by lidar and its statistical results at HeRO site in Hefei, China
NASA Astrophysics Data System (ADS)
Sun, Lu; Liu, Dong; Wang, Zhien; Wang, Zhenzhu; Wu, Decheng; Bo, Guangyu; Wang, Yingjian
2014-11-01
Extensive studies have illustrated the importance of obtaining exact vertical structures of clouds and aerosols for satellite and relevant climate simulations. However, challenging exists, for example, in distinguishing clouds from aerosols at times. Accurate cloud vertical profiles are mainly determined by cloud bases and heights. Based on the ground-based lidar observations in Hefei Radiation Observatory (HeRO), the vertical structures of clouds and aerosols in Hefei area(31.89°N 117.17°E) during May 2012-May 2014 have been investigated. The results show that the cloud fraction in the autumn and winter is less than that in the summer and spring, and is largest in the spring followed by the summer. The cloud fractions in the autumn and winter are comparable. The low cloud accounts for the most portion of the total. Compared with the cloud of the other heights, the high cloud is the least in the winter. Nearly 50% of the total vertical profiles can be detected by lidar as clouds and the proportion of the cloud of different heights seems to be stable annually. The fraction of low cloud is nearly 45%, medium cloud is nearly 35%, and high cloud is nearly 20%. In comparison with the results derived from CALIPSO, it is found that high cloud is usually missed for the ground-based lidar, while low cloud is usually omitted for the satellite observations. A combination of ground-based and space-borne lidar could lead to more reliable results. Further analysis will be performed in future studies.
NASA Astrophysics Data System (ADS)
Hao, Q.; Fang, C.; Cao, W.; Chen, P. F.
2015-12-01
We improve our filament automated detection method which was proposed in our previous works. It is then applied to process the full disk Hα data mainly obtained by the Big Bear Solar Observatory from 1988 to 2013, spanning nearly three solar cycles. The butterfly diagrams of the filaments, showing the information of the filament area, spine length, tilt angle, and the barb number, are obtained. The variations of these features with the calendar year and the latitude band are analyzed. The drift velocities of the filaments in different latitude bands are calculated and studied. We also investigate the north-south (N-S) asymmetries of the filament numbers in total and in each subclass classified according to the filament area, spine length, and tilt angle. The latitudinal distribution of the filament number is found to be bimodal. About 80% of all the filaments have tilt angles within [0°, 60°]. For the filaments within latitudes lower (higher) than 50°, the northeast (northwest) direction is dominant in the northern hemisphere and the southeast (southwest) direction is dominant in the southern hemisphere. The latitudinal migrations of the filaments experience three stages with declining drift velocities in each of solar cycles 22 and 23, and it seems that the drift velocity is faster in shorter solar cycles. Most filaments in latitudes lower (higher) than 50° migrate toward the equator (polar region). The N-S asymmetry indices indicate that the southern hemisphere is the dominant hemisphere in solar cycle 22 and the northern hemisphere is the dominant one in solar cycle 23.
Cao, Mei; Yie, Shang-Mian; Wu, Sheng-Min; Chen, Shu; Lou, Be; He, Xu; Ye, Shang-Rong; Xie, Ke; Rao, Lin; Gao, En; Ye, Nai-Yao
2009-01-01
We previously demonstrated that the detection of circulating cancer cells (CCC) expressing survivin mRNA could provide valuable information for predicting recurrence in patients with breast, lung, gastric and colorectal carcinoma. The purpose of this study is to investigate whether the detection of survivin-expressing CCC in the peripheral blood is also useful for predicting recurrence in patients with esophageal squamous cell carcinoma (ESCC). Blood samples obtained from 108 ESCC patients and 75 healthy volunteers were quantitatively investigated by a technique that detected reverse transcription-polymerase chain reaction products based on a hybridization-enzyme linked immunosorbent essay. Not all of the patients were available for the follow-up study. Only 48 patients who were treated with similar adjuvant therapy regimens were available and followed-up for 33 months after the initial assay test. Survivin-expressing CCC were detected in 51 (47.2%) patients. The presence of survivin-expressing CCC was found to be significantly associated with depth of invasion, vascular invasion, nodal status, and disease stages (P = 0.032, 0.019, 0.018, and 0.001, respectively). During the follow-up period, patients who had positive survivin expressions had a higher relapse rate and a shorter survival time than those who had negative survivin expressions (P = 0.002 and 0.016, respectively). Examination of survivin-expressing CCC could provide valuable information in the prediction of haematogenous recurrence as well as in the prognosis of ESCC. PMID:19521785
Ning, Lihua; Kan, Guizhen; Du, Wenkai; Guo, Shiwei; Wang, Qing; Zhang, Guozheng; Cheng, Hao; Yu, Deyue
2016-03-01
Tolerance to low-phosphorus soil is a desirable trait in soybean cultivars. Previous quantitative trait locus (QTL) studies for phosphorus-deficiency tolerance were mainly derived from bi-parental segregating populations and few reports from natural population. The objective of this study was to detect QTLs that regulate phosphorus-deficiency tolerance in soybean using association mapping approach. Phosphorus-deficiency tolerance was evaluated according to five traits (plant shoot height, shoot dry weight, phosphorus concentration, phosphorus acquisition efficiency and use efficiency) comprising a conditional phenotype at the seedling stage. Association mapping of the conditional phenotype detected 19 SNPs including 13 SNPs that were significantly associated with the five traits across two years. A novel cluster of SNPs, including three SNPs that consistently showed significant effects over two years, that associated with more than one trait was detected on chromosome 3. All favorable alleles, which were determined based on the mean of conditional phenotypic values of each trait over the two years, could be pyramided into one cultivar through parental cross combination. The best three cross combinations were predicted with the aim of simultaneously improving phosphorus acquisition efficiency and use efficiency. These results will provide a thorough understanding of the genetic basis of phosphorus deficiency tolerance in soybean.
Ning, Lihua; Kan, Guizhen; Du, Wenkai; Guo, Shiwei; Wang, Qing; Zhang, Guozheng; Cheng, Hao; Yu, Deyue
2016-01-01
Tolerance to low-phosphorus soil is a desirable trait in soybean cultivars. Previous quantitative trait locus (QTL) studies for phosphorus-deficiency tolerance were mainly derived from bi-parental segregating populations and few reports from natural population. The objective of this study was to detect QTLs that regulate phosphorus-deficiency tolerance in soybean using association mapping approach. Phosphorus-deficiency tolerance was evaluated according to five traits (plant shoot height, shoot dry weight, phosphorus concentration, phosphorus acquisition efficiency and use efficiency) comprising a conditional phenotype at the seedling stage. Association mapping of the conditional phenotype detected 19 SNPs including 13 SNPs that were significantly associated with the five traits across two years. A novel cluster of SNPs, including three SNPs that consistently showed significant effects over two years, that associated with more than one trait was detected on chromosome 3. All favorable alleles, which were determined based on the mean of conditional phenotypic values of each trait over the two years, could be pyramided into one cultivar through parental cross combination. The best three cross combinations were predicted with the aim of simultaneously improving phosphorus acquisition efficiency and use efficiency. These results will provide a thorough understanding of the genetic basis of phosphorus deficiency tolerance in soybean. PMID:27162491
Pieralice, Francesca; Proietti, Raffaele; La Valle, Paola; Giorgi, Giordano; Mazzolena, Marco; Taramelli, Andrea; Nicoletti, Luisa
2014-12-01
The Marine Strategy Framework Directive (MSFD, 2008/56/EC) is focused on protection, preservation and restoration of the marine environment by achieving and maintaining Good Environmental Status (GES) by 2020. Within this context, this paper presents a methodological approach for a fast and repeatable monitoring that allows quantitative assessment of seabed abrasion pressure due to recreational boat anchoring. The methodology consists of two steps: a semi-automatic procedure based on an algorithm for the ship detection in SAR imagery and a statistical model to obtain maps of spatial and temporal distribution density of anchored boats. Ship detection processing has been performed on 36 ASAR VV-pol images of Liguria test site, for the three years 2008, 2009 and 2010. Starting from the pointwise distribution layer produced by ship detection in imagery, boats points have been subdivided into 4 areas where a constant distribution density has been assumed for the entire period 2008-2010. In the future, this methodology will be applied also to higher resolution data of Sentinel-1 mission, specifically designed for the operational needs of the European Programme Copernicus. PMID:25096752
Pieralice, Francesca; Proietti, Raffaele; La Valle, Paola; Giorgi, Giordano; Mazzolena, Marco; Taramelli, Andrea; Nicoletti, Luisa
2014-12-01
The Marine Strategy Framework Directive (MSFD, 2008/56/EC) is focused on protection, preservation and restoration of the marine environment by achieving and maintaining Good Environmental Status (GES) by 2020. Within this context, this paper presents a methodological approach for a fast and repeatable monitoring that allows quantitative assessment of seabed abrasion pressure due to recreational boat anchoring. The methodology consists of two steps: a semi-automatic procedure based on an algorithm for the ship detection in SAR imagery and a statistical model to obtain maps of spatial and temporal distribution density of anchored boats. Ship detection processing has been performed on 36 ASAR VV-pol images of Liguria test site, for the three years 2008, 2009 and 2010. Starting from the pointwise distribution layer produced by ship detection in imagery, boats points have been subdivided into 4 areas where a constant distribution density has been assumed for the entire period 2008-2010. In the future, this methodology will be applied also to higher resolution data of Sentinel-1 mission, specifically designed for the operational needs of the European Programme Copernicus.
Chan, Ian; Wells, William; Mulkern, Robert V; Haker, Steven; Zhang, Jianqing; Zou, Kelly H; Maier, Stephan E; Tempany, Clare M C
2003-09-01
A multichannel statistical classifier for detecting prostate cancer was developed and validated by combining information from three different magnetic resonance (MR) methodologies: T2-weighted, T2-mapping, and line scan diffusion imaging (LSDI). From these MR sequences, four different sets of image intensities were obtained: T2-weighted (T2W) from T2-weighted imaging, Apparent Diffusion Coefficient (ADC) from LSDI, and proton density (PD) and T2 (T2 Map) from T2-mapping imaging. Manually segmented tumor labels from a radiologist, which were validated by biopsy results, served as tumor "ground truth." Textural features were extracted from the images using co-occurrence matrix (CM) and discrete cosine transform (DCT). Anatomical location of voxels was described by a cylindrical coordinate system. A statistical jack-knife approach was used to evaluate our classifiers. Single-channel maximum likelihood (ML) classifiers were based on 1 of the 4 basic image intensities. Our multichannel classifiers: support vector machine (SVM) and Fisher linear discriminant (FLD), utilized five different sets of derived features. Each classifier generated a summary statistical map that indicated tumor likelihood in the peripheral zone (PZ) of the prostate gland. To assess classifier accuracy, the average areas under the receiver operator characteristic (ROC) curves over all subjects were compared. Our best FLD classifier achieved an average ROC area of 0.839(+/-0.064), and our best SVM classifier achieved an average ROC area of 0.761(+/-0.043). The T2W ML classifier, our best single-channel classifier, only achieved an average ROC area of 0.599(+/-0.146). Compared to the best single-channel ML classifier, our best multichannel FLD and SVM classifiers have statistically superior ROC performance (P=0.0003 and 0.0017, respectively) from pairwise two-sided t-test. By integrating the information from multiple images and capturing the textural and anatomical features in tumor areas, summary
Hunt, N C; Ghosh, K M; Blain, A P; Rushton, S P; Longstaff, L M; Deehan, D J
2015-05-01
The aim of this study was to compare the maximum laxity conferred by the cruciate-retaining (CR) and posterior-stabilised (PS) Triathlon single-radius total knee arthroplasty (TKA) for anterior drawer, varus-valgus opening and rotation in eight cadaver knees through a defined arc of flexion (0º to 110º). The null hypothesis was that the limits of laxity of CR- and PS-TKAs are not significantly different. The investigation was undertaken in eight loaded cadaver knees undergoing subjective stress testing using a measurement rig. Firstly the native knee was tested prior to preparation for CR-TKA and subsequently for PS-TKA implantation. Surgical navigation was used to track maximal displacements/rotations at 0º, 30º, 60º, 90º and 110° of flexion. Mixed-effects modelling was used to define the behaviour of the TKAs. The laxity measured for the CR- and PS-TKAs revealed no statistically significant differences over the studied flexion arc for the two versions of TKA. Compared with the native knee both TKAs exhibited slightly increased anterior drawer and decreased varus-valgus and internal-external roational laxities. We believe further study is required to define the clinical states for which the additional constraint offered by a PS-TKA implant may be beneficial.
NASA Astrophysics Data System (ADS)
Temme, F. P.
1992-12-01
Realisation of the invariance properties of the p ⩽ 2 number partitional inventory components of the 20-fold spin algebra associated with [A] 20 nuclear spin clusters under SU2 × L20 allows the mappings {[λ] → Γ} to be derived. In addition, recent general inner tensor product expressions under Ln, for n even (odd), also facilitates the evaluation of many higher [λ] ( L20; p = 3) correlative mappings onto SU3↓SO(3) × L↓20T A 5 subduced symmetry from SU2 duality, thus providing results that determine the nature of adapted NMR bases for both dodecahedrane and its d 20 analogue. The significance of this work lies in the pertinence of nuclear spin statistics to both selective MQ-NMR and to other spectroscopic aspects of cage clusters, e.g., [ 13C] n, n = 20, 60, fullerenes. Mappings onto Ln irreps sets of specific p ⩽ 3 number partitions arise in combinatorial treatment of {M iti} Rota fields, defining scalar invariants in the context of Cayley algebra. Inclusion of the Ln group in the specific Racah chain for NMR symmetry gives rise to significant further physical insight.
Shanmugam, Nesan; Boerdlein, Annegret; Proff, Jochen; Ong, Peter; Valencia, Oswaldo; Maier, Sebastian K.G.; Bauer, Wolfgang R.; Paul, Vince; Sack, Stefan
2012-01-01
Aims Uncertainty exists over the importance of device-detected short-duration atrial arrhythmias. Continuous atrial diagnostics, through home monitoring (HM) technology (BIOTRONIK, Berlin, Germany), provides a unique opportunity to assess frequency and quantity of atrial fibrillation (AF) episodes defined as atrial high-rate events (AHRE). Methods and results Prospective data from 560 heart failure (HF) patients (age 67 ± 10 years, median ejection fraction 27%) patients with a cardiac resynchronization therapy (CRT) device capable of HM from two multi-centre studies were analysed. Atrial high-rate events burden was defined as the duration of mode switch in a 24-h period with atrial rates of >180 beats for at least 1% or total of 14 min per day. The primary endpoint was incidence of a thromboembolic (TE) event. Secondary endpoints were cardiovascular death, hospitalization because of AF, or worsening HF. Over a median 370-day follow-up AHRE occurred in 40% of patients with 11 (2%) patients developing TE complications and mortality rate of 4.3% (24 deaths, 16 with cardiovascular aetiology). Compared with patients without detected AHRE, patients with detected AHRE>3.8 h over a day were nine times more likely to develop TE complications (P= 0.006). The majority of patients (73%) did not show a temporal association with the detected atrial episode and their adverse event, with a mean interval of 46.7 ± 71.9 days (range 0–194) before the TE complication. Conclusion In a high-risk cohort of HF patients, device-detected atrial arrhythmias are associated with an increased incidence of TE events. A cut-off point of 3.8 h over 24 h was associated with significant increase in the event rate. Routine assessment of AHRE should be considered with other data when assessing stroke risk and considering anti-coagulation initiation and should also prompt the optimization of cardioprotective HF therapy in CRT patients. PMID:21933802
NASA Technical Reports Server (NTRS)
Bowles, Roland L.; Buck, Bill K.
2009-01-01
The objective of the research developed and presented in this document was to statistically assess turbulence hazard detection performance employing airborne pulse Doppler radar systems. The FAA certification methodology for forward looking airborne turbulence radars will require estimating the probabilities of missed and false hazard indications under operational conditions. Analytical approaches must be used due to the near impossibility of obtaining sufficient statistics experimentally. This report describes an end-to-end analytical technique for estimating these probabilities for Enhanced Turbulence (E-Turb) Radar systems under noise-limited conditions, for a variety of aircraft types, as defined in FAA TSO-C134. This technique provides for one means, but not the only means, by which an applicant can demonstrate compliance to the FAA directed ATDS Working Group performance requirements. Turbulence hazard algorithms were developed that derived predictive estimates of aircraft hazards from basic radar observables. These algorithms were designed to prevent false turbulence indications while accurately predicting areas of elevated turbulence risks to aircraft, passengers, and crew; and were successfully flight tested on a NASA B757-200 and a Delta Air Lines B737-800. Application of this defined methodology for calculating the probability of missed and false hazard indications taking into account the effect of the various algorithms used, is demonstrated for representative transport aircraft and radar performance characteristics.
Shakir, Nabeel A.; George, Arvin K.; Siddiqui, M. Minhaj; Rothwax, Jason T.; Rais-Bahrami, Soroush; Stamatakis, Lambros; Su, Daniel; Okoro, Chinonyerem; Raskolnikov, Dima; Walton-Diaz, Annerleim; Simon, Richard; Turkbey, Baris; Choyke, Peter L.; Merino, Maria J.; Wood, Bradford J.; Pinto, Peter A.
2015-01-01
Purpose Prostate specific antigen sensitivity increases with lower threshold values but with a corresponding decrease in specificity. Magnetic resonance imaging/ultrasound targeted biopsy detects prostate cancer more efficiently and of higher grade than standard 12-core transrectal ultrasound biopsy but the optimal population for its use is not well defined. We evaluated the performance of magnetic resonance imaging/ultrasound targeted biopsy vs 12-core biopsy across a prostate specific antigen continuum. Materials and Methods We reviewed the records of all patients enrolled in a prospective trial who underwent 12-core transrectal ultrasound and magnetic resonance imaging/ultrasound targeted biopsies from August 2007 through February 2014. Patients were stratified by each of 4 prostate specific antigen cutoffs. The greatest Gleason score using either biopsy method was compared in and across groups as well as across the population prostate specific antigen range. Clinically significant prostate cancer was defined as Gleason 7 (4 + 3) or greater. Univariate and multivariate analyses were performed. Results A total of 1,003 targeted and 12-core transrectal ultrasound biopsies were performed, of which 564 diagnosed prostate cancer for a 56.2% detection rate. Targeted biopsy led to significantly more upgrading to clinically significant disease compared to 12-core biopsy. This trend increased more with increasing prostate specific antigen, specifically in patients with prostate specific antigen 4 to 10 and greater than 10 ng/ml. Prostate specific antigen 5.2 ng/ml or greater captured 90% of upgrading by targeted biopsy, corresponding to 64% of patients who underwent multiparametric magnetic resonance imaging and subsequent fusion biopsy. Conversely a greater proportion of clinically insignificant disease was detected by 12-core vs targeted biopsy overall. These differences persisted when controlling for potential confounders on multivariate analysis. Conclusions Prostate
Parra-Blanco, Adolfo; Nicolás-Pérez, David; Gimeno-García, Antonio; Grosso, Begoña; Jiménez, Alejandro; Ortega, Juan; Quintero, Enrique
2006-01-01
AIM: To compare the cleansing quality of polyethylene glycol electrolyte solution and sodium phosphate with different schedules of administration, and to evaluate whether the timing of the administration of bowel preparation affects the detection of polyps. METHODS: One hundred and seventy-seven consecutive outpatients scheduled for colonoscopy were randomized in one of four groups to receive polyethylene glycol electrolyte solution or oral sodium phosphate with two different timing schedules. Quality of cleansing, polyp detection, and tolerance were evaluated. RESULTS: Patients receiving polyethylene glycol or sodium phosphate on the same day as the colonoscopy, obtained good to excellent global cleansing scores more frequently than patients who received polyethylene glycol or sodium phosphate on the day prior to the procedure (P < 0.001). Flat lesions, but not flat adenomas, were more frequent in patients prepared on the same day (P = 0.02). CONCLUSION: The quality of colonic cleansing and the detection of flat lesions are significantly improved when the preparation is taken on the day of the colonoscopy. PMID:17036388
Lee, Dong Hoon; Nam, Jong Kil; Park, Sung Woo; Lee, Seung Soo; Han, Ji-Yeon; Lee, Sang Don; Lee, Joon Woo
2016-01-01
Purpose To compare prostate cancer detection rates between 12 cores transrectal ultrasound-guided prostate biopsy (TRUS-Bx) and visually estimated multiparametric magnetic resonance imaging (mp-MRI)-targeted prostate biopsy (MRI-visual-Bx) for patients with prostate specific antigen (PSA) level less than 10 ng/mL. Materials and Methods In total, 76 patients with PSA levels below 10 ng/mL underwent 3.0 Tesla mp-MRI and TRUS-Bx prospectively in 2014. In patients with abnormal lesions on mp-MRI, we performed additional MRI-visual-Bx. We compared pathologic results, including the rate of clinically significant prostate cancer cores (cancer length greater than 5 mm and/or any Gleason grade greater than 3 in the biopsy core). Results The mean PSA was 6.43 ng/mL. In total, 48 of 76 (63.2%) patients had abnormal lesions on mp-MRI, and 116 targeted biopsy cores, an average of 2.42 per patient, were taken. The overall detection rates of prostate cancer using TRUS-Bx and MRI-visual-Bx were 26/76 (34.2%) and 23/48 (47.9%), respectively. In comparing the pathologic results of TRUS-Bx and MRI-visual-Bx cores, the positive rates were 8.4% (77 of 912 cores) and 46.6% (54 of 116 cores), respectively (p<0.001). Mean cancer core lengths and mean cancer core percentages were 3.2 mm and 24.5%, respectively, in TRUS-Bx and 6.3 mm and 45.4% in MRI-visual-Bx (p<0.001). In addition, Gleason score ≥7 was noted more frequently using MRI-visual-Bx (p=0.028). The detection rate of clinically significant prostate cancer was 27/77 (35.1%) and 40/54 (74.1%) for TRUS-Bx and MRI-visual-Bx, respectively (p<0.001). Conclusion MRI-visual-Bx showed better performance in the detection of clinically significant prostate cancer, compared to TRUS-Bx among patients with a PSA level less than 10 ng/mL. PMID:26996553
Lima, Monique da Rocha Queiroz; Nogueira, Rita Maria Ribeiro; Filippis, Ana Maria Bispo de; Nunes, Priscila Conrado Guerra; Sousa, Carla Santos de; Silva, Manoela Heringer da; Santos, Flavia Barreto dos
2014-08-01
The secreted form of the dengue virus (DENV) nonstructural-1 (NS1) glycoprotein has been shown to be useful for the diagnosis of DENV infections in patients' serum samples. In a number of studies, the sensitivity of the commercially available DENV NS1 glycoprotein detection assays was higher against some DENV serotypes (DENV-1>DENV-3>DENV-2=DENV-4) than others and were also lower using patients' serum samples with secondary versus primary DENV infections. In this study, 471 DENV-4 positive acute phase patients' serum samples were selected from a large panel collected in Brazil from March 2011 to October 2012 by RT-PCR and/or virus isolation followed by serotype determination. The sera from primary (n=228) and secondary (n=238) DENV-4 infections were identified using IgM and IgG capture ELISAs. The sensitivity of a commercial DENV NS1 glycoprotein detection ELISA was then assessed when these serum samples were not pre-treated or pre-treated by acid or heat dissociation prior to being tested. Acid and heat dissociation of patients' serum samples with primary and secondary DENV-4 infections increased significantly the sensitivity of the DENV NS1 glycoprotein detection ELISA from 54.4% to 77.2% (p<0.05) and 82% (p<0.05) and from 39.1% to 63.9% (p<0.05) and 73.1% (p<0.05), respectively. Treatment of DENV infected patients' serum samples using simple and rapid heat dissociation step (100°C for 5min) was, therefore, shown to be very useful for increasing the sensitivity of the DENV NS1 glycoprotein detection ELISA using serum samples from either primary or secondary DENV infected patients.
Blokland, M H; Van Tricht, E F; Van Rossum, H J; Sterk, S S; Nielen, M W F
2012-01-01
For years it has been suspected that natural hormones are illegally used as growth promoters in cattle in the European Union. Unfortunately there is a lack of methods and criteria that can be used to detect the abuse of natural hormones and distinguish treated from non-treated animals. Pattern recognition of steroid profiles is a promising approach for tracing/detecting the abuse of natural hormones administered to cattle. Traditionally steroids are analysed in urine as free steroid after deconjugation of the glucuronide (and sulphate) conjugates. The disadvantage of this deconjugation is that valuable information about the steroid profile in the sample is lost. In this study we develop a method to analyse steroids at very low concentration levels (ng l(-1)) for the free steroid, glucuronide and sulphate conjugates in urine samples. This method was used to determine concentrations of natural (pro)hormones in a large population (n = 620) of samples from male and female bovine animals and from bovine animals treated with testosterone-cypionate, estradiol-benzoate, dihydroepiandrosterone and pregnenolone. The data acquired were used to build a statistical model applying the multivariate technique 'Soft Independent Modeling of Class Analogy' (SIMCA). It is demonstrated that by using this model the results of the urine analysis can indicate which animal may have had illegal treatment with natural (pro)hormones.
NASA Astrophysics Data System (ADS)
Blondeau-Patissier, David; Gower, James F. R.; Dekker, Arnold G.; Phinn, Stuart R.; Brando, Vittorio E.
2014-04-01
The need for more effective environmental monitoring of the open and coastal ocean has recently led to notable advances in satellite ocean color technology and algorithm research. Satellite ocean color sensors' data are widely used for the detection, mapping and monitoring of phytoplankton blooms because earth observation provides a synoptic view of the ocean, both spatially and temporally. Algal blooms are indicators of marine ecosystem health; thus, their monitoring is a key component of effective management of coastal and oceanic resources. Since the late 1970s, a wide variety of operational ocean color satellite sensors and algorithms have been developed. The comprehensive review presented in this article captures the details of the progress and discusses the advantages and limitations of the algorithms used with the multi-spectral ocean color sensors CZCS, SeaWiFS, MODIS and MERIS. Present challenges include overcoming the severe limitation of these algorithms in coastal waters and refining detection limits in various oceanic and coastal environments. To understand the spatio-temporal patterns of algal blooms and their triggering factors, it is essential to consider the possible effects of environmental parameters, such as water temperature, turbidity, solar radiation and bathymetry. Hence, this review will also discuss the use of statistical techniques and additional datasets derived from ecosystem models or other satellite sensors to characterize further the factors triggering or limiting the development of algal blooms in coastal and open ocean waters.
Experimental Mathematics and Computational Statistics
Bailey, David H.; Borwein, Jonathan M.
2009-04-30
The field of statistics has long been noted for techniques to detect patterns and regularities in numerical data. In this article we explore connections between statistics and the emerging field of 'experimental mathematics'. These includes both applications of experimental mathematics in statistics, as well as statistical methods applied to computational mathematics.
Jaspers, Veerle L B; Herzke, Dorte; Eulaers, Igor; Gillespie, Brenda W; Eens, Marcel
2013-02-01
Perfluoroalkyl substances (PFASs) were investigated in tail feathers and soft tissues (liver, muscle, preen gland and adipose tissue) of barn owl (Tyto alba) road-kill victims (n=15) collected in the province of Antwerp (Belgium). A major PFAS producing facility is located in the Antwerp area and levels of PFASs in biota from that region have been found to be very high in previous studies. We aimed to investigate for the first time the main sources of PFASs in feathers of a terrestrial bird species. Throughout this study, we have used statistical methods for left-censored data to cope with levels below the limit of detection (LOD), instead of traditional, potentially biased, substitution methods. Perfluorooctane sulfonate (PFOS) was detected in all tissues (range: 11ng/g ww in muscle-1208ng/g ww in preen oil) and in tail feathers (<2.2-56.6ng/g ww). Perfluorooctanoate (PFOA) was measured at high levels in feathers (<14-670ng/g ww), but not in tissues (more than 50%
Gnanapragasam, V. J.; Burling, K.; George, A.; Stearn, S.; Warren, A.; Barrett, T.; Koo, B.; Gallagher, F. A.; Doble, A.; Kastner, C.; Parker, R. A.
2016-01-01
Both multi-parametric MRI (mpMRI) and the Prostate Health Index (PHI) have shown promise in predicting a positive biopsy in men with suspected prostate cancer. Here we investigated the value o