2009-02-01
range of modal analysis and the high frequency region of statistical energy analysis , is referred to as the mid-frequency range. The corresponding...frequency range of modal analysis and the high frequency region of statistical energy analysis , is referred to as the mid-frequency range. The...predictions. The averaging process is consistent with the averaging done in statistical energy analysis for stochastic systems. The FEM will always
Statistical energy analysis computer program, user's guide
NASA Technical Reports Server (NTRS)
Trudell, R. W.; Yano, L. I.
1981-01-01
A high frequency random vibration analysis, (statistical energy analysis (SEA) method) is examined. The SEA method accomplishes high frequency prediction of arbitrary structural configurations. A general SEA computer program is described. A summary of SEA theory, example problems of SEA program application, and complete program listing are presented.
The Shock and Vibration Bulletin. Part 2. Invited Papers, Structural Dynamics
1974-08-01
VIKING LANDER DYNAMICS 41 Mr. Joseph C. Pohlen, Martin Marietta Aerospace, Denver, Colorado Structural Dynamics PERFORMANCE OF STATISTICAL ENERGY ANALYSIS 47...aerospace structures. Analytical prediction of these environments is beyond the current scope of classical modal techniques. Statistical energy analysis methods...have been developed that circumvent the difficulties of high-frequency nodal analysis. These statistical energy analysis methods are evaluated
Numerical Analysis of Stochastic Dynamical Systems in the Medium-Frequency Range
2003-02-01
frequency vibration analysis such as the statistical energy analysis (SEA), the traditional modal analysis (well-suited for high and low: frequency...that the first few structural normal modes primarily constitute the total response. In the higher frequency range, the statistical energy analysis (SEA
Harris, Alex; Reeder, Rachelle; Hyun, Jenny
2011-01-01
The authors surveyed 21 editors and reviewers from major psychology journals to identify and describe the statistical and design errors they encounter most often and to get their advice regarding prevention of these problems. Content analysis of the text responses revealed themes in 3 major areas: (a) problems with research design and reporting (e.g., lack of an a priori power analysis, lack of congruence between research questions and study design/analysis, failure to adequately describe statistical procedures); (b) inappropriate data analysis (e.g., improper use of analysis of variance, too many statistical tests without adjustments, inadequate strategy for addressing missing data); and (c) misinterpretation of results. If researchers attended to these common methodological and analytic issues, the scientific quality of manuscripts submitted to high-impact psychology journals might be significantly improved.
Development of Composite Materials with High Passive Damping Properties
2006-05-15
frequency response function analysis. Sound transmission through sandwich panels was studied using the statistical energy analysis (SEA). Modal density...2.2.3 Finite element models 14 2.2.4 Statistical energy analysis method 15 CHAPTER 3 ANALYSIS OF DAMPING IN SANDWICH MATERIALS. 24 3.1 Equation of...sheets and the core. 2.2.4 Statistical energy analysis method Finite element models are generally only efficient for problems at low and middle frequencies
Network Data: Statistical Theory and New Models
2016-02-17
SECURITY CLASSIFICATION OF: During this period of review, Bin Yu worked on many thrusts of high-dimensional statistical theory and methodologies. Her...research covered a wide range of topics in statistics including analysis and methods for spectral clustering for sparse and structured networks...2,7,8,21], sparse modeling (e.g. Lasso) [4,10,11,17,18,19], statistical guarantees for the EM algorithm [3], statistical analysis of algorithm leveraging
Statistical Symbolic Execution with Informed Sampling
NASA Technical Reports Server (NTRS)
Filieri, Antonio; Pasareanu, Corina S.; Visser, Willem; Geldenhuys, Jaco
2014-01-01
Symbolic execution techniques have been proposed recently for the probabilistic analysis of programs. These techniques seek to quantify the likelihood of reaching program events of interest, e.g., assert violations. They have many promising applications but have scalability issues due to high computational demand. To address this challenge, we propose a statistical symbolic execution technique that performs Monte Carlo sampling of the symbolic program paths and uses the obtained information for Bayesian estimation and hypothesis testing with respect to the probability of reaching the target events. To speed up the convergence of the statistical analysis, we propose Informed Sampling, an iterative symbolic execution that first explores the paths that have high statistical significance, prunes them from the state space and guides the execution towards less likely paths. The technique combines Bayesian estimation with a partial exact analysis for the pruned paths leading to provably improved convergence of the statistical analysis. We have implemented statistical symbolic execution with in- formed sampling in the Symbolic PathFinder tool. We show experimentally that the informed sampling obtains more precise results and converges faster than a purely statistical analysis and may also be more efficient than an exact symbolic analysis. When the latter does not terminate symbolic execution with informed sampling can give meaningful results under the same time and memory limits.
Statistical analysis of 59 inspected SSME HPFTP turbine blades (uncracked and cracked)
NASA Technical Reports Server (NTRS)
Wheeler, John T.
1987-01-01
The numerical results of statistical analysis of the test data of Space Shuttle Main Engine high pressure fuel turbopump second-stage turbine blades, including some with cracks are presented. Several statistical methods use the test data to determine the application of differences in frequency variations between the uncracked and cracked blades.
NASA Technical Reports Server (NTRS)
Grosveld, Ferdinand W.; Schiller, Noah H.; Cabell, Randolph H.
2011-01-01
Comet Enflow is a commercially available, high frequency vibroacoustic analysis software founded on Energy Finite Element Analysis (EFEA) and Energy Boundary Element Analysis (EBEA). Energy Finite Element Analysis (EFEA) was validated on a floor-equipped composite cylinder by comparing EFEA vibroacoustic response predictions with Statistical Energy Analysis (SEA) and experimental results. Statistical Energy Analysis (SEA) predictions were made using the commercial software program VA One 2009 from ESI Group. The frequency region of interest for this study covers the one-third octave bands with center frequencies from 100 Hz to 4000 Hz.
Optimal Regulation of Structural Systems with Uncertain Parameters.
1981-02-02
been addressed, in part, by Statistical Energy Analysis . Moti- vated by a concern with high frequency vibration and acoustical- structural...Parameter Systems," AFOSR-TR-79-0753 (May, 1979). 25. R. H. Lyon, Statistical Energy Analysis of Dynamical Systems: Theory and Applications, (M.I.T...Press, Cambridge, Mass., 1975). 26. E. E. Ungar, " Statistical Energy Analysis of Vibrating Systems," Trans. ASME, J. Eng. Ind. 89, 626 (1967). 139 27
An analysis of the relationship of flight hours and naval rotary wing aviation mishaps
2017-03-01
evidence to support indicators used for sequestration, high flight hours, night flight, and overwater flight had statistically significant effects on...estimates found enough evidence to support indicators used for sequestration, high flight hours, night flight, and overwater flight had statistically ...38 C. DESCRIPTIVE STATISTICS ................................................................38 D
Statistical Analysis on the Mechanical Properties of Magnesium Alloys
Liu, Ruoyu; Jiang, Xianquan; Zhang, Hongju; Zhang, Dingfei; Wang, Jingfeng; Pan, Fusheng
2017-01-01
Knowledge of statistical characteristics of mechanical properties is very important for the practical application of structural materials. Unfortunately, the scatter characteristics of magnesium alloys for mechanical performance remain poorly understood until now. In this study, the mechanical reliability of magnesium alloys is systematically estimated using Weibull statistical analysis. Interestingly, the Weibull modulus, m, of strength for magnesium alloys is as high as that for aluminum and steels, confirming the very high reliability of magnesium alloys. The high predictability in the tensile strength of magnesium alloys represents the capability of preventing catastrophic premature failure during service, which is essential for safety and reliability assessment. PMID:29113116
NAUSEA and the Principle of Supplementarity of Damping and Isolation in Noise Control.
1980-02-01
New approaches and uses of the statistical energy analysis (NAUSEA) have been considered and developed in recent months. The advances were made...possible in that the requirement, in the olde statistical energy analysis , that the dynamic systems be highly reverberant and the couplings between the...analytical consideration in terms of the statistical energy analysis (SEA). A brief discussion and simple examples that relate to these recent advances
Noise Reduction in High-Throughput Gene Perturbation Screens
USDA-ARS?s Scientific Manuscript database
Motivation: Accurate interpretation of perturbation screens is essential for a successful functional investigation. However, the screened phenotypes are often distorted by noise, and their analysis requires specialized statistical analysis tools. The number and scope of statistical methods available...
Wu, Robert; Glen, Peter; Ramsay, Tim; Martel, Guillaume
2014-06-28
Observational studies dominate the surgical literature. Statistical adjustment is an important strategy to account for confounders in observational studies. Research has shown that published articles are often poor in statistical quality, which may jeopardize their conclusions. The Statistical Analyses and Methods in the Published Literature (SAMPL) guidelines have been published to help establish standards for statistical reporting.This study will seek to determine whether the quality of statistical adjustment and the reporting of these methods are adequate in surgical observational studies. We hypothesize that incomplete reporting will be found in all surgical observational studies, and that the quality and reporting of these methods will be of lower quality in surgical journals when compared with medical journals. Finally, this work will seek to identify predictors of high-quality reporting. This work will examine the top five general surgical and medical journals, based on a 5-year impact factor (2007-2012). All observational studies investigating an intervention related to an essential component area of general surgery (defined by the American Board of Surgery), with an exposure, outcome, and comparator, will be included in this systematic review. Essential elements related to statistical reporting and quality were extracted from the SAMPL guidelines and include domains such as intent of analysis, primary analysis, multiple comparisons, numbers and descriptive statistics, association and correlation analyses, linear regression, logistic regression, Cox proportional hazard analysis, analysis of variance, survival analysis, propensity analysis, and independent and correlated analyses. Each article will be scored as a proportion based on fulfilling criteria in relevant analyses used in the study. A logistic regression model will be built to identify variables associated with high-quality reporting. A comparison will be made between the scores of surgical observational studies published in medical versus surgical journals. Secondary outcomes will pertain to individual domains of analysis. Sensitivity analyses will be conducted. This study will explore the reporting and quality of statistical analyses in surgical observational studies published in the most referenced surgical and medical journals in 2013 and examine whether variables (including the type of journal) can predict high-quality reporting.
Reif, David M.; Israel, Mark A.; Moore, Jason H.
2007-01-01
The biological interpretation of gene expression microarray results is a daunting challenge. For complex diseases such as cancer, wherein the body of published research is extensive, the incorporation of expert knowledge provides a useful analytical framework. We have previously developed the Exploratory Visual Analysis (EVA) software for exploring data analysis results in the context of annotation information about each gene, as well as biologically relevant groups of genes. We present EVA as a flexible combination of statistics and biological annotation that provides a straightforward visual interface for the interpretation of microarray analyses of gene expression in the most commonly occuring class of brain tumors, glioma. We demonstrate the utility of EVA for the biological interpretation of statistical results by analyzing publicly available gene expression profiles of two important glial tumors. The results of a statistical comparison between 21 malignant, high-grade glioblastoma multiforme (GBM) tumors and 19 indolent, low-grade pilocytic astrocytomas were analyzed using EVA. By using EVA to examine the results of a relatively simple statistical analysis, we were able to identify tumor class-specific gene expression patterns having both statistical and biological significance. Our interactive analysis highlighted the potential importance of genes involved in cell cycle progression, proliferation, signaling, adhesion, migration, motility, and structure, as well as candidate gene loci on a region of Chromosome 7 that has been implicated in glioma. Because EVA does not require statistical or computational expertise and has the flexibility to accommodate any type of statistical analysis, we anticipate EVA will prove a useful addition to the repertoire of computational methods used for microarray data analysis. EVA is available at no charge to academic users and can be found at http://www.epistasis.org. PMID:19390666
Kim, Sung-Min; Choi, Yosoon
2017-01-01
To develop appropriate measures to prevent soil contamination in abandoned mining areas, an understanding of the spatial variation of the potentially toxic trace elements (PTEs) in the soil is necessary. For the purpose of effective soil sampling, this study uses hot spot analysis, which calculates a z-score based on the Getis-Ord Gi* statistic to identify a statistically significant hot spot sample. To constitute a statistically significant hot spot, a feature with a high value should also be surrounded by other features with high values. Using relatively cost- and time-effective portable X-ray fluorescence (PXRF) analysis, sufficient input data are acquired from the Busan abandoned mine and used for hot spot analysis. To calibrate the PXRF data, which have a relatively low accuracy, the PXRF analysis data are transformed using the inductively coupled plasma atomic emission spectrometry (ICP-AES) data. The transformed PXRF data of the Busan abandoned mine are classified into four groups according to their normalized content and z-scores: high content with a high z-score (HH), high content with a low z-score (HL), low content with a high z-score (LH), and low content with a low z-score (LL). The HL and LH cases may be due to measurement errors. Additional or complementary surveys are required for the areas surrounding these suspect samples or for significant hot spot areas. The soil sampling is conducted according to a four-phase procedure in which the hot spot analysis and proposed group classification method are employed to support the development of a sampling plan for the following phase. Overall, 30, 50, 80, and 100 samples are investigated and analyzed in phases 1–4, respectively. The method implemented in this case study may be utilized in the field for the assessment of statistically significant soil contamination and the identification of areas for which an additional survey is required. PMID:28629168
Kim, Sung-Min; Choi, Yosoon
2017-06-18
To develop appropriate measures to prevent soil contamination in abandoned mining areas, an understanding of the spatial variation of the potentially toxic trace elements (PTEs) in the soil is necessary. For the purpose of effective soil sampling, this study uses hot spot analysis, which calculates a z -score based on the Getis-Ord Gi* statistic to identify a statistically significant hot spot sample. To constitute a statistically significant hot spot, a feature with a high value should also be surrounded by other features with high values. Using relatively cost- and time-effective portable X-ray fluorescence (PXRF) analysis, sufficient input data are acquired from the Busan abandoned mine and used for hot spot analysis. To calibrate the PXRF data, which have a relatively low accuracy, the PXRF analysis data are transformed using the inductively coupled plasma atomic emission spectrometry (ICP-AES) data. The transformed PXRF data of the Busan abandoned mine are classified into four groups according to their normalized content and z -scores: high content with a high z -score (HH), high content with a low z -score (HL), low content with a high z -score (LH), and low content with a low z -score (LL). The HL and LH cases may be due to measurement errors. Additional or complementary surveys are required for the areas surrounding these suspect samples or for significant hot spot areas. The soil sampling is conducted according to a four-phase procedure in which the hot spot analysis and proposed group classification method are employed to support the development of a sampling plan for the following phase. Overall, 30, 50, 80, and 100 samples are investigated and analyzed in phases 1-4, respectively. The method implemented in this case study may be utilized in the field for the assessment of statistically significant soil contamination and the identification of areas for which an additional survey is required.
Zhu, Yun; Fan, Ruzong; Xiong, Momiao
2017-01-01
Investigating the pleiotropic effects of genetic variants can increase statistical power, provide important information to achieve deep understanding of the complex genetic structures of disease, and offer powerful tools for designing effective treatments with fewer side effects. However, the current multiple phenotype association analysis paradigm lacks breadth (number of phenotypes and genetic variants jointly analyzed at the same time) and depth (hierarchical structure of phenotype and genotypes). A key issue for high dimensional pleiotropic analysis is to effectively extract informative internal representation and features from high dimensional genotype and phenotype data. To explore correlation information of genetic variants, effectively reduce data dimensions, and overcome critical barriers in advancing the development of novel statistical methods and computational algorithms for genetic pleiotropic analysis, we proposed a new statistic method referred to as a quadratically regularized functional CCA (QRFCCA) for association analysis which combines three approaches: (1) quadratically regularized matrix factorization, (2) functional data analysis and (3) canonical correlation analysis (CCA). Large-scale simulations show that the QRFCCA has a much higher power than that of the ten competing statistics while retaining the appropriate type 1 errors. To further evaluate performance, the QRFCCA and ten other statistics are applied to the whole genome sequencing dataset from the TwinsUK study. We identify a total of 79 genes with rare variants and 67 genes with common variants significantly associated with the 46 traits using QRFCCA. The results show that the QRFCCA substantially outperforms the ten other statistics. PMID:29040274
The prior statistics of object colors.
Koenderink, Jan J
2010-02-01
The prior statistics of object colors is of much interest because extensive statistical investigations of reflectance spectra reveal highly non-uniform structure in color space common to several very different databases. This common structure is due to the visual system rather than to the statistics of environmental structure. Analysis involves an investigation of the proper sample space of spectral reflectance factors and of the statistical consequences of the projection of spectral reflectances on the color solid. Even in the case of reflectance statistics that are translationally invariant with respect to the wavelength dimension, the statistics of object colors is highly non-uniform. The qualitative nature of this non-uniformity is due to trichromacy.
NASA Astrophysics Data System (ADS)
Ridgley, James Alexander, Jr.
This dissertation is an exploratory quantitative analysis of various independent variables to determine their effect on the professional longevity (years of service) of high school science teachers in the state of Florida for the academic years 2011-2012 to 2013-2014. Data are collected from the Florida Department of Education, National Center for Education Statistics, and the National Assessment of Educational Progress databases. The following research hypotheses are examined: H1 - There are statistically significant differences in Level 1 (teacher variables) that influence the professional longevity of a high school science teacher in Florida. H2 - There are statistically significant differences in Level 2 (school variables) that influence the professional longevity of a high school science teacher in Florida. H3 - There are statistically significant differences in Level 3 (district variables) that influence the professional longevity of a high school science teacher in Florida. H4 - When tested in a hierarchical multiple regression, there are statistically significant differences in Level 1, Level 2, or Level 3 that influence the professional longevity of a high school science teacher in Florida. The professional longevity of a Floridian high school science teacher is the dependent variable. The independent variables are: (Level 1) a teacher's sex, age, ethnicity, earned degree, salary, number of schools taught in, migration count, and various years of service in different areas of education; (Level 2) a school's geographic location, residential population density, average class size, charter status, and SES; and (Level 3) a school district's average SES and average spending per pupil. Statistical analyses of exploratory MLRs and a HMR are used to support the research hypotheses. The final results of the HMR analysis show a teacher's age, salary, earned degree (unknown, associate, and doctorate), and ethnicity (Hispanic and Native Hawaiian/Pacific Islander); a school's charter status; and a school district's average SES are all significant predictors of a Florida high school science teacher's professional longevity. Although statistically significant in the initial exploratory MLR analyses, a teacher's ethnicity (Asian and Black), a school's geographic location (city and rural), and a school's SES are not statistically significant in the final HMR model.
Spatiotemporal Analysis of the Ebola Hemorrhagic Fever in West Africa in 2014
NASA Astrophysics Data System (ADS)
Xu, M.; Cao, C. X.; Guo, H. F.
2017-09-01
Ebola hemorrhagic fever (EHF) is an acute hemorrhagic diseases caused by the Ebola virus, which is highly contagious. This paper aimed to explore the possible gathering area of EHF cases in West Africa in 2014, and identify endemic areas and their tendency by means of time-space analysis. We mapped distribution of EHF incidences and explored statistically significant space, time and space-time disease clusters. We utilized hotspot analysis to find the spatial clustering pattern on the basis of the actual outbreak cases. spatial-temporal cluster analysis is used to analyze the spatial or temporal distribution of agglomeration disease, examine whether its distribution is statistically significant. Local clusters were investigated using Kulldorff's scan statistic approach. The result reveals that the epidemic mainly gathered in the western part of Africa near north Atlantic with obvious regional distribution. For the current epidemic, we have found areas in high incidence of EVD by means of spatial cluster analysis.
ERIC Educational Resources Information Center
Kadel, Robert
2004-01-01
To her surprise, Ms. Logan had just conducted a statistical analysis of her 10th grade biology students' quiz scores. The results indicated that she needed to reinforce mitosis before the students took the high-school proficiency test in three weeks, as required by the state. "Oh! That's easy!" She exclaimed. Teachers like Ms. Logan are…
Active Structural Acoustic Control as an Approach to Acoustic Optimization of Lightweight Structures
2001-06-01
appropriate approach based on Statistical Energy Analysis (SEA) would facilitate investigations of the structural behavior at a high modal density. On the way...higher frequency investigations an approach based on the Statistical Energy Analysis (SEA) is recommended to describe the structural dynamic behavior
TRAPR: R Package for Statistical Analysis and Visualization of RNA-Seq Data.
Lim, Jae Hyun; Lee, Soo Youn; Kim, Ju Han
2017-03-01
High-throughput transcriptome sequencing, also known as RNA sequencing (RNA-Seq), is a standard technology for measuring gene expression with unprecedented accuracy. Numerous bioconductor packages have been developed for the statistical analysis of RNA-Seq data. However, these tools focus on specific aspects of the data analysis pipeline, and are difficult to appropriately integrate with one another due to their disparate data structures and processing methods. They also lack visualization methods to confirm the integrity of the data and the process. In this paper, we propose an R-based RNA-Seq analysis pipeline called TRAPR, an integrated tool that facilitates the statistical analysis and visualization of RNA-Seq expression data. TRAPR provides various functions for data management, the filtering of low-quality data, normalization, transformation, statistical analysis, data visualization, and result visualization that allow researchers to build customized analysis pipelines.
Content analysis to detect high stress in oral interviews and text documents
NASA Technical Reports Server (NTRS)
Thirumalainambi, Rajkumar (Inventor); Jorgensen, Charles C. (Inventor)
2012-01-01
A system of interrogation to estimate whether a subject of interrogation is likely experiencing high stress, emotional volatility and/or internal conflict in the subject's responses to an interviewer's questions. The system applies one or more of four procedures, a first statistical analysis, a second statistical analysis, a third analysis and a heat map analysis, to identify one or more documents containing the subject's responses for which further examination is recommended. Words in the documents are characterized in terms of dimensions representing different classes of emotions and states of mind, in which the subject's responses that manifest high stress, emotional volatility and/or internal conflict are identified. A heat map visually displays the dimensions manifested by the subject's responses in different colors, textures, geometric shapes or other visually distinguishable indicia.
RooStatsCms: A tool for analysis modelling, combination and statistical studies
NASA Astrophysics Data System (ADS)
Piparo, D.; Schott, G.; Quast, G.
2010-04-01
RooStatsCms is an object oriented statistical framework based on the RooFit technology. Its scope is to allow the modelling, statistical analysis and combination of multiple search channels for new phenomena in High Energy Physics. It provides a variety of methods described in literature implemented as classes, whose design is oriented to the execution of multiple CPU intensive jobs on batch systems or on the Grid.
Statistical analysis of fNIRS data: a comprehensive review.
Tak, Sungho; Ye, Jong Chul
2014-01-15
Functional near-infrared spectroscopy (fNIRS) is a non-invasive method to measure brain activities using the changes of optical absorption in the brain through the intact skull. fNIRS has many advantages over other neuroimaging modalities such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), or magnetoencephalography (MEG), since it can directly measure blood oxygenation level changes related to neural activation with high temporal resolution. However, fNIRS signals are highly corrupted by measurement noises and physiology-based systemic interference. Careful statistical analyses are therefore required to extract neuronal activity-related signals from fNIRS data. In this paper, we provide an extensive review of historical developments of statistical analyses of fNIRS signal, which include motion artifact correction, short source-detector separation correction, principal component analysis (PCA)/independent component analysis (ICA), false discovery rate (FDR), serially-correlated errors, as well as inference techniques such as the standard t-test, F-test, analysis of variance (ANOVA), and statistical parameter mapping (SPM) framework. In addition, to provide a unified view of various existing inference techniques, we explain a linear mixed effect model with restricted maximum likelihood (ReML) variance estimation, and show that most of the existing inference methods for fNIRS analysis can be derived as special cases. Some of the open issues in statistical analysis are also described. Copyright © 2013 Elsevier Inc. All rights reserved.
Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong
2015-01-01
Background Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. Objectives This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. Methods We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. Results There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. Conclusion The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent. PMID:26053876
Wu, Yazhou; Zhou, Liang; Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong
2015-01-01
Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent.
Analysis of High-Throughput ELISA Microarray Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
White, Amanda M.; Daly, Don S.; Zangar, Richard C.
Our research group develops analytical methods and software for the high-throughput analysis of quantitative enzyme-linked immunosorbent assay (ELISA) microarrays. ELISA microarrays differ from DNA microarrays in several fundamental aspects and most algorithms for analysis of DNA microarray data are not applicable to ELISA microarrays. In this review, we provide an overview of the steps involved in ELISA microarray data analysis and how the statistically sound algorithms we have developed provide an integrated software suite to address the needs of each data-processing step. The algorithms discussed are available in a set of open-source software tools (http://www.pnl.gov/statistics/ProMAT).
Rossi, Pierre; Gillet, François; Rohrbach, Emmanuelle; Diaby, Nouhou; Holliger, Christof
2009-01-01
The variability of terminal restriction fragment polymorphism analysis applied to complex microbial communities was assessed statistically. Recent technological improvements were implemented in the successive steps of the procedure, resulting in a standardized procedure which provided a high level of reproducibility. PMID:19749066
Power flow as a complement to statistical energy analysis and finite element analysis
NASA Technical Reports Server (NTRS)
Cuschieri, J. M.
1987-01-01
Present methods of analysis of the structural response and the structure-borne transmission of vibrational energy use either finite element (FE) techniques or statistical energy analysis (SEA) methods. The FE methods are a very useful tool at low frequencies where the number of resonances involved in the analysis is rather small. On the other hand SEA methods can predict with acceptable accuracy the response and energy transmission between coupled structures at relatively high frequencies where the structural modal density is high and a statistical approach is the appropriate solution. In the mid-frequency range, a relatively large number of resonances exist which make finite element method too costly. On the other hand SEA methods can only predict an average level form. In this mid-frequency range a possible alternative is to use power flow techniques, where the input and flow of vibrational energy to excited and coupled structural components can be expressed in terms of input and transfer mobilities. This power flow technique can be extended from low to high frequencies and this can be integrated with established FE models at low frequencies and SEA models at high frequencies to form a verification of the method. This method of structural analysis using power flo and mobility methods, and its integration with SEA and FE analysis is applied to the case of two thin beams joined together at right angles.
Orchestrating high-throughput genomic analysis with Bioconductor
Huber, Wolfgang; Carey, Vincent J.; Gentleman, Robert; Anders, Simon; Carlson, Marc; Carvalho, Benilton S.; Bravo, Hector Corrada; Davis, Sean; Gatto, Laurent; Girke, Thomas; Gottardo, Raphael; Hahne, Florian; Hansen, Kasper D.; Irizarry, Rafael A.; Lawrence, Michael; Love, Michael I.; MacDonald, James; Obenchain, Valerie; Oleś, Andrzej K.; Pagès, Hervé; Reyes, Alejandro; Shannon, Paul; Smyth, Gordon K.; Tenenbaum, Dan; Waldron, Levi; Morgan, Martin
2015-01-01
Bioconductor is an open-source, open-development software project for the analysis and comprehension of high-throughput data in genomics and molecular biology. The project aims to enable interdisciplinary research, collaboration and rapid development of scientific software. Based on the statistical programming language R, Bioconductor comprises 934 interoperable packages contributed by a large, diverse community of scientists. Packages cover a range of bioinformatic and statistical applications. They undergo formal initial review and continuous automated testing. We present an overview for prospective users and contributors. PMID:25633503
Extracting chemical information from high-resolution Kβ X-ray emission spectroscopy
NASA Astrophysics Data System (ADS)
Limandri, S.; Robledo, J.; Tirao, G.
2018-06-01
High-resolution X-ray emission spectroscopy allows studying the chemical environment of a wide variety of materials. Chemical information can be obtained by fitting the X-ray spectra and observing the behavior of some spectral features. Spectral changes can also be quantified by means of statistical parameters calculated by considering the spectrum as a probability distribution. Another possibility is to perform statistical multivariate analysis, such as principal component analysis. In this work the performance of these procedures for extracting chemical information in X-ray emission spectroscopy spectra for mixtures of Mn2+ and Mn4+ oxides are studied. A detail analysis of the parameters obtained, as well as the associated uncertainties is shown. The methodologies are also applied for Mn oxidation state characterization of double perovskite oxides Ba1+xLa1-xMnSbO6 (with 0 ≤ x ≤ 0.7). The results show that statistical parameters and multivariate analysis are the most suitable for the analysis of this kind of spectra.
Statistical analysis of secondary particle distributions in relativistic nucleus-nucleus collisions
NASA Technical Reports Server (NTRS)
Mcguire, Stephen C.
1987-01-01
The use is described of several statistical techniques to characterize structure in the angular distributions of secondary particles from nucleus-nucleus collisions in the energy range 24 to 61 GeV/nucleon. The objective of this work was to determine whether there are correlations between emitted particle intensity and angle that may be used to support the existence of the quark gluon plasma. The techniques include chi-square null hypothesis tests, the method of discrete Fourier transform analysis, and fluctuation analysis. We have also used the method of composite unit vectors to test for azimuthal asymmetry in a data set of 63 JACEE-3 events. Each method is presented in a manner that provides the reader with some practical detail regarding its application. Of those events with relatively high statistics, Fe approaches 0 at 55 GeV/nucleon was found to possess an azimuthal distribution with a highly non-random structure. No evidence of non-statistical fluctuations was found in the pseudo-rapidity distributions of the events studied. It is seen that the most effective application of these methods relies upon the availability of many events or single events that possess very high multiplicities.
Trivedi, Prinal; Edwards, Jode W; Wang, Jelai; Gadbury, Gary L; Srinivasasainagendra, Vinodh; Zakharkin, Stanislav O; Kim, Kyoungmi; Mehta, Tapan; Brand, Jacob P L; Patki, Amit; Page, Grier P; Allison, David B
2005-04-06
Many efforts in microarray data analysis are focused on providing tools and methods for the qualitative analysis of microarray data. HDBStat! (High-Dimensional Biology-Statistics) is a software package designed for analysis of high dimensional biology data such as microarray data. It was initially developed for the analysis of microarray gene expression data, but it can also be used for some applications in proteomics and other aspects of genomics. HDBStat! provides statisticians and biologists a flexible and easy-to-use interface to analyze complex microarray data using a variety of methods for data preprocessing, quality control analysis and hypothesis testing. Results generated from data preprocessing methods, quality control analysis and hypothesis testing methods are output in the form of Excel CSV tables, graphs and an Html report summarizing data analysis. HDBStat! is a platform-independent software that is freely available to academic institutions and non-profit organizations. It can be downloaded from our website http://www.soph.uab.edu/ssg_content.asp?id=1164.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-22
... statistically significant relationship is evaluated by way of the correlation coefficient (r) with statistical... . The analysis revealed a significant high correlation between reduced predicted crew effectiveness (as...
Challenges of Big Data Analysis.
Fan, Jianqing; Han, Fang; Liu, Han
2014-06-01
Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.
Challenges of Big Data Analysis
Fan, Jianqing; Han, Fang; Liu, Han
2014-01-01
Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions. PMID:25419469
Black Females in High School: A Statistical Educational Profile
ERIC Educational Resources Information Center
Muhammad, Crystal Gafford; Dixson, Adrienne D.
2008-01-01
In life as in literature, both the mainstream public and the Black community writ large, overlook the Black female experiences, both adolescent and adult. In order to contribute to the knowledge base regarding this population, we present through our study a statistical portrait of Black females in high school. To do so, we present an analysis of…
ERIC Educational Resources Information Center
Lau, Joann M.; Korn, Robert W.
2007-01-01
In this article, the authors present a laboratory exercise in data collection and statistical analysis in biological space using clustered stomates on leaves of "Begonia" plants. The exercise can be done in middle school classes by students making their own slides and seeing imprints of cells, or at the high school level through collecting data of…
ERIC Educational Resources Information Center
Chen, Xianglei; Lauff, Erich; Arbeit, Caren A.; Henke, Robin; Skomsvold, Paul; Hufford, Justine
2017-01-01
This Statistical Analysis Report tracks a cohort of 2002 high school sophomores over 10 years, examining the extent to which cohort members had reached such life course milestones as finishing school, starting a job, leaving home, getting married, and having children. The analyses in this report are based on data from the Education Longitudinal…
Ji, Jun; Ling, Jeffrey; Jiang, Helen; Wen, Qiaojun; Whitin, John C; Tian, Lu; Cohen, Harvey J; Ling, Xuefeng B
2013-03-23
Mass spectrometry (MS) has evolved to become the primary high throughput tool for proteomics based biomarker discovery. Until now, multiple challenges in protein MS data analysis remain: large-scale and complex data set management; MS peak identification, indexing; and high dimensional peak differential analysis with the concurrent statistical tests based false discovery rate (FDR). "Turnkey" solutions are needed for biomarker investigations to rapidly process MS data sets to identify statistically significant peaks for subsequent validation. Here we present an efficient and effective solution, which provides experimental biologists easy access to "cloud" computing capabilities to analyze MS data. The web portal can be accessed at http://transmed.stanford.edu/ssa/. Presented web application supplies large scale MS data online uploading and analysis with a simple user interface. This bioinformatic tool will facilitate the discovery of the potential protein biomarkers using MS.
NASA Astrophysics Data System (ADS)
ten Veldhuis, Marie-Claire; Schleiss, Marc
2017-04-01
In this study, we introduced an alternative approach for analysis of hydrological flow time series, using an adaptive sampling framework based on inter-amount times (IATs). The main difference with conventional flow time series is the rate at which low and high flows are sampled: the unit of analysis for IATs is a fixed flow amount, instead of a fixed time window. We analysed statistical distributions of flows and IATs across a wide range of sampling scales to investigate sensitivity of statistical properties such as quantiles, variance, skewness, scaling parameters and flashiness indicators to the sampling scale. We did this based on streamflow time series for 17 (semi)urbanised basins in North Carolina, US, ranging from 13 km2 to 238 km2 in size. Results showed that adaptive sampling of flow time series based on inter-amounts leads to a more balanced representation of low flow and peak flow values in the statistical distribution. While conventional sampling gives a lot of weight to low flows, as these are most ubiquitous in flow time series, IAT sampling gives relatively more weight to high flow values, when given flow amounts are accumulated in shorter time. As a consequence, IAT sampling gives more information about the tail of the distribution associated with high flows, while conventional sampling gives relatively more information about low flow periods. We will present results of statistical analyses across a range of subdaily to seasonal scales and will highlight some interesting insights that can be derived from IAT statistics with respect to basin flashiness and impact urbanisation on hydrological response.
A κ-generalized statistical mechanics approach to income analysis
NASA Astrophysics Data System (ADS)
Clementi, F.; Gallegati, M.; Kaniadakis, G.
2009-02-01
This paper proposes a statistical mechanics approach to the analysis of income distribution and inequality. A new distribution function, having its roots in the framework of κ-generalized statistics, is derived that is particularly suitable for describing the whole spectrum of incomes, from the low-middle income region up to the high income Pareto power-law regime. Analytical expressions for the shape, moments and some other basic statistical properties are given. Furthermore, several well-known econometric tools for measuring inequality, which all exist in a closed form, are considered. A method for parameter estimation is also discussed. The model is shown to fit remarkably well the data on personal income for the United States, and the analysis of inequality performed in terms of its parameters is revealed as very powerful.
Marateb, Hamid Reza; Mansourian, Marjan; Adibi, Peyman; Farina, Dario
2014-01-01
Background: selecting the correct statistical test and data mining method depends highly on the measurement scale of data, type of variables, and purpose of the analysis. Different measurement scales are studied in details and statistical comparison, modeling, and data mining methods are studied based upon using several medical examples. We have presented two ordinal–variables clustering examples, as more challenging variable in analysis, using Wisconsin Breast Cancer Data (WBCD). Ordinal-to-Interval scale conversion example: a breast cancer database of nine 10-level ordinal variables for 683 patients was analyzed by two ordinal-scale clustering methods. The performance of the clustering methods was assessed by comparison with the gold standard groups of malignant and benign cases that had been identified by clinical tests. Results: the sensitivity and accuracy of the two clustering methods were 98% and 96%, respectively. Their specificity was comparable. Conclusion: by using appropriate clustering algorithm based on the measurement scale of the variables in the study, high performance is granted. Moreover, descriptive and inferential statistics in addition to modeling approach must be selected based on the scale of the variables. PMID:24672565
Fu, Wenjiang J.; Stromberg, Arnold J.; Viele, Kert; Carroll, Raymond J.; Wu, Guoyao
2009-01-01
Over the past two decades, there have been revolutionary developments in life science technologies characterized by high throughput, high efficiency, and rapid computation. Nutritionists now have the advanced methodologies for the analysis of DNA, RNA, protein, low-molecular-weight metabolites, as well as access to bioinformatics databases. Statistics, which can be defined as the process of making scientific inferences from data that contain variability, has historically played an integral role in advancing nutritional sciences. Currently, in the era of systems biology, statistics has become an increasingly important tool to quantitatively analyze information about biological macromolecules. This article describes general terms used in statistical analysis of large, complex experimental data. These terms include experimental design, power analysis, sample size calculation, and experimental errors (type I and II errors) for nutritional studies at population, tissue, cellular, and molecular levels. In addition, we highlighted various sources of experimental variations in studies involving microarray gene expression, real-time polymerase chain reaction, proteomics, and other bioinformatics technologies. Moreover, we provided guidelines for nutritionists and other biomedical scientists to plan and conduct studies and to analyze the complex data. Appropriate statistical analyses are expected to make an important contribution to solving major nutrition-associated problems in humans and animals (including obesity, diabetes, cardiovascular disease, cancer, ageing, and intrauterine fetal retardation). PMID:20233650
Laser diagnostics of native cervix dabs with human papilloma virus in high carcinogenic risk
NASA Astrophysics Data System (ADS)
Peresunko, O. P.; Karpenko, Ju. G.; Burkovets, D. N.; Ivashko, P. V.; Nikorych, A. V.; Yermolenko, S. B.; Gruia, Ion; Gruia, M. J.
2015-11-01
The results of experimental studies of coordinate distributions of Mueller matrix elements of the following types of cervical scraping tissue are presented: rate- low-grade - highly differentiated dysplasia (CIN1-CIN3) - adenocarcinoma of high, medium and low levels of differentiation (G1-G3). The rationale for the choice of statistical points 1-4 orders polarized coherent radiation field, transformed as a result of interaction with the oncologic modified biological layers "epithelium-stroma" as a quantitative criterion of polarimetric optical differentiation state of human biological tissues are shown here. The analysis of the obtained Mueller matrix elements and statistical correlation methods, the systematized by types studied tissues is accomplished. The results of research images of Mueller matrix elements m34 for this type of pathology as low-grade dysplasia (CIN2), the results of its statistical and correlation analysis are presented.
NASA Astrophysics Data System (ADS)
Rubtsov, Vladimir; Kapralov, Sergey; Chalyk, Iuri; Ulianova, Onega; Ulyanov, Sergey
2013-02-01
Statistical properties of laser speckles, formed in skin and mucous of colon have been analyzed and compared. It has been demonstrated that first and second order statistics of "skin" speckles and "mucous" speckles are quite different. It is shown that speckles, formed in mucous, are not Gaussian one. Layered structure of colon mucous causes formation of speckled biospeckles. First- and second- order statistics of speckled speckles have been reviewed in this paper. Statistical properties of Fresnel and Fraunhofer doubly scattered and cascade speckles are described. Non-gaussian statistics of biospeckles may lead to high localization of intensity of coherent light in human tissue during the laser surgery. Way of suppression of highly localized non-gaussian speckles is suggested.
Smith, Paul F.
2017-01-01
Effective inferential statistical analysis is essential for high quality studies in neuroscience. However, recently, neuroscience has been criticised for the poor use of experimental design and statistical analysis. Many of the statistical issues confronting neuroscience are similar to other areas of biology; however, there are some that occur more regularly in neuroscience studies. This review attempts to provide a succinct overview of some of the major issues that arise commonly in the analyses of neuroscience data. These include: the non-normal distribution of the data; inequality of variance between groups; extensive correlation in data for repeated measurements across time or space; excessive multiple testing; inadequate statistical power due to small sample sizes; pseudo-replication; and an over-emphasis on binary conclusions about statistical significance as opposed to effect sizes. Statistical analysis should be viewed as just another neuroscience tool, which is critical to the final outcome of the study. Therefore, it needs to be done well and it is a good idea to be proactive and seek help early, preferably before the study even begins. PMID:29371855
Smith, Paul F
2017-01-01
Effective inferential statistical analysis is essential for high quality studies in neuroscience. However, recently, neuroscience has been criticised for the poor use of experimental design and statistical analysis. Many of the statistical issues confronting neuroscience are similar to other areas of biology; however, there are some that occur more regularly in neuroscience studies. This review attempts to provide a succinct overview of some of the major issues that arise commonly in the analyses of neuroscience data. These include: the non-normal distribution of the data; inequality of variance between groups; extensive correlation in data for repeated measurements across time or space; excessive multiple testing; inadequate statistical power due to small sample sizes; pseudo-replication; and an over-emphasis on binary conclusions about statistical significance as opposed to effect sizes. Statistical analysis should be viewed as just another neuroscience tool, which is critical to the final outcome of the study. Therefore, it needs to be done well and it is a good idea to be proactive and seek help early, preferably before the study even begins.
Pathway analysis with next-generation sequencing data.
Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric; Xiong, Momiao
2015-04-01
Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.
Local image statistics: maximum-entropy constructions and perceptual salience
Victor, Jonathan D.; Conte, Mary M.
2012-01-01
The space of visual signals is high-dimensional and natural visual images have a highly complex statistical structure. While many studies suggest that only a limited number of image statistics are used for perceptual judgments, a full understanding of visual function requires analysis not only of the impact of individual image statistics, but also, how they interact. In natural images, these statistical elements (luminance distributions, correlations of low and high order, edges, occlusions, etc.) are intermixed, and their effects are difficult to disentangle. Thus, there is a need for construction of stimuli in which one or more statistical elements are introduced in a controlled fashion, so that their individual and joint contributions can be analyzed. With this as motivation, we present algorithms to construct synthetic images in which local image statistics—including luminance distributions, pair-wise correlations, and higher-order correlations—are explicitly specified and all other statistics are determined implicitly by maximum-entropy. We then apply this approach to measure the sensitivity of the human visual system to local image statistics and to sample their interactions. PMID:22751397
Diagnosis checking of statistical analysis in RCTs indexed in PubMed.
Lee, Paul H; Tse, Andy C Y
2017-11-01
Statistical analysis is essential for reporting of the results of randomized controlled trials (RCTs), as well as evaluating their effectiveness. However, the validity of a statistical analysis also depends on whether the assumptions of that analysis are valid. To review all RCTs published in journals indexed in PubMed during December 2014 to provide a complete picture of how RCTs handle assumptions of statistical analysis. We reviewed all RCTs published in December 2014 that appeared in journals indexed in PubMed using the Cochrane highly sensitive search strategy. The 2014 impact factors of the journals were used as proxies for their quality. The type of statistical analysis used and whether the assumptions of the analysis were tested were reviewed. In total, 451 papers were included. Of the 278 papers that reported a crude analysis for the primary outcomes, 31 (27·2%) reported whether the outcome was normally distributed. Of the 172 papers that reported an adjusted analysis for the primary outcomes, diagnosis checking was rarely conducted, with only 20%, 8·6% and 7% checked for generalized linear model, Cox proportional hazard model and multilevel model, respectively. Study characteristics (study type, drug trial, funding sources, journal type and endorsement of CONSORT guidelines) were not associated with the reporting of diagnosis checking. The diagnosis of statistical analyses in RCTs published in PubMed-indexed journals was usually absent. Journals should provide guidelines about the reporting of a diagnosis of assumptions. © 2017 Stichting European Society for Clinical Investigation Journal Foundation.
ERIC Educational Resources Information Center
Essid, Hedi; Ouellette, Pierre; Vigeant, Stephane
2010-01-01
The objective of this paper is to measure the efficiency of high schools in Tunisia. We use a statistical data envelopment analysis (DEA)-bootstrap approach with quasi-fixed inputs to estimate the precision of our measure. To do so, we developed a statistical model serving as the foundation of the data generation process (DGP). The DGP is…
Statistical Analysis of Big Data on Pharmacogenomics
Fan, Jianqing; Liu, Han
2013-01-01
This paper discusses statistical methods for estimating complex correlation structure from large pharmacogenomic datasets. We selectively review several prominent statistical methods for estimating large covariance matrix for understanding correlation structure, inverse covariance matrix for network modeling, large-scale simultaneous tests for selecting significantly differently expressed genes and proteins and genetic markers for complex diseases, and high dimensional variable selection for identifying important molecules for understanding molecule mechanisms in pharmacogenomics. Their applications to gene network estimation and biomarker selection are used to illustrate the methodological power. Several new challenges of Big data analysis, including complex data distribution, missing data, measurement error, spurious correlation, endogeneity, and the need for robust statistical methods, are also discussed. PMID:23602905
The effects of multiple repairs on Inconel 718 weld mechanical properties
NASA Technical Reports Server (NTRS)
Russell, C. K.; Nunes, A. C., Jr.; Moore, D.
1991-01-01
Inconel 718 weldments were repaired 3, 6, 9, and 13 times using the gas tungsten arc welding process. The welded panels were machined into mechanical test specimens, postweld heat treated, and nondestructively tested. Tensile properties and high cycle fatigue life were evaluated and the results compared to unrepaired weld properties. Mechanical property data were analyzed using the statistical methods of difference in means for tensile properties and difference in log means and Weibull analysis for high cycle fatigue properties. Statistical analysis performed on the data did not show a significant decrease in tensile or high cycle fatigue properties due to the repeated repairs. Some degradation was observed in all properties, however, it was minimal.
An Analysis Methodology for the Gamma-ray Large Area Space Telescope
NASA Technical Reports Server (NTRS)
Morris, Robin D.; Cohen-Tanugi, Johann
2004-01-01
The Large Area Telescope (LAT) instrument on the Gamma Ray Large Area Space Telescope (GLAST) has been designed to detect high-energy gamma rays and determine their direction of incidence and energy. We propose a reconstruction algorithm based on recent advances in statistical methodology. This method, alternative to the standard event analysis inherited from high energy collider physics experiments, incorporates more accurately the physical processes occurring in the detector, and makes full use of the statistical information available. It could thus provide a better estimate of the direction and energy of the primary photon.
A new u-statistic with superior design sensitivity in matched observational studies.
Rosenbaum, Paul R
2011-09-01
In an observational or nonrandomized study of treatment effects, a sensitivity analysis indicates the magnitude of bias from unmeasured covariates that would need to be present to alter the conclusions of a naïve analysis that presumes adjustments for observed covariates suffice to remove all bias. The power of sensitivity analysis is the probability that it will reject a false hypothesis about treatment effects allowing for a departure from random assignment of a specified magnitude; in particular, if this specified magnitude is "no departure" then this is the same as the power of a randomization test in a randomized experiment. A new family of u-statistics is proposed that includes Wilcoxon's signed rank statistic but also includes other statistics with substantially higher power when a sensitivity analysis is performed in an observational study. Wilcoxon's statistic has high power to detect small effects in large randomized experiments-that is, it often has good Pitman efficiency-but small effects are invariably sensitive to small unobserved biases. Members of this family of u-statistics that emphasize medium to large effects can have substantially higher power in a sensitivity analysis. For example, in one situation with 250 pair differences that are Normal with expectation 1/2 and variance 1, the power of a sensitivity analysis that uses Wilcoxon's statistic is 0.08 while the power of another member of the family of u-statistics is 0.66. The topic is examined by performing a sensitivity analysis in three observational studies, using an asymptotic measure called the design sensitivity, and by simulating power in finite samples. The three examples are drawn from epidemiology, clinical medicine, and genetic toxicology. © 2010, The International Biometric Society.
Effect of the absolute statistic on gene-sampling gene-set analysis methods.
Nam, Dougu
2017-06-01
Gene-set enrichment analysis and its modified versions have commonly been used for identifying altered functions or pathways in disease from microarray data. In particular, the simple gene-sampling gene-set analysis methods have been heavily used for datasets with only a few sample replicates. The biggest problem with this approach is the highly inflated false-positive rate. In this paper, the effect of absolute gene statistic on gene-sampling gene-set analysis methods is systematically investigated. Thus far, the absolute gene statistic has merely been regarded as a supplementary method for capturing the bidirectional changes in each gene set. Here, it is shown that incorporating the absolute gene statistic in gene-sampling gene-set analysis substantially reduces the false-positive rate and improves the overall discriminatory ability. Its effect was investigated by power, false-positive rate, and receiver operating curve for a number of simulated and real datasets. The performances of gene-set analysis methods in one-tailed (genome-wide association study) and two-tailed (gene expression data) tests were also compared and discussed.
Improved Statistics for Genome-Wide Interaction Analysis
Ueki, Masao; Cordell, Heather J.
2012-01-01
Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new “joint effects” statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al.'s originally-proposed statistics, on account of the inflated error rate that can result. PMID:22496670
Mathur, Sunil; Sadana, Ajit
2015-12-01
We present a rank-based test statistic for the identification of differentially expressed genes using a distance measure. The proposed test statistic is highly robust against extreme values and does not assume the distribution of parent population. Simulation studies show that the proposed test is more powerful than some of the commonly used methods, such as paired t-test, Wilcoxon signed rank test, and significance analysis of microarray (SAM) under certain non-normal distributions. The asymptotic distribution of the test statistic, and the p-value function are discussed. The application of proposed method is shown using a real-life data set. © The Author(s) 2011.
Probabilistic Signal Recovery and Random Matrices
2016-12-08
applications in statistics , biomedical data analysis, quantization, dimen- sion reduction, and networks science. 1. High-dimensional inference and geometry Our...low-rank approxima- tion, with applications to community detection in networks, Annals of Statistics 44 (2016), 373–400. [7] C. Le, E. Levina, R...approximation, with applications to community detection in networks, Annals of Statistics 44 (2016), 373–400. C. Le, E. Levina, R. Vershynin, Concentration
Defining the ecological hydrology of Taiwan Rivers using multivariate statistical methods
NASA Astrophysics Data System (ADS)
Chang, Fi-John; Wu, Tzu-Ching; Tsai, Wen-Ping; Herricks, Edwin E.
2009-09-01
SummaryThe identification and verification of ecohydrologic flow indicators has found new support as the importance of ecological flow regimes is recognized in modern water resources management, particularly in river restoration and reservoir management. An ecohydrologic indicator system reflecting the unique characteristics of Taiwan's water resources and hydrology has been developed, the Taiwan ecohydrological indicator system (TEIS). A major challenge for the water resources community is using the TEIS to provide environmental flow rules that improve existing water resources management. This paper examines data from the extensive network of flow monitoring stations in Taiwan using TEIS statistics to define and refine environmental flow options in Taiwan. Multivariate statistical methods were used to examine TEIS statistics for 102 stations representing the geographic and land use diversity of Taiwan. The Pearson correlation coefficient showed high multicollinearity between the TEIS statistics. Watersheds were separated into upper and lower-watershed locations. An analysis of variance indicated significant differences between upstream, more natural, and downstream, more developed, locations in the same basin with hydrologic indicator redundancy in flow change and magnitude statistics. Issues of multicollinearity were examined using a Principal Component Analysis (PCA) with the first three components related to general flow and high/low flow statistics, frequency and time statistics, and quantity statistics. These principle components would explain about 85% of the total variation. A major conclusion is that managers must be aware of differences among basins, as well as differences within basins that will require careful selection of management procedures to achieve needed flow regimes.
A national streamflow network gap analysis
Kiang, Julie E.; Stewart, David W.; Archfield, Stacey A.; Osborne, Emily B.; Eng, Ken
2013-01-01
The U.S. Geological Survey (USGS) conducted a gap analysis to evaluate how well the USGS streamgage network meets a variety of needs, focusing on the ability to calculate various statistics at locations that have streamgages (gaged) and that do not have streamgages (ungaged). This report presents the results of analysis to determine where there are gaps in the network of gaged locations, how accurately desired statistics can be calculated with a given length of record, and whether the current network allows for estimation of these statistics at ungaged locations. The analysis indicated that there is variability across the Nation’s streamflow data-collection network in terms of the spatial and temporal coverage of streamgages. In general, the Eastern United States has better coverage than the Western United States. The arid Southwestern United States, Alaska, and Hawaii were observed to have the poorest spatial coverage, using the dataset assembled for this study. Except in Hawaii, these areas also tended to have short streamflow records. Differences in hydrology lead to differences in the uncertainty of statistics calculated in different regions of the country. Arid and semiarid areas of the Central and Southwestern United States generally exhibited the highest levels of interannual variability in flow, leading to larger uncertainty in flow statistics. At ungaged locations, information can be transferred from nearby streamgages if there is sufficient similarity between the gaged watersheds and the ungaged watersheds of interest. Areas where streamgages exhibit high correlation are most likely to be suitable for this type of information transfer. The areas with the most highly correlated streamgages appear to coincide with mountainous areas of the United States. Lower correlations are found in the Central United States and coastal areas of the Southeastern United States. Information transfer from gaged basins to ungaged basins is also most likely to be successful when basin attributes show high similarity. At the scale of the analysis completed in this study, the attributes of basins upstream of USGS streamgages cover the full range of basin attributes observed at potential locations of interest fairly well. Some exceptions included very high or very low elevation areas and very arid areas.
Data exploration, quality control and statistical analysis of ChIP-exo/nexus experiments
Welch, Rene; Chung, Dongjun; Grass, Jeffrey; Landick, Robert
2017-01-01
Abstract ChIP-exo/nexus experiments rely on innovative modifications of the commonly used ChIP-seq protocol for high resolution mapping of transcription factor binding sites. Although many aspects of the ChIP-exo data analysis are similar to those of ChIP-seq, these high throughput experiments pose a number of unique quality control and analysis challenges. We develop a novel statistical quality control pipeline and accompanying R/Bioconductor package, ChIPexoQual, to enable exploration and analysis of ChIP-exo and related experiments. ChIPexoQual evaluates a number of key issues including strand imbalance, library complexity, and signal enrichment of data. Assessment of these features are facilitated through diagnostic plots and summary statistics computed over regions of the genome with varying levels of coverage. We evaluated our QC pipeline with both large collections of public ChIP-exo/nexus data and multiple, new ChIP-exo datasets from Escherichia coli. ChIPexoQual analysis of these datasets resulted in guidelines for using these QC metrics across a wide range of sequencing depths and provided further insights for modelling ChIP-exo data. PMID:28911122
Data exploration, quality control and statistical analysis of ChIP-exo/nexus experiments.
Welch, Rene; Chung, Dongjun; Grass, Jeffrey; Landick, Robert; Keles, Sündüz
2017-09-06
ChIP-exo/nexus experiments rely on innovative modifications of the commonly used ChIP-seq protocol for high resolution mapping of transcription factor binding sites. Although many aspects of the ChIP-exo data analysis are similar to those of ChIP-seq, these high throughput experiments pose a number of unique quality control and analysis challenges. We develop a novel statistical quality control pipeline and accompanying R/Bioconductor package, ChIPexoQual, to enable exploration and analysis of ChIP-exo and related experiments. ChIPexoQual evaluates a number of key issues including strand imbalance, library complexity, and signal enrichment of data. Assessment of these features are facilitated through diagnostic plots and summary statistics computed over regions of the genome with varying levels of coverage. We evaluated our QC pipeline with both large collections of public ChIP-exo/nexus data and multiple, new ChIP-exo datasets from Escherichia coli. ChIPexoQual analysis of these datasets resulted in guidelines for using these QC metrics across a wide range of sequencing depths and provided further insights for modelling ChIP-exo data. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
A Vehicle for Bivariate Data Analysis
ERIC Educational Resources Information Center
Roscoe, Matt B.
2016-01-01
Instead of reserving the study of probability and statistics for special fourth-year high school courses, the Common Core State Standards for Mathematics (CCSSM) takes a "statistics for all" approach. The standards recommend that students in grades 6-8 learn to summarize and describe data distributions, understand probability, draw…
Landing Site Dispersion Analysis and Statistical Assessment for the Mars Phoenix Lander
NASA Technical Reports Server (NTRS)
Bonfiglio, Eugene P.; Adams, Douglas; Craig, Lynn; Spencer, David A.; Strauss, William; Seelos, Frank P.; Seelos, Kimberly D.; Arvidson, Ray; Heet, Tabatha
2008-01-01
The Mars Phoenix Lander launched on August 4, 2007 and successfully landed on Mars 10 months later on May 25, 2008. Landing ellipse predicts and hazard maps were key in selecting safe surface targets for Phoenix. Hazard maps were based on terrain slopes, geomorphology maps and automated rock counts of MRO's High Resolution Imaging Science Experiment (HiRISE) images. The expected landing dispersion which led to the selection of Phoenix's surface target is discussed as well as the actual landing dispersion predicts determined during operations in the weeks, days, and hours before landing. A statistical assessment of these dispersions is performed, comparing the actual landing-safety probabilities to criteria levied by the project. Also discussed are applications for this statistical analysis which were used by the Phoenix project. These include using the statistical analysis used to verify the effectiveness of a pre-planned maneuver menu and calculating the probability of future maneuvers.
Guo, Hui; Zhang, Zhen; Yao, Yuan; Liu, Jialin; Chang, Ruirui; Liu, Zhao; Hao, Hongyuan; Huang, Taohong; Wen, Jun; Zhou, Tingting
2018-08-30
Semen sojae praeparatum with homology of medicine and food is a famous traditional Chinese medicine. A simple and effective quality fingerprint analysis, coupled with chemometrics methods, was developed for quality assessment of Semen sojae praeparatum. First, similarity analysis (SA) and hierarchical clusting analysis (HCA) were applied to select the qualitative markers, which obviously influence the quality of Semen sojae praeparatum. 21 chemicals were selected and characterized by high resolution ion trap/time-of-flight mass spectrometry (LC-IT-TOF-MS). Subsequently, principal components analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA) were conducted to select the quantitative markers of Semen sojae praeparatum samples from different origins. Moreover, 11 compounds with statistical significance were determined quantitatively, which provided an accurate and informative data for quality evaluation. This study proposes a new strategy for "statistic analysis-based fingerprint establishment", which would be a valuable reference for further study. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Ohyanagi, S.; Dileonardo, C.
2013-12-01
As a natural phenomenon earthquake occurrence is difficult to predict. Statistical analysis of earthquake data was performed using candlestick chart and Bollinger Band methods. These statistical methods, commonly used in the financial world to analyze market trends were tested against earthquake data. Earthquakes above Mw 4.0 located on shore of Sanriku (37.75°N ~ 41.00°N, 143.00°E ~ 144.50°E) from February 1973 to May 2013 were selected for analysis. Two specific patterns in earthquake occurrence were recognized through the analysis. One is a spread of candlestick prior to the occurrence of events greater than Mw 6.0. A second pattern shows convergence in the Bollinger Band, which implies a positive or negative change in the trend of earthquakes. Both patterns match general models for the buildup and release of strain through the earthquake cycle, and agree with both the characteristics of the candlestick chart and Bollinger Band analysis. These results show there is a high correlation between patterns in earthquake occurrence and trend analysis by these two statistical methods. The results of this study agree with the appropriateness of the application of these financial analysis methods to the analysis of earthquake occurrence.
NASA Astrophysics Data System (ADS)
Lin, Shu; Wang, Rui; Xia, Ning; Li, Yongdong; Liu, Chunliang
2018-01-01
Statistical multipactor theories are critical prediction approaches for multipactor breakdown determination. However, these approaches still require a negotiation between the calculation efficiency and accuracy. This paper presents an improved stationary statistical theory for efficient threshold analysis of two-surface multipactor. A general integral equation over the distribution function of the electron emission phase with both the single-sided and double-sided impacts considered is formulated. The modeling results indicate that the improved stationary statistical theory can not only obtain equally good accuracy of multipactor threshold calculation as the nonstationary statistical theory, but also achieve high calculation efficiency concurrently. By using this improved stationary statistical theory, the total time consumption in calculating full multipactor susceptibility zones of parallel plates can be decreased by as much as a factor of four relative to the nonstationary statistical theory. It also shows that the effect of single-sided impacts is indispensable for accurate multipactor prediction of coaxial lines and also more significant for the high order multipactor. Finally, the influence of secondary emission yield (SEY) properties on the multipactor threshold is further investigated. It is observed that the first cross energy and the energy range between the first cross and the SEY maximum both play a significant role in determining the multipactor threshold, which agrees with the numerical simulation results in the literature.
Aerosol, a health hazard during ultrasonic scaling: A clinico-microbiological study.
Singh, Akanksha; Shiva Manjunath, R G; Singla, Deepak; Bhattacharya, Hirak S; Sarkar, Arijit; Chandra, Neeraj
2016-01-01
Ultrasonic scaling is a routinely used treatment to remove plaque and calculus from tooth surfaces. These scalers use water as a coolant which is splattered during the vibration of the tip. The splatter when mixed with saliva and plaque of the patients causes the aerosol highly infectious and acts as a major risk factor for transmission of the disease. In spite of necessary protection, sometimes, the operator might get infected because of the infectious nature of the splatter. To evaluate the aerosol contamination produced during ultrasonic scaling by the help of microbiological analysis. This clinico-microbiological study consisted of twenty patients. Two agar plates were used for each patient; the first was kept at the center of the operatory room 20 min before the treatment while the second agar plate was kept 40 cm away from the patient's chest during the treatment. Both the agar plates were sent for microbiological analysis. The statistical analysis was done with the help of STATA 11.0 (StataCorp. 2013. Stata Statistical Software, Release 13. College Station, TX: StataCorp LP, 4905 Lakeway Drive College Station, Texas, USA). Statistical software was used for data analysis and the P < 0.001 was considered to be statistically significant. The results for bacterial count were highly significant when compared before and during the treatment. The Gram staining showed the presence of Staphylococcus and Streptococcus species in high numbers. The aerosols and splatters produced during dental procedures have the potential to spread infection to dental personnel. Therefore, proper precautions should be taken to minimize the risk of infection to the operator.
ERIC Educational Resources Information Center
Edge, D. Michael
2011-01-01
This non-experimental study attempted to determine how the different prescribed mathematic tracks offered at a comprehensive technical high school influenced the mathematics performance of low-achieving students on standardized assessments of mathematics achievement. The goal was to provide an analysis of any statistically significant differences…
The US EPA’s ToxCastTM program seeks to combine advances in high-throughput screening technology with methodologies from statistics and computer science to develop high-throughput decision support tools for assessing chemical hazard and risk. To develop new methods of analysis of...
Random forests for classification in ecology
Cutler, D.R.; Edwards, T.C.; Beard, K.H.; Cutler, A.; Hess, K.T.; Gibson, J.; Lawler, J.J.
2007-01-01
Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature. ?? 2007 by the Ecological Society of America.
Zheng, Jie; Harris, Marcelline R; Masci, Anna Maria; Lin, Yu; Hero, Alfred; Smith, Barry; He, Yongqun
2016-09-14
Statistics play a critical role in biological and clinical research. However, most reports of scientific results in the published literature make it difficult for the reader to reproduce the statistical analyses performed in achieving those results because they provide inadequate documentation of the statistical tests and algorithms applied. The Ontology of Biological and Clinical Statistics (OBCS) is put forward here as a step towards solving this problem. The terms in OBCS including 'data collection', 'data transformation in statistics', 'data visualization', 'statistical data analysis', and 'drawing a conclusion based on data', cover the major types of statistical processes used in basic biological research and clinical outcome studies. OBCS is aligned with the Basic Formal Ontology (BFO) and extends the Ontology of Biomedical Investigations (OBI), an OBO (Open Biological and Biomedical Ontologies) Foundry ontology supported by over 20 research communities. Currently, OBCS comprehends 878 terms, representing 20 BFO classes, 403 OBI classes, 229 OBCS specific classes, and 122 classes imported from ten other OBO ontologies. We discuss two examples illustrating how the ontology is being applied. In the first (biological) use case, we describe how OBCS was applied to represent the high throughput microarray data analysis of immunological transcriptional profiles in human subjects vaccinated with an influenza vaccine. In the second (clinical outcomes) use case, we applied OBCS to represent the processing of electronic health care data to determine the associations between hospital staffing levels and patient mortality. Our case studies were designed to show how OBCS can be used for the consistent representation of statistical analysis pipelines under two different research paradigms. Other ongoing projects using OBCS for statistical data processing are also discussed. The OBCS source code and documentation are available at: https://github.com/obcs/obcs . The Ontology of Biological and Clinical Statistics (OBCS) is a community-based open source ontology in the domain of biological and clinical statistics. OBCS is a timely ontology that represents statistics-related terms and their relations in a rigorous fashion, facilitates standard data analysis and integration, and supports reproducible biological and clinical research.
Statistical innovations in the medical device world sparked by the FDA.
Campbell, Gregory; Yue, Lilly Q
2016-01-01
The world of medical devices while highly diverse is extremely innovative, and this facilitates the adoption of innovative statistical techniques. Statisticians in the Center for Devices and Radiological Health (CDRH) at the Food and Drug Administration (FDA) have provided leadership in implementing statistical innovations. The innovations discussed include: the incorporation of Bayesian methods in clinical trials, adaptive designs, the use and development of propensity score methodology in the design and analysis of non-randomized observational studies, the use of tipping-point analysis for missing data, techniques for diagnostic test evaluation, bridging studies for companion diagnostic tests, quantitative benefit-risk decisions, and patient preference studies.
Biological Parametric Mapping: A Statistical Toolbox for Multi-Modality Brain Image Analysis
Casanova, Ramon; Ryali, Srikanth; Baer, Aaron; Laurienti, Paul J.; Burdette, Jonathan H.; Hayasaka, Satoru; Flowers, Lynn; Wood, Frank; Maldjian, Joseph A.
2006-01-01
In recent years multiple brain MR imaging modalities have emerged; however, analysis methodologies have mainly remained modality specific. In addition, when comparing across imaging modalities, most researchers have been forced to rely on simple region-of-interest type analyses, which do not allow the voxel-by-voxel comparisons necessary to answer more sophisticated neuroscience questions. To overcome these limitations, we developed a toolbox for multimodal image analysis called biological parametric mapping (BPM), based on a voxel-wise use of the general linear model. The BPM toolbox incorporates information obtained from other modalities as regressors in a voxel-wise analysis, thereby permitting investigation of more sophisticated hypotheses. The BPM toolbox has been developed in MATLAB with a user friendly interface for performing analyses, including voxel-wise multimodal correlation, ANCOVA, and multiple regression. It has a high degree of integration with the SPM (statistical parametric mapping) software relying on it for visualization and statistical inference. Furthermore, statistical inference for a correlation field, rather than a widely-used T-field, has been implemented in the correlation analysis for more accurate results. An example with in-vivo data is presented demonstrating the potential of the BPM methodology as a tool for multimodal image analysis. PMID:17070709
A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data
Vinaixa, Maria; Samino, Sara; Saez, Isabel; Duran, Jordi; Guinovart, Joan J.; Yanes, Oscar
2012-01-01
Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples. PMID:24957762
A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data.
Vinaixa, Maria; Samino, Sara; Saez, Isabel; Duran, Jordi; Guinovart, Joan J; Yanes, Oscar
2012-10-18
Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples.
NASA Astrophysics Data System (ADS)
Kovalevsky, Louis; Langley, Robin S.; Caro, Stephane
2016-05-01
Due to the high cost of experimental EMI measurements significant attention has been focused on numerical simulation. Classical methods such as Method of Moment or Finite Difference Time Domain are not well suited for this type of problem, as they require a fine discretisation of space and failed to take into account uncertainties. In this paper, the authors show that the Statistical Energy Analysis is well suited for this type of application. The SEA is a statistical approach employed to solve high frequency problems of electromagnetically reverberant cavities at a reduced computational cost. The key aspects of this approach are (i) to consider an ensemble of system that share the same gross parameter, and (ii) to avoid solving Maxwell's equations inside the cavity, using the power balance principle. The output is an estimate of the field magnitude distribution in each cavity. The method is applied on a typical aircraft structure.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Halligan, Matthew
Radiated power calculation approaches for practical scenarios of incomplete high- density interface characterization information and incomplete incident power information are presented. The suggested approaches build upon a method that characterizes power losses through the definition of power loss constant matrices. Potential radiated power estimates include using total power loss information, partial radiated power loss information, worst case analysis, and statistical bounding analysis. A method is also proposed to calculate radiated power when incident power information is not fully known for non-periodic signals at the interface. Incident data signals are modeled from a two-state Markov chain where bit state probabilities aremore » derived. The total spectrum for windowed signals is postulated as the superposition of spectra from individual pulses in a data sequence. Statistical bounding methods are proposed as a basis for the radiated power calculation due to the statistical calculation complexity to find a radiated power probability density function.« less
The Statistical Consulting Center for Astronomy (SCCA)
NASA Technical Reports Server (NTRS)
Akritas, Michael
2001-01-01
The process by which raw astronomical data acquisition is transformed into scientifically meaningful results and interpretation typically involves many statistical steps. Traditional astronomy limits itself to a narrow range of old and familiar statistical methods: means and standard deviations; least-squares methods like chi(sup 2) minimization; and simple nonparametric procedures such as the Kolmogorov-Smirnov tests. These tools are often inadequate for the complex problems and datasets under investigations, and recent years have witnessed an increased usage of maximum-likelihood, survival analysis, multivariate analysis, wavelet and advanced time-series methods. The Statistical Consulting Center for Astronomy (SCCA) assisted astronomers with the use of sophisticated tools, and to match these tools with specific problems. The SCCA operated with two professors of statistics and a professor of astronomy working together. Questions were received by e-mail, and were discussed in detail with the questioner. Summaries of those questions and answers leading to new approaches were posted on the Web (www.state.psu.edu/ mga/SCCA). In addition to serving individual astronomers, the SCCA established a Web site for general use that provides hypertext links to selected on-line public-domain statistical software and services. The StatCodes site (www.astro.psu.edu/statcodes) provides over 200 links in the areas of: Bayesian statistics; censored and truncated data; correlation and regression, density estimation and smoothing, general statistics packages and information; image analysis; interactive Web tools; multivariate analysis; multivariate clustering and classification; nonparametric analysis; software written by astronomers; spatial statistics; statistical distributions; time series analysis; and visualization tools. StatCodes has received a remarkable high and constant hit rate of 250 hits/week (over 10,000/year) since its inception in mid-1997. It is of interest to scientists both within and outside of astronomy. The most popular sections are multivariate techniques, image analysis, and time series analysis. Hundreds of copies of the ASURV, SLOPES and CENS-TAU codes developed by SCCA scientists were also downloaded from the StatCodes site. In addition to formal SCCA duties, SCCA scientists continued a variety of related activities in astrostatistics, including refereeing of statistically oriented papers submitted to the Astrophysical Journal, talks in meetings including Feigelson's talk to science journalists entitled "The reemergence of astrostatistics" at the American Association for the Advancement of Science meeting, and published papers of astrostatistical content.
Statistical quality control through overall vibration analysis
NASA Astrophysics Data System (ADS)
Carnero, M. a. Carmen; González-Palma, Rafael; Almorza, David; Mayorga, Pedro; López-Escobar, Carlos
2010-05-01
The present study introduces the concept of statistical quality control in automotive wheel bearings manufacturing processes. Defects on products under analysis can have a direct influence on passengers' safety and comfort. At present, the use of vibration analysis on machine tools for quality control purposes is not very extensive in manufacturing facilities. Noise and vibration are common quality problems in bearings. These failure modes likely occur under certain operating conditions and do not require high vibration amplitudes but relate to certain vibration frequencies. The vibration frequencies are affected by the type of surface problems (chattering) of ball races that are generated through grinding processes. The purpose of this paper is to identify grinding process variables that affect the quality of bearings by using statistical principles in the field of machine tools. In addition, an evaluation of the quality results of the finished parts under different combinations of process variables is assessed. This paper intends to establish the foundations to predict the quality of the products through the analysis of self-induced vibrations during the contact between the grinding wheel and the parts. To achieve this goal, the overall self-induced vibration readings under different combinations of process variables are analysed using statistical tools. The analysis of data and design of experiments follows a classical approach, considering all potential interactions between variables. The analysis of data is conducted through analysis of variance (ANOVA) for data sets that meet normality and homoscedasticity criteria. This paper utilizes different statistical tools to support the conclusions such as chi squared, Shapiro-Wilks, symmetry, Kurtosis, Cochran, Hartlett, and Hartley and Krushal-Wallis. The analysis presented is the starting point to extend the use of predictive techniques (vibration analysis) for quality control. This paper demonstrates the existence of predictive variables (high-frequency vibration displacements) that are sensible to the processes setup and the quality of the products obtained. Based on the result of this overall vibration analysis, a second paper will analyse self-induced vibration spectrums in order to define limit vibration bands, controllable every cycle or connected to permanent vibration-monitoring systems able to adjust sensible process variables identified by ANOVA, once the vibration readings exceed established quality limits.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Steed, Chad Allen
EDENx is a multivariate data visualization tool that allows interactive user driven analysis of large-scale data sets with high dimensionality. EDENx builds on our earlier system, called EDEN to enable analysis of more dimensions and larger scale data sets. EDENx provides an initial overview of summary statistics for each variable in the data set under investigation. EDENx allows the user to interact with graphical summary plots of the data to investigate subsets and their statistical associations. These plots include histograms, binned scatterplots, binned parallel coordinate plots, timeline plots, and graphical correlation indicators. From the EDENx interface, a user can selectmore » a subsample of interest and launch a more detailed data visualization via the EDEN system. EDENx is best suited for high-level, aggregate analysis tasks while EDEN is more appropriate for detail data investigations.« less
Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti
2016-07-01
A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Code is available at https://github.com/aalto-ics-kepaco anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J.; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T.; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti
2016-01-01
Motivation: A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. Results: We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness. Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Availability and implementation: Code is available at https://github.com/aalto-ics-kepaco Contacts: anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153689
Huncharek, M; Kupelnick, B
2001-01-01
The etiology of epithelial ovarian cancer is unknown. Prior work suggests that high dietary fat intake is associated with an increased risk of this tumor, although this association remains speculative. A meta-analysis was performed to evaluate this suspected relationship. Using previously described methods, a protocol was developed for a meta-analysis examining the association between high vs. low dietary fat intake and the risk of epithelial ovarian cancer. Literature search techniques, study inclusion criteria, and statistical procedures were prospectively defined. Data from observational studies were pooled using a general variance-based meta-analytic method employing confidence intervals (CI) previously described by Greenland. The outcome of interest was a summary relative risk (RRs) reflecting the risk of ovarian cancer associated with high vs. low dietary fat intake. Sensitivity analyses were performed when necessary to evaluate any observed statistical heterogeneity. The literature search yielded 8 observational studies enrolling 6,689 subjects. Data were stratified into three dietary fat intake categories: total fat, animal fat, and saturated fat. Initial tests for statistical homogeneity demonstrated that hospital-based studies accounted for observed heterogeneity possibly because of selection bias. Accounting for this, an RRs was calculated for high vs. low total fat intake, yielding a value of 1.24 (95% CI = 1.07-1.43), a statistically significant result. That is, high total fat intake is associated with a 24% increased risk of ovarian cancer development. The RRs for high saturated fat intake was 1.20 (95% CI = 1.04-1.39), suggesting a 20% increased risk of ovarian cancer among subjects with these dietary habits. High vs. low animal fat diet gave an RRs of 1.70 (95% CI = 1.43-2.03), consistent with a statistically significant 70% increased ovarian cancer risk. High dietary fat intake appears to represent a significant risk factor for the development of ovarian cancer. The magnitude of this risk associated with total fat and saturated fat is rather modest. Ovarian cancer risk associated with high animal fat intake appears significantly greater than that associated with the other types of fat intake studied, although this requires confirmation via larger analyses. Further work is needed to clarify factors that may modify the effects of dietary fat in vivo.
Mathysen, Danny G P; Aclimandos, Wagih; Roelant, Ella; Wouters, Kristien; Creuzot-Garcher, Catherine; Ringens, Peter J; Hawlina, Marko; Tassignon, Marie-José
2013-11-01
To investigate whether introduction of item-response theory (IRT) analysis, in parallel to the 'traditional' statistical analysis methods available for performance evaluation of multiple T/F items as used in the European Board of Ophthalmology Diploma (EBOD) examination, has proved beneficial, and secondly, to study whether the overall assessment performance of the current written part of EBOD is sufficiently high (KR-20≥ 0.90) to be kept as examination format in future EBOD editions. 'Traditional' analysis methods for individual MCQ item performance comprise P-statistics, Rit-statistics and item discrimination, while overall reliability is evaluated through KR-20 for multiple T/F items. The additional set of statistical analysis methods for the evaluation of EBOD comprises mainly IRT analysis. These analysis techniques are used to monitor whether the introduction of negative marking for incorrect answers (since EBOD 2010) has a positive influence on the statistical performance of EBOD as a whole and its individual test items in particular. Item-response theory analysis demonstrated that item performance parameters should not be evaluated individually, but should be related to one another. Before the introduction of negative marking, the overall EBOD reliability (KR-20) was good though with room for improvement (EBOD 2008: 0.81; EBOD 2009: 0.78). After the introduction of negative marking, the overall reliability of EBOD improved significantly (EBOD 2010: 0.92; EBOD 2011:0.91; EBOD 2012: 0.91). Although many statistical performance parameters are available to evaluate individual items, our study demonstrates that the overall reliability assessment remains the only crucial parameter to be evaluated allowing comparison. While individual item performance analysis is worthwhile to undertake as secondary analysis, drawing final conclusions seems to be more difficult. Performance parameters need to be related, as shown by IRT analysis. Therefore, IRT analysis has proved beneficial for the statistical analysis of EBOD. Introduction of negative marking has led to a significant increase in the reliability (KR-20 > 0.90), indicating that the current examination format can be kept for future EBOD examinations. © 2013 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.
Publishing in "SERJ": An Analysis of Papers from 2002-2009
ERIC Educational Resources Information Center
Zieffler, Andrew; Garfield, Joan; delMas, Robert C.; Le, Laura; Isaak, Rebekah; Bjornsdottir, Audbjorg; Park, Jiyoon
2011-01-01
"SERJ" has provided a high quality professional publication venue for researchers in statistics education for close to a decade. This paper presents a review of the articles published to explore what they suggest about the field of statistics education, the researchers, the questions addressed, and the growing knowledge base on teaching and…
Statistical description of tectonic motions
NASA Technical Reports Server (NTRS)
Agnew, Duncan Carr
1993-01-01
This report summarizes investigations regarding tectonic motions. The topics discussed include statistics of crustal deformation, Earth rotation studies, using multitaper spectrum analysis techniques applied to both space-geodetic data and conventional astrometric estimates of the Earth's polar motion, and the development, design, and installation of high-stability geodetic monuments for use with the global positioning system.
Selected Statistics on the Status of Asian-American Women
ERIC Educational Resources Information Center
Fong, Pauline; Cabezas, Amado
1977-01-01
Taken from a paper on "The Economic and Employment Status of Asian Women in America" by Pauline Fong and Amado Cabezas of ASIAN, Inc., this brief analysis of statistics on Asian women indicates that highly educated Asian women do not have higher incomes or better jobs than many of those with less education.
ERIC Educational Resources Information Center
Wagler, Amy; Wagler, Ron
2014-01-01
Every high school graduate should be able to use data analysis and statistical reasoning to draw conclusions about the world. Two core statistical concepts for students to understand are the role of variability in measures and evaluating the effect of a variable. In the activity presented in this article, students investigate a scientific question…
High order statistical signatures from source-driven measurements of subcritical fissile systems
NASA Astrophysics Data System (ADS)
Mattingly, John Kelly
1998-11-01
This research focuses on the development and application of high order statistical analyses applied to measurements performed with subcritical fissile systems driven by an introduced neutron source. The signatures presented are derived from counting statistics of the introduced source and radiation detectors that observe the response of the fissile system. It is demonstrated that successively higher order counting statistics possess progressively higher sensitivity to reactivity. Consequently, these signatures are more sensitive to changes in the composition, fissile mass, and configuration of the fissile assembly. Furthermore, it is shown that these techniques are capable of distinguishing the response of the fissile system to the introduced source from its response to any internal or inherent sources. This ability combined with the enhanced sensitivity of higher order signatures indicates that these techniques will be of significant utility in a variety of applications. Potential applications include enhanced radiation signature identification of weapons components for nuclear disarmament and safeguards applications and augmented nondestructive analysis of spent nuclear fuel. In general, these techniques expand present capabilities in the analysis of subcritical measurements.
Parallel processing of genomics data
NASA Astrophysics Data System (ADS)
Agapito, Giuseppe; Guzzi, Pietro Hiram; Cannataro, Mario
2016-10-01
The availability of high-throughput experimental platforms for the analysis of biological samples, such as mass spectrometry, microarrays and Next Generation Sequencing, have made possible to analyze a whole genome in a single experiment. Such platforms produce an enormous volume of data per single experiment, thus the analysis of this enormous flow of data poses several challenges in term of data storage, preprocessing, and analysis. To face those issues, efficient, possibly parallel, bioinformatics software needs to be used to preprocess and analyze data, for instance to highlight genetic variation associated with complex diseases. In this paper we present a parallel algorithm for the parallel preprocessing and statistical analysis of genomics data, able to face high dimension of data and resulting in good response time. The proposed system is able to find statistically significant biological markers able to discriminate classes of patients that respond to drugs in different ways. Experiments performed on real and synthetic genomic datasets show good speed-up and scalability.
Guo, Jing; Yuan, Yahong; Dou, Pei; Yue, Tianli
2017-10-01
Fifty-one kiwifruit juice samples of seven kiwifruit varieties from five regions in China were analyzed to determine their polyphenols contents and to trace fruit varieties and geographical origins by multivariate statistical analysis. Twenty-one polyphenols belonging to four compound classes were determined by ultra-high-performance liquid chromatography coupled with ultra-high-resolution TOF mass spectrometry. (-)-Epicatechin, (+)-catechin, procyanidin B1 and caffeic acid derivatives were the predominant phenolic compounds in the juices. Principal component analysis (PCA) allowed a clear separation of the juices according to kiwifruit varieties. Stepwise linear discriminant analysis (SLDA) yielded satisfactory categorization of samples, provided 100% success rate according to kiwifruit varieties and 92.2% success rate according to geographical origins. The result showed that polyphenolic profiles of kiwifruit juices contain enough information to trace fruit varieties and geographical origins. Copyright © 2017 Elsevier Ltd. All rights reserved.
An Overview of R in Health Decision Sciences.
Jalal, Hawre; Pechlivanoglou, Petros; Krijkamp, Eline; Alarid-Escudero, Fernando; Enns, Eva; Hunink, M G Myriam
2017-10-01
As the complexity of health decision science applications increases, high-level programming languages are increasingly adopted for statistical analyses and numerical computations. These programming languages facilitate sophisticated modeling, model documentation, and analysis reproducibility. Among the high-level programming languages, the statistical programming framework R is gaining increased recognition. R is freely available, cross-platform compatible, and open source. A large community of users who have generated an extensive collection of well-documented packages and functions supports it. These functions facilitate applications of health decision science methodology as well as the visualization and communication of results. Although R's popularity is increasing among health decision scientists, methodological extensions of R in the field of decision analysis remain isolated. The purpose of this article is to provide an overview of existing R functionality that is applicable to the various stages of decision analysis, including model design, input parameter estimation, and analysis of model outputs.
Analysis tools for discovering strong parity violation at hadron colliders
NASA Astrophysics Data System (ADS)
Backović, Mihailo; Ralston, John P.
2011-07-01
Several arguments suggest parity violation may be observable in high energy strong interactions. We introduce new analysis tools to describe the azimuthal dependence of multiparticle distributions, or “azimuthal flow.” Analysis uses the representations of the orthogonal group O(2) and dihedral groups DN necessary to define parity completely in two dimensions. Classification finds that collective angles used in event-by-event statistics represent inequivalent tensor observables that cannot generally be represented by a single “reaction plane.” Many new parity-violating observables exist that have never been measured, while many parity-conserving observables formerly lumped together are now distinguished. We use the concept of “event-shape sorting” to suggest separating right- and left-handed events, and we discuss the effects of transverse and longitudinal spin. The analysis tools are statistically robust, and can be applied equally to low or high multiplicity events at the Tevatron, RHIC or RHIC Spin, and the LHC.
Fernee, Christianne; Browne, Martin; Zakrzewski, Sonia
2017-01-01
This paper introduces statistical shape modelling (SSM) for use in osteoarchaeology research. SSM is a full field, multi-material analytical technique, and is presented as a supplementary geometric morphometric (GM) tool. Lower mandibular canines from two archaeological populations and one modern population were sampled, digitised using micro-CT, aligned, registered to a baseline and statistically modelled using principal component analysis (PCA). Sample material properties were incorporated as a binary enamel/dentin parameter. Results were assessed qualitatively and quantitatively using anatomical landmarks. Finally, the technique’s application was demonstrated for inter-sample comparison through analysis of the principal component (PC) weights. It was found that SSM could provide high detail qualitative and quantitative insight with respect to archaeological inter- and intra-sample variability. This technique has value for archaeological, biomechanical and forensic applications including identification, finite element analysis (FEA) and reconstruction from partial datasets. PMID:29216199
An issue of literacy on pediatric arterial hypertension
NASA Astrophysics Data System (ADS)
Teodoro, M. Filomena; Romana, Andreia; Simão, Carla
2017-11-01
Arterial hypertension in pediatric age is a public health problem, whose prevalence has increased significantly over time. Pediatric arterial hypertension (PAH) is under-diagnosed in most cases, a highly prevalent disease, appears without notice with multiple consequences on the children's health and future adults. Children caregivers and close family must know the PAH existence, the negative consequences associated with it, the risk factors and, finally, must do prevention. In [12, 13] can be found a statistical data analysis using a simpler questionnaire introduced in [4] under the aim of a preliminary study about PAH caregivers acquaintance. A continuation of such analysis is detailed in [14]. An extension of such questionnaire was built and applied to a distinct population and it was filled online. The statistical approach is partially reproduced in the present work. Some statistical models were estimated using several approaches, namely multivariate analysis (factorial analysis), also adequate methods to analyze the kind of data in study.
1997-06-01
career success for academy graduates relative to officers commissioned from other sources. Favoritism occurs if high-ranking officers who are service... career success as a naval officer? 6 The thesis investigates several databases in an effort to paint a complete statistical picture of naval officer...including both public and private sector career success was conducted by the Standard & Poor’s Corporation with a related analysis by Professor Michael Useem
ERIC Educational Resources Information Center
Raker, Jeffrey R.; Holme, Thomas A.
2014-01-01
A cluster analysis was conducted with a set of survey data on chemistry faculty familiarity with 13 assessment terms. Cluster groupings suggest a high, middle, and low overall familiarity with the terminology and an independent high and low familiarity with terms related to fundamental statistics. The six resultant clusters were found to be…
Analysis of high-resolution foreign exchange data of USD-JPY for 13 years
NASA Astrophysics Data System (ADS)
Mizuno, Takayuki; Kurihara, Shoko; Takayasu, Misako; Takayasu, Hideki
2003-06-01
We analyze high-resolution foreign exchange data consisting of 20 million data points of USD-JPY for 13 years to report firm statistical laws in distributions and correlations of exchange rate fluctuations. A conditional probability density analysis clearly shows the existence of trend-following movements at time scale of 8-ticks, about 1 min.
Spatial Differentiation of Landscape Values in the Murray River Region of Victoria, Australia
NASA Astrophysics Data System (ADS)
Zhu, Xuan; Pfueller, Sharron; Whitelaw, Paul; Winter, Caroline
2010-05-01
This research advances the understanding of the location of perceived landscape values through a statistically based approach to spatial analysis of value densities. Survey data were obtained from a sample of people living in and using the Murray River region, Australia, where declining environmental quality prompted a reevaluation of its conservation status. When densities of 12 perceived landscape values were mapped using geographic information systems (GIS), valued places clustered along the entire river bank and in associated National/State Parks and reserves. While simple density mapping revealed high value densities in various locations, it did not indicate what density of a landscape value could be regarded as a statistically significant hotspot or distinguish whether overlapping areas of high density for different values indicate identical or adjacent locations. A spatial statistic Getis-Ord Gi* was used to indicate statistically significant spatial clusters of high value densities or “hotspots”. Of 251 hotspots, 40% were for single non-use values, primarily spiritual, therapeutic or intrinsic. Four hotspots had 11 landscape values. Two, lacking economic value, were located in ecologically important river red gum forests and two, lacking wilderness value, were near the major towns of Echuca-Moama and Albury-Wodonga. Hotspots for eight values showed statistically significant associations with another value. There were high associations between learning and heritage values while economic and biological diversity values showed moderate associations with several other direct and indirect use values. This approach may improve confidence in the interpretation of spatial analysis of landscape values by enhancing understanding of value relationships.
NASA Astrophysics Data System (ADS)
Zabolotna, Natalia I.; Radchenko, Kostiantyn O.; Karas, Oleksandr V.
2018-01-01
A fibroadenoma diagnosing of breast using statistical analysis (determination and analysis of statistical moments of the 1st-4th order) of the obtained polarization images of Jones matrix imaginary elements of the optically thin (attenuation coefficient τ <= 0,1 ) blood plasma films with further intellectual differentiation based on the method of "fuzzy" logic and discriminant analysis were proposed. The accuracy of the intellectual differentiation of blood plasma samples to the "norm" and "fibroadenoma" of breast was 82.7% by the method of linear discriminant analysis, and by the "fuzzy" logic method is 95.3%. The obtained results allow to confirm the potentially high level of reliability of the method of differentiation by "fuzzy" analysis.
Multivariate analysis: A statistical approach for computations
NASA Astrophysics Data System (ADS)
Michu, Sachin; Kaushik, Vandana
2014-10-01
Multivariate analysis is a type of multivariate statistical approach commonly used in, automotive diagnosis, education evaluating clusters in finance etc and more recently in the health-related professions. The objective of the paper is to provide a detailed exploratory discussion about factor analysis (FA) in image retrieval method and correlation analysis (CA) of network traffic. Image retrieval methods aim to retrieve relevant images from a collected database, based on their content. The problem is made more difficult due to the high dimension of the variable space in which the images are represented. Multivariate correlation analysis proposes an anomaly detection and analysis method based on the correlation coefficient matrix. Anomaly behaviors in the network include the various attacks on the network like DDOs attacks and network scanning.
NASA Astrophysics Data System (ADS)
Zainudin, M. N. Shah; Sulaiman, Md Nasir; Mustapha, Norwati; Perumal, Thinagaran
2017-10-01
Prior knowledge in pervasive computing recently garnered a lot of attention due to its high demand in various application domains. Human activity recognition (HAR) considered as the applications that are widely explored by the expertise that provides valuable information to the human. Accelerometer sensor-based approach is utilized as devices to undergo the research in HAR since their small in size and this sensor already build-in in the various type of smartphones. However, the existence of high inter-class similarities among the class tends to degrade the recognition performance. Hence, this work presents the method for activity recognition using our proposed features from combinational of spectral analysis with statistical descriptors that able to tackle the issue of differentiating stationary and locomotion activities. The noise signal is filtered using Fourier Transform before it will be extracted using two different groups of features, spectral frequency analysis, and statistical descriptors. Extracted signal later will be classified using random forest ensemble classifier models. The recognition results show the good accuracy performance for stationary and locomotion activities based on USC HAD datasets.
The impact of varicocelectomy on sperm parameters: a meta-analysis.
Schauer, Ingrid; Madersbacher, Stephan; Jost, Romy; Hübner, Wilhelm Alexander; Imhof, Martin
2012-05-01
We determined the impact of 3 surgical techniques (high ligation, inguinal varicocelectomy and the subinguinal approach) for varicocelectomy on sperm parameters (count and motility) and pregnancy rates. By searching the literature using MEDLINE and the Cochrane Library with the last search performed in February 2011, focusing on the last 20 years, a total of 94 articles published between 1975 and 2011 reporting on sperm parameters before and after varicocelectomy were identified. Inclusion criteria for this meta-analysis were at least 2 semen analyses (before and 3 or more months after the procedure), patient age older than 19 years, clinical subfertility and/or abnormal semen parameters, and a clinically palpable varicocele. To rule out skewing factors a bias analysis was performed, and statistical analysis was done with RevMan5(®) and SPSS 15.0(®). A total of 14 articles were included in the statistical analysis. All 3 surgical approaches led to significant or highly significant postoperative improvement of both parameters with only slight numeric differences among the techniques. This difference did not reach statistical significance for sperm count (p = 0.973) or sperm motility (p = 0.372). After high ligation surgery sperm count increased by 10.85 million per ml (p = 0.006) and motility by 6.80% (p <0.00001) on the average. Inguinal varicocelectomy led to an improvement in sperm count of 7.17 million per ml (p <0.0001) while motility changed by 9.44% (p = 0.001). Subinguinal varicocelectomy provided an increase in sperm count of 9.75 million per ml (p = 0.002) and sperm motility by 12.25% (p = 0.001). Inguinal varicocelectomy showed the highest pregnancy rate of 41.48% compared to 26.90% and 26.56% after high ligation and subinguinal varicocelectomy, respectively, and the difference was statistically significant (p = 0.035). This meta-analysis suggests that varicocelectomy leads to significant improvements in sperm count and motility regardless of surgical technique, with the inguinal approach offering the highest pregnancy rate. Copyright © 2012 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
Statistical process control methods allow the analysis and improvement of anesthesia care.
Fasting, Sigurd; Gisvold, Sven E
2003-10-01
Quality aspects of the anesthetic process are reflected in the rate of intraoperative adverse events. The purpose of this report is to illustrate how the quality of the anesthesia process can be analyzed using statistical process control methods, and exemplify how this analysis can be used for quality improvement. We prospectively recorded anesthesia-related data from all anesthetics for five years. The data included intraoperative adverse events, which were graded into four levels, according to severity. We selected four adverse events, representing important quality and safety aspects, for statistical process control analysis. These were: inadequate regional anesthesia, difficult emergence from general anesthesia, intubation difficulties and drug errors. We analyzed the underlying process using 'p-charts' for statistical process control. In 65,170 anesthetics we recorded adverse events in 18.3%; mostly of lesser severity. Control charts were used to define statistically the predictable normal variation in problem rate, and then used as a basis for analysis of the selected problems with the following results: Inadequate plexus anesthesia: stable process, but unacceptably high failure rate; Difficult emergence: unstable process, because of quality improvement efforts; Intubation difficulties: stable process, rate acceptable; Medication errors: methodology not suited because of low rate of errors. By applying statistical process control methods to the analysis of adverse events, we have exemplified how this allows us to determine if a process is stable, whether an intervention is required, and if quality improvement efforts have the desired effect.
RepExplore: addressing technical replicate variance in proteomics and metabolomics data analysis.
Glaab, Enrico; Schneider, Reinhard
2015-07-01
High-throughput omics datasets often contain technical replicates included to account for technical sources of noise in the measurement process. Although summarizing these replicate measurements by using robust averages may help to reduce the influence of noise on downstream data analysis, the information on the variance across the replicate measurements is lost in the averaging process and therefore typically disregarded in subsequent statistical analyses.We introduce RepExplore, a web-service dedicated to exploit the information captured in the technical replicate variance to provide more reliable and informative differential expression and abundance statistics for omics datasets. The software builds on previously published statistical methods, which have been applied successfully to biomedical omics data but are difficult to use without prior experience in programming or scripting. RepExplore facilitates the analysis by providing a fully automated data processing and interactive ranking tables, whisker plot, heat map and principal component analysis visualizations to interpret omics data and derived statistics. Freely available at http://www.repexplore.tk enrico.glaab@uni.lu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Assessment of statistical methods used in library-based approaches to microbial source tracking.
Ritter, Kerry J; Carruthers, Ethan; Carson, C Andrew; Ellender, R D; Harwood, Valerie J; Kingsley, Kyle; Nakatsu, Cindy; Sadowsky, Michael; Shear, Brian; West, Brian; Whitlock, John E; Wiggins, Bruce A; Wilbur, Jayson D
2003-12-01
Several commonly used statistical methods for fingerprint identification in microbial source tracking (MST) were examined to assess the effectiveness of pattern-matching algorithms to correctly identify sources. Although numerous statistical methods have been employed for source identification, no widespread consensus exists as to which is most appropriate. A large-scale comparison of several MST methods, using identical fecal sources, presented a unique opportunity to assess the utility of several popular statistical methods. These included discriminant analysis, nearest neighbour analysis, maximum similarity and average similarity, along with several measures of distance or similarity. Threshold criteria for excluding uncertain or poorly matched isolates from final analysis were also examined for their ability to reduce false positives and increase prediction success. Six independent libraries used in the study were constructed from indicator bacteria isolated from fecal materials of humans, seagulls, cows and dogs. Three of these libraries were constructed using the rep-PCR technique and three relied on antibiotic resistance analysis (ARA). Five of the libraries were constructed using Escherichia coli and one using Enterococcus spp. (ARA). Overall, the outcome of this study suggests a high degree of variability across statistical methods. Despite large differences in correct classification rates among the statistical methods, no single statistical approach emerged as superior. Thresholds failed to consistently increase rates of correct classification and improvement was often associated with substantial effective sample size reduction. Recommendations are provided to aid in selecting appropriate analyses for these types of data.
Mishima, Hiroyuki; Lidral, Andrew C; Ni, Jun
2008-05-28
Genetic association studies have been used to map disease-causing genes. A newly introduced statistical method, called exhaustive haplotype association study, analyzes genetic information consisting of different numbers and combinations of DNA sequence variations along a chromosome. Such studies involve a large number of statistical calculations and subsequently high computing power. It is possible to develop parallel algorithms and codes to perform the calculations on a high performance computing (HPC) system. However, most existing commonly-used statistic packages for genetic studies are non-parallel versions. Alternatively, one may use the cutting-edge technology of grid computing and its packages to conduct non-parallel genetic statistical packages on a centralized HPC system or distributed computing systems. In this paper, we report the utilization of a queuing scheduler built on the Grid Engine and run on a Rocks Linux cluster for our genetic statistical studies. Analysis of both consecutive and combinational window haplotypes was conducted by the FBAT (Laird et al., 2000) and Unphased (Dudbridge, 2003) programs. The dataset consisted of 26 loci from 277 extended families (1484 persons). Using the Rocks Linux cluster with 22 compute-nodes, FBAT jobs performed about 14.4-15.9 times faster, while Unphased jobs performed 1.1-18.6 times faster compared to the accumulated computation duration. Execution of exhaustive haplotype analysis using non-parallel software packages on a Linux-based system is an effective and efficient approach in terms of cost and performance.
Mishima, Hiroyuki; Lidral, Andrew C; Ni, Jun
2008-01-01
Background Genetic association studies have been used to map disease-causing genes. A newly introduced statistical method, called exhaustive haplotype association study, analyzes genetic information consisting of different numbers and combinations of DNA sequence variations along a chromosome. Such studies involve a large number of statistical calculations and subsequently high computing power. It is possible to develop parallel algorithms and codes to perform the calculations on a high performance computing (HPC) system. However, most existing commonly-used statistic packages for genetic studies are non-parallel versions. Alternatively, one may use the cutting-edge technology of grid computing and its packages to conduct non-parallel genetic statistical packages on a centralized HPC system or distributed computing systems. In this paper, we report the utilization of a queuing scheduler built on the Grid Engine and run on a Rocks Linux cluster for our genetic statistical studies. Results Analysis of both consecutive and combinational window haplotypes was conducted by the FBAT (Laird et al., 2000) and Unphased (Dudbridge, 2003) programs. The dataset consisted of 26 loci from 277 extended families (1484 persons). Using the Rocks Linux cluster with 22 compute-nodes, FBAT jobs performed about 14.4–15.9 times faster, while Unphased jobs performed 1.1–18.6 times faster compared to the accumulated computation duration. Conclusion Execution of exhaustive haplotype analysis using non-parallel software packages on a Linux-based system is an effective and efficient approach in terms of cost and performance. PMID:18541045
A weighted U-statistic for genetic association analyses of sequencing data.
Wei, Changshuai; Li, Ming; He, Zihuai; Vsevolozhskaya, Olga; Schaid, Daniel J; Lu, Qing
2014-12-01
With advancements in next-generation sequencing technology, a massive amount of sequencing data is generated, which offers a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, the high-dimensional sequencing data poses a great challenge for statistical analysis. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a Weighted U Sequencing test, referred to as WU-SEQ, for the high-dimensional association analysis of sequencing data. Based on a nonparametric U-statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we showed that WU-SEQ outperformed a commonly used sequence kernel association test (SKAT) method when the underlying assumptions were violated (e.g., the phenotype followed a heavy-tailed distribution). Even when the assumptions were satisfied, WU-SEQ still attained comparable performance to SKAT. Finally, we applied WU-SEQ to sequencing data from the Dallas Heart Study (DHS), and detected an association between ANGPTL 4 and very low density lipoprotein cholesterol. © 2014 WILEY PERIODICALS, INC.
NASA Astrophysics Data System (ADS)
Magazù, Salvatore; Mezei, Ferenc; Migliardo, Federica
2018-05-01
In a variety of applications of inelastic neutron scattering spectroscopy the goal is to single out the elastic scattering contribution from the total scattered spectrum as a function of momentum transfer and sample environment parameters. The elastic part of the spectrum is defined in such a case by the energy resolution of the spectrometer. Variable elastic energy resolution offers a way to distinguish between elastic and quasi-elastic intensities. Correlation spectroscopy lends itself as an efficient, high intensity approach for accomplishing this both at continuous and pulsed neutron sources. On the one hand, in beam modulation methods the Liouville theorem coupling between intensity and resolution is relaxed and time-of-flight velocity analysis of the neutron velocity distribution can be performed with 50 % duty factor exposure for all available resolutions. On the other hand, the (quasi)elastic part of the spectrum generally contains the major part of the integrated intensity at a given detector, and thus correlation spectroscopy can be applied with most favorable signal to statistical noise ratio. The novel spectrometer CORELLI at SNS is an example for this type of application of the correlation technique at a pulsed source. On a continuous neutron source a statistical chopper can be used for quasi-random time dependent beam modulation and the total time-of-flight of the neutron from the statistical chopper to detection is determined by the analysis of the correlation between the temporal fluctuation of the neutron detection rate and the statistical chopper beam modulation pattern. The correlation analysis can either be used for the determination of the incoming neutron velocity or for the scattered neutron velocity, depending of the position of the statistical chopper along the neutron trajectory. These two options are considered together with an evaluation of spectrometer performance compared to conventional spectroscopy, in particular for variable resolution elastic neutron scattering (RENS) studies of relaxation processes and the evolution of mean square displacements. A particular focus of our analysis is the unique feature of correlation spectroscopy of delivering high and resolution independent beam intensity, thus the same statistical chopper scan contains both high intensity and high resolution information at the same time, and can be evaluated both ways. This flexibility for variable resolution data handling represents an additional asset for correlation spectroscopy in variable resolution work. Changing the beam width for the same statistical chopper allows us to additionally trade resolution for intensity in two different experimental runs, similarly for conventional single slit chopper spectroscopy. The combination of these two approaches is a capability of particular value in neutron spectroscopy studies requiring variable energy resolution, such as the systematic study of quasi-elastic scattering and mean square displacement. Furthermore the statistical chopper approach is particularly advantageous for studying samples with low scattering intensity in the presence of a high, sample independent background.
NASA Astrophysics Data System (ADS)
Coelho, Carlos A.; Marques, Filipe J.
2013-09-01
In this paper the authors combine the equicorrelation and equivariance test introduced by Wilks [13] with the likelihood ratio test (l.r.t.) for independence of groups of variables to obtain the l.r.t. of block equicorrelation and equivariance. This test or its single block version may find applications in many areas as in psychology, education, medicine, genetics and they are important "in many tests of multivariate analysis, e.g. in MANOVA, Profile Analysis, Growth Curve analysis, etc" [12, 9]. By decomposing the overall hypothesis into the hypotheses of independence of groups of variables and the hypothesis of equicorrelation and equivariance we are able to obtain the expressions for the overall l.r.t. statistic and its moments. From these we obtain a suitable factorization of the characteristic function (c.f.) of the logarithm of the l.r.t. statistic, which enables us to develop highly manageable and precise near-exact distributions for the test statistic.
Quantile regression for the statistical analysis of immunological data with many non-detects.
Eilers, Paul H C; Röder, Esther; Savelkoul, Huub F J; van Wijk, Roy Gerth
2012-07-07
Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects.
Estimating the proportion of true null hypotheses when the statistics are discrete.
Dialsingh, Isaac; Austin, Stefanie R; Altman, Naomi S
2015-07-15
In high-dimensional testing problems π0, the proportion of null hypotheses that are true is an important parameter. For discrete test statistics, the P values come from a discrete distribution with finite support and the null distribution may depend on an ancillary statistic such as a table margin that varies among the test statistics. Methods for estimating π0 developed for continuous test statistics, which depend on a uniform or identical null distribution of P values, may not perform well when applied to discrete testing problems. This article introduces a number of π0 estimators, the regression and 'T' methods that perform well with discrete test statistics and also assesses how well methods developed for or adapted from continuous tests perform with discrete tests. We demonstrate the usefulness of these estimators in the analysis of high-throughput biological RNA-seq and single-nucleotide polymorphism data. implemented in R. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
López-Carr, David; Pricope, Narcisa G.; Aukema, Juliann E.; Jankowska, Marta M.; Funk, Christopher C.; Husak, Gregory J.; Michaelsen, Joel C.
2014-01-01
We present an integrative measure of exposure and sensitivity components of vulnerability to climatic and demographic change for the African continent in order to identify “hot spots” of high potential population vulnerability. Getis-Ord Gi* spatial clustering analyses reveal statistically significant locations of spatio-temporal precipitation decline coinciding with high population density and increase. Statistically significant areas are evident, particularly across central, southern, and eastern Africa. The highly populated Lake Victoria basin emerges as a particularly salient hot spot. People located in the regions highlighted in this analysis suffer exceptionally high exposure to negative climate change impacts (as populations increase on lands with decreasing rainfall). Results may help inform further hot spot mapping and related research on demographic vulnerabilities to climate change. Results may also inform more suitable geographical targeting of policy interventions across the continent.
Software for the Integration of Multiomics Experiments in Bioconductor.
Ramos, Marcel; Schiffer, Lucas; Re, Angela; Azhar, Rimsha; Basunia, Azfar; Rodriguez, Carmen; Chan, Tiffany; Chapman, Phil; Davis, Sean R; Gomez-Cabrero, David; Culhane, Aedin C; Haibe-Kains, Benjamin; Hansen, Kasper D; Kodali, Hanish; Louis, Marie S; Mer, Arvind S; Riester, Markus; Morgan, Martin; Carey, Vince; Waldron, Levi
2017-11-01
Multiomics experiments are increasingly commonplace in biomedical research and add layers of complexity to experimental design, data integration, and analysis. R and Bioconductor provide a generic framework for statistical analysis and visualization, as well as specialized data classes for a variety of high-throughput data types, but methods are lacking for integrative analysis of multiomics experiments. The MultiAssayExperiment software package, implemented in R and leveraging Bioconductor software and design principles, provides for the coordinated representation of, storage of, and operation on multiple diverse genomics data. We provide the unrestricted multiple 'omics data for each cancer tissue in The Cancer Genome Atlas as ready-to-analyze MultiAssayExperiment objects and demonstrate in these and other datasets how the software simplifies data representation, statistical analysis, and visualization. The MultiAssayExperiment Bioconductor package reduces major obstacles to efficient, scalable, and reproducible statistical analysis of multiomics data and enhances data science applications of multiple omics datasets. Cancer Res; 77(21); e39-42. ©2017 AACR . ©2017 American Association for Cancer Research.
Kobayashi, Katsuhiro; Jacobs, Julia; Gotman, Jean
2013-01-01
Objective A novel type of statistical time-frequency analysis was developed to elucidate changes of high-frequency EEG activity associated with epileptic spikes. Methods The method uses the Gabor Transform and detects changes of power in comparison to background activity using t-statistics that are controlled by the false discovery rate (FDR) to correct type I error of multiple testing. The analysis was applied to EEGs recorded at 2000 Hz from three patients with mesial temporal lobe epilepsy. Results Spike-related increase of high-frequency oscillations (HFOs) was clearly shown in the FDR-controlled t-spectra: it was most dramatic in spikes recorded from the hippocampus when the hippocampus was the seizure onset zone (SOZ). Depression of fast activity was observed immediately after the spikes, especially consistently in the discharges from the hippocampal SOZ. It corresponded to the slow wave part in case of spike-and-slow-wave complexes, but it was noted even in spikes without apparent slow waves. In one patient, a gradual increase of power above 200 Hz preceded spikes. Conclusions FDR-controlled t-spectra clearly detected the spike-related changes of HFOs that were unclear in standard power spectra. Significance We developed a promising tool to study the HFOs that may be closely linked to the pathophysiology of epileptogenesis. PMID:19394892
Similar protein expression profiles of ovarian and endometrial high-grade serous carcinomas.
Hiramatsu, Kosuke; Yoshino, Kiyoshi; Serada, Satoshi; Yoshihara, Kosuke; Hori, Yumiko; Fujimoto, Minoru; Matsuzaki, Shinya; Egawa-Takata, Tomomi; Kobayashi, Eiji; Ueda, Yutaka; Morii, Eiichi; Enomoto, Takayuki; Naka, Tetsuji; Kimura, Tadashi
2016-03-01
Ovarian and endometrial high-grade serous carcinomas (HGSCs) have similar clinical and pathological characteristics; however, exhaustive protein expression profiling of these cancers has yet to be reported. We performed protein expression profiling on 14 cases of HGSCs (7 ovarian and 7 endometrial) and 18 endometrioid carcinomas (9 ovarian and 9 endometrial) using iTRAQ-based exhaustive and quantitative protein analysis. We identified 828 tumour-expressed proteins and evaluated the statistical similarity of protein expression profiles between ovarian and endometrial HGSCs using unsupervised hierarchical cluster analysis (P<0.01). Using 45 statistically highly expressed proteins in HGSCs, protein ontology analysis detected two enriched terms and proteins composing each term: IMP2 and MCM2. Immunohistochemical analyses confirmed the higher expression of IMP2 and MCM2 in ovarian and endometrial HGSCs as well as in tubal and peritoneal HGSCs than in endometrioid carcinomas (P<0.01). The knockdown of either IMP2 or MCM2 by siRNA interference significantly decreased the proliferation rate of ovarian HGSC cell line (P<0.01). We demonstrated the statistical similarity of the protein expression profiles of ovarian and endometrial HGSC beyond the organs. We suggest that increased IMP2 and MCM2 expression may underlie some of the rapid HGSC growth observed clinically.
A Virtual Study of Grid Resolution on Experiments of a Highly-Resolved Turbulent Plume
NASA Astrophysics Data System (ADS)
Maisto, Pietro M. F.; Marshall, Andre W.; Gollner, Michael J.; Fire Protection Engineering Department Collaboration
2017-11-01
An accurate representation of sub-grid scale turbulent mixing is critical for modeling fire plumes and smoke transport. In this study, PLIF and PIV diagnostics are used with the saltwater modeling technique to provide highly-resolved instantaneous field measurements in unconfined turbulent plumes useful for statistical analysis, physical insight, and model validation. The effect of resolution was investigated employing a virtual interrogation window (of varying size) applied to the high-resolution field measurements. Motivated by LES low-pass filtering concepts, the high-resolution experimental data in this study can be analyzed within the interrogation windows (i.e. statistics at the sub-grid scale) and on interrogation windows (i.e. statistics at the resolved scale). A dimensionless resolution threshold (L/D*) criterion was determined to achieve converged statistics on the filtered measurements. Such a criterion was then used to establish the relative importance between large and small-scale turbulence phenomena while investigating specific scales for the turbulent flow. First order data sets start to collapse at a resolution of 0.3D*, while for second and higher order statistical moments the interrogation window size drops down to 0.2D*.
Local sensitivity analysis for inverse problems solved by singular value decomposition
Hill, M.C.; Nolan, B.T.
2010-01-01
Local sensitivity analysis provides computationally frugal ways to evaluate models commonly used for resource management, risk assessment, and so on. This includes diagnosing inverse model convergence problems caused by parameter insensitivity and(or) parameter interdependence (correlation), understanding what aspects of the model and data contribute to measures of uncertainty, and identifying new data likely to reduce model uncertainty. Here, we consider sensitivity statistics relevant to models in which the process model parameters are transformed using singular value decomposition (SVD) to create SVD parameters for model calibration. The statistics considered include the PEST identifiability statistic, and combined use of the process-model parameter statistics composite scaled sensitivities and parameter correlation coefficients (CSS and PCC). The statistics are complimentary in that the identifiability statistic integrates the effects of parameter sensitivity and interdependence, while CSS and PCC provide individual measures of sensitivity and interdependence. PCC quantifies correlations between pairs or larger sets of parameters; when a set of parameters is intercorrelated, the absolute value of PCC is close to 1.00 for all pairs in the set. The number of singular vectors to include in the calculation of the identifiability statistic is somewhat subjective and influences the statistic. To demonstrate the statistics, we use the USDA’s Root Zone Water Quality Model to simulate nitrogen fate and transport in the unsaturated zone of the Merced River Basin, CA. There are 16 log-transformed process-model parameters, including water content at field capacity (WFC) and bulk density (BD) for each of five soil layers. Calibration data consisted of 1,670 observations comprising soil moisture, soil water tension, aqueous nitrate and bromide concentrations, soil nitrate concentration, and organic matter content. All 16 of the SVD parameters could be estimated by regression based on the range of singular values. Identifiability statistic results varied based on the number of SVD parameters included. Identifiability statistics calculated for four SVD parameters indicate the same three most important process-model parameters as CSS/PCC (WFC1, WFC2, and BD2), but the order differed. Additionally, the identifiability statistic showed that BD1 was almost as dominant as WFC1. The CSS/PCC analysis showed that this results from its high correlation with WCF1 (-0.94), and not its individual sensitivity. Such distinctions, combined with analysis of how high correlations and(or) sensitivities result from the constructed model, can produce important insights into, for example, the use of sensitivity analysis to design monitoring networks. In conclusion, the statistics considered identified similar important parameters. They differ because (1) with CSS/PCC can be more awkward because sensitivity and interdependence are considered separately and (2) identifiability requires consideration of how many SVD parameters to include. A continuing challenge is to understand how these computationally efficient methods compare with computationally demanding global methods like Markov-Chain Monte Carlo given common nonlinear processes and the often even more nonlinear models.
Advanced statistical energy analysis
NASA Astrophysics Data System (ADS)
Heron, K. H.
1994-09-01
A high-frequency theory (advanced statistical energy analysis (ASEA)) is developed which takes account of the mechanism of tunnelling and uses a ray theory approach to track the power flowing around a plate or a beam network and then uses statistical energy analysis (SEA) to take care of any residual power. ASEA divides the energy of each sub-system into energy that is freely available for transfer to other sub-systems and energy that is fixed within the sub-systems that are physically separate and can be interpreted as a series of mathematical models, the first of which is identical to standard SEA and subsequent higher order models are convergent on an accurate prediction. Using a structural assembly of six rods as an example, ASEA is shown to converge onto the exact results while SEA is shown to overpredict by up to 60 dB.
NASA Astrophysics Data System (ADS)
Dibike, Y. B.; Eum, H. I.; Prowse, T. D.
2017-12-01
Flows originating from alpine dominated cold region watersheds typically experience extended winter low flows followed by spring snowmelt and summer rainfall driven high flows. In a warmer climate, there will be temperature- induced shift in precipitation from snow towards rain as well as changes in snowmelt timing affecting the frequency of extreme high and low flow events which could significantly alter ecosystem services. This study examines the potential changes in the frequency and severity of hydrologic extremes in the Athabasca River watershed in Alberta, Canada based on the Variable Infiltration Capacity (VIC) hydrologic model and selected and statistically downscaled climate change scenario data from the latest Coupled Model Intercomparison Project (CMIP5). The sensitivity of these projected changes is also examined by applying different extreme flow analysis methods. The hydrological model projections show an overall increase in mean annual streamflow in the watershed and a corresponding shift in the freshet timing to earlier period. Most of the streams are projected to experience increases during the winter and spring seasons and decreases during the summer and early fall seasons, with an overall projected increases in extreme high flows, especially for low frequency events. While the middle and lower parts of the watershed are characterised by projected increases in extreme high flows, the high elevation alpine region is mainly characterised by corresponding decreases in extreme low flow events. However, the magnitude of projected changes in extreme flow varies over a wide range, especially for low frequent events, depending on the climate scenario and period of analysis, and sometimes in a nonlinear way. Nonetheless, the sensitivity of the projected changes to the statistical method of analysis is found to be relatively small compared to the inter-model variability.
Estimation of integral curves from high angular resolution diffusion imaging (HARDI) data.
Carmichael, Owen; Sakhanenko, Lyudmila
2015-05-15
We develop statistical methodology for a popular brain imaging technique HARDI based on the high order tensor model by Özarslan and Mareci [10]. We investigate how uncertainty in the imaging procedure propagates through all levels of the model: signals, tensor fields, vector fields, and fibers. We construct asymptotically normal estimators of the integral curves or fibers which allow us to trace the fibers together with confidence ellipsoids. The procedure is computationally intense as it blends linear algebra concepts from high order tensors with asymptotical statistical analysis. The theoretical results are illustrated on simulated and real datasets. This work generalizes the statistical methodology proposed for low angular resolution diffusion tensor imaging by Carmichael and Sakhanenko [3], to several fibers per voxel. It is also a pioneering statistical work on tractography from HARDI data. It avoids all the typical limitations of the deterministic tractography methods and it delivers the same information as probabilistic tractography methods. Our method is computationally cheap and it provides well-founded mathematical and statistical framework where diverse functionals on fibers, directions and tensors can be studied in a systematic and rigorous way.
Estimation of integral curves from high angular resolution diffusion imaging (HARDI) data
Carmichael, Owen; Sakhanenko, Lyudmila
2015-01-01
We develop statistical methodology for a popular brain imaging technique HARDI based on the high order tensor model by Özarslan and Mareci [10]. We investigate how uncertainty in the imaging procedure propagates through all levels of the model: signals, tensor fields, vector fields, and fibers. We construct asymptotically normal estimators of the integral curves or fibers which allow us to trace the fibers together with confidence ellipsoids. The procedure is computationally intense as it blends linear algebra concepts from high order tensors with asymptotical statistical analysis. The theoretical results are illustrated on simulated and real datasets. This work generalizes the statistical methodology proposed for low angular resolution diffusion tensor imaging by Carmichael and Sakhanenko [3], to several fibers per voxel. It is also a pioneering statistical work on tractography from HARDI data. It avoids all the typical limitations of the deterministic tractography methods and it delivers the same information as probabilistic tractography methods. Our method is computationally cheap and it provides well-founded mathematical and statistical framework where diverse functionals on fibers, directions and tensors can be studied in a systematic and rigorous way. PMID:25937674
Comparative analysis of positive and negative attitudes toward statistics
NASA Astrophysics Data System (ADS)
Ghulami, Hassan Rahnaward; Ab Hamid, Mohd Rashid; Zakaria, Roslinazairimah
2015-02-01
Many statistics lecturers and statistics education researchers are interested to know the perception of their students' attitudes toward statistics during the statistics course. In statistics course, positive attitude toward statistics is a vital because it will be encourage students to get interested in the statistics course and in order to master the core content of the subject matters under study. Although, students who have negative attitudes toward statistics they will feel depressed especially in the given group assignment, at risk for failure, are often highly emotional, and could not move forward. Therefore, this study investigates the students' attitude towards learning statistics. Six latent constructs have been the measurement of students' attitudes toward learning statistic such as affect, cognitive competence, value, difficulty, interest, and effort. The questionnaire was adopted and adapted from the reliable and validate instrument of Survey of Attitudes towards Statistics (SATS). This study is conducted among engineering undergraduate engineering students in the university Malaysia Pahang (UMP). The respondents consist of students who were taking the applied statistics course from different faculties. From the analysis, it is found that the questionnaire is acceptable and the relationships among the constructs has been proposed and investigated. In this case, students show full effort to master the statistics course, feel statistics course enjoyable, have confidence that they have intellectual capacity, and they have more positive attitudes then negative attitudes towards statistics learning. In conclusion in terms of affect, cognitive competence, value, interest and effort construct the positive attitude towards statistics was mostly exhibited. While negative attitudes mostly exhibited by difficulty construct.
NASA Astrophysics Data System (ADS)
McCray, Wilmon Wil L., Jr.
The research was prompted by a need to conduct a study that assesses process improvement, quality management and analytical techniques taught to students in U.S. colleges and universities undergraduate and graduate systems engineering and the computing science discipline (e.g., software engineering, computer science, and information technology) degree programs during their academic training that can be applied to quantitatively manage processes for performance. Everyone involved in executing repeatable processes in the software and systems development lifecycle processes needs to become familiar with the concepts of quantitative management, statistical thinking, process improvement methods and how they relate to process-performance. Organizations are starting to embrace the de facto Software Engineering Institute (SEI) Capability Maturity Model Integration (CMMI RTM) Models as process improvement frameworks to improve business processes performance. High maturity process areas in the CMMI model imply the use of analytical, statistical, quantitative management techniques, and process performance modeling to identify and eliminate sources of variation, continually improve process-performance; reduce cost and predict future outcomes. The research study identifies and provides a detail discussion of the gap analysis findings of process improvement and quantitative analysis techniques taught in U.S. universities systems engineering and computing science degree programs, gaps that exist in the literature, and a comparison analysis which identifies the gaps that exist between the SEI's "healthy ingredients " of a process performance model and courses taught in U.S. universities degree program. The research also heightens awareness that academicians have conducted little research on applicable statistics and quantitative techniques that can be used to demonstrate high maturity as implied in the CMMI models. The research also includes a Monte Carlo simulation optimization model and dashboard that demonstrates the use of statistical methods, statistical process control, sensitivity analysis, quantitative and optimization techniques to establish a baseline and predict future customer satisfaction index scores (outcomes). The American Customer Satisfaction Index (ACSI) model and industry benchmarks were used as a framework for the simulation model.
Modeling and replicating statistical topology and evidence for CMB nonhomogeneity
Agami, Sarit
2017-01-01
Under the banner of “big data,” the detection and classification of structure in extremely large, high-dimensional, data sets are two of the central statistical challenges of our times. Among the most intriguing new approaches to this challenge is “TDA,” or “topological data analysis,” one of the primary aims of which is providing nonmetric, but topologically informative, preanalyses of data which make later, more quantitative, analyses feasible. While TDA rests on strong mathematical foundations from topology, in applications, it has faced challenges due to difficulties in handling issues of statistical reliability and robustness, often leading to an inability to make scientific claims with verifiable levels of statistical confidence. We propose a methodology for the parametric representation, estimation, and replication of persistence diagrams, the main diagnostic tool of TDA. The power of the methodology lies in the fact that even if only one persistence diagram is available for analysis—the typical case for big data applications—the replications permit conventional statistical hypothesis testing. The methodology is conceptually simple and computationally practical, and provides a broadly effective statistical framework for persistence diagram TDA analysis. We demonstrate the basic ideas on a toy example, and the power of the parametric approach to TDA modeling in an analysis of cosmic microwave background (CMB) nonhomogeneity. PMID:29078301
Text mining and network analysis to find functional associations of genes in high altitude diseases.
Bhasuran, Balu; Subramanian, Devika; Natarajan, Jeyakumar
2018-05-02
Travel to elevations above 2500 m is associated with the risk of developing one or more forms of acute altitude illness such as acute mountain sickness (AMS), high altitude cerebral edema (HACE) or high altitude pulmonary edema (HAPE). Our work aims to identify the functional association of genes involved in high altitude diseases. In this work we identified the gene networks responsible for high altitude diseases by using the principle of gene co-occurrence statistics from literature and network analysis. First, we mined the literature data from PubMed on high-altitude diseases, and extracted the co-occurring gene pairs. Next, based on their co-occurrence frequency, gene pairs were ranked. Finally, a gene association network was created using statistical measures to explore potential relationships. Network analysis results revealed that EPO, ACE, IL6 and TNF are the top five genes that were found to co-occur with 20 or more genes, while the association between EPAS1 and EGLN1 genes is strongly substantiated. The network constructed from this study proposes a large number of genes that work in-toto in high altitude conditions. Overall, the result provides a good reference for further study of the genetic relationships in high altitude diseases. Copyright © 2018 Elsevier Ltd. All rights reserved.
The impact of mother's literacy on child dental caries: Individual data or aggregate data analysis?
Haghdoost, Ali-Akbar; Hessari, Hossein; Baneshi, Mohammad Reza; Rad, Maryam; Shahravan, Arash
2017-01-01
To evaluate the impact of mother's literacy on child dental caries based on a national oral health survey in Iran and to investigate the possibility of ecological fallacy in aggregate data analysis. Existing data were from second national oral health survey that was carried out in 2004, which including 8725 6 years old participants. The association of mother's literacy with caries occurrence (DMF (Decayed, Missing, Filling) total score >0) of her child was assessed using individual data by logistic regression model. Then the association of the percentages of mother's literacy and the percentages of decayed teeth in each 30 provinces of Iran was assessed using aggregated data retrieved from the data of second national oral health survey of Iran and alternatively from census of "Statistical Center of Iran" using linear regression model. The significance level was set at 0.05 for all analysis. Individual data analysis showed a statistically significant association between mother's literacy and decayed teeth of children ( P = 0.02, odds ratio = 0.83). There were not statistical significant association between mother's literacy and child dental caries in aggregate data analysis of oral health survey ( P = 0.79, B = 0.03) and census of "Statistical Center of Statistics" ( P = 0.60, B = 0.14). Literate mothers have a preventive effect on occurring dental caries of children. According to the high percentage of illiterate parents in Iran, it's logical to consider suitable methods of oral health education which do not need reading or writing. Aggregate data analysis and individual data analysis had completely different results in this study.
Analysis of traffic crash data in Kentucky (2011-2015).
DOT National Transportation Integrated Search
2016-09-01
This report documents an analysis of traffic crash data in Kentucky for the years of 2011 through 2015. A primary objective of this study was to determine average crash statistics for Kentucky highways. Rates were calculated for various types of high...
Analysis of Traffic Crash Data in Kentucky (2012-2016).
DOT National Transportation Integrated Search
2017-09-01
This report documents an analysis of traffic crash data in Kentucky for the years of 2012 through 2016. A primary objective of this study was to determine average crash statistics for Kentucky highways. Rates were calculated for various types of high...
Analysis of traffic crash data in Kentucky (2009-2013).
DOT National Transportation Integrated Search
2014-09-01
This report documents an analysis of traffic crash data in Kentucky for the years of 2009 through 2013. A primary objective of this study was to determine average crash statistics for Kentucky highways. Rates were calculated for various types of high...
Mapping Quantitative Traits in Unselected Families: Algorithms and Examples
Dupuis, Josée; Shi, Jianxin; Manning, Alisa K.; Benjamin, Emelia J.; Meigs, James B.; Cupples, L. Adrienne; Siegmund, David
2009-01-01
Linkage analysis has been widely used to identify from family data genetic variants influencing quantitative traits. Common approaches have both strengths and limitations. Likelihood ratio tests typically computed in variance component analysis can accommodate large families but are highly sensitive to departure from normality assumptions. Regression-based approaches are more robust but their use has primarily been restricted to nuclear families. In this paper, we develop methods for mapping quantitative traits in moderately large pedigrees. Our methods are based on the score statistic which in contrast to the likelihood ratio statistic, can use nonparametric estimators of variability to achieve robustness of the false positive rate against departures from the hypothesized phenotypic model. Because the score statistic is easier to calculate than the likelihood ratio statistic, our basic mapping methods utilize relatively simple computer code that performs statistical analysis on output from any program that computes estimates of identity-by-descent. This simplicity also permits development and evaluation of methods to deal with multivariate and ordinal phenotypes, and with gene-gene and gene-environment interaction. We demonstrate our methods on simulated data and on fasting insulin, a quantitative trait measured in the Framingham Heart Study. PMID:19278016
DAnTE: a statistical tool for quantitative analysis of –omics data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Polpitiya, Ashoka D.; Qian, Weijun; Jaitly, Navdeep
2008-05-03
DAnTE (Data Analysis Tool Extension) is a statistical tool designed to address challenges unique to quantitative bottom-up, shotgun proteomics data. This tool has also been demonstrated for microarray data and can easily be extended to other high-throughput data types. DAnTE features selected normalization methods, missing value imputation algorithms, peptide to protein rollup methods, an extensive array of plotting functions, and a comprehensive ANOVA scheme that can handle unbalanced data and random effects. The Graphical User Interface (GUI) is designed to be very intuitive and user friendly.
GIS Adoption among Senior High School Geography Teachers in Taiwan
ERIC Educational Resources Information Center
Lay, Jinn-Guey; Chen, Yu-Wen; Chi, Yu-Lin
2013-01-01
This article explores the adoption of geographic information system (GIS) knowledge and skills through in-service training for high school geography teachers in Taiwan. Through statistical analysis of primary data collected from a census of Taiwan's high school geography teachers, it explores what motivates these teachers to undertake GIS…
Xia, Li C; Ai, Dongmei; Cram, Jacob A; Liang, Xiaoyi; Fuhrman, Jed A; Sun, Fengzhu
2015-09-21
Local trend (i.e. shape) analysis of time series data reveals co-changing patterns in dynamics of biological systems. However, slow permutation procedures to evaluate the statistical significance of local trend scores have limited its applications to high-throughput time series data analysis, e.g., data from the next generation sequencing technology based studies. By extending the theories for the tail probability of the range of sum of Markovian random variables, we propose formulae for approximating the statistical significance of local trend scores. Using simulations and real data, we show that the approximate p-value is close to that obtained using a large number of permutations (starting at time points >20 with no delay and >30 with delay of at most three time steps) in that the non-zero decimals of the p-values obtained by the approximation and the permutations are mostly the same when the approximate p-value is less than 0.05. In addition, the approximate p-value is slightly larger than that based on permutations making hypothesis testing based on the approximate p-value conservative. The approximation enables efficient calculation of p-values for pairwise local trend analysis, making large scale all-versus-all comparisons possible. We also propose a hybrid approach by integrating the approximation and permutations to obtain accurate p-values for significantly associated pairs. We further demonstrate its use with the analysis of the Polymouth Marine Laboratory (PML) microbial community time series from high-throughput sequencing data and found interesting organism co-occurrence dynamic patterns. The software tool is integrated into the eLSA software package that now provides accelerated local trend and similarity analysis pipelines for time series data. The package is freely available from the eLSA website: http://bitbucket.org/charade/elsa.
Clesse, Christophe; Lighezzolo-Alnot, Joëlle; De Lavergne, Sylvie; Hamlin, Sandrine; Scheffler, Michèle
2018-06-01
The authors' purpose for this article is to identify, review and interpret all publications about the episiotomy rates worldwide. Based on the criteria from the PRISMA guidelines, twenty databases were scrutinized. All studies which include national statistics related to episiotomy were selected, as well as studies presenting estimated data. Sixty-one papers were selected with publication dates between 1995 and 2016. A static and dynamic analysis of all the results was carried out. The assumption for the decline in the number of episiotomies is discussed and confirmed, recalling that nowadays high rates of episiotomy remain in less industrialized countries and East Asia. Finally, our analysis aims to investigate the potential determinants which influence apparent statistical disparities.
Coupling strength assumption in statistical energy analysis
Lafont, T.; Totaro, N.
2017-01-01
This paper is a discussion of the hypothesis of weak coupling in statistical energy analysis (SEA). The examples of coupled oscillators and statistical ensembles of coupled plates excited by broadband random forces are discussed. In each case, a reference calculation is compared with the SEA calculation. First, it is shown that the main SEA relation, the coupling power proportionality, is always valid for two oscillators irrespective of the coupling strength. But the case of three subsystems, consisting of oscillators or ensembles of plates, indicates that the coupling power proportionality fails when the coupling is strong. Strong coupling leads to non-zero indirect coupling loss factors and, sometimes, even to a reversal of the energy flow direction from low to high vibrational temperature. PMID:28484335
Statistical characterization of planar two-dimensional Rayleigh-Taylor mixing layers
NASA Astrophysics Data System (ADS)
Sendersky, Dmitry
2000-10-01
The statistical evolution of a planar, randomly perturbed fluid interface subject to Rayleigh-Taylor instability is explored through numerical simulation in two space dimensions. The data set, generated by the front-tracking code FronTier, is highly resolved and covers a large ensemble of initial perturbations, allowing a more refined analysis of closure issues pertinent to the stochastic modeling of chaotic fluid mixing. We closely approach a two-fold convergence of the mean two-phase flow: convergence of the numerical solution under computational mesh refinement, and statistical convergence under increasing ensemble size. Quantities that appear in the two-phase averaged Euler equations are computed directly and analyzed for numerical and statistical convergence. Bulk averages show a high degree of convergence, while interfacial averages are convergent only in the outer portions of the mixing zone, where there is a coherent array of bubble and spike tips. Comparison with the familiar bubble/spike penetration law h = alphaAgt 2 is complicated by the lack of scale invariance, inability to carry the simulations to late time, the increasing Mach numbers of the bubble/spike tips, and sensitivity to the method of data analysis. Finally, we use the simulation data to analyze some constitutive properties of the mixing process.
Statistical Analysis of Spectral Properties and Prosodic Parameters of Emotional Speech
NASA Astrophysics Data System (ADS)
Přibil, J.; Přibilová, A.
2009-01-01
The paper addresses reflection of microintonation and spectral properties in male and female acted emotional speech. Microintonation component of speech melody is analyzed regarding its spectral and statistical parameters. According to psychological research of emotional speech, different emotions are accompanied by different spectral noise. We control its amount by spectral flatness according to which the high frequency noise is mixed in voiced frames during cepstral speech synthesis. Our experiments are aimed at statistical analysis of cepstral coefficient values and ranges of spectral flatness in three emotions (joy, sadness, anger), and a neutral state for comparison. Calculated histograms of spectral flatness distribution are visually compared and modelled by Gamma probability distribution. Histograms of cepstral coefficient distribution are evaluated and compared using skewness and kurtosis. Achieved statistical results show good correlation comparing male and female voices for all emotional states portrayed by several Czech and Slovak professional actors.
PCA as a practical indicator of OPLS-DA model reliability.
Worley, Bradley; Powers, Robert
Principal Component Analysis (PCA) and Orthogonal Projections to Latent Structures Discriminant Analysis (OPLS-DA) are powerful statistical modeling tools that provide insights into separations between experimental groups based on high-dimensional spectral measurements from NMR, MS or other analytical instrumentation. However, when used without validation, these tools may lead investigators to statistically unreliable conclusions. This danger is especially real for Partial Least Squares (PLS) and OPLS, which aggressively force separations between experimental groups. As a result, OPLS-DA is often used as an alternative method when PCA fails to expose group separation, but this practice is highly dangerous. Without rigorous validation, OPLS-DA can easily yield statistically unreliable group separation. A Monte Carlo analysis of PCA group separations and OPLS-DA cross-validation metrics was performed on NMR datasets with statistically significant separations in scores-space. A linearly increasing amount of Gaussian noise was added to each data matrix followed by the construction and validation of PCA and OPLS-DA models. With increasing added noise, the PCA scores-space distance between groups rapidly decreased and the OPLS-DA cross-validation statistics simultaneously deteriorated. A decrease in correlation between the estimated loadings (added noise) and the true (original) loadings was also observed. While the validity of the OPLS-DA model diminished with increasing added noise, the group separation in scores-space remained basically unaffected. Supported by the results of Monte Carlo analyses of PCA group separations and OPLS-DA cross-validation metrics, we provide practical guidelines and cross-validatory recommendations for reliable inference from PCA and OPLS-DA models.
Quantitative and statistical analysis Power grid topology of the Western Interconnection Energy storage for grid applications Research Interests Understanding the implications of high penetrations of renewable
Lee, Yong Seuk; Lee, Sang Bok; Oh, Won Seok; Kwon, Yong Eok; Lee, Beom Koo
2016-01-01
The objectives of this study were (1) to evaluate the clinical and radiologic outcomes of open-wedge high tibial osteotomy focusing on patellofemoral alignment and (2) to search for correlation between variables and patellofemoral malalignment. A total of 46 knees (46 patients) from 32 females and 14 males who underwent open-wedge high tibial osteotomy were included in this retrospective case series. Outcomes were evaluated using clinical scales and radiologic parameters at the last follow-up. Pre-operative and final follow-up values were compared for the outcome analysis. For the focused analysis of the patellofemoral joint, correlation analyses between patellofemoral variables and pre- and post-operative weight-bearing line (WBL), clinical score, posterior slope, Blackburn Peel ratio, lateral patellar tilt, lateral patellar shift, and congruence angle were performed. The minimum follow-up period was 2 years and median follow-up period was 44 months (range 24-88 months). The percentage of weight-bearing line was shifted from 17.2 ± 11.1 to 56.7 ± 12.7%, and it was statistically significant (p < 0.01). Regarding the clinical results, statistical significance was observed using all scores (p < 0.01). In the radiologic evaluation, patellar descent was observed with statistical significance (p < 0.01). Last follow-up lateral patellar tilt was decreased with statistical significance (p < 0.01). In correlation analysis between variables of patellofemoral malalignment, the pre-operative weight-bearing line showed an association with the change in lateral patellar tilt and lateral patellar shift (correlation coefficient: 0.3). After open-wedge high tibial osteotomy, clinical results showed improvement, compared to pre-operative values. The patellar tilt and lateral patellar shift were not changed; however, descent of the patella was observed. Therefore, mild patellofemoral problems should not be a contraindication of the open-wedge high tibial osteotomy. Case series, Level IV.
NASA Astrophysics Data System (ADS)
Amin, Asad; Nasim, Wajid; Mubeen, Muhammad; Kazmi, Dildar Hussain; Lin, Zhaohui; Wahid, Abdul; Sultana, Syeda Refat; Gibbs, Jim; Fahad, Shah
2017-09-01
Unpredictable precipitation trends have largely influenced by climate change which prolonged droughts or floods in South Asia. Statistical analysis of monthly, seasonal, and annual precipitation trend carried out for different temporal (1996-2015 and 2041-2060) and spatial scale (39 meteorological stations) in Pakistan. Statistical downscaling model (SimCLIM) was used for future precipitation projection (2041-2060) and analyzed by statistical approach. Ensemble approach combined with representative concentration pathways (RCPs) at medium level used for future projections. The magnitude and slop of trends were derived by applying Mann-Kendal and Sen's slop statistical approaches. Geo-statistical application used to generate precipitation trend maps. Comparison of base and projected precipitation by statistical analysis represented by maps and graphical visualization which facilitate to detect trends. Results of this study projects that precipitation trend was increasing more than 70% of weather stations for February, March, April, August, and September represented as base years. Precipitation trend was decreased in February to April but increase in July to October in projected years. Highest decreasing trend was reported in January for base years which was also decreased in projected years. Greater variation in precipitation trends for projected and base years was reported in February to April. Variations in projected precipitation trend for Punjab and Baluchistan highly accredited in March and April. Seasonal analysis shows large variation in winter, which shows increasing trend for more than 30% of weather stations and this increased trend approaches 40% for projected precipitation. High risk was reported in base year pre-monsoon season where 90% of weather station shows increasing trend but in projected years this trend decreased up to 33%. Finally, the annual precipitation trend has increased for more than 90% of meteorological stations in base (1996-2015) which has decreased for projected year (2041-2060) up to 76%. These result revealed that overall precipitation trend is decreasing in future year which may prolonged the drought in 14% of weather stations under study.
Hadoop and friends - first experience at CERN with a new platform for high throughput analysis steps
NASA Astrophysics Data System (ADS)
Duellmann, D.; Surdy, K.; Menichetti, L.; Toebbicke, R.
2017-10-01
The statistical analysis of infrastructure metrics comes with several specific challenges, including the fairly large volume of unstructured metrics from a large set of independent data sources. Hadoop and Spark provide an ideal environment in particular for the first steps of skimming rapidly through hundreds of TB of low relevance data to find and extract the much smaller data volume that is relevant for statistical analysis and modelling. This presentation will describe the new Hadoop service at CERN and the use of several of its components for high throughput data aggregation and ad-hoc pattern searches. We will describe the hardware setup used, the service structure with a small set of decoupled clusters and the first experience with co-hosting different applications and performing software upgrades. We will further detail the common infrastructure used for data extraction and preparation from continuous monitoring and database input sources.
Stanzel, Sven; Weimer, Marc; Kopp-Schneider, Annette
2013-06-01
High-throughput screening approaches are carried out for the toxicity assessment of a large number of chemical compounds. In such large-scale in vitro toxicity studies several hundred or thousand concentration-response experiments are conducted. The automated evaluation of concentration-response data using statistical analysis scripts saves time and yields more consistent results in comparison to data analysis performed by the use of menu-driven statistical software. Automated statistical analysis requires that concentration-response data are available in a standardised data format across all compounds. To obtain consistent data formats, a standardised data management workflow must be established, including guidelines for data storage, data handling and data extraction. In this paper two procedures for data management within large-scale toxicological projects are proposed. Both procedures are based on Microsoft Excel files as the researcher's primary data format and use a computer programme to automate the handling of data files. The first procedure assumes that data collection has not yet started whereas the second procedure can be used when data files already exist. Successful implementation of the two approaches into the European project ACuteTox is illustrated. Copyright © 2012 Elsevier Ltd. All rights reserved.
Lamm, Steven H; Ferdosi, Hamid; Dissen, Elisabeth K; Li, Ji; Ahn, Jaeil
2015-12-07
High levels (> 200 µg/L) of inorganic arsenic in drinking water are known to be a cause of human lung cancer, but the evidence at lower levels is uncertain. We have sought the epidemiological studies that have examined the dose-response relationship between arsenic levels in drinking water and the risk of lung cancer over a range that includes both high and low levels of arsenic. Regression analysis, based on six studies identified from an electronic search, examined the relationship between the log of the relative risk and the log of the arsenic exposure over a range of 1-1000 µg/L. The best-fitting continuous meta-regression model was sought and found to be a no-constant linear-quadratic analysis where both the risk and the exposure had been logarithmically transformed. This yielded both a statistically significant positive coefficient for the quadratic term and a statistically significant negative coefficient for the linear term. Sub-analyses by study design yielded results that were similar for both ecological studies and non-ecological studies. Statistically significant X-intercepts consistently found no increased level of risk at approximately 100-150 µg/L arsenic.
NASA Astrophysics Data System (ADS)
Zavaletta, Vanessa A.; Bartholmai, Brian J.; Robb, Richard A.
2007-03-01
Diffuse lung diseases, such as idiopathic pulmonary fibrosis (IPF), can be characterized and quantified by analysis of volumetric high resolution CT scans of the lungs. These data sets typically have dimensions of 512 x 512 x 400. It is too subjective and labor intensive for a radiologist to analyze each slice and quantify regional abnormalities manually. Thus, computer aided techniques are necessary, particularly texture analysis techniques which classify various lung tissue types. Second and higher order statistics which relate the spatial variation of the intensity values are good discriminatory features for various textures. The intensity values in lung CT scans range between [-1024, 1024]. Calculation of second order statistics on this range is too computationally intensive so the data is typically binned between 16 or 32 gray levels. There are more effective ways of binning the gray level range to improve classification. An optimal and very efficient way to nonlinearly bin the histogram is to use a dynamic programming algorithm. The objective of this paper is to show that nonlinear binning using dynamic programming is computationally efficient and improves the discriminatory power of the second and higher order statistics for more accurate quantification of diffuse lung disease.
Lamm, Steven H.; Ferdosi, Hamid; Dissen, Elisabeth K.; Li, Ji; Ahn, Jaeil
2015-01-01
High levels (> 200 µg/L) of inorganic arsenic in drinking water are known to be a cause of human lung cancer, but the evidence at lower levels is uncertain. We have sought the epidemiological studies that have examined the dose-response relationship between arsenic levels in drinking water and the risk of lung cancer over a range that includes both high and low levels of arsenic. Regression analysis, based on six studies identified from an electronic search, examined the relationship between the log of the relative risk and the log of the arsenic exposure over a range of 1–1000 µg/L. The best-fitting continuous meta-regression model was sought and found to be a no-constant linear-quadratic analysis where both the risk and the exposure had been logarithmically transformed. This yielded both a statistically significant positive coefficient for the quadratic term and a statistically significant negative coefficient for the linear term. Sub-analyses by study design yielded results that were similar for both ecological studies and non-ecological studies. Statistically significant X-intercepts consistently found no increased level of risk at approximately 100–150 µg/L arsenic. PMID:26690190
Validating an Air Traffic Management Concept of Operation Using Statistical Modeling
NASA Technical Reports Server (NTRS)
He, Yuning; Davies, Misty Dawn
2013-01-01
Validating a concept of operation for a complex, safety-critical system (like the National Airspace System) is challenging because of the high dimensionality of the controllable parameters and the infinite number of states of the system. In this paper, we use statistical modeling techniques to explore the behavior of a conflict detection and resolution algorithm designed for the terminal airspace. These techniques predict the robustness of the system simulation to both nominal and off-nominal behaviors within the overall airspace. They also can be used to evaluate the output of the simulation against recorded airspace data. Additionally, the techniques carry with them a mathematical value of the worth of each prediction-a statistical uncertainty for any robustness estimate. Uncertainty Quantification (UQ) is the process of quantitative characterization and ultimately a reduction of uncertainties in complex systems. UQ is important for understanding the influence of uncertainties on the behavior of a system and therefore is valuable for design, analysis, and verification and validation. In this paper, we apply advanced statistical modeling methodologies and techniques on an advanced air traffic management system, namely the Terminal Tactical Separation Assured Flight Environment (T-TSAFE). We show initial results for a parameter analysis and safety boundary (envelope) detection in the high-dimensional parameter space. For our boundary analysis, we developed a new sequential approach based upon the design of computer experiments, allowing us to incorporate knowledge from domain experts into our modeling and to determine the most likely boundary shapes and its parameters. We carried out the analysis on system parameters and describe an initial approach that will allow us to include time-series inputs, such as the radar track data, into the analysis
A perceptual space of local image statistics.
Victor, Jonathan D; Thengone, Daniel J; Rizvi, Syed M; Conte, Mary M
2015-12-01
Local image statistics are important for visual analysis of textures, surfaces, and form. There are many kinds of local statistics, including those that capture luminance distributions, spatial contrast, oriented segments, and corners. While sensitivity to each of these kinds of statistics have been well-studied, much less is known about visual processing when multiple kinds of statistics are relevant, in large part because the dimensionality of the problem is high and different kinds of statistics interact. To approach this problem, we focused on binary images on a square lattice - a reduced set of stimuli which nevertheless taps many kinds of local statistics. In this 10-parameter space, we determined psychophysical thresholds to each kind of statistic (16 observers) and all of their pairwise combinations (4 observers). Sensitivities and isodiscrimination contours were consistent across observers. Isodiscrimination contours were elliptical, implying a quadratic interaction rule, which in turn determined ellipsoidal isodiscrimination surfaces in the full 10-dimensional space, and made predictions for sensitivities to complex combinations of statistics. These predictions, including the prediction of a combination of statistics that was metameric to random, were verified experimentally. Finally, check size had only a mild effect on sensitivities over the range from 2.8 to 14min, but sensitivities to second- and higher-order statistics was substantially lower at 1.4min. In sum, local image statistics form a perceptual space that is highly stereotyped across observers, in which different kinds of statistics interact according to simple rules. Copyright © 2015 Elsevier Ltd. All rights reserved.
A perceptual space of local image statistics
Victor, Jonathan D.; Thengone, Daniel J.; Rizvi, Syed M.; Conte, Mary M.
2015-01-01
Local image statistics are important for visual analysis of textures, surfaces, and form. There are many kinds of local statistics, including those that capture luminance distributions, spatial contrast, oriented segments, and corners. While sensitivity to each of these kinds of statistics have been well-studied, much less is known about visual processing when multiple kinds of statistics are relevant, in large part because the dimensionality of the problem is high and different kinds of statistics interact. To approach this problem, we focused on binary images on a square lattice – a reduced set of stimuli which nevertheless taps many kinds of local statistics. In this 10-parameter space, we determined psychophysical thresholds to each kind of statistic (16 observers) and all of their pairwise combinations (4 observers). Sensitivities and isodiscrimination contours were consistent across observers. Isodiscrimination contours were elliptical, implying a quadratic interaction rule, which in turn determined ellipsoidal isodiscrimination surfaces in the full 10-dimensional space, and made predictions for sensitivities to complex combinations of statistics. These predictions, including the prediction of a combination of statistics that was metameric to random, were verified experimentally. Finally, check size had only a mild effect on sensitivities over the range from 2.8 to 14 min, but sensitivities to second- and higher-order statistics was substantially lower at 1.4 min. In sum, local image statistics forms a perceptual space that is highly stereotyped across observers, in which different kinds of statistics interact according to simple rules. PMID:26130606
A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis.
Lin, Johnny; Bentler, Peter M
2012-01-01
Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne's asymptotically distribution-free method and Satorra Bentler's mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler's statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby's study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic.
[Notes on vital statistics for the study of perinatal health].
Juárez, Sol Pía
2014-01-01
Vital statistics, published by the National Statistics Institute in Spain, are a highly important source for the study of perinatal health nationwide. However, the process of data collection is not well-known and has implications both for the quality and interpretation of the epidemiological results derived from this source. The aim of this study was to present how the information is collected and some of the associated problems. This study is the result of an analysis of the methodological notes from the National Statistics Institute and first-hand information obtained from hospitals, the Central Civil Registry of Madrid, and the Madrid Institute for Statistics. Greater integration between these institutions is required to improve the quality of birth and stillbirth statistics. Copyright © 2014 SESPAS. Published by Elsevier Espana. All rights reserved.
NASA Astrophysics Data System (ADS)
Lee, An-Sheng; Lu, Wei-Li; Huang, Jyh-Jaan; Chang, Queenie; Wei, Kuo-Yen; Lin, Chin-Jung; Liou, Sofia Ya Hsuan
2016-04-01
Through the geology and climate characteristic in Taiwan, generally rivers carry a lot of suspended particles. After these particles settled, they become sediments which are good sorbent for heavy metals in river system. Consequently, sediments can be found recording contamination footprint at low flow energy region, such as estuary. Seven sediment cores were collected along Nankan River, northern Taiwan, which is seriously contaminated by factory, household and agriculture input. Physico-chemical properties of these cores were derived from Itrax-XRF Core Scanner and grain size analysis. In order to interpret these complex data matrices, the multivariate statistical techniques (cluster analysis, factor analysis and discriminant analysis) were introduced to this study. Through the statistical determination, the result indicates four types of sediment. One of them represents contamination event which shows high concentration of Cu, Zn, Pb, Ni and Fe, and low concentration of Si and Zr. Furthermore, three possible contamination sources of this type of sediment were revealed by Factor Analysis. The combination of sediment analysis and multivariate statistical techniques used provides new insights into the contamination depositional history of Nankan River and could be similarly applied to other river systems to determine the scale of anthropogenic contamination.
Nojima, Masanori; Tokunaga, Mutsumi; Nagamura, Fumitaka
2018-05-05
To investigate under what circumstances inappropriate use of 'multivariate analysis' is likely to occur and to identify the population that needs more support with medical statistics. The frequency of inappropriate regression model construction in multivariate analysis and related factors were investigated in observational medical research publications. The inappropriate algorithm of using only variables that were significant in univariate analysis was estimated to occur at 6.4% (95% CI 4.8% to 8.5%). This was observed in 1.1% of the publications with a medical statistics expert (hereinafter 'expert') as the first author, 3.5% if an expert was included as coauthor and in 12.2% if experts were not involved. In the publications where the number of cases was 50 or less and the study did not include experts, inappropriate algorithm usage was observed with a high proportion of 20.2%. The OR of the involvement of experts for this outcome was 0.28 (95% CI 0.15 to 0.53). A further, nation-level, analysis showed that the involvement of experts and the implementation of unfavourable multivariate analysis are associated at the nation-level analysis (R=-0.652). Based on the results of this study, the benefit of participation of medical statistics experts is obvious. Experts should be involved for proper confounding adjustment and interpretation of statistical models. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
DnaSAM: Software to perform neutrality testing for large datasets with complex null models.
Eckert, Andrew J; Liechty, John D; Tearse, Brandon R; Pande, Barnaly; Neale, David B
2010-05-01
Patterns of DNA sequence polymorphisms can be used to understand the processes of demography and adaptation within natural populations. High-throughput generation of DNA sequence data has historically been the bottleneck with respect to data processing and experimental inference. Advances in marker technologies have largely solved this problem. Currently, the limiting step is computational, with most molecular population genetic software allowing a gene-by-gene analysis through a graphical user interface. An easy-to-use analysis program that allows both high-throughput processing of multiple sequence alignments along with the flexibility to simulate data under complex demographic scenarios is currently lacking. We introduce a new program, named DnaSAM, which allows high-throughput estimation of DNA sequence diversity and neutrality statistics from experimental data along with the ability to test those statistics via Monte Carlo coalescent simulations. These simulations are conducted using the ms program, which is able to incorporate several genetic parameters (e.g. recombination) and demographic scenarios (e.g. population bottlenecks). The output is a set of diversity and neutrality statistics with associated probability values under a user-specified null model that are stored in easy to manipulate text file. © 2009 Blackwell Publishing Ltd.
Meta-analysis inside and outside particle physics: two traditions that should converge?
Baker, Rose D; Jackson, Dan
2013-06-01
The use of meta-analysis in medicine and epidemiology really took off in the 1970s. However, in high-energy physics, the Particle Data Group has been carrying out meta-analyses of measurements of particle masses and other properties since 1957. Curiously, there has been virtually no interaction between those working inside and outside particle physics. In this paper, we use statistical models to study two major differences in practice. The first is the usefulness of systematic errors, which physicists are now beginning to quote in addition to statistical errors. The second is whether it is better to treat heterogeneity by scaling up errors as do the Particle Data Group or by adding a random effect as does the rest of the community. Besides fitting models, we derive and use an exact test of the error-scaling hypothesis. We also discuss the other methodological differences between the two streams of meta-analysis. Our conclusion is that systematic errors are not currently very useful and that the conventional random effects model, as routinely used in meta-analysis, has a useful role to play in particle physics. The moral we draw for statisticians is that we should be more willing to explore 'grassroots' areas of statistical application, so that good statistical practice can flow both from and back to the statistical mainstream. Copyright © 2012 John Wiley & Sons, Ltd. Copyright © 2012 John Wiley & Sons, Ltd.
Application of microarray analysis on computer cluster and cloud platforms.
Bernau, C; Boulesteix, A-L; Knaus, J
2013-01-01
Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.
A scaling procedure for the response of an isolated system with high modal overlap factor
NASA Astrophysics Data System (ADS)
De Rosa, S.; Franco, F.
2008-10-01
The paper deals with a numerical approach that reduces some physical sizes of the solution domain to compute the dynamic response of an isolated system: it has been named Asymptotical Scaled Modal Analysis (ASMA). The proposed numerical procedure alters the input data needed to obtain the classic modal responses to increase the frequency band of validity of the discrete or continuous coordinates model through the definition of a proper scaling coefficient. It is demonstrated that the computational cost remains acceptable while the frequency range of analysis increases. Moreover, with reference to the flexural vibrations of a rectangular plate, the paper discusses the ASMA vs. the statistical energy analysis and the energy distribution approach. Some insights are also given about the limits of the scaling coefficient. Finally it is shown that the linear dynamic response, predicted with the scaling procedure, has the same quality and characteristics of the statistical energy analysis, but it can be useful when the system cannot be solved appropriately by the standard Statistical Energy Analysis (SEA).
USDA-ARS?s Scientific Manuscript database
Nonresonant laser vaporization combined with high-resolution electrospray time-of-flight mass spectrometry enables analysis of a casing after discharge of a firearm revealing organic signature molecules including methyl centralite (MC), diphenylamine (DPA), N-nitrosodiphenylamine (N-NO-DPA), 4-nitro...
General Nature of Multicollinearity in Multiple Regression Analysis.
ERIC Educational Resources Information Center
Liu, Richard
1981-01-01
Discusses multiple regression, a very popular statistical technique in the field of education. One of the basic assumptions in regression analysis requires that independent variables in the equation should not be highly correlated. The problem of multicollinearity and some of the solutions to it are discussed. (Author)
Harris, Alex H S; Reeder, Rachelle; Hyun, Jenny K
2009-10-01
Journal editors and statistical reviewers are often in the difficult position of catching serious problems in submitted manuscripts after the research is conducted and data have been analyzed. We sought to learn from editors and reviewers of major psychiatry journals what common statistical and design problems they most often find in submitted manuscripts and what they wished to communicate to authors regarding these issues. Our primary goal was to facilitate communication between journal editors/reviewers and researchers/authors and thereby improve the scientific and statistical quality of research and submitted manuscripts. Editors and statistical reviewers of 54 high-impact psychiatry journals were surveyed to learn what statistical or design problems they encounter most often in submitted manuscripts. Respondents completed the survey online. The authors analyzed survey text responses using content analysis procedures to identify major themes related to commonly encountered statistical or research design problems. Editors and reviewers (n=15) who handle manuscripts from 39 different high-impact psychiatry journals responded to the survey. The most commonly cited problems regarded failure to map statistical models onto research questions, improper handling of missing data, not controlling for multiple comparisons, not understanding the difference between equivalence and difference trials, and poor controls in quasi-experimental designs. The scientific quality of psychiatry research and submitted reports could be greatly improved if researchers became sensitive to, or sought consultation on frequently encountered methodological and analytic issues.
ERIC Educational Resources Information Center
Smith, Rachel A.; Levine, Timothy R.; Lachlan, Kenneth A.; Fediuk, Thomas A.
2002-01-01
Notes that the availability of statistical software packages has led to a sharp increase in use of complex research designs and complex statistical analyses in communication research. Reports a series of Monte Carlo simulations which demonstrate that this complexity may come at a heavier cost than many communication researchers realize. Warns…
NASA Astrophysics Data System (ADS)
Hendikawati, P.; Arifudin, R.; Zahid, M. Z.
2018-03-01
This study aims to design an android Statistics Data Analysis application that can be accessed through mobile devices to making it easier for users to access. The Statistics Data Analysis application includes various topics of basic statistical along with a parametric statistics data analysis application. The output of this application system is parametric statistics data analysis that can be used for students, lecturers, and users who need the results of statistical calculations quickly and easily understood. Android application development is created using Java programming language. The server programming language uses PHP with the Code Igniter framework, and the database used MySQL. The system development methodology used is the Waterfall methodology with the stages of analysis, design, coding, testing, and implementation and system maintenance. This statistical data analysis application is expected to support statistical lecturing activities and make students easier to understand the statistical analysis of mobile devices.
An Analysis of Operational Suitability for Test and Evaluation of Highly Reliable Systems
1994-03-04
Exposition," Journal of the American Statistical A iation-59: 353-375 (June 1964). 17. SYS 229, Test and Evaluation Management Coursebook , School of Systems...in hours, 0 is 2-5 the desired MTBCF in hours, R is the number of critical failures, and a is the P[type-I error] of the X2 statistic with 2*R+2...design of experiments (DOE) tables and the use of Bayesian statistics to increase the confidence level of the test results that will be obtained from
Sound transmission loss of composite sandwich panels
NASA Astrophysics Data System (ADS)
Zhou, Ran
Light composite sandwich panels are increasingly used in automobiles, ships and aircraft, because of the advantages they offer of high strength-to-weight ratios. However, the acoustical properties of these light and stiff structures can be less desirable than those of equivalent metal panels. These undesirable properties can lead to high interior noise levels. A number of researchers have studied the acoustical properties of honeycomb and foam sandwich panels. Not much work, however, has been carried out on foam-filled honeycomb sandwich panels. In this dissertation, governing equations for the forced vibration of asymmetric sandwich panels are developed. An analytical expression for modal densities of symmetric sandwich panels is derived from a sixth-order governing equation. A boundary element analysis model for the sound transmission loss of symmetric sandwich panels is proposed. Measurements of the modal density, total loss factor, radiation loss factor, and sound transmission loss of foam-filled honeycomb sandwich panels with different configurations and thicknesses are presented. Comparisons between the predicted sound transmission loss values obtained from wave impedance analysis, statistical energy analysis, boundary element analysis, and experimental values are presented. The wave impedance analysis model provides accurate predictions of sound transmission loss for the thin foam-filled honeycomb sandwich panels at frequencies above their first resonance frequencies. The predictions from the statistical energy analysis model are in better agreement with the experimental transmission loss values of the sandwich panels when the measured radiation loss factor values near coincidence are used instead of the theoretical values for single-layer panels. The proposed boundary element analysis model provides more accurate predictions of sound transmission loss for the thick foam-filled honeycomb sandwich panels than either the wave impedance analysis model or the statistical energy analysis model.
2014-09-01
14-7 ii Abstract The U.S. North Atlantic coast is subject to coastal flooding as a result of both severe extratropical storms (e.g., Nor’easters...Products and Services, excluding any kind of high-resolution hydrodynamic modeling. Tropical and extratropical storms were treated as a single...joint probability analysis and high-fidelity modeling of tropical and extratropical storms
Restoration of MRI Data for Field Nonuniformities using High Order Neighborhood Statistics
Hadjidemetriou, Stathis; Studholme, Colin; Mueller, Susanne; Weiner, Michael; Schuff, Norbert
2007-01-01
MRI at high magnetic fields (> 3.0 T ) is complicated by strong inhomogeneous radio-frequency fields, sometimes termed the “bias field”. These lead to nonuniformity of image intensity, greatly complicating further analysis such as registration and segmentation. Existing methods for bias field correction are effective for 1.5 T or 3.0 T MRI, but are not completely satisfactory for higher field data. This paper develops an effective bias field correction for high field MRI based on the assumption that the nonuniformity is smoothly varying in space. Also, nonuniformity is quantified and unmixed using high order neighborhood statistics of intensity cooccurrences. They are computed within spherical windows of limited size over the entire image. The restoration is iterative and makes use of a novel stable stopping criterion that depends on the scaled entropy of the cooccurrence statistics, which is a non monotonic function of the iterations; the Shannon entropy of the cooccurrence statistics normalized to the effective dynamic range of the image. The algorithm restores whole head data, is robust to intense nonuniformities present in high field acquisitions, and is robust to variations in anatomy. This algorithm significantly improves bias field correction in comparison to N3 on phantom 1.5 T head data and high field 4 T human head data. PMID:18193095
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kollias, Pavlos
2017-08-08
This is a multi-institutional, collaborative project using observations and modeling to study the evolution (e.g. formation and growth) of hydrometeors in continental convective clouds. Our contribution was in data analysis for the generation of high-value cloud and precipitation products and derive cloud statistics for model validation. There are two areas in data analysis that we contributed: i) the development of novel, state-of-the-art dual-wavelength radar algorithms for the retrieval of cloud microphysical properties and ii) the evaluation of large domain, high-resolution models using comprehensive multi-sensor observations. Our research group developed statistical summaries from numerous sensors and developed retrievals of vertical airmore » motion in deep convection.« less
Effect of Embolization Material in the Calculation of Dose Deposition in Arteriovenous Malformations
DOE Office of Scientific and Technical Information (OSTI.GOV)
De la Cruz, O. O. Galvan; Moreno-Jimenez, S.; Larraga-Gutierrez, J. M.
2010-12-07
In this work it is studied the impact of the incorporation of high Z materials (embolization material) in the dose calculation for stereotactic radiosurgery treatment for arteriovenous malformations. A statistical analysis is done to establish the variables that may impact in the dose calculation. To perform the comparison pencil beam (PB) and Monte Carlo (MC) calculation algorithms were used. The comparison between both dose calculations shows that PB overestimates the dose deposited. The statistical analysis, for the quantity of patients of the study (20), shows that the variable that may impact in the dose calculation is the volume of themore » high Z material in the arteriovenous malformation. Further studies have to be done to establish the clinical impact with the radiosurgery result.« less
An overview of data acquisition, signal coding and data analysis techniques for MST radars
NASA Technical Reports Server (NTRS)
Rastogi, P. K.
1986-01-01
An overview is given of the data acquisition, signal processing, and data analysis techniques that are currently in use with high power MST/ST (mesosphere stratosphere troposphere/stratosphere troposphere) radars. This review supplements the works of Rastogi (1983) and Farley (1984) presented at previous MAP workshops. A general description is given of data acquisition and signal processing operations and they are characterized on the basis of their disparate time scales. Then signal coding, a brief description of frequently used codes, and their limitations are discussed, and finally, several aspects of statistical data processing such as signal statistics, power spectrum and autocovariance analysis, outlier removal techniques are discussed.
NASA Astrophysics Data System (ADS)
Hsu, Kuo-Hsien
2012-11-01
Formosat-2 image is a kind of high-spatial-resolution (2 meters GSD) remote sensing satellite data, which includes one panchromatic band and four multispectral bands (Blue, Green, Red, near-infrared). An essential sector in the daily processing of received Formosat-2 image is to estimate the cloud statistic of image using Automatic Cloud Coverage Assessment (ACCA) algorithm. The information of cloud statistic of image is subsequently recorded as an important metadata for image product catalog. In this paper, we propose an ACCA method with two consecutive stages: preprocessing and post-processing analysis. For pre-processing analysis, the un-supervised K-means classification, Sobel's method, thresholding method, non-cloudy pixels reexamination, and cross-band filter method are implemented in sequence for cloud statistic determination. For post-processing analysis, Box-Counting fractal method is implemented. In other words, the cloud statistic is firstly determined via pre-processing analysis, the correctness of cloud statistic of image of different spectral band is eventually cross-examined qualitatively and quantitatively via post-processing analysis. The selection of an appropriate thresholding method is very critical to the result of ACCA method. Therefore, in this work, We firstly conduct a series of experiments of the clustering-based and spatial thresholding methods that include Otsu's, Local Entropy(LE), Joint Entropy(JE), Global Entropy(GE), and Global Relative Entropy(GRE) method, for performance comparison. The result shows that Otsu's and GE methods both perform better than others for Formosat-2 image. Additionally, our proposed ACCA method by selecting Otsu's method as the threshoding method has successfully extracted the cloudy pixels of Formosat-2 image for accurate cloud statistic estimation.
Advanced Gear Alloys for Ultra High Strength Applications
NASA Technical Reports Server (NTRS)
Shen, Tony; Krantz, Timothy; Sebastian, Jason
2011-01-01
Single tooth bending fatigue (STBF) test data of UHS Ferrium C61 and C64 alloys are presented in comparison with historical test data of conventional gear steels (9310 and Pyrowear 53) with comparable statistical analysis methods. Pitting and scoring tests of C61 and C64 are works in progress. Boeing statistical analysis of STBF test data for the four gear steels (C61, C64, 9310 and Pyrowear 53) indicates that the UHS grades exhibit increases in fatigue strength in the low cycle fatigue (LCF) regime. In the high cycle fatigue (HCF) regime, the UHS steels exhibit better mean fatigue strength endurance limit behavior (particularly as compared to Pyrowear 53). However, due to considerable scatter in the UHS test data, the anticipated overall benefits of the UHS grades in bending fatigue have not been fully demonstrated. Based on all the test data and on Boeing s analysis, C61 has been selected by Boeing as the gear steel for the final ERDS demonstrator test gearboxes. In terms of potential follow-up work, detailed physics-based, micromechanical analysis and modeling of the fatigue data would allow for a better understanding of the causes of the experimental scatter, and of the transition from high-stress LCF (surface-dominated) to low-stress HCF (subsurface-dominated) fatigue failure. Additional STBF test data and failure analysis work, particularly in the HCF regime and around the endurance limit stress, could allow for better statistical confidence and could reduce the observed effects of experimental test scatter. Finally, the need for further optimization of the residual compressive stress profiles of the UHS steels (resulting from carburization and peening) is noted, particularly for the case of the higher hardness C64 material.
Searching for hidden unexpected features in the SnIa data
NASA Astrophysics Data System (ADS)
Shafieloo, A.; Perivolaropoulos, L.
2010-06-01
It is known that κ2 statistic and likelihood analysis may not be sensitive to the all features of the data. Despite of the fact that by using κ2 statistic we can measure the overall goodness of fit for a model confronted to a data set, some specific features of the data can stay undetectable. For instance, it has been pointed out that there is an unexpected brightness of the SnIa data at z > 1 in the Union compilation. We quantify this statement by constructing a new statistic, called Binned Normalized Difference (BND) statistic, which is applicable directly on the Type Ia Supernova (SnIa) distance moduli. This statistic is designed to pick up systematic brightness trends of SnIa data points with respect to a best fit cosmological model at high redshifts. According to this statistic there are 2.2%, 5.3% and 12.6% consistency between the Gold06, Union08 and Constitution09 data and spatially flat ΛCDM model when the real data is compared with many realizations of the simulated monte carlo datasets. The corresponding realization probability in the context of a (w0,w1) = (-1.4,2) model is more than 30% for all mentioned datasets indicating a much better consistency for this model with respect to the BND statistic. The unexpected high z brightness of SnIa can be interpreted either as a trend towards more deceleration at high z than expected in the context of ΛCDM or as a statistical fluctuation or finally as a systematic effect perhaps due to a mild SnIa evolution at high z.
HIV/AIDS information by African companies: an empirical analysis.
Barako, Dulacha G; Taplin, Ross H; Brown, Alistair M
2010-01-01
This article investigates the extent of Human Immunodeficiency Virus/Acquired Immune Deficiency Syndrome Disclosures (HIV/AIDSD) in online annual reports by 200 listed companies from 10 African countries for the year ending 2006. Descriptive statistics reveal a very low level of overall HIV/AIDSD practices with a mean of 6 per cent disclosure, with half (100 out of 200) of the African companies making no disclosures at all. Logistic regression analysis reveals that company size and country are highly significant predictors of any disclosure of HIV/AIDS in annual reports. Profitability is also statistically significantly associated with the extent of disclosure.
The use and misuse of statistical methodologies in pharmacology research.
Marino, Michael J
2014-01-01
Descriptive, exploratory, and inferential statistics are necessary components of hypothesis-driven biomedical research. Despite the ubiquitous need for these tools, the emphasis on statistical methods in pharmacology has become dominated by inferential methods often chosen more by the availability of user-friendly software than by any understanding of the data set or the critical assumptions of the statistical tests. Such frank misuse of statistical methodology and the quest to reach the mystical α<0.05 criteria has hampered research via the publication of incorrect analysis driven by rudimentary statistical training. Perhaps more critically, a poor understanding of statistical tools limits the conclusions that may be drawn from a study by divorcing the investigator from their own data. The net result is a decrease in quality and confidence in research findings, fueling recent controversies over the reproducibility of high profile findings and effects that appear to diminish over time. The recent development of "omics" approaches leading to the production of massive higher dimensional data sets has amplified these issues making it clear that new approaches are needed to appropriately and effectively mine this type of data. Unfortunately, statistical education in the field has not kept pace. This commentary provides a foundation for an intuitive understanding of statistics that fosters an exploratory approach and an appreciation for the assumptions of various statistical tests that hopefully will increase the correct use of statistics, the application of exploratory data analysis, and the use of statistical study design, with the goal of increasing reproducibility and confidence in the literature. Copyright © 2013. Published by Elsevier Inc.
Lyons-Weiler, James; Pelikan, Richard; Zeh, Herbert J; Whitcomb, David C; Malehorn, David E; Bigbee, William L; Hauskrecht, Milos
2005-01-01
Peptide profiles generated using SELDI/MALDI time of flight mass spectrometry provide a promising source of patient-specific information with high potential impact on the early detection and classification of cancer and other diseases. The new profiling technology comes, however, with numerous challenges and concerns. Particularly important are concerns of reproducibility of classification results and their significance. In this work we describe a computational validation framework, called PACE (Permutation-Achieved Classification Error), that lets us assess, for a given classification model, the significance of the Achieved Classification Error (ACE) on the profile data. The framework compares the performance statistic of the classifier on true data samples and checks if these are consistent with the behavior of the classifier on the same data with randomly reassigned class labels. A statistically significant ACE increases our belief that a discriminative signal was found in the data. The advantage of PACE analysis is that it can be easily combined with any classification model and is relatively easy to interpret. PACE analysis does not protect researchers against confounding in the experimental design, or other sources of systematic or random error. We use PACE analysis to assess significance of classification results we have achieved on a number of published data sets. The results show that many of these datasets indeed possess a signal that leads to a statistically significant ACE.
Sojoudi, Alireza; Goodyear, Bradley G
2016-12-01
Spontaneous fluctuations of blood-oxygenation level-dependent functional magnetic resonance imaging (BOLD fMRI) signals are highly synchronous between brain regions that serve similar functions. This provides a means to investigate functional networks; however, most analysis techniques assume functional connections are constant over time. This may be problematic in the case of neurological disease, where functional connections may be highly variable. Recently, several methods have been proposed to determine moment-to-moment changes in the strength of functional connections over an imaging session (so called dynamic connectivity). Here a novel analysis framework based on a hierarchical observation modeling approach was proposed, to permit statistical inference of the presence of dynamic connectivity. A two-level linear model composed of overlapping sliding windows of fMRI signals, incorporating the fact that overlapping windows are not independent was described. To test this approach, datasets were synthesized whereby functional connectivity was either constant (significant or insignificant) or modulated by an external input. The method successfully determines the statistical significance of a functional connection in phase with the modulation, and it exhibits greater sensitivity and specificity in detecting regions with variable connectivity, when compared with sliding-window correlation analysis. For real data, this technique possesses greater reproducibility and provides a more discriminative estimate of dynamic connectivity than sliding-window correlation analysis. Hum Brain Mapp 37:4566-4580, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Detector noise statistics in the non-linear regime
NASA Technical Reports Server (NTRS)
Shopbell, P. L.; Bland-Hawthorn, J.
1992-01-01
The statistical behavior of an idealized linear detector in the presence of threshold and saturation levels is examined. It is assumed that the noise is governed by the statistical fluctuations in the number of photons emitted by the source during an exposure. Since physical detectors cannot have infinite dynamic range, our model illustrates that all devices have non-linear regimes, particularly at high count rates. The primary effect is a decrease in the statistical variance about the mean signal due to a portion of the expected noise distribution being removed via clipping. Higher order statistical moments are also examined, in particular, skewness and kurtosis. In principle, the expected distortion in the detector noise characteristics can be calibrated using flatfield observations with count rates matched to the observations. For this purpose, some basic statistical methods that utilize Fourier analysis techniques are described.
Thompson, Cheryl Bagley
2009-01-01
This 13th article of the Basics of Research series is first in a short series on statistical analysis. These articles will discuss creating your statistical analysis plan, levels of measurement, descriptive statistics, probability theory, inferential statistics, and general considerations for interpretation of the results of a statistical analysis.
Creation of a virtual cutaneous tissue bank
NASA Astrophysics Data System (ADS)
LaFramboise, William A.; Shah, Sujal; Hoy, R. W.; Letbetter, D.; Petrosko, P.; Vennare, R.; Johnson, Peter C.
2000-04-01
Cellular and non-cellular constituents of skin contain fundamental morphometric features and structural patterns that correlate with tissue function. High resolution digital image acquisitions performed using an automated system and proprietary software to assemble adjacent images and create a contiguous, lossless, digital representation of individual microscope slide specimens. Serial extraction, evaluation and statistical analysis of cutaneous feature is performed utilizing an automated analysis system, to derive normal cutaneous parameters comprising essential structural skin components. Automated digital cutaneous analysis allows for fast extraction of microanatomic dat with accuracy approximating manual measurement. The process provides rapid assessment of feature both within individual specimens and across sample populations. The images, component data, and statistical analysis comprise a bioinformatics database to serve as an architectural blueprint for skin tissue engineering and as a diagnostic standard of comparison for pathologic specimens.
Goodwin, Cody R; Sherrod, Stacy D; Marasco, Christina C; Bachmann, Brian O; Schramm-Sapyta, Nicole; Wikswo, John P; McLean, John A
2014-07-01
A metabolic system is composed of inherently interconnected metabolic precursors, intermediates, and products. The analysis of untargeted metabolomics data has conventionally been performed through the use of comparative statistics or multivariate statistical analysis-based approaches; however, each falls short in representing the related nature of metabolic perturbations. Herein, we describe a complementary method for the analysis of large metabolite inventories using a data-driven approach based upon a self-organizing map algorithm. This workflow allows for the unsupervised clustering, and subsequent prioritization of, correlated features through Gestalt comparisons of metabolic heat maps. We describe this methodology in detail, including a comparison to conventional metabolomics approaches, and demonstrate the application of this method to the analysis of the metabolic repercussions of prolonged cocaine exposure in rat sera profiles.
NASA Astrophysics Data System (ADS)
Zheng, Xu; Hao, Zhiyong; Wang, Xu; Mao, Jie
2016-06-01
High-speed-railway-train interior noise at low, medium, and high frequencies could be simulated by finite element analysis (FEA) or boundary element analysis (BEA), hybrid finite element analysis-statistical energy analysis (FEA-SEA) and statistical energy analysis (SEA), respectively. First, a new method named statistical acoustic energy flow (SAEF) is proposed, which can be applied to the full-spectrum HST interior noise simulation (including low, medium, and high frequencies) with only one model. In an SAEF model, the corresponding multi-physical-field coupling excitations are firstly fully considered and coupled to excite the interior noise. The interior noise attenuated by sound insulation panels of carriage is simulated through modeling the inflow acoustic energy from the exterior excitations into the interior acoustic cavities. Rigid multi-body dynamics, fast multi-pole BEA, and large-eddy simulation with indirect boundary element analysis are first employed to extract the multi-physical-field excitations, which include the wheel-rail interaction forces/secondary suspension forces, the wheel-rail rolling noise, and aerodynamic noise, respectively. All the peak values and their frequency bands of the simulated acoustic excitations are validated with those from the noise source identification test. Besides, the measured equipment noise inside equipment compartment is used as one of the excitation sources which contribute to the interior noise. Second, a full-trimmed FE carriage model is firstly constructed, and the simulated modal shapes and frequencies agree well with the measured ones, which has validated the global FE carriage model as well as the local FE models of the aluminum alloy-trim composite panel. Thus, the sound transmission loss model of any composite panel has indirectly been validated. Finally, the SAEF model of the carriage is constructed based on the accurate FE model and stimulated by the multi-physical-field excitations. The results show that the trend of the simulated 1/3 octave band sound pressure spectrum agrees well with that of the on-site-measured one. The deviation between the simulated and measured overall sound pressure level (SPL) is 2.6 dB(A) and well controlled below the engineering tolerance limit, which has validated the SAEF model in the full-spectrum analysis of the high speed train interior noise.
NASA Astrophysics Data System (ADS)
Zhang, Ying; Moges, Semu; Block, Paul
2018-01-01
Prediction of seasonal precipitation can provide actionable information to guide management of various sectoral activities. For instance, it is often translated into hydrological forecasts for better water resources management. However, many studies assume homogeneity in precipitation across an entire study region, which may prove ineffective for operational and local-level decisions, particularly for locations with high spatial variability. This study proposes advancing local-level seasonal precipitation predictions by first conditioning on regional-level predictions, as defined through objective cluster analysis, for western Ethiopia. To our knowledge, this is the first study predicting seasonal precipitation at high resolution in this region, where lives and livelihoods are vulnerable to precipitation variability given the high reliance on rain-fed agriculture and limited water resources infrastructure. The combination of objective cluster analysis, spatially high-resolution prediction of seasonal precipitation, and a modeling structure spanning statistical and dynamical approaches makes clear advances in prediction skill and resolution, as compared with previous studies. The statistical model improves versus the non-clustered case or dynamical models for a number of specific clusters in northwestern Ethiopia, with clusters having regional average correlation and ranked probability skill score (RPSS) values of up to 0.5 and 33 %, respectively. The general skill (after bias correction) of the two best-performing dynamical models over the entire study region is superior to that of the statistical models, although the dynamical models issue predictions at a lower resolution and the raw predictions require bias correction to guarantee comparable skills.
NASA Astrophysics Data System (ADS)
Wu, Di; Torres, Elizabeth B.; Jose, Jorge V.
2015-03-01
ASD is a spectrum of neurodevelopmental disorders. The high heterogeneity of the symptoms associated with the disorder impedes efficient diagnoses based on human observations. Recent advances with high-resolution MEM wearable sensors enable accurate movement measurements that may escape the naked eye. It calls for objective metrics to extract physiological relevant information from the rapidly accumulating data. In this talk we'll discuss the statistical analysis of movement data continuously collected with high-resolution sensors at 240Hz. We calculated statistical properties of speed fluctuations within the millisecond time range that closely correlate with the subjects' cognitive abilities. We computed the periodicity and synchronicity of the speed fluctuations' from their power spectrum and ensemble averaged two-point cross-correlation function. We built a two-parameter phase space from the temporal statistical analyses of the nearest neighbor fluctuations that provided a quantitative biomarker for ASD and adult normal subjects and further classified ASD severity. We also found age related developmental statistical signatures and potential ASD parental links in our movement dynamical studies. Our results may have direct clinical applications.
Mocellin, Simone; Pasquali, Sandro; Rossi, Carlo R; Nitti, Donato
2010-04-07
Based on previous meta-analyses of randomized controlled trials (RCTs), the use of interferon alpha (IFN-alpha) in the adjuvant setting improves disease-free survival (DFS) in patients with high-risk cutaneous melanoma. However, RCTs have yielded conflicting data on the effect of IFN-alpha on overall survival (OS). We conducted a systematic review and meta-analysis to examine the effect of IFN-alpha on DFS and OS in patients with high-risk cutaneous melanoma. The systematic review was performed by searching MEDLINE, EMBASE, Cancerlit, Cochrane, ISI Web of Science, and ASCO databases. The meta-analysis was performed using time-to-event data from which hazard ratios (HRs) and 95% confidence intervals (CIs) of DFS and OS were estimated. Subgroup and meta-regression analyses to investigate the effect of dose and treatment duration were also performed. Statistical tests were two-sided. The meta-analysis included 14 RCTs, published between 1990 and 2008, and involved 8122 patients, of which 4362 patients were allocated to the IFN-alpha arm. IFN-alpha alone was compared with observation in 12 of the 14 trials, and 17 comparisons (IFN-alpha vs comparator) were generated in total. IFN-alpha treatment was associated with a statistically significant improvement in DFS in 10 of the 17 comparisons (HR for disease recurrence = 0.82, 95% CI = 0.77 to 0.87; P < .001) and improved OS in four of the 14 comparisons (HR for death = 0.89, 95% CI = 0.83 to 0.96; P = .002). No between-study heterogeneity in either DFS or OS was observed. No optimal IFN-alpha dose and/or treatment duration or a subset of patients more responsive to adjuvant therapy was identified using subgroup analysis and meta-regression. In patients with high-risk cutaneous melanoma, IFN-alpha adjuvant treatment showed statistically significant improvement in both DFS and OS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
G. Ostrouchov; W.E.Doll; D.A.Wolf
2003-07-01
Unexploded ordnance(UXO)surveys encompass large areas, and the cost of surveying these areas can be high. Enactment of earlier protocols for sampling UXO sites have shown the shortcomings of these procedures and led to a call for development of scientifically defensible statistical procedures for survey design and analysis. This project is one of three funded by SERDP to address this need.
Supaporn, Pansuwan; Yeom, Sung Ho
2018-04-30
This study investigated the biological conversion of crude glycerol generated from a commercial biodiesel production plant as a by-product to 1,3-propanediol (1,3-PD). Statistical analysis was employed to derive a statistical model for the individual and interactive effects of glycerol, (NH 4 ) 2 SO 4 , trace elements, pH, and cultivation time on the four objectives: 1,3-PD concentration, yield, selectivity, and productivity. Optimum conditions for each objective with its maximum value were predicted by statistical optimization, and experiments under the optimum conditions verified the predictions. In addition, by systematic analysis of the values of four objectives, optimum conditions for 1,3-PD concentration (49.8 g/L initial glycerol, 4.0 g/L of (NH 4 ) 2 SO 4 , 2.0 mL/L of trace element, pH 7.5, and 11.2 h of cultivation time) were determined to be the global optimum culture conditions for 1,3-PD production. Under these conditions, we could achieve high 1,3-PD yield (47.4%), 1,3-PD selectivity (88.8%), and 1,3-PD productivity (2.1/g/L/h) as well as high 1,3-PD concentration (23.6 g/L).
Implementation of statistical process control for proteomic experiments via LC MS/MS.
Bereman, Michael S; Johnson, Richard; Bollinger, James; Boss, Yuval; Shulman, Nick; MacLean, Brendan; Hoofnagle, Andrew N; MacCoss, Michael J
2014-04-01
Statistical process control (SPC) is a robust set of tools that aids in the visualization, detection, and identification of assignable causes of variation in any process that creates products, services, or information. A tool has been developed termed Statistical Process Control in Proteomics (SProCoP) which implements aspects of SPC (e.g., control charts and Pareto analysis) into the Skyline proteomics software. It monitors five quality control metrics in a shotgun or targeted proteomic workflow. None of these metrics require peptide identification. The source code, written in the R statistical language, runs directly from the Skyline interface, which supports the use of raw data files from several of the mass spectrometry vendors. It provides real time evaluation of the chromatographic performance (e.g., retention time reproducibility, peak asymmetry, and resolution), and mass spectrometric performance (targeted peptide ion intensity and mass measurement accuracy for high resolving power instruments) via control charts. Thresholds are experiment- and instrument-specific and are determined empirically from user-defined quality control standards that enable the separation of random noise and systematic error. Finally, Pareto analysis provides a summary of performance metrics and guides the user to metrics with high variance. The utility of these charts to evaluate proteomic experiments is illustrated in two case studies.
O'Shannessy, Daniel J; Bendas, Katie; Schweizer, Charles; Wang, Wenquan; Albone, Earl; Somers, Elizabeth B; Weil, Susan; Meredith, Rhonda K; Wustner, Jason; Grasso, Luigi; Landers, Mark; Nicolaides, Nicholas C
2017-07-01
Farletuzumab (FAR) is a humanized monoclonal antibody (mAb) that binds to folate receptor alpha. A Ph3 trial in ovarian cancer patients treated with carboplatin/taxane plus FAR or placebo did not meet the primary statistical endpoint. Subgroup analysis demonstrated that subjects with high FAR exposure levels (Cmin>57.6μg/mL) showed statistically significant improvements in PFS and OS. The neonatal Fc receptor (fcgrt) plays a central role in albumin/IgG stasis and mAb pharmacokinetics (PK). Here we evaluated fcgrt sequence and association of its promoter variable number tandem repeats (VNTR) and coding single nucleotide variants (SNV) with albumin/IgG levels and FAR PK in the Ph3 patients. A statistical correlation existed between high FAR Cmin and AUC in patients with the highest quartile of albumin and lowest quartile of IgG1. Analysis of fcgrt identified 5 different VNTRs in the promoter region and 9 SNVs within the coding region, 4 which are novel. Copyright © 2017. Published by Elsevier Inc.
SPReM: Sparse Projection Regression Model For High-dimensional Linear Regression *
Sun, Qiang; Zhu, Hongtu; Liu, Yufeng; Ibrahim, Joseph G.
2014-01-01
The aim of this paper is to develop a sparse projection regression modeling (SPReM) framework to perform multivariate regression modeling with a large number of responses and a multivariate covariate of interest. We propose two novel heritability ratios to simultaneously perform dimension reduction, response selection, estimation, and testing, while explicitly accounting for correlations among multivariate responses. Our SPReM is devised to specifically address the low statistical power issue of many standard statistical approaches, such as the Hotelling’s T2 test statistic or a mass univariate analysis, for high-dimensional data. We formulate the estimation problem of SPREM as a novel sparse unit rank projection (SURP) problem and propose a fast optimization algorithm for SURP. Furthermore, we extend SURP to the sparse multi-rank projection (SMURP) by adopting a sequential SURP approximation. Theoretically, we have systematically investigated the convergence properties of SURP and the convergence rate of SURP estimates. Our simulation results and real data analysis have shown that SPReM out-performs other state-of-the-art methods. PMID:26527844
A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis
Lin, Johnny; Bentler, Peter M.
2012-01-01
Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne’s asymptotically distribution-free method and Satorra Bentler’s mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler’s statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby’s study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic. PMID:23144511
Colegrave, Nick
2017-01-01
A common approach to the analysis of experimental data across much of the biological sciences is test-qualified pooling. Here non-significant terms are dropped from a statistical model, effectively pooling the variation associated with each removed term with the error term used to test hypotheses (or estimate effect sizes). This pooling is only carried out if statistical testing on the basis of applying that data to a previous more complicated model provides motivation for this model simplification; hence the pooling is test-qualified. In pooling, the researcher increases the degrees of freedom of the error term with the aim of increasing statistical power to test their hypotheses of interest. Despite this approach being widely adopted and explicitly recommended by some of the most widely cited statistical textbooks aimed at biologists, here we argue that (except in highly specialized circumstances that we can identify) the hoped-for improvement in statistical power will be small or non-existent, and there is likely to be much reduced reliability of the statistical procedures through deviation of type I error rates from nominal levels. We thus call for greatly reduced use of test-qualified pooling across experimental biology, more careful justification of any use that continues, and a different philosophy for initial selection of statistical models in the light of this change in procedure. PMID:28330912
Exploring the Link Between Streamflow Trends and Climate Change in Indiana, USA
NASA Astrophysics Data System (ADS)
Kumar, S.; Kam, J.; Thurner, K.; Merwade, V.
2007-12-01
Streamflow trends in Indiana are evaluated for 85 USGS streamflow gaging stations that have continuous unregulated streamflow records varying from 10 to 80 years. The trends are analyzed by using the non-parametric Mann-Kendall test with prior trend-free pre-whitening to remove serial correlation in the data. Bootstrap method is used to establish field significance of the results. Trends are computed for 12 streamflow statistics to include low-, medium- (median and mean flow), and high-flow conditions on annual and seasonal time step. The analysis is done for six study periods, ranging from 10 years to more than 65 years, all ending in 2003. The trends in annual average streamflow, for 50 years study period, are compared with annual average precipitation trends from 14 National Climatic Data Center (NCDC) stations in Indiana, that have 50 years of continuous daily record. The results show field significant positive trends in annual low and medium streamflow statistics at majority of gaging stations for study periods that include 40 or more years of records. In seasonal analysis, all flow statistics in summer and fall (low flow seasons), and only low flow statistics in winter and spring (high flow seasons) are showing positive trends. No field significant trends in annual and seasonal flow statistics are observed for study periods that include 25 or fewer years of records, except for northern Indiana where localized negative trends are observed in 10 and 15 years study periods. Further, stream flow trends are found to be highly correlated with precipitation trends on annual time step. No apparent climate change signal is observed in Indiana stream flow records.
Li, Huanjie; Nickerson, Lisa D; Nichols, Thomas E; Gao, Jia-Hong
2017-03-01
Two powerful methods for statistical inference on MRI brain images have been proposed recently, a non-stationary voxelation-corrected cluster-size test (CST) based on random field theory and threshold-free cluster enhancement (TFCE) based on calculating the level of local support for a cluster, then using permutation testing for inference. Unlike other statistical approaches, these two methods do not rest on the assumptions of a uniform and high degree of spatial smoothness of the statistic image. Thus, they are strongly recommended for group-level fMRI analysis compared to other statistical methods. In this work, the non-stationary voxelation-corrected CST and TFCE methods for group-level analysis were evaluated for both stationary and non-stationary images under varying smoothness levels, degrees of freedom and signal to noise ratios. Our results suggest that, both methods provide adequate control for the number of voxel-wise statistical tests being performed during inference on fMRI data and they are both superior to current CSTs implemented in popular MRI data analysis software packages. However, TFCE is more sensitive and stable for group-level analysis of VBM data. Thus, the voxelation-corrected CST approach may confer some advantages by being computationally less demanding for fMRI data analysis than TFCE with permutation testing and by also being applicable for single-subject fMRI analyses, while the TFCE approach is advantageous for VBM data. Hum Brain Mapp 38:1269-1280, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
USDA-ARS?s Scientific Manuscript database
Recent advances in technology have led to the collection of high-dimensional data not previously encountered in many scientific environments. As a result, scientists are often faced with the challenging task of including these high-dimensional data into statistical models. For example, data from sen...
Anima: Modular Workflow System for Comprehensive Image Data Analysis
Rantanen, Ville; Valori, Miko; Hautaniemi, Sampsa
2014-01-01
Modern microscopes produce vast amounts of image data, and computational methods are needed to analyze and interpret these data. Furthermore, a single image analysis project may require tens or hundreds of analysis steps starting from data import and pre-processing to segmentation and statistical analysis; and ending with visualization and reporting. To manage such large-scale image data analysis projects, we present here a modular workflow system called Anima. Anima is designed for comprehensive and efficient image data analysis development, and it contains several features that are crucial in high-throughput image data analysis: programing language independence, batch processing, easily customized data processing, interoperability with other software via application programing interfaces, and advanced multivariate statistical analysis. The utility of Anima is shown with two case studies focusing on testing different algorithms developed in different imaging platforms and an automated prediction of alive/dead C. elegans worms by integrating several analysis environments. Anima is a fully open source and available with documentation at www.anduril.org/anima. PMID:25126541
Papadia, Andrea; Bellati, Filippo; Bogani, Giorgio; Ditto, Antonino; Martinelli, Fabio; Lorusso, Domenica; Donfrancesco, Cristina; Gasparri, Maria Luisa; Raspagliesi, Francesco
2015-12-01
The aim of this study was to identify clinical variables that may predict the need for adjuvant radiotherapy after neoadjuvant chemotherapy (NACT) and radical surgery in locally advanced cervical cancer patients. A retrospective series of cervical cancer patients with International Federation of Gynecology and Obstetrics (FIGO) stages IB2-IIB treated with NACT followed by radical surgery was analyzed. Clinical predictors of persistence of intermediate- and/or high-risk factors at final pathological analysis were investigated. Statistical analysis was performed using univariate and multivariate analysis and using a model based on artificial intelligence known as artificial neuronal network (ANN) analysis. Overall, 101 patients were available for the analyses. Fifty-two (51 %) patients were considered at high risk secondary to parametrial, resection margin and/or lymph node involvement. When disease was confined to the cervix, four (4 %) patients were considered at intermediate risk. At univariate analysis, FIGO grade 3, stage IIB disease at diagnosis and the presence of enlarged nodes before NACT predicted the presence of intermediate- and/or high-risk factors at final pathological analysis. At multivariate analysis, only FIGO grade 3 and tumor diameter maintained statistical significance. The specificity of ANN models in evaluating predictive variables was slightly superior to conventional multivariable models. FIGO grade, stage, tumor diameter, and histology are associated with persistence of pathological intermediate- and/or high-risk factors after NACT and radical surgery. This information is useful in counseling patients at the time of treatment planning with regard to the probability of being subjected to pelvic radiotherapy after completion of the initially planned treatment.
Low energy peripheral scaling in nucleon-nucleon scattering and uncertainty quantification
NASA Astrophysics Data System (ADS)
Ruiz Simo, I.; Amaro, J. E.; Ruiz Arriola, E.; Navarro Pérez, R.
2018-03-01
We analyze the peripheral structure of the nucleon-nucleon interaction for LAB energies below 350 MeV. To this end we transform the scattering matrix into the impact parameter representation by analyzing the scaled phase shifts (L + 1/2) δ JLS (p) and the scaled mixing parameters (L + 1/2)ɛ JLS (p) in terms of the impact parameter b = (L + 1/2)/p. According to the eikonal approximation, at large angular momentum L these functions should become an universal function of b, independent on L. This allows to discuss in a rather transparent way the role of statistical and systematic uncertainties in the different long range components of the two-body potential. Implications for peripheral waves obtained in chiral perturbation theory interactions to fifth order (N5LO) or from the large body of NN data considered in the SAID partial wave analysis are also drawn from comparing them with other phenomenological high-quality interactions, constructed to fit scattering data as well. We find that both N5LO and SAID peripheral waves disagree more than 5σ with the Granada-2013 statistical analysis, more than 2σ with the 6 statistically equivalent potentials fitting the Granada-2013 database and about 1σ with the historical set of 13 high-quality potentials developed since the 1993 Nijmegen analysis.
Jenkinson, Garrett; Abante, Jordi; Feinberg, Andrew P; Goutsias, John
2018-03-07
DNA methylation is a stable form of epigenetic memory used by cells to control gene expression. Whole genome bisulfite sequencing (WGBS) has emerged as a gold-standard experimental technique for studying DNA methylation by producing high resolution genome-wide methylation profiles. Statistical modeling and analysis is employed to computationally extract and quantify information from these profiles in an effort to identify regions of the genome that demonstrate crucial or aberrant epigenetic behavior. However, the performance of most currently available methods for methylation analysis is hampered by their inability to directly account for statistical dependencies between neighboring methylation sites, thus ignoring significant information available in WGBS reads. We present a powerful information-theoretic approach for genome-wide modeling and analysis of WGBS data based on the 1D Ising model of statistical physics. This approach takes into account correlations in methylation by utilizing a joint probability model that encapsulates all information available in WGBS methylation reads and produces accurate results even when applied on single WGBS samples with low coverage. Using the Shannon entropy, our approach provides a rigorous quantification of methylation stochasticity in individual WGBS samples genome-wide. Furthermore, it utilizes the Jensen-Shannon distance to evaluate differences in methylation distributions between a test and a reference sample. Differential performance assessment using simulated and real human lung normal/cancer data demonstrate a clear superiority of our approach over DSS, a recently proposed method for WGBS data analysis. Critically, these results demonstrate that marginal methods become statistically invalid when correlations are present in the data. This contribution demonstrates clear benefits and the necessity of modeling joint probability distributions of methylation using the 1D Ising model of statistical physics and of quantifying methylation stochasticity using concepts from information theory. By employing this methodology, substantial improvement of DNA methylation analysis can be achieved by effectively taking into account the massive amount of statistical information available in WGBS data, which is largely ignored by existing methods.
Chida, Koji; Morohashi, Gembu; Fuji, Hitoshi; Magata, Fumihiko; Fujimura, Akiko; Hamada, Koki; Ikarashi, Dai; Yamamoto, Ryuichi
2014-01-01
Background and objective While the secondary use of medical data has gained attention, its adoption has been constrained due to protection of patient privacy. Making medical data secure by de-identification can be problematic, especially when the data concerns rare diseases. We require rigorous security management measures. Materials and methods Using secure computation, an approach from cryptography, our system can compute various statistics over encrypted medical records without decrypting them. An issue of secure computation is that the amount of processing time required is immense. We implemented a system that securely computes healthcare statistics from the statistical computing software ‘R’ by effectively combining secret-sharing-based secure computation with original computation. Results Testing confirmed that our system could correctly complete computation of average and unbiased variance of approximately 50 000 records of dummy insurance claim data in a little over a second. Computation including conditional expressions and/or comparison of values, for example, t test and median, could also be correctly completed in several tens of seconds to a few minutes. Discussion If medical records are simply encrypted, the risk of leaks exists because decryption is usually required during statistical analysis. Our system possesses high-level security because medical records remain in encrypted state even during statistical analysis. Also, our system can securely compute some basic statistics with conditional expressions using ‘R’ that works interactively while secure computation protocols generally require a significant amount of processing time. Conclusions We propose a secure statistical analysis system using ‘R’ for medical data that effectively integrates secret-sharing-based secure computation and original computation. PMID:24763677
Chida, Koji; Morohashi, Gembu; Fuji, Hitoshi; Magata, Fumihiko; Fujimura, Akiko; Hamada, Koki; Ikarashi, Dai; Yamamoto, Ryuichi
2014-10-01
While the secondary use of medical data has gained attention, its adoption has been constrained due to protection of patient privacy. Making medical data secure by de-identification can be problematic, especially when the data concerns rare diseases. We require rigorous security management measures. Using secure computation, an approach from cryptography, our system can compute various statistics over encrypted medical records without decrypting them. An issue of secure computation is that the amount of processing time required is immense. We implemented a system that securely computes healthcare statistics from the statistical computing software 'R' by effectively combining secret-sharing-based secure computation with original computation. Testing confirmed that our system could correctly complete computation of average and unbiased variance of approximately 50,000 records of dummy insurance claim data in a little over a second. Computation including conditional expressions and/or comparison of values, for example, t test and median, could also be correctly completed in several tens of seconds to a few minutes. If medical records are simply encrypted, the risk of leaks exists because decryption is usually required during statistical analysis. Our system possesses high-level security because medical records remain in encrypted state even during statistical analysis. Also, our system can securely compute some basic statistics with conditional expressions using 'R' that works interactively while secure computation protocols generally require a significant amount of processing time. We propose a secure statistical analysis system using 'R' for medical data that effectively integrates secret-sharing-based secure computation and original computation. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Monitoring the soil degradation by Metastatistical Analysis
NASA Astrophysics Data System (ADS)
Oleschko, K.; Gaona, C.; Tarquis, A.
2009-04-01
The effectiveness of fractal toolbox to capture the critical behavior of soil structural patterns during the chemical and physical degradation was documented by our numerous experiments (Oleschko et al., 2008 a; 2008 b). The spatio-temporal dynamics of these patterns was measured and mapped with high precision in terms of fractal descriptors. All tested fractal techniques were able to detect the statistically significant differences in structure between the perfect spongy and massive patterns of uncultivated and sodium-saline agricultural soils, respectively. For instance, the Hurst exponent, extracted from the Chernozeḿ micromorphological images and from the time series of its physical and mechanical properties measured in situ, detected the roughness decrease (and therefore the increase in H - from 0.17 to 0.30 for images) derived from the loss of original structure complexity. The combined use of different fractal descriptors brings statistical precision into the quantification of natural system degradation and provides a means for objective soil structure comparison (Oleschko et al., 2000). The ability of fractal parameters to capture critical behavior and phase transition was documented for different contrasting situations, including from Andosols deforestation and erosion, to Vertisols high fructuring and consolidation. The Hurst exponent is used to measure the type of persistence and degree of complexity of structure dynamics. We conclude that there is an urgent need to select and adopt a standardized toolbox for fractal analysis and complexity measures in Earth Sciences. We propose to use the second-order (meta-) statistics as subtle measures of complexity (Atmanspacher et al., 1997). The high degree of correlation was documented between the fractal and high-order statistical descriptors (four central moments of stochastic variable distribution) used to the system heterogeneity and variability analysis. We proposed to call this combined fractal/statistical toolbox Metastatistical Analysis and recommend it to the projects directed to soil degradation monitoring. References: 1. Oleschko, K., B.S. Figueroa, M.E. Miranda, M.A. Vuelvas and E.R. Solleiro, Soil & Till. Res. 55, 43 (2000). 2. Oleschko, K., Korvin, G., Figueroa S. B., Vuelvas, M.A., Balankin, A., Flores L., Carreño, D. Fractal radar scattering from soil. Physical Review E.67, 041403, 2003. 3. Zamora-Castro S., Oleschko, K. Flores, L., Ventura, E. Jr., Parrot, J.-F., 2008. Fractal mapping of pore and solids attributes. Vadose Zone Journal, v. 7, Issue2: 473-492. 4. Oleschko, K., Korvin, G., Muñoz, A., Velásquez, J., Miranda, M.E., Carreon, D., Flores, L., Martínez, M., Velásquez-Valle, M., Brambilla, F., Parrot, J.-F. Ronquillo, G., 2008. Fractal mapping of soil moisture content from remote sensed multi-scale data. Nonlinear Proceses in Geophysics Journal, 15: 711-725. 5. Atmanspacher, H., Räth, Ch., Wiedenmann, G., 1997. Statistics and meta-statistics in the concept of complexity. Physica A, 234: 819-829.
Jeste, Shafali S; Kirkham, Natasha; Senturk, Damla; Hasenstab, Kyle; Sugar, Catherine; Kupelian, Chloe; Baker, Elizabeth; Sanders, Andrew J; Shimizu, Christina; Norona, Amanda; Paparella, Tanya; Freeman, Stephanny F N; Johnson, Scott P
2015-01-01
Statistical learning is characterized by detection of regularities in one's environment without an awareness or intention to learn, and it may play a critical role in language and social behavior. Accordingly, in this study we investigated the electrophysiological correlates of visual statistical learning in young children with autism spectrum disorder (ASD) using an event-related potential shape learning paradigm, and we examined the relation between visual statistical learning and cognitive function. Compared to typically developing (TD) controls, the ASD group as a whole showed reduced evidence of learning as defined by N1 (early visual discrimination) and P300 (attention to novelty) components. Upon further analysis, in the ASD group there was a positive correlation between N1 amplitude difference and non-verbal IQ, and a positive correlation between P300 amplitude difference and adaptive social function. Children with ASD and a high non-verbal IQ and high adaptive social function demonstrated a distinctive pattern of learning. This is the first study to identify electrophysiological markers of visual statistical learning in children with ASD. Through this work we have demonstrated heterogeneity in statistical learning in ASD that maps onto non-verbal cognition and adaptive social function. © 2014 John Wiley & Sons Ltd.
Real-time movement detection and analysis for video surveillance applications
NASA Astrophysics Data System (ADS)
Hueber, Nicolas; Hennequin, Christophe; Raymond, Pierre; Moeglin, Jean-Pierre
2014-06-01
Pedestrian movement along critical infrastructures like pipes, railways or highways, is of major interest in surveillance applications as well as its behavior in urban environment. The goal is to anticipate illicit or dangerous human activities. For this purpose, we propose an all-in-one small autonomous system which delivers high level statistics and reports alerts in specific cases. This situational awareness project leads us to manage efficiently the scene by performing movement analysis. A dynamic background extraction algorithm is developed to reach the degree of robustness against natural and urban environment perturbations and also to match the embedded implementation constraints. When changes are detected in the scene, specific patterns are applied to detect and highlight relevant movements. Depending on the applications, specific descriptors can be extracted and fused in order to reach a high level of interpretation. In this paper, our approach is applied to two operational use cases: pedestrian urban statistics and railway surveillance. In the first case, a grid of prototypes is deployed over a city centre to collect pedestrian movement statistics up to a macroscopic level of analysis. The results demonstrate the relevance of the delivered information; in particular, the flow density map highlights pedestrian preferential paths along the streets. In the second case, one prototype is set next to high speed train tracks to secure the area. The results exhibit a low false alarm rate and assess our approach of a large sensor network for delivering a precise operational picture without overwhelming a supervisor.
Robustness of S1 statistic with Hodges-Lehmann for skewed distributions
NASA Astrophysics Data System (ADS)
Ahad, Nor Aishah; Yahaya, Sharipah Soaad Syed; Yin, Lee Ping
2016-10-01
Analysis of variance (ANOVA) is a common use parametric method to test the differences in means for more than two groups when the populations are normally distributed. ANOVA is highly inefficient under the influence of non- normal and heteroscedastic settings. When the assumptions are violated, researchers are looking for alternative such as Kruskal-Wallis under nonparametric or robust method. This study focused on flexible method, S1 statistic for comparing groups using median as the location estimator. S1 statistic was modified by substituting the median with Hodges-Lehmann and the default scale estimator with the variance of Hodges-Lehmann and MADn to produce two different test statistics for comparing groups. Bootstrap method was used for testing the hypotheses since the sampling distributions of these modified S1 statistics are unknown. The performance of the proposed statistic in terms of Type I error was measured and compared against the original S1 statistic, ANOVA and Kruskal-Wallis. The propose procedures show improvement compared to the original statistic especially under extremely skewed distribution.
AutoBayes Program Synthesis System Users Manual
NASA Technical Reports Server (NTRS)
Schumann, Johann; Jafari, Hamed; Pressburger, Tom; Denney, Ewen; Buntine, Wray; Fischer, Bernd
2008-01-01
Program synthesis is the systematic, automatic construction of efficient executable code from high-level declarative specifications. AutoBayes is a fully automatic program synthesis system for the statistical data analysis domain; in particular, it solves parameter estimation problems. It has seen many successful applications at NASA and is currently being used, for example, to analyze simulation results for Orion. The input to AutoBayes is a concise description of a data analysis problem composed of a parameterized statistical model and a goal that is a probability term involving parameters and input data. The output is optimized and fully documented C/C++ code computing the values for those parameters that maximize the probability term. AutoBayes can solve many subproblems symbolically rather than having to rely on numeric approximation algorithms, thus yielding effective, efficient, and compact code. Statistical analysis is faster and more reliable, because effort can be focused on model development and validation rather than manual development of solution algorithms and code.
Statistical principle and methodology in the NISAN system.
Asano, C
1979-01-01
The NISAN system is a new interactive statistical analysis program package constructed by an organization of Japanese statisticans. The package is widely available for both statistical situations, confirmatory analysis and exploratory analysis, and is planned to obtain statistical wisdom and to choose optimal process of statistical analysis for senior statisticians. PMID:540594
ERIC Educational Resources Information Center
Delacroix, Jacques; Ragin, Charles C.
1981-01-01
Presents a statistical analysis of dependency of developing nations on more highly developed and industrialized nations and relates this dependency to various degrees of economic development. The analysis is based on the structural blockage argument (one of several dependency arguments contained in many versions of dependency theory). Emphasizes…
Principal Component Analysis: Resources for an Essential Application of Linear Algebra
ERIC Educational Resources Information Center
Pankavich, Stephen; Swanson, Rebecca
2015-01-01
Principal Component Analysis (PCA) is a highly useful topic within an introductory Linear Algebra course, especially since it can be used to incorporate a number of applied projects. This method represents an essential application and extension of the Spectral Theorem and is commonly used within a variety of fields, including statistics,…
Introducing Undergraduate Students to Metabolomics Using a NMR-Based Analysis of Coffee Beans
ERIC Educational Resources Information Center
Sandusky, Peter Olaf
2017-01-01
Metabolomics applies multivariate statistical analysis to sets of high-resolution spectra taken over a population of biologically derived samples. The objective is to distinguish subpopulations within the overall sample population, and possibly also to identify biomarkers. While metabolomics has become part of the standard analytical toolbox in…
Correlative weighted stacking for seismic data in the wavelet domain
Zhang, S.; Xu, Y.; Xia, J.; ,
2004-01-01
Horizontal stacking plays a crucial role for modern seismic data processing, for it not only compresses random noise and multiple reflections, but also provides a foundational data for subsequent migration and inversion. However, a number of examples showed that random noise in adjacent traces exhibits correlation and coherence. The average stacking and weighted stacking based on the conventional correlative function all result in false events, which are caused by noise. Wavelet transform and high order statistics are very useful methods for modern signal processing. The multiresolution analysis in wavelet theory can decompose signal on difference scales, and high order correlative function can inhibit correlative noise, for which the conventional correlative function is of no use. Based on the theory of wavelet transform and high order statistics, high order correlative weighted stacking (HOCWS) technique is presented in this paper. Its essence is to stack common midpoint gathers after the normal moveout correction by weight that is calculated through high order correlative statistics in the wavelet domain. Synthetic examples demonstrate its advantages in improving the signal to noise (S/N) ration and compressing the correlative random noise.
Lukášová, Ivana; Muselík, Jan; Franc, Aleš; Goněc, Roman; Mika, Filip; Vetchý, David
2017-11-15
Warfarin is intensively discussed drug with narrow therapeutic range. There have been cases of bleeding attributed to varying content or altered quality of the active substance. Factor analysis is useful for finding suitable technological parameters leading to high content uniformity of tablets containing low amount of active substance. The composition of tabletting blend and technological procedure were set with respect to factor analysis of previously published results. The correctness of set parameters was checked by manufacturing and evaluation of tablets containing 1-10mg of warfarin sodium. The robustness of suggested technology was checked by using "worst case scenario" and statistical evaluation of European Pharmacopoeia (EP) content uniformity limits with respect to Bergum division and process capability index (Cpk). To evaluate the quality of active substance and tablets, dissolution method was developed (water; EP apparatus II; 25rpm), allowing for statistical comparison of dissolution profiles. Obtained results prove the suitability of factor analysis to optimize the composition with respect to batches manufactured previously and thus the use of metaanalysis under industrial conditions is feasible. Copyright © 2017 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zubko, I. Yu., E-mail: zoubko@list.ru; Kochurov, V. I.
2015-10-27
For the aim of the crystal temperature control the computational-statistical approach to studying thermo-mechanical properties for finite sized crystals is presented. The approach is based on the combination of the high-performance computational techniques and statistical analysis of the crystal response on external thermo-mechanical actions for specimens with the statistically small amount of atoms (for instance, nanoparticles). The heat motion of atoms is imitated in the statics approach by including the independent degrees of freedom for atoms connected with their oscillations. We obtained that under heating, graphene material response is nonsymmetric.
NASA Astrophysics Data System (ADS)
Lee, K. David; Wiesenfeld, Eric; Gelfand, Andrew
2007-04-01
One of the greatest challenges in modern combat is maintaining a high level of timely Situational Awareness (SA). In many situations, computational complexity and accuracy considerations make the development and deployment of real-time, high-level inference tools very difficult. An innovative hybrid framework that combines Bayesian inference, in the form of Bayesian Networks, and Possibility Theory, in the form of Fuzzy Logic systems, has recently been introduced to provide a rigorous framework for high-level inference. In previous research, the theoretical basis and benefits of the hybrid approach have been developed. However, lacking is a concrete experimental comparison of the hybrid framework with traditional fusion methods, to demonstrate and quantify this benefit. The goal of this research, therefore, is to provide a statistical analysis on the comparison of the accuracy and performance of hybrid network theory, with pure Bayesian and Fuzzy systems and an inexact Bayesian system approximated using Particle Filtering. To accomplish this task, domain specific models will be developed under these different theoretical approaches and then evaluated, via Monte Carlo Simulation, in comparison to situational ground truth to measure accuracy and fidelity. Following this, a rigorous statistical analysis of the performance results will be performed, to quantify the benefit of hybrid inference to other fusion tools.
Petrovecki, Mladen; Rahelić, Dario; Bilić-Zulle, Lidija; Jelec, Vjekoslav
2003-02-01
To investigate whether and to what extent various parameters, such as individual characteristics, computer habits, situational factors, and pseudoscientific variables, influence Medical Informatics examination grade, and how inadequate statistical analysis can lead to wrong conclusions. The study included a total of 382 second-year undergraduate students at the Rijeka University School of Medicine in the period from 1996/97 to 2000/01 academic year. After passing the Medical Informatics exam, students filled out an anonymous questionnaire about their attitude toward learning medical informatics. They were asked to grade the course organization and curriculum content, and provide their date of birth; sex; study year; high school grades; Medical Informatics examination grade, type, and term; and describe their computer habits. From these data, we determined their zodiac signs and biorhythm. Data were compared by the use of t-test, one-way ANOVA with Tukey's honest significance difference test, and randomized complete block design ANOVA. Out of 21 variables analyzed, only 10 correlated with the average grade. Students taking Medical Informatics examination in the 1998/99 academic year earned lower average grade than any other generation. Significantly higher Medical Informatics exam grade was earned by students who finished a grammar high school; owned and regularly used a computer, Internet, and e-mail (p< or =0.002 for all items); passed an oral exam without taking a written test (p=0.004), or did not repeat the exam (p<0.001). Better high-school students and students with better grades from high-school informatics course also scored significantly better (p=0.032 and p<0.001, respectively). Grade in high-school mathematics, student's sex, and time of year when the examination was taken were not related to the grade, and neither were pseudoscientific parameters, such as student zodiac sign, zodiac sign quality, or biorhythm cycles, except when intentionally inadequate statistics was used for data analysis. Medical Informatics examination grades correlated with general learning capacity and computer habits of students, but showed no relation to other investigated parameters, such as examination term or pseudoscientific parameters. Inadequate statistical analysis can always confirm false conclusions.
1992-01-09
Crystal Polymers Tracy Reed Geophysics Laboratory (GEO) 9 Analysis of Model Output Statistics Thunderstorm Prediction Model Frank Lasley 10...four hours to twenty-four hours. It was predicted that the dogbones would turn brown once they reached the approximate annealing temperature. This was...LYS Hanscom AFB Frank A. Lasley Abstracft. Model Output Statistics (MOS) Thunderstorm prediction information and Service A weather observations
Rapid analysis of pharmaceutical drugs using LIBS coupled with multivariate analysis.
Tiwari, P K; Awasthi, S; Kumar, R; Anand, R K; Rai, P K; Rai, A K
2018-02-01
Type 2 diabetes drug tablets containing voglibose having dose strengths of 0.2 and 0.3 mg of various brands have been examined, using laser-induced breakdown spectroscopy (LIBS) technique. The statistical methods such as the principal component analysis (PCA) and the partial least square regression analysis (PLSR) have been employed on LIBS spectral data for classifying and developing the calibration models of drug samples. We have developed the ratio-based calibration model applying PLSR in which relative spectral intensity ratios H/C, H/N and O/N are used. Further, the developed model has been employed to predict the relative concentration of element in unknown drug samples. The experiment has been performed in air and argon atmosphere, respectively, and the obtained results have been compared. The present model provides rapid spectroscopic method for drug analysis with high statistical significance for online control and measurement process in a wide variety of pharmaceutical industrial applications.
Cyber Risk Management for Critical Infrastructure: A Risk Analysis Model and Three Case Studies.
Paté-Cornell, M-Elisabeth; Kuypers, Marshall; Smith, Matthew; Keller, Philip
2018-02-01
Managing cyber security in an organization involves allocating the protection budget across a spectrum of possible options. This requires assessing the benefits and the costs of these options. The risk analyses presented here are statistical when relevant data are available, and system-based for high-consequence events that have not happened yet. This article presents, first, a general probabilistic risk analysis framework for cyber security in an organization to be specified. It then describes three examples of forward-looking analyses motivated by recent cyber attacks. The first one is the statistical analysis of an actual database, extended at the upper end of the loss distribution by a Bayesian analysis of possible, high-consequence attack scenarios that may happen in the future. The second is a systems analysis of cyber risks for a smart, connected electric grid, showing that there is an optimal level of connectivity. The third is an analysis of sequential decisions to upgrade the software of an existing cyber security system or to adopt a new one to stay ahead of adversaries trying to find their way in. The results are distributions of losses to cyber attacks, with and without some considered countermeasures in support of risk management decisions based both on past data and anticipated incidents. © 2017 Society for Risk Analysis.
Peronard, Jean-Paul
2013-01-01
This article is a comparative analysis between workers in health care with high and low degree of readiness for living technology such as robotics. To explore the differences among workers' readiness, statistical analysis was conducted in a data set obtained from 200 respondents. The results showed important differences between high- and low-readiness types on issues such as staff security, documentation, autonomy, and future challenges.
1998-01-01
Ferrography on High Performance Aircraft Engine Lubricating Oils Allison M. Toms, Sharon 0. Hem, Tim Yarborough Joint Oil Analysis Program Technical...turbine engines by spectroscopy (AES and FT-IR) and direct reading and analytical ferrography . A statistical analysis of the data collected is...presented. Key Words: Analytical ferrography ; atomic emission spectroscopy; condition monitoring; direct reading ferrography ; Fourier transform infrared
ERIC Educational Resources Information Center
Rushton, Gregory T.; Rosengrant, David; Dewar, Andrew; Shah, Lisa; Ray, Herman E.; Sheppard, Keith; Watanabe, Lynn
2017-01-01
Efforts to improve the number and quality of the high school physics teaching workforce have taken several forms, including those sponsored by professional organizations. Using a series of large-scale teacher demographic data sets from the National Center for Education Statistics (NCES), this study sought to investigate trends in teacher quality…
ERIC Educational Resources Information Center
Horne, Lela M.; Rachal, John R.; Shelley, Kyna
2012-01-01
A mixed methods framework utilized quantitative and qualitative data to determine whether statistically significant differences existed between high school and GED[R] student perceptions of credential value. An exploratory factor analysis (n=326) extracted four factors and then a MANOVA procedure was performed with a stratified quota sample…
Suárez, Susanna; Rubio, Arantxa; Sueiro, Rosa Ana; Garrido, Joaquín
2003-06-06
In some cities of the autonomous community of Extremadura (south-west of Spain), levels of simazine from 10 to 30 ppm were detected in tap water. To analyse the possible effect of this herbicide, two biomarkers, sister chromatid exchanges (SCE) and micronuclei (MN), were used in peripheral blood lymphocytes from males exposed to simazine through drinking water. SCE and MN analysis failed to detect any statistically significant increase in the people exposed to simazine when compared with the controls. With respect to high frequency cells (HFC), a statistically significant difference was detected between exposed and control groups.
Yan, Jianjun; Shen, Xiaojing; Wang, Yiqin; Li, Fufeng; Xia, Chunming; Guo, Rui; Chen, Chunfeng; Shen, Qingwei
2010-01-01
This study aims at utilising Wavelet Packet Transform (WPT) and Support Vector Machine (SVM) algorithm to make objective analysis and quantitative research for the auscultation in Traditional Chinese Medicine (TCM) diagnosis. First, Wavelet Packet Decomposition (WPD) at level 6 was employed to split more elaborate frequency bands of the auscultation signals. Then statistic analysis was made based on the extracted Wavelet Packet Energy (WPE) features from WPD coefficients. Furthermore, the pattern recognition was used to distinguish mixed subjects' statistical feature values of sample groups through SVM. Finally, the experimental results showed that the classification accuracies were at a high level.
Statistics of SU(5) D-brane models on a type II orientifold
NASA Astrophysics Data System (ADS)
Gmeiner, Florian; Stein, Maren
2006-06-01
We perform a statistical analysis of models with SU(5) and flipped SU(5) gauge group in a type II orientifold setup. We investigate the distribution and correlation of properties of these models, including the number of generations and the hidden sector gauge group. Compared to the recent analysis [F. Gmeiner, R. Blumenhagen, G. Honecker, D. Lüst, and T. Weigand, J. High Energy Phys.JHEPFG1029-8479 01 (2006) 004; F. Gmeiner, Fortschr. Phys.FPYKA60015-8208 54, 391 (2006).10.1088/1126-6708/2006/01/004] of models with a standard model-like gauge group, we find very similar results.
Properties of different selection signature statistics and a new strategy for combining them.
Ma, Y; Ding, X; Qanbari, S; Weigend, S; Zhang, Q; Simianer, H
2015-11-01
Identifying signatures of recent or ongoing selection is of high relevance in livestock population genomics. From a statistical perspective, determining a proper testing procedure and combining various test statistics is challenging. On the basis of extensive simulations in this study, we discuss the statistical properties of eight different established selection signature statistics. In the considered scenario, we show that a reasonable power to detect selection signatures is achieved with high marker density (>1 SNP/kb) as obtained from sequencing, while rather small sample sizes (~15 diploid individuals) appear to be sufficient. Most selection signature statistics such as composite likelihood ratio and cross population extended haplotype homozogysity have the highest power when fixation of the selected allele is reached, while integrated haplotype score has the highest power when selection is ongoing. We suggest a novel strategy, called de-correlated composite of multiple signals (DCMS) to combine different statistics for detecting selection signatures while accounting for the correlation between the different selection signature statistics. When examined with simulated data, DCMS consistently has a higher power than most of the single statistics and shows a reliable positional resolution. We illustrate the new statistic to the established selective sweep around the lactase gene in human HapMap data providing further evidence of the reliability of this new statistic. Then, we apply it to scan selection signatures in two chicken samples with diverse skin color. Our analysis suggests that a set of well-known genes such as BCO2, MC1R, ASIP and TYR were involved in the divergent selection for this trait.
Imaging mass spectrometry statistical analysis.
Jones, Emrys A; Deininger, Sören-Oliver; Hogendoorn, Pancras C W; Deelder, André M; McDonnell, Liam A
2012-08-30
Imaging mass spectrometry is increasingly used to identify new candidate biomarkers. This clinical application of imaging mass spectrometry is highly multidisciplinary: expertise in mass spectrometry is necessary to acquire high quality data, histology is required to accurately label the origin of each pixel's mass spectrum, disease biology is necessary to understand the potential meaning of the imaging mass spectrometry results, and statistics to assess the confidence of any findings. Imaging mass spectrometry data analysis is further complicated because of the unique nature of the data (within the mass spectrometry field); several of the assumptions implicit in the analysis of LC-MS/profiling datasets are not applicable to imaging. The very large size of imaging datasets and the reporting of many data analysis routines, combined with inadequate training and accessible reviews, have exacerbated this problem. In this paper we provide an accessible review of the nature of imaging data and the different strategies by which the data may be analyzed. Particular attention is paid to the assumptions of the data analysis routines to ensure that the reader is apprised of their correct usage in imaging mass spectrometry research. Copyright © 2012 Elsevier B.V. All rights reserved.
Sample size and power considerations in network meta-analysis
2012-01-01
Background Network meta-analysis is becoming increasingly popular for establishing comparative effectiveness among multiple interventions for the same disease. Network meta-analysis inherits all methodological challenges of standard pairwise meta-analysis, but with increased complexity due to the multitude of intervention comparisons. One issue that is now widely recognized in pairwise meta-analysis is the issue of sample size and statistical power. This issue, however, has so far only received little attention in network meta-analysis. To date, no approaches have been proposed for evaluating the adequacy of the sample size, and thus power, in a treatment network. Findings In this article, we develop easy-to-use flexible methods for estimating the ‘effective sample size’ in indirect comparison meta-analysis and network meta-analysis. The effective sample size for a particular treatment comparison can be interpreted as the number of patients in a pairwise meta-analysis that would provide the same degree and strength of evidence as that which is provided in the indirect comparison or network meta-analysis. We further develop methods for retrospectively estimating the statistical power for each comparison in a network meta-analysis. We illustrate the performance of the proposed methods for estimating effective sample size and statistical power using data from a network meta-analysis on interventions for smoking cessation including over 100 trials. Conclusion The proposed methods are easy to use and will be of high value to regulatory agencies and decision makers who must assess the strength of the evidence supporting comparative effectiveness estimates. PMID:22992327
NASA Astrophysics Data System (ADS)
Ilyas, Muhammad; Salwah
2017-02-01
The type of this research was experiment. The purpose of this study was to determine the difference and the quality of student's learning achievement between students who obtained learning through Realistic Mathematics Education (RME) approach and students who obtained learning through problem solving approach. This study was a quasi-experimental research with non-equivalent experiment group design. The population of this study was all students of grade VII in one of junior high school in Palopo, in the second semester of academic year 2015/2016. Two classes were selected purposively as sample of research that was: year VII-5 as many as 28 students were selected as experiment group I and VII-6 as many as 23 students were selected as experiment group II. Treatment that used in the experiment group I was learning by RME Approach, whereas in the experiment group II by problem solving approach. Technique of data collection in this study gave pretest and posttest to students. The analysis used in this research was an analysis of descriptive statistics and analysis of inferential statistics using t-test. Based on the analysis of descriptive statistics, it can be concluded that the average score of students' mathematics learning after taught using problem solving approach was similar to the average results of students' mathematics learning after taught using realistic mathematics education (RME) approach, which are both at the high category. In addition, It can also be concluded that; (1) there was no difference in the results of students' mathematics learning taught using realistic mathematics education (RME) approach and students who taught using problem solving approach, (2) quality of learning achievement of students who received RME approach and problem solving approach learning was same, which was at the high category.
Lazo Gonzalez, Eduardo; Hilgenfeld, Tim; Kickingereder, Philipp; Bendszus, Martin; Heiland, Sabine; Ozga, Ann-Kathrin; Sommer, Andreas; Lux, Christopher J.; Zingler, Sebastian
2017-01-01
Objective The objective of this prospective study was to evaluate whether magnetic resonance imaging (MRI) is equivalent to lateral cephalometric radiographs (LCR, “gold standard”) in cephalometric analysis. Methods The applied MRI technique was optimized for short scanning time, high resolution, high contrast and geometric accuracy. Prior to orthodontic treatment, 20 patients (mean age ± SD, 13.95 years ± 5.34) received MRI and LCR. MRI datasets were postprocessed into lateral cephalograms. Cephalometric analysis was performed twice by two independent observers for both modalities with an interval of 4 weeks. Eight bilateral and 10 midsagittal landmarks were identified, and 24 widely used measurements (14 angles, 10 distances) were calculated. Statistical analysis was performed by using intraclass correlation coefficient (ICC), Bland-Altman analysis and two one-sided tests (TOST) within the predefined equivalence margin of ± 2°/mm. Results Geometric accuracy of the MRI technique was confirmed by phantom measurements. Mean intraobserver ICC were 0.977/0.975 for MRI and 0.975/0.961 for LCR. Average interobserver ICC were 0.980 for MRI and 0.929 for LCR. Bland-Altman analysis showed high levels of agreement between the two modalities, bias range (mean ± SD) was -0.66 to 0.61 mm (0.06 ± 0.44) for distances and -1.33 to 1.14° (0.06 ± 0.71) for angles. Except for the interincisal angle (p = 0.17) all measurements were statistically equivalent (p < 0.05). Conclusions This study demonstrates feasibility of orthodontic treatment planning without radiation exposure based on MRI. High-resolution isotropic MRI datasets can be transformed into lateral cephalograms allowing reliable measurements as applied in orthodontic routine with high concordance to the corresponding measurements on LCR. PMID:28334054
Heil, Alexander; Lazo Gonzalez, Eduardo; Hilgenfeld, Tim; Kickingereder, Philipp; Bendszus, Martin; Heiland, Sabine; Ozga, Ann-Kathrin; Sommer, Andreas; Lux, Christopher J; Zingler, Sebastian
2017-01-01
The objective of this prospective study was to evaluate whether magnetic resonance imaging (MRI) is equivalent to lateral cephalometric radiographs (LCR, "gold standard") in cephalometric analysis. The applied MRI technique was optimized for short scanning time, high resolution, high contrast and geometric accuracy. Prior to orthodontic treatment, 20 patients (mean age ± SD, 13.95 years ± 5.34) received MRI and LCR. MRI datasets were postprocessed into lateral cephalograms. Cephalometric analysis was performed twice by two independent observers for both modalities with an interval of 4 weeks. Eight bilateral and 10 midsagittal landmarks were identified, and 24 widely used measurements (14 angles, 10 distances) were calculated. Statistical analysis was performed by using intraclass correlation coefficient (ICC), Bland-Altman analysis and two one-sided tests (TOST) within the predefined equivalence margin of ± 2°/mm. Geometric accuracy of the MRI technique was confirmed by phantom measurements. Mean intraobserver ICC were 0.977/0.975 for MRI and 0.975/0.961 for LCR. Average interobserver ICC were 0.980 for MRI and 0.929 for LCR. Bland-Altman analysis showed high levels of agreement between the two modalities, bias range (mean ± SD) was -0.66 to 0.61 mm (0.06 ± 0.44) for distances and -1.33 to 1.14° (0.06 ± 0.71) for angles. Except for the interincisal angle (p = 0.17) all measurements were statistically equivalent (p < 0.05). This study demonstrates feasibility of orthodontic treatment planning without radiation exposure based on MRI. High-resolution isotropic MRI datasets can be transformed into lateral cephalograms allowing reliable measurements as applied in orthodontic routine with high concordance to the corresponding measurements on LCR.
Statistical Analysis of Research Data | Center for Cancer Research
Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. The Statistical Analysis of Research Data (SARD) course will be held on April 5-6, 2018 from 9 a.m.-5 p.m. at the National Institutes of Health's Natcher Conference Center, Balcony C on the Bethesda Campus. SARD is designed to provide an overview on the general principles of statistical analysis of research data. The first day will feature univariate data analysis, including descriptive statistics, probability distributions, one- and two-sample inferential statistics.
Shaikh, Masood Ali
2017-09-01
Assessment of research articles in terms of study designs used, statistical tests applied and the use of statistical analysis programmes help determine research activity profile and trends in the country. In this descriptive study, all original articles published by Journal of Pakistan Medical Association (JPMA) and Journal of the College of Physicians and Surgeons Pakistan (JCPSP), in the year 2015 were reviewed in terms of study designs used, application of statistical tests, and the use of statistical analysis programmes. JPMA and JCPSP published 192 and 128 original articles, respectively, in the year 2015. Results of this study indicate that cross-sectional study design, bivariate inferential statistical analysis entailing comparison between two variables/groups, and use of statistical software programme SPSS to be the most common study design, inferential statistical analysis, and statistical analysis software programmes, respectively. These results echo previously published assessment of these two journals for the year 2014.
A statistical analysis of cervical auscultation signals from adults with unsafe airway protection.
Dudik, Joshua M; Kurosu, Atsuko; Coyle, James L; Sejdić, Ervin
2016-01-22
Aspiration, where food or liquid is allowed to enter the larynx during a swallow, is recognized as the most clinically salient feature of oropharyngeal dysphagia. This event can lead to short-term harm via airway obstruction or more long-term effects such as pneumonia. In order to non-invasively identify this event using high resolution cervical auscultation there is a need to characterize cervical auscultation signals from subjects with dysphagia who aspirate. In this study, we collected swallowing sound and vibration data from 76 adults (50 men, 26 women, mean age 62) who underwent a routine videofluoroscopy swallowing examination. The analysis was limited to swallows of liquid with either thin (<5 cps) or viscous (≈300 cps) consistency and was divided into those with deep laryngeal penetration or aspiration (unsafe airway protection), and those with either shallow or no laryngeal penetration (safe airway protection), using a standardized scale. After calculating a selection of time, frequency, and time-frequency features for each swallow, the safe and unsafe categories were compared using Wilcoxon rank-sum statistical tests. Our analysis found that few of our chosen features varied in magnitude between safe and unsafe swallows with thin swallows demonstrating no statistical variation. We also supported our past findings with regard to the effects of sex and the presence or absence of stroke on cervical ausculation signals, but noticed certain discrepancies with regards to bolus viscosity. Overall, our results support the necessity of using multiple statistical features concurrently to identify laryngeal penetration of swallowed boluses in future work with high resolution cervical auscultation.
Wang, Fang-Xu; Yuan, Jian-Chao; Kang, Li-Ping; Pang, Xu; Yan, Ren-Yi; Zhao, Yang; Zhang, Jie; Sun, Xin-Guang; Ma, Bai-Ping
2016-09-10
An ultra high-performance liquid chromatography quadrupole time-of-flight tandem mass spectrometry approach coupled with multivariate statistical analysis was established and applied to rapidly distinguish the chemical differences between fibrous root and rhizome of Anemarrhena asphodeloides. The datasets of tR-m/z pairs, ion intensity and sample code were processed by principal component analysis and orthogonal partial least squares discriminant analysis. Chemical markers could be identified based on their exact mass data, fragmentation characteristics, and retention times. And the new compounds among chemical markers could be isolated rapidly guided by the ultra high-performance liquid chromatography quadrupole time-of-flight tandem mass spectrometry and their definitive structures would be further elucidated by NMR spectra. Using this approach, twenty-four markers were identified on line including nine new saponins and five new steroidal saponins of them were obtained in pure form. The study validated this proposed approach as a suitable method for identification of the chemical differences between various medicinal parts in order to expand medicinal parts and increase the utilization rate of resources. Copyright © 2016 Elsevier B.V. All rights reserved.
Fuchs, Tobias A; Fiechter, Michael; Gebhard, Cathérine; Stehli, Julia; Ghadri, Jelena R; Kazakauskaite, Egle; Herzog, Bernhard A; Husmann, Lars; Gaemperli, Oliver; Kaufmann, Philipp A
2013-03-01
To assess the impact of adaptive statistical iterative reconstruction (ASIR) on coronary plaque volume and composition analysis as well as on stenosis quantification in high definition coronary computed tomography angiography (CCTA). We included 50 plaques in 29 consecutive patients who were referred for the assessment of known or suspected coronary artery disease (CAD) with contrast-enhanced CCTA on a 64-slice high definition CT scanner (Discovery HD 750, GE Healthcare). CCTA scans were reconstructed with standard filtered back projection (FBP) with no ASIR (0 %) or with increasing contributions of ASIR, i.e. 20, 40, 60, 80 and 100 % (no FBP). Plaque analysis (volume, components and stenosis degree) was performed using a previously validated automated software. Mean values for minimal diameter and minimal area as well as degree of stenosis did not change significantly using different ASIR reconstructions. There was virtually no impact of reconstruction algorithms on mean plaque volume or plaque composition (e.g. soft, intermediate and calcified component). However, with increasing ASIR contribution, the percentage of plaque volume component between 401 and 500 HU decreased significantly (p < 0.05). Modern image reconstruction algorithms such as ASIR, which has been developed for noise reduction in latest high resolution CCTA scans, can be used reliably without interfering with the plaque analysis and stenosis severity assessment.
Statistical tools for transgene copy number estimation based on real-time PCR.
Yuan, Joshua S; Burris, Jason; Stewart, Nathan R; Mentewab, Ayalew; Stewart, C Neal
2007-11-01
As compared with traditional transgene copy number detection technologies such as Southern blot analysis, real-time PCR provides a fast, inexpensive and high-throughput alternative. However, the real-time PCR based transgene copy number estimation tends to be ambiguous and subjective stemming from the lack of proper statistical analysis and data quality control to render a reliable estimation of copy number with a prediction value. Despite the recent progresses in statistical analysis of real-time PCR, few publications have integrated these advancements in real-time PCR based transgene copy number determination. Three experimental designs and four data quality control integrated statistical models are presented. For the first method, external calibration curves are established for the transgene based on serially-diluted templates. The Ct number from a control transgenic event and putative transgenic event are compared to derive the transgene copy number or zygosity estimation. Simple linear regression and two group T-test procedures were combined to model the data from this design. For the second experimental design, standard curves were generated for both an internal reference gene and the transgene, and the copy number of transgene was compared with that of internal reference gene. Multiple regression models and ANOVA models can be employed to analyze the data and perform quality control for this approach. In the third experimental design, transgene copy number is compared with reference gene without a standard curve, but rather, is based directly on fluorescence data. Two different multiple regression models were proposed to analyze the data based on two different approaches of amplification efficiency integration. Our results highlight the importance of proper statistical treatment and quality control integration in real-time PCR-based transgene copy number determination. These statistical methods allow the real-time PCR-based transgene copy number estimation to be more reliable and precise with a proper statistical estimation. Proper confidence intervals are necessary for unambiguous prediction of trangene copy number. The four different statistical methods are compared for their advantages and disadvantages. Moreover, the statistical methods can also be applied for other real-time PCR-based quantification assays including transfection efficiency analysis and pathogen quantification.
Hamel, Jean-Francois; Saulnier, Patrick; Pe, Madeline; Zikos, Efstathios; Musoro, Jammbe; Coens, Corneel; Bottomley, Andrew
2017-09-01
Over the last decades, Health-related Quality of Life (HRQoL) end-points have become an important outcome of the randomised controlled trials (RCTs). HRQoL methodology in RCTs has improved following international consensus recommendations. However, no international recommendations exist concerning the statistical analysis of such data. The aim of our study was to identify and characterise the quality of the statistical methods commonly used for analysing HRQoL data in cancer RCTs. Building on our recently published systematic review, we analysed a total of 33 published RCTs studying the HRQoL methods reported in RCTs since 1991. We focussed on the ability of the methods to deal with the three major problems commonly encountered when analysing HRQoL data: their multidimensional and longitudinal structure and the commonly high rate of missing data. All studies reported HRQoL being assessed repeatedly over time for a period ranging from 2 to 36 months. Missing data were common, with compliance rates ranging from 45% to 90%. From the 33 studies considered, 12 different statistical methods were identified. Twenty-nine studies analysed each of the questionnaire sub-dimensions without type I error adjustment. Thirteen studies repeated the HRQoL analysis at each assessment time again without type I error adjustment. Only 8 studies used methods suitable for repeated measurements. Our findings show a lack of consistency in statistical methods for analysing HRQoL data. Problems related to multiple comparisons were rarely considered leading to a high risk of false positive results. It is therefore critical that international recommendations for improving such statistical practices are developed. Copyright © 2017. Published by Elsevier Ltd.
Reese, Sarah E; Archer, Kellie J; Therneau, Terry M; Atkinson, Elizabeth J; Vachon, Celine M; de Andrade, Mariza; Kocher, Jean-Pierre A; Eckel-Passow, Jeanette E
2013-11-15
Batch effects are due to probe-specific systematic variation between groups of samples (batches) resulting from experimental features that are not of biological interest. Principal component analysis (PCA) is commonly used as a visual tool to determine whether batch effects exist after applying a global normalization method. However, PCA yields linear combinations of the variables that contribute maximum variance and thus will not necessarily detect batch effects if they are not the largest source of variability in the data. We present an extension of PCA to quantify the existence of batch effects, called guided PCA (gPCA). We describe a test statistic that uses gPCA to test whether a batch effect exists. We apply our proposed test statistic derived using gPCA to simulated data and to two copy number variation case studies: the first study consisted of 614 samples from a breast cancer family study using Illumina Human 660 bead-chip arrays, whereas the second case study consisted of 703 samples from a family blood pressure study that used Affymetrix SNP Array 6.0. We demonstrate that our statistic has good statistical properties and is able to identify significant batch effects in two copy number variation case studies. We developed a new statistic that uses gPCA to identify whether batch effects exist in high-throughput genomic data. Although our examples pertain to copy number data, gPCA is general and can be used on other data types as well. The gPCA R package (Available via CRAN) provides functionality and data to perform the methods in this article. reesese@vcu.edu
Vasudevan, Rama K; Tselev, Alexander; Baddorf, Arthur P; Kalinin, Sergei V
2014-10-28
Reflection high energy electron diffraction (RHEED) has by now become a standard tool for in situ monitoring of film growth by pulsed laser deposition and molecular beam epitaxy. Yet despite the widespread adoption and wealth of information in RHEED images, most applications are limited to observing intensity oscillations of the specular spot, and much additional information on growth is discarded. With ease of data acquisition and increased computation speeds, statistical methods to rapidly mine the data set are now feasible. Here, we develop such an approach to the analysis of the fundamental growth processes through multivariate statistical analysis of a RHEED image sequence. This approach is illustrated for growth of La(x)Ca(1-x)MnO(3) films grown on etched (001) SrTiO(3) substrates, but is universal. The multivariate methods including principal component analysis and k-means clustering provide insight into the relevant behaviors, the timing and nature of a disordered to ordered growth change, and highlight statistically significant patterns. Fourier analysis yields the harmonic components of the signal and allows separation of the relevant components and baselines, isolating the asymmetric nature of the step density function and the transmission spots from the imperfect layer-by-layer (LBL) growth. These studies show the promise of big data approaches to obtaining more insight into film properties during and after epitaxial film growth. Furthermore, these studies open the pathway to use forward prediction methods to potentially allow significantly more control over growth process and hence final film quality.
Proposal for a recovery prediction method for patients affected by acute mediastinitis
2012-01-01
Background An attempt to find a prediction method of death risk in patients affected by acute mediastinitis. There is not such a tool described in available literature for that serious disease. Methods The study comprised 44 consecutive cases of acute mediastinitis. General anamnesis and biochemical data were included. Factor analysis was used to extract the risk characteristic for the patients. The most valuable results were obtained for 8 parameters which were selected for further statistical analysis (all collected during few hours after admission). Three factors reached Eigenvalue >1. Clinical explanations of these combined statistical factors are: Factor1 - proteinic status (serum total protein, albumin, and hemoglobin level), Factor2 - inflammatory status (white blood cells, CRP, procalcitonin), and Factor3 - general risk (age, number of coexisting diseases). Threshold values of prediction factors were estimated by means of statistical analysis (factor analysis, Statgraphics Centurion XVI). Results The final prediction result for the patients is constructed as simultaneous evaluation of all factor scores. High probability of death should be predicted if factor 1 value decreases with simultaneous increase of factors 2 and 3. The diagnostic power of the proposed method was revealed to be high [sensitivity =90%, specificity =64%], for Factor1 [SNC = 87%, SPC = 79%]; for Factor2 [SNC = 87%, SPC = 50%] and for Factor3 [SNC = 73%, SPC = 71%]. Conclusion The proposed prediction method seems a useful emergency signal during acute mediastinitis control in affected patients. PMID:22574625
NASA Astrophysics Data System (ADS)
Palozzi, Jason; Pantopoulos, George; Maravelis, Angelos G.; Nordsvan, Adam; Zelilidis, Avraam
2018-02-01
This investigation presents an outcrop-based integrated study of internal division analysis and statistical treatment of turbidite bed thickness applied to a Carboniferous deep-water channel-levee complex in the Myall Trough, southeast Australia. Turbidite beds of the studied succession are characterized by a range of sedimentary structures grouped into two main associations, a thick-bedded and a thin-bedded one, that reflect channel-fill and overbank/levee deposits, respectively. Three vertically stacked channel-levee cycles have been identified. Results of statistical analysis of bed thickness, grain-size and internal division patterns applied on the studied channel-levee succession, indicate that turbidite bed thickness data seem to be well characterized by a bimodal lognormal distribution, which is possibly reflecting the difference between deposition from lower-density flows (in a levee/overbank setting) and very high-density flows (in a channel fill setting). Power law and exponential distributions were observed to hold only for the thick-bedded parts of the succession and cannot characterize the whole bed thickness range of the studied sediments. The succession also exhibits non-random clustering of bed thickness and grain-size measurements. The studied sediments are also characterized by the presence of statistically detected fining-upward sandstone packets. A novel quantitative approach (change-point analysis) is proposed for the detection of those packets. Markov permutation statistics also revealed the existence of order in the alternation of internal divisions in the succession expressed by an optimal internal division cycle reflecting two main types of gravity flow events deposited within both thick-bedded conglomeratic and thin-bedded sandstone associations. The analytical methods presented in this study can be used as additional tools for quantitative analysis and recognition of depositional environments in hydrocarbon-bearing research of ancient deep-water channel-levee settings.
Zhu, Yuerong; Zhu, Yuelin; Xu, Wei
2008-01-01
Background Though microarray experiments are very popular in life science research, managing and analyzing microarray data are still challenging tasks for many biologists. Most microarray programs require users to have sophisticated knowledge of mathematics, statistics and computer skills for usage. With accumulating microarray data deposited in public databases, easy-to-use programs to re-analyze previously published microarray data are in high demand. Results EzArray is a web-based Affymetrix expression array data management and analysis system for researchers who need to organize microarray data efficiently and get data analyzed instantly. EzArray organizes microarray data into projects that can be analyzed online with predefined or custom procedures. EzArray performs data preprocessing and detection of differentially expressed genes with statistical methods. All analysis procedures are optimized and highly automated so that even novice users with limited pre-knowledge of microarray data analysis can complete initial analysis quickly. Since all input files, analysis parameters, and executed scripts can be downloaded, EzArray provides maximum reproducibility for each analysis. In addition, EzArray integrates with Gene Expression Omnibus (GEO) and allows instantaneous re-analysis of published array data. Conclusion EzArray is a novel Affymetrix expression array data analysis and sharing system. EzArray provides easy-to-use tools for re-analyzing published microarray data and will help both novice and experienced users perform initial analysis of their microarray data from the location of data storage. We believe EzArray will be a useful system for facilities with microarray services and laboratories with multiple members involved in microarray data analysis. EzArray is freely available from . PMID:18218103
ERIC Educational Resources Information Center
Hatami, Gissou; Motamed, Niloofar; Ashrafzadeh, Mahshid
2010-01-01
Validity and reliability of Persian adaptation of MSLSS in the 12-18 years, middle and high school students (430 students in grades 6-12 in Bushehr port, Iran) using confirmatory factor analysis by means of LISREL statistical package were checked. Internal consistency reliability estimates (Cronbach's coefficient [alpha]) were all above the…
Number of Black Children in Extreme Poverty Hits Record High. Analysis Background.
ERIC Educational Resources Information Center
Children's Defense Fund, Washington, DC.
To examine the experiences of black children and poverty, researchers conducted a computer analysis of data from the U.S. Census Bureau's Current Population Survey, the source of official government poverty statistics. The data are through 2001. Results indicated that nearly 1 million black children were living in extreme poverty, with after-tax…
Performance Reports: Mirror alignment system performance prediction comparison between SAO and EKC
NASA Technical Reports Server (NTRS)
Tananbaum, H. D.; Zhang, J. P.
1994-01-01
The objective of this study is to perform an independent analysis of the residual high resolution mirror assembly (HRMA) mirror distortions caused by force and moment errors in the mirror alignment system (MAS) to statistically predict the HRMA performance. These performance predictions are then compared with those performed by Kodak to verify their analysis results.
The Essential Genome of Escherichia coli K-12.
Goodall, Emily C A; Robinson, Ashley; Johnston, Iain G; Jabbari, Sara; Turner, Keith A; Cunningham, Adam F; Lund, Peter A; Cole, Jeffrey A; Henderson, Ian R
2018-02-20
Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. IMPORTANCE Incentives to define lists of genes that are essential for bacterial survival include the identification of potential targets for antibacterial drug development, genes required for rapid growth for exploitation in biotechnology, and discovery of new biochemical pathways. To identify essential genes in Escherichia coli , we constructed a transposon mutant library of unprecedented density. Initial automated analysis of the resulting data revealed many discrepancies compared to the literature. We now report more extensive statistical analysis supported by both literature searches and detailed inspection of high-density TraDIS sequencing data for each putative essential gene for the E. coli model laboratory organism. This paper is important because it provides a better understanding of the essential genes of E. coli , reveals the limitations of relying on automated analysis alone, and provides a new standard for the analysis of TraDIS data. Copyright © 2018 Goodall et al.
Turgeon, Ricky D; Wilby, Kyle J; Ensom, Mary H H
2015-06-01
We conducted a systematic review with meta-analysis to evaluate the efficacy of antiviral agents on complete recovery of Bell's palsy. We searched CENTRAL, Embase, MEDLINE, International Pharmaceutical Abstracts, and sources of unpublished literature to November 1, 2014. Primary and secondary outcomes were complete and satisfactory recovery, respectively. To evaluate statistical heterogeneity, we performed subgroup analysis of baseline severity of Bell's palsy and between-study sensitivity analyses based on risk of allocation and detection bias. The 10 included randomized controlled trials (2419 patients; 807 with severe Bell's palsy at onset) had variable risk of bias, with 9 trials having a high risk of bias in at least 1 domain. Complete recovery was not statistically significantly greater with antiviral use versus no antiviral use in the random-effects meta-analysis of 6 trials (relative risk, 1.06; 95% confidence interval, 0.97-1.16; I(2) = 65%). Conversely, random-effects meta-analysis of 9 trials showed a statistically significant difference in satisfactory recovery (relative risk, 1.10; 95% confidence interval, 1.02-1.18; I(2) = 63%). Response to antiviral agents did not differ visually or statistically between patients with severe symptoms at baseline and those with milder disease (test for interaction, P = .11). Sensitivity analyses did not show a clear effect of bias on outcomes. Antiviral agents are not efficacious in increasing the proportion of patients with Bell's palsy who achieved complete recovery, regardless of baseline symptom severity. Copyright © 2015 Elsevier Inc. All rights reserved.
Open Source Tools for Seismicity Analysis
NASA Astrophysics Data System (ADS)
Powers, P.
2010-12-01
The spatio-temporal analysis of seismicity plays an important role in earthquake forecasting and is integral to research on earthquake interactions and triggering. For instance, the third version of the Uniform California Earthquake Rupture Forecast (UCERF), currently under development, will use Epidemic Type Aftershock Sequences (ETAS) as a model for earthquake triggering. UCERF will be a "living" model and therefore requires robust, tested, and well-documented ETAS algorithms to ensure transparency and reproducibility. Likewise, as earthquake aftershock sequences unfold, real-time access to high quality hypocenter data makes it possible to monitor the temporal variability of statistical properties such as the parameters of the Omori Law and the Gutenberg Richter b-value. Such statistical properties are valuable as they provide a measure of how much a particular sequence deviates from expected behavior and can be used when assigning probabilities of aftershock occurrence. To address these demands and provide public access to standard methods employed in statistical seismology, we present well-documented, open-source JavaScript and Java software libraries for the on- and off-line analysis of seismicity. The Javascript classes facilitate web-based asynchronous access to earthquake catalog data and provide a framework for in-browser display, analysis, and manipulation of catalog statistics; implementations of this framework will be made available on the USGS Earthquake Hazards website. The Java classes, in addition to providing tools for seismicity analysis, provide tools for modeling seismicity and generating synthetic catalogs. These tools are extensible and will be released as part of the open-source OpenSHA Commons library.
Image encryption based on a delayed fractional-order chaotic logistic system
NASA Astrophysics Data System (ADS)
Wang, Zhen; Huang, Xia; Li, Ning; Song, Xiao-Na
2012-05-01
A new image encryption scheme is proposed based on a delayed fractional-order chaotic logistic system. In the process of generating a key stream, the time-varying delay and fractional derivative are embedded in the proposed scheme to improve the security. Such a scheme is described in detail with security analyses including correlation analysis, information entropy analysis, run statistic analysis, mean-variance gray value analysis, and key sensitivity analysis. Experimental results show that the newly proposed image encryption scheme possesses high security.
Investigating spousal concordance of diabetes through statistical analysis and data mining.
Wang, Jong-Yi; Liu, Chiu-Shong; Lung, Chi-Hsuan; Yang, Ya-Tun; Lin, Ming-Hung
2017-01-01
Spousal clustering of diabetes merits attention. Whether old-age vulnerability or a shared family environment determines the concordance of diabetes is also uncertain. This study investigated the spousal concordance of diabetes and compared the risk of diabetes concordance between couples and noncouples by using nationally representative data. A total of 22,572 individuals identified from the 2002-2013 National Health Insurance Research Database of Taiwan constituted 5,643 couples and 5,643 noncouples through 1:1 dual propensity score matching (PSM). Factors associated with concordance in both spouses with diabetes were analyzed at the individual level. The risk of diabetes concordance between couples and noncouples was compared at the couple level. Logistic regression was the main statistical method. Statistical data were analyzed using SAS 9.4. C&RT and Apriori of data mining conducted in IBM SPSS Modeler 13 served as a supplement to statistics. High odds of the spousal concordance of diabetes were associated with old age, middle levels of urbanization, and high comorbidities (all P < 0.05). The dual PSM analysis revealed that the risk of diabetes concordance was significantly higher in couples (5.19%) than in noncouples (0.09%; OR = 61.743, P < 0.0001). A high concordance rate of diabetes in couples may indicate the influences of assortative mating and shared environment. Diabetes in a spouse implicates its risk in the partner. Family-based diabetes care that emphasizes the screening of couples at risk of diabetes by using the identified risk factors is suggested in prospective clinical practice interventions.
Investigating spousal concordance of diabetes through statistical analysis and data mining
Liu, Chiu-Shong; Lung, Chi-Hsuan; Yang, Ya-Tun; Lin, Ming-Hung
2017-01-01
Objective Spousal clustering of diabetes merits attention. Whether old-age vulnerability or a shared family environment determines the concordance of diabetes is also uncertain. This study investigated the spousal concordance of diabetes and compared the risk of diabetes concordance between couples and noncouples by using nationally representative data. Methods A total of 22,572 individuals identified from the 2002–2013 National Health Insurance Research Database of Taiwan constituted 5,643 couples and 5,643 noncouples through 1:1 dual propensity score matching (PSM). Factors associated with concordance in both spouses with diabetes were analyzed at the individual level. The risk of diabetes concordance between couples and noncouples was compared at the couple level. Logistic regression was the main statistical method. Statistical data were analyzed using SAS 9.4. C&RT and Apriori of data mining conducted in IBM SPSS Modeler 13 served as a supplement to statistics. Results High odds of the spousal concordance of diabetes were associated with old age, middle levels of urbanization, and high comorbidities (all P < 0.05). The dual PSM analysis revealed that the risk of diabetes concordance was significantly higher in couples (5.19%) than in noncouples (0.09%; OR = 61.743, P < 0.0001). Conclusions A high concordance rate of diabetes in couples may indicate the influences of assortative mating and shared environment. Diabetes in a spouse implicates its risk in the partner. Family-based diabetes care that emphasizes the screening of couples at risk of diabetes by using the identified risk factors is suggested in prospective clinical practice interventions. PMID:28817654
2011-01-01
Background Geographic Information Systems (GIS) combined with spatial analytical methods could be helpful in examining patterns of drug use. Little attention has been paid to geographic variation of cardiovascular prescription use in Taiwan. The main objective was to use local spatial association statistics to test whether or not the cardiovascular medication-prescribing pattern is homogenous across 352 townships in Taiwan. Methods The statistical methods used were the global measures of Moran's I and Local Indicators of Spatial Association (LISA). While Moran's I provides information on the overall spatial distribution of the data, LISA provides information on types of spatial association at the local level. LISA statistics can also be used to identify influential locations in spatial association analysis. The major classes of prescription cardiovascular drugs were taken from Taiwan's National Health Insurance Research Database (NHIRD), which has a coverage rate of over 97%. The dosage of each prescription was converted into defined daily doses to measure the consumption of each class of drugs. Data were analyzed with ArcGIS and GeoDa at the township level. Results The LISA statistics showed an unusual use of cardiovascular medications in the southern townships with high local variation. Patterns of drug use also showed more low-low spatial clusters (cold spots) than high-high spatial clusters (hot spots), and those low-low associations were clustered in the rural areas. Conclusions The cardiovascular drug prescribing patterns were heterogeneous across Taiwan. In particular, a clear pattern of north-south disparity exists. Such spatial clustering helps prioritize the target areas that require better education concerning drug use. PMID:21609462
Hypothesis testing for band size detection of high-dimensional banded precision matrices.
An, Baiguo; Guo, Jianhua; Liu, Yufeng
2014-06-01
Many statistical analysis procedures require a good estimator for a high-dimensional covariance matrix or its inverse, the precision matrix. When the precision matrix is banded, the Cholesky-based method often yields a good estimator of the precision matrix. One important aspect of this method is determination of the band size of the precision matrix. In practice, crossvalidation is commonly used; however, we show that crossvalidation not only is computationally intensive but can be very unstable. In this paper, we propose a new hypothesis testing procedure to determine the band size in high dimensions. Our proposed test statistic is shown to be asymptotically normal under the null hypothesis, and its theoretical power is studied. Numerical examples demonstrate the effectiveness of our testing procedure.
NASA Astrophysics Data System (ADS)
Jiao, Yi; Duan, Zhe
2017-01-01
In a diffraction-limited storage ring, half integer resonances can have strong effects on the beam dynamics, associated with the large detuning terms from the strong focusing and strong sextupoles as required for an ultralow emittance. In this study, the limitation of half integer resonances on the available momentum acceptance (MA) was statistically analyzed based on one design of the High Energy Photon Source (HEPS). It was found that the probability of MA reduction due to crossing of half integer resonances is closely correlated with the level of beta beats at the nominal tunes, but independent of the error sources. The analysis indicated that for the presented HEPS lattice design, the rms amplitude of beta beats should be kept below 1.5% horizontally and 2.5% vertically to reach a small MA reduction probability of about 1%.
Multivariate meta-analysis: a robust approach based on the theory of U-statistic.
Ma, Yan; Mazumdar, Madhu
2011-10-30
Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.
The statistical analysis of circadian phase and amplitude in constant-routine core-temperature data
NASA Technical Reports Server (NTRS)
Brown, E. N.; Czeisler, C. A.
1992-01-01
Accurate estimation of the phases and amplitude of the endogenous circadian pacemaker from constant-routine core-temperature series is crucial for making inferences about the properties of the human biological clock from data collected under this protocol. This paper presents a set of statistical methods based on a harmonic-regression-plus-correlated-noise model for estimating the phases and the amplitude of the endogenous circadian pacemaker from constant-routine core-temperature data. The methods include a Bayesian Monte Carlo procedure for computing the uncertainty in these circadian functions. We illustrate the techniques with a detailed study of a single subject's core-temperature series and describe their relationship to other statistical methods for circadian data analysis. In our laboratory, these methods have been successfully used to analyze more than 300 constant routines and provide a highly reliable means of extracting phase and amplitude information from core-temperature data.
A note about high blood pressure in childhood
NASA Astrophysics Data System (ADS)
Teodoro, M. Filomena; Simão, Carla
2017-06-01
In medical, behavioral and social sciences it is usual to get a binary outcome. In the present work is collected information where some of the outcomes are binary variables (1='yes'/ 0='no'). In [14] a preliminary study about the caregivers perception of pediatric hypertension was introduced. An experimental questionnaire was designed to be answered by the caregivers of routine pediatric consultation attendees in the Santa Maria's hospital (HSM). The collected data was statistically analyzed, where a descriptive analysis and a predictive model were performed. Significant relations between some socio-demographic variables and the assessed knowledge were obtained. In [14] can be found a statistical data analysis using partial questionnaire's information. The present article completes the statistical approach estimating a model for relevant remaining questions of questionnaire by Generalized Linear Models (GLM). Exploring the binary outcome issue, we intend to extend this approach using Generalized Linear Mixed Models (GLMM), but the process is still ongoing.
Shrout, Patrick E; Rodgers, Joseph L
2018-01-04
Psychology advances knowledge by testing statistical hypotheses using empirical observations and data. The expectation is that most statistically significant findings can be replicated in new data and in new laboratories, but in practice many findings have replicated less often than expected, leading to claims of a replication crisis. We review recent methodological literature on questionable research practices, meta-analysis, and power analysis to explain the apparently high rates of failure to replicate. Psychologists can improve research practices to advance knowledge in ways that improve replicability. We recommend that researchers adopt open science conventions of preregi-stration and full disclosure and that replication efforts be based on multiple studies rather than on a single replication attempt. We call for more sophisticated power analyses, careful consideration of the various influences on effect sizes, and more complete disclosure of nonsignificant as well as statistically significant findings.
Analysis of Variance: What Is Your Statistical Software Actually Doing?
ERIC Educational Resources Information Center
Li, Jian; Lomax, Richard G.
2011-01-01
Users assume statistical software packages produce accurate results. In this article, the authors systematically examined Statistical Package for the Social Sciences (SPSS) and Statistical Analysis System (SAS) for 3 analysis of variance (ANOVA) designs, mixed-effects ANOVA, fixed-effects analysis of covariance (ANCOVA), and nested ANOVA. For each…
Equilibrium statistical-thermal models in high-energy physics
NASA Astrophysics Data System (ADS)
Tawfik, Abdel Nasser
2014-05-01
We review some recent highlights from the applications of statistical-thermal models to different experimental measurements and lattice QCD thermodynamics that have been made during the last decade. We start with a short review of the historical milestones on the path of constructing statistical-thermal models for heavy-ion physics. We discovered that Heinz Koppe formulated in 1948, an almost complete recipe for the statistical-thermal models. In 1950, Enrico Fermi generalized this statistical approach, in which he started with a general cross-section formula and inserted into it, the simplifying assumptions about the matrix element of the interaction process that likely reflects many features of the high-energy reactions dominated by density in the phase space of final states. In 1964, Hagedorn systematically analyzed the high-energy phenomena using all tools of statistical physics and introduced the concept of limiting temperature based on the statistical bootstrap model. It turns to be quite often that many-particle systems can be studied with the help of statistical-thermal methods. The analysis of yield multiplicities in high-energy collisions gives an overwhelming evidence for the chemical equilibrium in the final state. The strange particles might be an exception, as they are suppressed at lower beam energies. However, their relative yields fulfill statistical equilibrium, as well. We review the equilibrium statistical-thermal models for particle production, fluctuations and collective flow in heavy-ion experiments. We also review their reproduction of the lattice QCD thermodynamics at vanishing and finite chemical potential. During the last decade, five conditions have been suggested to describe the universal behavior of the chemical freeze-out parameters. The higher order moments of multiplicity have been discussed. They offer deep insights about particle production and to critical fluctuations. Therefore, we use them to describe the freeze-out parameters and suggest the location of the QCD critical endpoint. Various extensions have been proposed in order to take into consideration the possible deviations of the ideal hadron gas. We highlight various types of interactions, dissipative properties and location-dependences (spatial rapidity). Furthermore, we review three models combining hadronic with partonic phases; quasi-particle model, linear sigma model with Polyakov potentials and compressible bag model.
Vibroacoustic optimization using a statistical energy analysis model
NASA Astrophysics Data System (ADS)
Culla, Antonio; D`Ambrogio, Walter; Fregolent, Annalisa; Milana, Silvia
2016-08-01
In this paper, an optimization technique for medium-high frequency dynamic problems based on Statistical Energy Analysis (SEA) method is presented. Using a SEA model, the subsystem energies are controlled by internal loss factors (ILF) and coupling loss factors (CLF), which in turn depend on the physical parameters of the subsystems. A preliminary sensitivity analysis of subsystem energy to CLF's is performed to select CLF's that are most effective on subsystem energies. Since the injected power depends not only on the external loads but on the physical parameters of the subsystems as well, it must be taken into account under certain conditions. This is accomplished in the optimization procedure, where approximate relationships between CLF's, injected power and physical parameters are derived. The approach is applied on a typical aeronautical structure: the cabin of a helicopter.
Machine learning for neuroimaging with scikit-learn.
Abraham, Alexandre; Pedregosa, Fabian; Eickenberg, Michael; Gervais, Philippe; Mueller, Andreas; Kossaifi, Jean; Gramfort, Alexandre; Thirion, Bertrand; Varoquaux, Gaël
2014-01-01
Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g., multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g., resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain.
Machine learning for neuroimaging with scikit-learn
Abraham, Alexandre; Pedregosa, Fabian; Eickenberg, Michael; Gervais, Philippe; Mueller, Andreas; Kossaifi, Jean; Gramfort, Alexandre; Thirion, Bertrand; Varoquaux, Gaël
2014-01-01
Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g., multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learning can uncover hidden structures in sets of images (e.g., resting state functional MRI) or find sub-populations in large cohorts. By considering different functional neuroimaging applications, we illustrate how scikit-learn, a Python machine learning library, can be used to perform some key analysis steps. Scikit-learn contains a very large set of statistical learning algorithms, both supervised and unsupervised, and its application to neuroimaging data provides a versatile tool to study the brain. PMID:24600388
Analysis of counting data: Development of the SATLAS Python package
NASA Astrophysics Data System (ADS)
Gins, W.; de Groote, R. P.; Bissell, M. L.; Granados Buitrago, C.; Ferrer, R.; Lynch, K. M.; Neyens, G.; Sels, S.
2018-01-01
For the analysis of low-statistics counting experiments, a traditional nonlinear least squares minimization routine may not always provide correct parameter and uncertainty estimates due to the assumptions inherent in the algorithm(s). In response to this, a user-friendly Python package (SATLAS) was written to provide an easy interface between the data and a variety of minimization algorithms which are suited for analyzinglow, as well as high, statistics data. The advantage of this package is that it allows the user to define their own model function and then compare different minimization routines to determine the optimal parameter values and their respective (correlated) errors. Experimental validation of the different approaches in the package is done through analysis of hyperfine structure data of 203Fr gathered by the CRIS experiment at ISOLDE, CERN.
Do climate extreme events foster violent civil conflicts? A coincidence analysis
NASA Astrophysics Data System (ADS)
Schleussner, Carl-Friedrich; Donges, Jonathan F.; Donner, Reik V.
2014-05-01
Civil conflicts promoted by adverse environmental conditions represent one of the most important potential feedbacks in the global socio-environmental nexus. While the role of climate extremes as a triggering factor is often discussed, no consensus is yet reached about the cause-and-effect relation in the observed data record. Here we present results of a rigorous statistical coincidence analysis based on the Munich Re Inc. extreme events database and the Uppsala conflict data program. We report evidence for statistically significant synchronicity between climate extremes with high economic impact and violent conflicts for various regions, although no coherent global signal emerges from our analysis. Our results indicate the importance of regional vulnerability and might aid to identify hot-spot regions for potential climate-triggered violent social conflicts.
Lü, Yiran; Hao, Shuxin; Zhang, Guoqing; Liu, Jie; Liu, Yue; Xu, Dongqun
2018-01-01
To implement the online statistical analysis function in information system of air pollution and health impact monitoring, and obtain the data analysis information real-time. Using the descriptive statistical method as well as time-series analysis and multivariate regression analysis, SQL language and visual tools to implement online statistical analysis based on database software. Generate basic statistical tables and summary tables of air pollution exposure and health impact data online; Generate tendency charts of each data part online and proceed interaction connecting to database; Generate butting sheets which can lead to R, SAS and SPSS directly online. The information system air pollution and health impact monitoring implements the statistical analysis function online, which can provide real-time analysis result to its users.
The changing landscape of astrostatistics and astroinformatics
NASA Astrophysics Data System (ADS)
Feigelson, Eric D.
2017-06-01
The history and current status of the cross-disciplinary fields of astrostatistics and astroinformatics are reviewed. Astronomers need a wide range of statistical methods for both data reduction and science analysis. With the proliferation of high-throughput telescopes, efficient large scale computational methods are also becoming essential. However, astronomers receive only weak training in these fields during their formal education. Interest in the fields is rapidly growing with conferences organized by scholarly societies, textbooks and tutorial workshops, and research studies pushing the frontiers of methodology. R, the premier language of statistical computing, can provide an important software environment for the incorporation of advanced statistical and computational methodology into the astronomical community.
Statistical science: a grammar for research.
Cox, David R
2017-06-01
I greatly appreciate the invitation to give this lecture with its century long history. The title is a warning that the lecture is rather discursive and not highly focused and technical. The theme is simple. That statistical thinking provides a unifying set of general ideas and specific methods relevant whenever appreciable natural variation is present. To be most fruitful these ideas should merge seamlessly with subject-matter considerations. By contrast, there is sometimes a temptation to regard formal statistical analysis as a ritual to be added after the serious work has been done, a ritual to satisfy convention, referees, and regulatory agencies. I want implicitly to refute that idea.
Rosen, G D
2006-06-01
Meta-analysis is a vague descriptor used to encompass very diverse methods of data collection analysis, ranging from simple averages to more complex statistical methods. Holo-analysis is a fully comprehensive statistical analysis of all available data and all available variables in a specified topic, with results expressed in a holistic factual empirical model. The objectives and applications of holo-analysis include software production for prediction of responses with confidence limits, translation of research conditions to praxis (field) circumstances, exposure of key missing variables, discovery of theoretically unpredictable variables and interactions, and planning future research. Holo-analyses are cited as examples of the effects on broiler feed intake and live weight gain of exogenous phytases, which account for 70% of variation in responses in terms of 20 highly significant chronological, dietary, environmental, genetic, managemental, and nutrient variables. Even better future accountancy of variation will be facilitated if and when authors of papers routinely provide key data for currently neglected variables, such as temperatures, complete feed formulations, and mortalities.
Counting statistics of chaotic resonances at optical frequencies: Theory and experiments
NASA Astrophysics Data System (ADS)
Lippolis, Domenico; Wang, Li; Xiao, Yun-Feng
2017-07-01
A deformed dielectric microcavity is used as an experimental platform for the analysis of the statistics of chaotic resonances, in the perspective of testing fractal Weyl laws at optical frequencies. In order to surmount the difficulties that arise from reading strongly overlapping spectra, we exploit the mixed nature of the phase space at hand, and only count the high-Q whispering-gallery modes (WGMs) directly. That enables us to draw statistical information on the more lossy chaotic resonances, coupled to the high-Q regular modes via dynamical tunneling. Three different models [classical, Random-Matrix-Theory (RMT) based, semiclassical] to interpret the experimental data are discussed. On the basis of least-squares analysis, theoretical estimates of Ehrenfest time, and independent measurements, we find that a semiclassically modified RMT-based expression best describes the experiment in all its realizations, particularly when the resonator is coupled to visible light, while RMT alone still works quite well in the infrared. In this work we reexamine and substantially extend the results of a short paper published earlier [L. Wang et al., Phys. Rev. E 93, 040201(R) (2016), 10.1103/PhysRevE.93.040201].
NASA Astrophysics Data System (ADS)
Sarkar, Arnab; Karki, Vijay; Aggarwal, Suresh K.; Maurya, Gulab S.; Kumar, Rohit; Rai, Awadhesh K.; Mao, Xianglei; Russo, Richard E.
2015-06-01
Laser induced breakdown spectroscopy (LIBS) was applied for elemental characterization of high alloy steel using partial least squares regression (PLSR) with an objective to evaluate the analytical performance of this multivariate approach. The optimization of the number of principle components for minimizing error in PLSR algorithm was investigated. The effect of different pre-treatment procedures on the raw spectral data before PLSR analysis was evaluated based on several statistical (standard error of prediction, percentage relative error of prediction etc.) parameters. The pre-treatment with "NORM" parameter gave the optimum statistical results. The analytical performance of PLSR model improved by increasing the number of laser pulses accumulated per spectrum as well as by truncating the spectrum to appropriate wavelength region. It was found that the statistical benefit of truncating the spectrum can also be accomplished by increasing the number of laser pulses per accumulation without spectral truncation. The constituents (Co and Mo) present in hundreds of ppm were determined with relative precision of 4-9% (2σ), whereas the major constituents Cr and Ni (present at a few percent levels) were determined with a relative precision of ~ 2%(2σ).
McDermott, Jason E.; Wang, Jing; Mitchell, Hugh; Webb-Robertson, Bobbie-Jo; Hafen, Ryan; Ramey, John; Rodland, Karin D.
2012-01-01
Introduction The advent of high throughput technologies capable of comprehensive analysis of genes, transcripts, proteins and other significant biological molecules has provided an unprecedented opportunity for the identification of molecular markers of disease processes. However, it has simultaneously complicated the problem of extracting meaningful molecular signatures of biological processes from these complex datasets. The process of biomarker discovery and characterization provides opportunities for more sophisticated approaches to integrating purely statistical and expert knowledge-based approaches. Areas covered In this review we will present examples of current practices for biomarker discovery from complex omic datasets and the challenges that have been encountered in deriving valid and useful signatures of disease. We will then present a high-level review of data-driven (statistical) and knowledge-based methods applied to biomarker discovery, highlighting some current efforts to combine the two distinct approaches. Expert opinion Effective, reproducible and objective tools for combining data-driven and knowledge-based approaches to identify predictive signatures of disease are key to future success in the biomarker field. We will describe our recommendations for possible approaches to this problem including metrics for the evaluation of biomarkers. PMID:23335946
Log-Normality and Multifractal Analysis of Flame Surface Statistics
NASA Astrophysics Data System (ADS)
Saha, Abhishek; Chaudhuri, Swetaprovo; Law, Chung K.
2013-11-01
The turbulent flame surface is typically highly wrinkled and folded at a multitude of scales controlled by various flame properties. It is useful if the information contained in this complex geometry can be projected onto a simpler regular geometry for the use of spectral, wavelet or multifractal analyses. Here we investigate local flame surface statistics of turbulent flame expanding under constant pressure. First the statistics of local length ratio is experimentally obtained from high-speed Mie scattering images. For spherically expanding flame, length ratio on the measurement plane, at predefined equiangular sectors is defined as the ratio of the actual flame length to the length of a circular-arc of radius equal to the average radius of the flame. Assuming isotropic distribution of such flame segments we convolute suitable forms of the length-ratio probability distribution functions (pdfs) to arrive at corresponding area-ratio pdfs. Both the pdfs are found to be near log-normally distributed and shows self-similar behavior with increasing radius. Near log-normality and rather intermittent behavior of the flame-length ratio suggests similarity with dissipation rate quantities which stimulates multifractal analysis. Currently at Indian Institute of Science, India.
Machine learning patterns for neuroimaging-genetic studies in the cloud.
Da Mota, Benoit; Tudoran, Radu; Costan, Alexandru; Varoquaux, Gaël; Brasche, Goetz; Conrod, Patricia; Lemaitre, Herve; Paus, Tomas; Rietschel, Marcella; Frouin, Vincent; Poline, Jean-Baptiste; Antoniu, Gabriel; Thirion, Bertrand
2014-01-01
Brain imaging is a natural intermediate phenotype to understand the link between genetic information and behavior or brain pathologies risk factors. Massive efforts have been made in the last few years to acquire high-dimensional neuroimaging and genetic data on large cohorts of subjects. The statistical analysis of such data is carried out with increasingly sophisticated techniques and represents a great computational challenge. Fortunately, increasing computational power in distributed architectures can be harnessed, if new neuroinformatics infrastructures are designed and training to use these new tools is provided. Combining a MapReduce framework (TomusBLOB) with machine learning algorithms (Scikit-learn library), we design a scalable analysis tool that can deal with non-parametric statistics on high-dimensional data. End-users describe the statistical procedure to perform and can then test the model on their own computers before running the very same code in the cloud at a larger scale. We illustrate the potential of our approach on real data with an experiment showing how the functional signal in subcortical brain regions can be significantly fit with genome-wide genotypes. This experiment demonstrates the scalability and the reliability of our framework in the cloud with a 2 weeks deployment on hundreds of virtual machines.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McDermott, Jason E.; Wang, Jing; Mitchell, Hugh D.
2013-01-01
The advent of high throughput technologies capable of comprehensive analysis of genes, transcripts, proteins and other significant biological molecules has provided an unprecedented opportunity for the identification of molecular markers of disease processes. However, it has simultaneously complicated the problem of extracting meaningful signatures of biological processes from these complex datasets. The process of biomarker discovery and characterization provides opportunities both for purely statistical and expert knowledge-based approaches and would benefit from improved integration of the two. Areas covered In this review we will present examples of current practices for biomarker discovery from complex omic datasets and the challenges thatmore » have been encountered. We will then present a high-level review of data-driven (statistical) and knowledge-based methods applied to biomarker discovery, highlighting some current efforts to combine the two distinct approaches. Expert opinion Effective, reproducible and objective tools for combining data-driven and knowledge-based approaches to biomarker discovery and characterization are key to future success in the biomarker field. We will describe our recommendations of possible approaches to this problem including metrics for the evaluation of biomarkers.« less
The remarkable geographical pattern of gastric cancer mortality in Ecuador.
Montero-Oleas, Nadia; Núñez-González, Solange; Simancas-Racines, Daniel
2017-12-01
This study was aimed to describe the gastric cancer mortality trend, and to analyze the spatial distribution of gastric cancer mortality in Ecuador, between 2004 and 2015. Data were collected from the National Institute of Statistics and Census (INEC) database. Crude gastric cancer mortality rates, standardized mortality ratios (SMRs) and indirect standardized mortality rates (ISMRs) were calculated per 100,000 persons. For time trend analysis, joinpoint regression was used. The annual percentage rate change (APC) and the average annual percent change (AAPC) was computed for each province. Spatial age-adjusted analysis was used to detect high risk clusters of gastric cancer mortality, from 2010 to 2015, using Kulldorff spatial scan statistics. In Ecuador, between 2004 and 2015, gastric cancer caused a total of 19,115 deaths: 10,679 in men and 8436 in women. When crude rates were analyzed, a significant decline was detected (AAPC: -1.8%; p<0.001). ISMR also decreased, but this change was not statistically significant (APC: -0.53%; p=0.36). From 2004 to 2007 and from 2008 to 2011 the province with the highest ISMR was Carchi; and, from 2012 to 2015, was Cotopaxi. The most likely high occurrence cluster included Bolívar, Los Ríos, Chimborazo, Tungurahua, and Cotopaxi provinces, with a relative risk of 1.34 (p<0.001). There is a substantial geographic variation in gastric cancer mortality rates among Ecuadorian provinces. The spatial analysis indicates the presence of high occurrence clusters throughout the Andes Mountains. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Optimizing human activity patterns using global sensitivity analysis.
Fairchild, Geoffrey; Hickmann, Kyle S; Mniszewski, Susan M; Del Valle, Sara Y; Hyman, James M
2014-12-01
Implementing realistic activity patterns for a population is crucial for modeling, for example, disease spread, supply and demand, and disaster response. Using the dynamic activity simulation engine, DASim, we generate schedules for a population that capture regular (e.g., working, eating, and sleeping) and irregular activities (e.g., shopping or going to the doctor). We use the sample entropy (SampEn) statistic to quantify a schedule's regularity for a population. We show how to tune an activity's regularity by adjusting SampEn, thereby making it possible to realistically design activities when creating a schedule. The tuning process sets up a computationally intractable high-dimensional optimization problem. To reduce the computational demand, we use Bayesian Gaussian process regression to compute global sensitivity indices and identify the parameters that have the greatest effect on the variance of SampEn. We use the harmony search (HS) global optimization algorithm to locate global optima. Our results show that HS combined with global sensitivity analysis can efficiently tune the SampEn statistic with few search iterations. We demonstrate how global sensitivity analysis can guide statistical emulation and global optimization algorithms to efficiently tune activities and generate realistic activity patterns. Though our tuning methods are applied to dynamic activity schedule generation, they are general and represent a significant step in the direction of automated tuning and optimization of high-dimensional computer simulations.
Optimizing human activity patterns using global sensitivity analysis
Hickmann, Kyle S.; Mniszewski, Susan M.; Del Valle, Sara Y.; Hyman, James M.
2014-01-01
Implementing realistic activity patterns for a population is crucial for modeling, for example, disease spread, supply and demand, and disaster response. Using the dynamic activity simulation engine, DASim, we generate schedules for a population that capture regular (e.g., working, eating, and sleeping) and irregular activities (e.g., shopping or going to the doctor). We use the sample entropy (SampEn) statistic to quantify a schedule’s regularity for a population. We show how to tune an activity’s regularity by adjusting SampEn, thereby making it possible to realistically design activities when creating a schedule. The tuning process sets up a computationally intractable high-dimensional optimization problem. To reduce the computational demand, we use Bayesian Gaussian process regression to compute global sensitivity indices and identify the parameters that have the greatest effect on the variance of SampEn. We use the harmony search (HS) global optimization algorithm to locate global optima. Our results show that HS combined with global sensitivity analysis can efficiently tune the SampEn statistic with few search iterations. We demonstrate how global sensitivity analysis can guide statistical emulation and global optimization algorithms to efficiently tune activities and generate realistic activity patterns. Though our tuning methods are applied to dynamic activity schedule generation, they are general and represent a significant step in the direction of automated tuning and optimization of high-dimensional computer simulations. PMID:25580080
Carmichael, Mary C.; St. Clair, Candace; Edwards, Andrea M.; Barrett, Peter; McFerrin, Harris; Davenport, Ian; Awad, Mohamed; Kundu, Anup; Ireland, Shubha Kale
2016-01-01
Xavier University of Louisiana leads the nation in awarding BS degrees in the biological sciences to African-American students. In this multiyear study with ∼5500 participants, data-driven interventions were adopted to improve student academic performance in a freshman-level general biology course. The three hour-long exams were common and administered concurrently to all students. New exam questions were developed using Bloom’s taxonomy, and exam results were analyzed statistically with validated assessment tools. All but the comprehensive final exam were returned to students for self-evaluation and remediation. Among other approaches, course rigor was monitored by using an identical set of 60 questions on the final exam across 10 semesters. Analysis of the identical sets of 60 final exam questions revealed that overall averages increased from 72.9% (2010) to 83.5% (2015). Regression analysis demonstrated a statistically significant correlation between high-risk students and their averages on the 60 questions. Additional analysis demonstrated statistically significant improvements for at least one letter grade from midterm to final and a 20% increase in the course pass rates over time, also for the high-risk population. These results support the hypothesis that our data-driven interventions and assessment techniques are successful in improving student retention, particularly for our academically at-risk students. PMID:27543637
Optimizing human activity patterns using global sensitivity analysis
Fairchild, Geoffrey; Hickmann, Kyle S.; Mniszewski, Susan M.; ...
2013-12-10
Implementing realistic activity patterns for a population is crucial for modeling, for example, disease spread, supply and demand, and disaster response. Using the dynamic activity simulation engine, DASim, we generate schedules for a population that capture regular (e.g., working, eating, and sleeping) and irregular activities (e.g., shopping or going to the doctor). We use the sample entropy (SampEn) statistic to quantify a schedule’s regularity for a population. We show how to tune an activity’s regularity by adjusting SampEn, thereby making it possible to realistically design activities when creating a schedule. The tuning process sets up a computationally intractable high-dimensional optimizationmore » problem. To reduce the computational demand, we use Bayesian Gaussian process regression to compute global sensitivity indices and identify the parameters that have the greatest effect on the variance of SampEn. Here we use the harmony search (HS) global optimization algorithm to locate global optima. Our results show that HS combined with global sensitivity analysis can efficiently tune the SampEn statistic with few search iterations. We demonstrate how global sensitivity analysis can guide statistical emulation and global optimization algorithms to efficiently tune activities and generate realistic activity patterns. Finally, though our tuning methods are applied to dynamic activity schedule generation, they are general and represent a significant step in the direction of automated tuning and optimization of high-dimensional computer simulations.« less
Enrichment analysis in high-throughput genomics - accounting for dependency in the NULL.
Gold, David L; Coombes, Kevin R; Wang, Jing; Mallick, Bani
2007-03-01
Translating the overwhelming amount of data generated in high-throughput genomics experiments into biologically meaningful evidence, which may for example point to a series of biomarkers or hint at a relevant pathway, is a matter of great interest in bioinformatics these days. Genes showing similar experimental profiles, it is hypothesized, share biological mechanisms that if understood could provide clues to the molecular processes leading to pathological events. It is the topic of further study to learn if or how a priori information about the known genes may serve to explain coexpression. One popular method of knowledge discovery in high-throughput genomics experiments, enrichment analysis (EA), seeks to infer if an interesting collection of genes is 'enriched' for a Consortium particular set of a priori Gene Ontology Consortium (GO) classes. For the purposes of statistical testing, the conventional methods offered in EA software implicitly assume independence between the GO classes. Genes may be annotated for more than one biological classification, and therefore the resulting test statistics of enrichment between GO classes can be highly dependent if the overlapping gene sets are relatively large. There is a need to formally determine if conventional EA results are robust to the independence assumption. We derive the exact null distribution for testing enrichment of GO classes by relaxing the independence assumption using well-known statistical theory. In applications with publicly available data sets, our test results are similar to the conventional approach which assumes independence. We argue that the independence assumption is not detrimental.
Climate drivers on malaria transmission in Arunachal Pradesh, India.
Upadhyayula, Suryanaryana Murty; Mutheneni, Srinivasa Rao; Chenna, Sumana; Parasaram, Vaideesh; Kadiri, Madhusudhan Rao
2015-01-01
The present study was conducted during the years 2006 to 2012 and provides information on prevalence of malaria and its regulation with effect to various climatic factors in East Siang district of Arunachal Pradesh, India. Correlation analysis, Principal Component Analysis and Hotelling's T² statistics models are adopted to understand the effect of weather variables on malaria transmission. The epidemiological study shows that the prevalence of malaria is mostly caused by the parasite Plasmodium vivax followed by Plasmodium falciparum. It is noted that, the intensity of malaria cases declined gradually from the year 2006 to 2012. The transmission of malaria observed was more during the rainy season, as compared to summer and winter seasons. Further, the data analysis study with Principal Component Analysis and Hotelling's T² statistic has revealed that the climatic variables such as temperature and rainfall are the most influencing factors for the high rate of malaria transmission in East Siang district of Arunachal Pradesh.
NASA Technical Reports Server (NTRS)
Belmont, A. D.
1979-01-01
The problem of preventing cabin ozone from exceeding a given standard was investigated. Statistical analysis of vertical distribution of ozone is summarized. The cost, logistics, maintenance, ability to forecast ozone, and avoiding high ozone concentrations are presented. Filtering approaches and the requirements to remove ozone toxicity are discussed.
High School and Beyond. 1980 Sophomore Cohort. First Follow-Up (1982). [machine-readable data file].
ERIC Educational Resources Information Center
National Center for Education Statistics (ED), Washington, DC.
The High School and Beyond 1980 Sophomore Cohort First Follow-Up (1982) data file is presented. The First Follow-Up Sophomore Cohort data tape consists of four related data files: (1) the student data file (including data availability flags, weights, questionnaire data, and composite variables); (2) Statistical Analysis System (SAS) control cards…
Discipline and Order in American High Schools. Contractor Report.
ERIC Educational Resources Information Center
DiPrete, Thomas A.; And Others
Discipline and misbehavior in American high schools are the focus of this analysis of data from the first wave (1980) of a longitudinal study of over 30,000 sophomores and over 28,000 seniors. A summary of the findings shows that differences between urban and other schools are usually statistically insignificant when other school and student…
A Handbook of Sound and Vibration Parameters
1978-09-18
fixed in space. (Reference 1.) no motion atay node Static Divergence: (See Divergence.) Statistical Energy Analysis (SEA): Statistical energy analysis is...parameters of the circuits come from statistics of the vibrational characteristics of the structure. Statistical energy analysis is uniquely successful
NASA Technical Reports Server (NTRS)
Dominick, Wayne D. (Editor); Bassari, Jinous; Triantafyllopoulos, Spiros
1984-01-01
The University of Southwestern Louisiana (USL) NASA PC R and D statistical analysis support package is designed to be a three-level package to allow statistical analysis for a variety of applications within the USL Data Base Management System (DBMS) contract work. The design addresses usage of the statistical facilities as a library package, as an interactive statistical analysis system, and as a batch processing package.
Association analysis of multiple traits by an approach of combining P values.
Chen, Lili; Wang, Yong; Zhou, Yajing
2018-03-01
Increasing evidence shows that one variant can affect multiple traits, which is a widespread phenomenon in complex diseases. Joint analysis of multiple traits can increase statistical power of association analysis and uncover the underlying genetic mechanism. Although there are many statistical methods to analyse multiple traits, most of these methods are usually suitable for detecting common variants associated with multiple traits. However, because of low minor allele frequency of rare variant, these methods are not optimal for rare variant association analysis. In this paper, we extend an adaptive combination of P values method (termed ADA) for single trait to test association between multiple traits and rare variants in the given region. For a given region, we use reverse regression model to test each rare variant associated with multiple traits and obtain the P value of single-variant test. Further, we take the weighted combination of these P values as the test statistic. Extensive simulation studies show that our approach is more powerful than several other comparison methods in most cases and is robust to the inclusion of a high proportion of neutral variants and the different directions of effects of causal variants.
Gis-Based Spatial Statistical Analysis of College Graduates Employment
NASA Astrophysics Data System (ADS)
Tang, R.
2012-07-01
It is urgently necessary to be aware of the distribution and employment status of college graduates for proper allocation of human resources and overall arrangement of strategic industry. This study provides empirical evidence regarding the use of geocoding and spatial analysis in distribution and employment status of college graduates based on the data from 2004-2008 Wuhan Municipal Human Resources and Social Security Bureau, China. Spatio-temporal distribution of employment unit were analyzed with geocoding using ArcGIS software, and the stepwise multiple linear regression method via SPSS software was used to predict the employment and to identify spatially associated enterprise and professionals demand in the future. The results show that the enterprises in Wuhan east lake high and new technology development zone increased dramatically from 2004 to 2008, and tended to distributed southeastward. Furthermore, the models built by statistical analysis suggest that the specialty of graduates major in has an important impact on the number of the employment and the number of graduates engaging in pillar industries. In conclusion, the combination of GIS and statistical analysis which helps to simulate the spatial distribution of the employment status is a potential tool for human resource development research.
Luster measurements of lips treated with lipstick formulations.
Yadav, Santosh; Issa, Nevine; Streuli, David; McMullen, Roger; Fares, Hani
2011-01-01
In this study, digital photography in combination with image analysis was used to measure the luster of several lipstick formulations containing varying amounts and types of polymers. A weighed amount of lipstick was applied to a mannequin's lips and the mannequin was illuminated by a uniform beam of a white light source. Digital images of the mannequin were captured with a high-resolution camera and the images were analyzed using image analysis software. Luster analysis was performed using Stamm (L(Stamm)) and Reich-Robbins (L(R-R)) luster parameters. Statistical analysis was performed on each luster parameter (L(Stamm) and L(R-R)), peak height, and peak width. Peak heights for lipstick formulation containing 11% and 5% VP/eicosene copolymer were statistically different from those of the control. The L(Stamm) and L(R-R) parameters for the treatment containing 11% VP/eicosene copolymer were statistically different from these of the control. Based on the results obtained in this study, we are able to determine whether a polymer is a good pigment dispersant and contributes to visually detected shine of a lipstick upon application. The methodology presented in this paper could serve as a tool for investigators to screen their ingredients for shine in lipstick formulations.
Water quality analysis of the Rapur area, Andhra Pradesh, South India using multivariate techniques
NASA Astrophysics Data System (ADS)
Nagaraju, A.; Sreedhar, Y.; Thejaswi, A.; Sayadi, Mohammad Hossein
2017-10-01
The groundwater samples from Rapur area were collected from different sites to evaluate the major ion chemistry. The large number of data can lead to difficulties in the integration, interpretation, and representation of the results. Two multivariate statistical methods, hierarchical cluster analysis (HCA) and factor analysis (FA), were applied to evaluate their usefulness to classify and identify geochemical processes controlling groundwater geochemistry. Four statistically significant clusters were obtained from 30 sampling stations. This has resulted two important clusters viz., cluster 1 (pH, Si, CO3, Mg, SO4, Ca, K, HCO3, alkalinity, Na, Na + K, Cl, and hardness) and cluster 2 (EC and TDS) which are released to the study area from different sources. The application of different multivariate statistical techniques, such as principal component analysis (PCA), assists in the interpretation of complex data matrices for a better understanding of water quality of a study area. From PCA, it is clear that the first factor (factor 1), accounted for 36.2% of the total variance, was high positive loading in EC, Mg, Cl, TDS, and hardness. Based on the PCA scores, four significant cluster groups of sampling locations were detected on the basis of similarity of their water quality.
Fotina, I; Lütgendorf-Caucig, C; Stock, M; Pötter, R; Georg, D
2012-02-01
Inter-observer studies represent a valid method for the evaluation of target definition uncertainties and contouring guidelines. However, data from the literature do not yet give clear guidelines for reporting contouring variability. Thus, the purpose of this work was to compare and discuss various methods to determine variability on the basis of clinical cases and a literature review. In this study, 7 prostate and 8 lung cases were contoured on CT images by 8 experienced observers. Analysis of variability included descriptive statistics, calculation of overlap measures, and statistical measures of agreement. Cross tables with ratios and correlations were established for overlap parameters. It was shown that the minimal set of parameters to be reported should include at least one of three volume overlap measures (i.e., generalized conformity index, Jaccard coefficient, or conformation number). High correlation between these parameters and scatter of the results was observed. A combination of descriptive statistics, overlap measure, and statistical measure of agreement or reliability analysis is required to fully report the interrater variability in delineation.
Statistical Tutorial | Center for Cancer Research
Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. ST is designed as a follow up to Statistical Analysis of Research Data (SARD) held in April 2018. The tutorial will apply the general principles of statistical analysis of research data including descriptive statistics, z- and t-tests of means and mean
Reconnection properties in Kelvin-Helmholtz instabilities
NASA Astrophysics Data System (ADS)
Vernisse, Y.; Lavraud, B.; Eriksson, S.; Gershman, D. J.; Dorelli, J.; Pollock, C. J.; Giles, B. L.; Aunai, N.; Avanov, L. A.; Burch, J.; Chandler, M. O.; Coffey, V. N.; Dargent, J.; Ergun, R.; Farrugia, C. J.; Genot, V. N.; Graham, D.; Hasegawa, H.; Jacquey, C.; Kacem, I.; Khotyaintsev, Y. V.; Li, W.; Magnes, W.; Marchaudon, A.; Moore, T. E.; Paterson, W. R.; Penou, E.; Phan, T.; Retino, A.; Schwartz, S. J.; Saito, Y.; Sauvaud, J. A.; Schiff, C.; Torbert, R. B.; Wilder, F. D.; Yokota, S.
2017-12-01
Kelvin-Helmholtz instabilities are particular laboratories to study strong guide field reconnection processes. In particular, unlike the usual dayside magnetopause, the conditions across the magnetopause in KH vortices are quasi-symmetric, with low differences in beta and magnetic shear angle. We study these properties by means of statistical analysis of the high-resolution data of the Magnetospheric Multiscale mission. Several events of Kelvin-Helmholtz instabilities pas the terminator plane and a long lasting dayside instabilities event where used in order to produce this statistical analysis. Early results present a consistency between the data and the theory. In addition, the results emphasize the importance of the thickness of the magnetopause as a driver of magnetic reconnection in low magnetic shear events.
Paranjpe, Madhav G; Denton, Melissa D; Vidmar, Tom; Elbekai, Reem H
2014-01-01
Carcinogenicity studies have been performed in conventional 2-year rodent studies for at least 3 decades, whereas the short-term carcinogenicity studies in transgenic mice, such as Tg.rasH2, have only been performed over the last decade. In the 2-year conventional rodent studies, interlinked problems, such as increasing trends in the initial body weights, increased body weight gains, high incidence of spontaneous tumors, and low survival, that complicate the interpretation of findings have been well established. However, these end points have not been evaluated in the short-term carcinogenicity studies involving the Tg.rasH2 mice. In this article, we present retrospective analysis of data obtained from control groups in 26-week carcinogenicity studies conducted in Tg.rasH2 mice since 2004. Our analysis showed statistically significant decreasing trends in initial body weights of both sexes. Although the terminal body weights did not show any significant trends, there was a statistically significant increasing trend toward body weight gains, more so in males than in females, which correlated with increasing trends in the food consumption. There were no statistically significant alterations in mortality trends. In addition, the incidence of all common spontaneous tumors remained fairly constant with no statistically significant differences in trends. © The Author(s) 2014.
Colizzo, J M; Clayton, S B; Richter, J E
2016-08-01
The aim of this investigation was to determine the motility patterns of inflammatory and fibrostenotic phenotypes of eosinophilic esophagitis (EoE) utilizing high-resolution manometry (HRM). Twenty-nine patients with a confirmed diagnosis of EoE according to clinicopathological criteria currently being managed at the Joy McCann Culverhouse Swallowing Center at the University of South Florida were included in the retrospective analysis. Only patients who completed HRM studies were included in the analysis. Patients were classified into inflammatory or fibrostenotic subtypes based on baseline endoscopic evidence. Their baseline HRM studies prior to therapy were analyzed. Manometric data including distal contractile integral, integrated relaxation pressure, and intrabolus pressure (IBP) values were recorded. HRM results were interpreted according to the Chicago Classification system. Statistical analysis was performed with SPSS software (Version 22, IBM Co., Armonk, NY, USA). Data were compared utilizing Student's t-test, χ(2) test, Pearson correlation, and Spearman correlation tests. Statistical significance was set at P < 0.05. A total of 29 patients with EoE were included into the retrospective analysis. The overall average age among patients was 40 years. Male patients comprised 62% of the overall population. Both groups were similar in age, gender, and overall clinical presentation. Seventeen patients (58%) had fibrostenotic disease, and 12 (42%) displayed inflammatory disease. The average IBP for the fibrostenotic and inflammatory groups were 18.6 ± 6.0 mmHg and 12.6 ± 3.5 mmHg, respectively (P < 0.05). Strictures were only seen in the fibrostenotic group. Of the fibrostenotic group, 6 (35%) demonstrated proximal esophageal strictures, 7 (41%) had distal strictures, 3 (18%) had mid-esophageal strictures, and 1 (6%) patient had pan-esophageal strictures. There was no statistically significant correlation between the level of esophageal stricture and degree of IBP. Integrated relaxation pressure, distal contractile integral, and other HRM metrics did not demonstrate statistical significance between the two subtypes. There also appeared no statistically significant correlation between patient demographics and esophageal metrics. Patients with the fibrostenotic phenotype of EoE demonstrated an IBP that was significantly higher than that of the inflammatory group. © 2015 International Society for Diseases of the Esophagus.
NASA Astrophysics Data System (ADS)
Voegel, Phillip D.; Quashnock, Kathryn A.; Heil, Katrina M.
2004-05-01
The Student-to-Student Chemistry Initiative is an outreach program started in the fall of 2001 at Midwestern State University (MSU). The oncampus program trains high school science students to perform a series of chemistry demonstrations and subsequently provides kits containing necessary supplies and reagents for the high school students to perform demonstration programs at elementary schools. The program focuses on improving student perception of science. The program's impact on high school student perception is evaluated through statistical analysis of paired preparticipation and postparticipation surveys. The surveys focus on four areas of student perception: general attitude toward science, interest in careers in science, science awareness, and interest in attending MSU for postsecondary education. Increased scores were observed in all evaluation areas including a statistically significant increase in science awareness following participation.
Statistical and Spatial Analysis of Bathymetric Data for the St. Clair River, 1971-2007
Bennion, David
2009-01-01
To address questions concerning ongoing geomorphic processes in the St. Clair River, selected bathymetric datasets spanning 36 years were analyzed. Comparisons of recent high-resolution datasets covering the upper river indicate a highly variable, active environment. Although statistical and spatial comparisons of the datasets show that some changes to the channel size and shape have taken place during the study period, uncertainty associated with various survey methods and interpolation processes limit the statistically certain results. The methods used to spatially compare the datasets are sensitive to small variations in position and depth that are within the range of uncertainty associated with the datasets. Characteristics of the data, such as the density of measured points and the range of values surveyed, can also influence the results of spatial comparison. With due consideration of these limitations, apparently active and ongoing areas of elevation change in the river are mapped and discussed.
A survey of work engagement and psychological capital levels.
Bonner, Lynda
2016-08-11
To evaluate the relationship between work engagement and psychological capital (PsyCap) levels reported by registered nurses. PsyCap is a developable human resource. Research on PsyCap as an antecedent to work engagement in nurses is needed. A convenience sample of 137 registered nurses participated in this quantitative cross-sectional survey. Questionnaires measured self-reported levels of work engagement and psychological capital. Descriptive and inferential statistics were used for data analysis. There was a statistically significant correlation between work engagement and PsyCap scores (r=0.633, p<0.01). Nurses working at band 5 level reported statistically significantly lower PsyCap scores compared with nurses working at band 6 and 7 levels. Nurses reporting high levels of work engagement also reported high levels of PsyCap. Band 5 nurses might benefit most from interventions to increase their PsyCap. This study supports PsyCap as an antecedent to work engagement.
Johnson, Gregory R.; Kangas, Joshua D.; Dovzhenko, Alexander; Trojok, Rüdiger; Voigt, Karsten; Majarian, Timothy D.; Palme, Klaus; Murphy, Robert F.
2017-01-01
Quantitative image analysis procedures are necessary for the automated discovery of effects of drug treatment in large collections of fluorescent micrographs. When compared to their mammalian counterparts, the effects of drug conditions on protein localization in plant species are poorly understood and underexplored. To investigate this relationship, we generated a large collection of images of single plant cells after various drug treatments. For this, protoplasts were isolated from six transgenic lines of A. thaliana expressing fluorescently tagged proteins. Nine drugs at three concentrations were applied to protoplast cultures followed by automated image acquisition. For image analysis, we developed a cell segmentation protocol for detecting drug effects using a Hough-transform based region of interest detector and a novel cross-channel texture feature descriptor. In order to determine treatment effects, we summarized differences between treated and untreated experiments with an L1 Cramér-von Mises statistic. The distribution of these statistics across all pairs of treated and untreated replicates was compared to the variation within control replicates to determine the statistical significance of observed effects. Using this pipeline, we report the dose dependent drug effects in the first high-content Arabidopsis thaliana drug screen of its kind. These results can function as a baseline for comparison to other protein organization modeling approaches in plant cells. PMID:28245335
Abécassis, V; Pompon, D; Truan, G
2000-10-15
The design of a family shuffling strategy (CLERY: Combinatorial Libraries Enhanced by Recombination in Yeast) associating PCR-based and in vivo recombination and expression in yeast is described. This strategy was tested using human cytochrome P450 CYP1A1 and CYP1A2 as templates, which share 74% nucleotide sequence identity. Construction of highly shuffled libraries of mosaic structures and reduction of parental gene contamination were two major goals. Library characterization involved multiprobe hybridization on DNA macro-arrays. The statistical analysis of randomly selected clones revealed a high proportion of chimeric genes (86%) and a homogeneous representation of the parental contribution among the sequences (55.8 +/- 2.5% for parental sequence 1A2). A microtiter plate screening system was designed to achieve colorimetric detection of polycyclic hydrocarbon hydroxylation by transformed yeast cells. Full sequences of five randomly picked and five functionally selected clones were analyzed. Results confirmed the shuffling efficiency and allowed calculation of the average length of sequence exchange and mutation rates. The efficient and statistically representative generation of mosaic structures by this type of family shuffling in a yeast expression system constitutes a novel and promising tool for structure-function studies and tuning enzymatic activities of multicomponent eucaryote complexes involving non-soluble enzymes.
Highly Efficient Design-of-Experiments Methods for Combining CFD Analysis and Experimental Data
NASA Technical Reports Server (NTRS)
Anderson, Bernhard H.; Haller, Harold S.
2009-01-01
It is the purpose of this study to examine the impact of "highly efficient" Design-of-Experiments (DOE) methods for combining sets of CFD generated analysis data with smaller sets of Experimental test data in order to accurately predict performance results where experimental test data were not obtained. The study examines the impact of micro-ramp flow control on the shock wave boundary layer (SWBL) interaction where a complete paired set of data exist from both CFD analysis and Experimental measurements By combining the complete set of CFD analysis data composed of fifteen (15) cases with a smaller subset of experimental test data containing four/five (4/5) cases, compound data sets (CFD/EXP) were generated which allows the prediction of the complete set of Experimental results No statistical difference were found to exist between the combined (CFD/EXP) generated data sets and the complete Experimental data set composed of fifteen (15) cases. The same optimal micro-ramp configuration was obtained using the (CFD/EXP) generated data as obtained with the complete set of Experimental data, and the DOE response surfaces generated by the two data sets were also not statistically different.
[Study on ecological suitability regionalization of Eucommia ulmoides in Guizhou].
Kang, Chuan-Zhi; Wang, Qing-Qing; Zhou, Tao; Jiang, Wei-Ke; Xiao, Cheng-Hong; Xie, Yu
2014-05-01
To study the ecological suitability regionalization of Eucommia ulmoides, for selecting artificial planting base and high-quality industrial raw material purchase area of the herb in Guizhou. Based on the investigation of 14 Eucommia ulmoides producing areas, pinoresinol diglucoside content and ecological factors were obtained. Using spatial analysis method to carry on ecological suitability regionalization. Meanwhile, combining pinoresinol diglucoside content, the correlation of major active components and environmental factors were analyzed by statistical analysis. The most suitability planting area of Eucommia ulmoides was the northwest of Guizhou. The distribution of Eucommia ulmoides was mainly affected by the type and pH value of soil, and monthly precipitation. The spatial structure of major active components in Eucommia ulmoides were randomly distributed in global space, but had only one aggregation point which had a high positive correlation in local space. The major active components of Eucommia ulmoides had no correlation with altitude, longitude or latitude. Using the spatial analysis method and statistical analysis method, based on environmental factor and pinoresinol diglucoside content, the ecological suitability regionalization of Eucommia ulmoides can provide reference for the selection of suitable planting area, artificial planting base and directing production layout.
The Essential Genome of Escherichia coli K-12
2018-01-01
ABSTRACT Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. PMID:29463657
Zhu, Wensheng; Yuan, Ying; Zhang, Jingwen; Zhou, Fan; Knickmeyer, Rebecca C; Zhu, Hongtu
2017-02-01
The aim of this paper is to systematically evaluate a biased sampling issue associated with genome-wide association analysis (GWAS) of imaging phenotypes for most imaging genetic studies, including the Alzheimer's Disease Neuroimaging Initiative (ADNI). Specifically, the original sampling scheme of these imaging genetic studies is primarily the retrospective case-control design, whereas most existing statistical analyses of these studies ignore such sampling scheme by directly correlating imaging phenotypes (called the secondary traits) with genotype. Although it has been well documented in genetic epidemiology that ignoring the case-control sampling scheme can produce highly biased estimates, and subsequently lead to misleading results and suspicious associations, such findings are not well documented in imaging genetics. We use extensive simulations and a large-scale imaging genetic data analysis of the Alzheimer's Disease Neuroimaging Initiative (ADNI) data to evaluate the effects of the case-control sampling scheme on GWAS results based on some standard statistical methods, such as linear regression methods, while comparing it with several advanced statistical methods that appropriately adjust for the case-control sampling scheme. Copyright © 2016 Elsevier Inc. All rights reserved.
Statistical analysis of time transfer data from Timation 2. [US Naval Observatory and Australia
NASA Technical Reports Server (NTRS)
Luck, J. M.; Morgan, P.
1974-01-01
Between July 1973 and January 1974, three time transfer experiments using the Timation 2 satellite were conducted to measure time differences between the U.S. Naval Observatory and Australia. Statistical tests showed that the results are unaffected by the satellite's position with respect to the sunrise/sunset line or by its closest approach azimuth at the Australian station. Further tests revealed that forward predictions of time scale differences, based on the measurements, can be made with high confidence.
Statistical Analysis of High-Cycle Fatigue Behavior of Friction Stir Welded AA5083-H321
2011-01-01
durable structures are: (a) FSW is 111being used in a serial production of aluminum alloy -based 112ferryboat deck structures in Finland; (b) Al-Mg- Si -based...and strain-hardened/stabilized Al-Mg-Mn alloy ) are characterized by a relatively large statistical scatter. This scatter is closely related to the...associated with friction stir-welded (FSW) joints of AA5083-H321 (a solid-solution-strengthened and strain-hardened/stabilized Al-Mg-Mn alloy ) are
2015-08-01
the nine questions. The Statistical Package for the Social Sciences ( SPSS ) [11] was used to conduct statistical analysis on the sample. Two types...constructs. SPSS was again used to conduct statistical analysis on the sample. This time factor analysis was conducted. Factor analysis attempts to...Business Research Methods and Statistics using SPSS . P432. 11 IBM SPSS Statistics . (2012) 12 Burns, R.B., Burns, R.A. (2008) ‘Business Research
Nilsson, Björn; Håkansson, Petra; Johansson, Mikael; Nelander, Sven; Fioretos, Thoas
2007-01-01
Ontological analysis facilitates the interpretation of microarray data. Here we describe new ontological analysis methods which, unlike existing approaches, are threshold-free and statistically powerful. We perform extensive evaluations and introduce a new concept, detection spectra, to characterize methods. We show that different ontological analysis methods exhibit distinct detection spectra, and that it is critical to account for this diversity. Our results argue strongly against the continued use of existing methods, and provide directions towards an enhanced approach. PMID:17488501
Bradshaw, Elizabeth J; Keogh, Justin W L; Hume, Patria A; Maulder, Peter S; Nortje, Jacques; Marnewick, Michel
2009-06-01
The purpose of this study was to examine the role of neuromotor noise on golf swing performance in high- and low-handicap players. Selected two-dimensional kinematic measures of 20 male golfers (n=10 per high- or low-handicap group) performing 10 golf swings with a 5-iron club was obtained through video analysis. Neuromotor noise was calculated by deducting the standard error of the measurement from the coefficient of variation obtained from intra-individual analysis. Statistical methods included linear regression analysis and one-way analysis of variance using SPSS. Absolute invariance in the key technical positions (e.g., at the top of the backswing) of the golf swing appears to be a more favorable technique for skilled performance.
Qiao, Zhi; Li, Xiang; Liu, Haifeng; Zhang, Lei; Cao, Junyang; Xie, Guotong; Qin, Nan; Jiang, Hui; Lin, Haocheng
2017-01-01
The prevalence of erectile dysfunction (ED) has been extensively studied worldwide. Erectile dysfunction drugs has shown great efficacy in preventing male erectile dysfunction. In order to help doctors know drug taken preference of patients and better prescribe, it is crucial to analyze who actually take erectile dysfunction drugs and the relation between sexual behaviors and drug use. Existing clinical studies usually used descriptive statistics and regression analysis based on small volume of data. In this paper, based on big volume of data (48,630 questionnaires), we use data mining approaches besides statistics and regression analysis to comprehensively analyze the relation between male sexual behaviors and use of erectile dysfunction drugs for unravelling the characteristic of patients who take erectile dysfunction drugs. We firstly analyze the impact of multiple sexual behavior factors on whether to use the erectile dysfunction drugs. Then, we explore to mine the Decision Rules for Stratification to discover patients who are more likely to take drugs. Based on the decision rules, the patients can be partitioned into four potential groups for use of erectile dysfunction: high potential group, intermediate potential-1 group, intermediate potential-2 group and low potential group. Experimental results show 1) the sexual behavior factors, erectile hardness and time length to prepare (how long to prepares for sexual behaviors ahead of time), have bigger impacts both in correlation analysis and potential drug taking patients discovering; 2) odds ratio between patients identified as low potential and high potential was 6.098 (95% confidence interval, 5.159-7.209) with statistically significant differences in taking drug potential detected between all potential groups.
NASA Astrophysics Data System (ADS)
Lopez, S. R.; Hogue, T. S.
2011-12-01
Global climate models (GCMs) are primarily used to generate historical and future large-scale circulation patterns at a coarse resolution (typical order of 50,000 km2) and fail to capture climate variability at the ground level due to localized surface influences (i.e topography, marine, layer, land cover, etc). Their inability to accurately resolve these processes has led to the development of numerous 'downscaling' techniques. The goal of this study is to enhance statistical downscaling of daily precipitation and temperature for regions with heterogeneous land cover and topography. Our analysis was divided into two periods, historical (1961-2000) and contemporary (1980-2000), and tested using sixteen predictand combinations from four GCMs (GFDL CM2.0, GFDL CM2.1, CNRM-CM3 and MRI-CGCM2 3.2a. The Southern California area was separated into five county regions: Santa Barbara, Ventura, Los Angeles, Orange and San Diego. Principle component analysis (PCA) was performed on ground-based observations in order to (1) reduce the number of redundant gauges and minimize dimensionality and (2) cluster gauges that behave statistically similarly for post-analysis. Post-PCA analysis included extensive testing of predictor-predictand relationships using an enhanced canonical correlation analysis (ECCA). The ECCA includes obtaining the optimal predictand sets for all models within each spatial domain (county) as governed by daily and monthly overall statistics. Results show all models maintain mean annual and monthly behavior within each county and daily statistics are improved. The level of improvement highly depends on the vegetation extent within each county and the land-to-ocean ratio within the GCM spatial grid. The utilization of the entire historical period also leads to better statistical representation of observed daily precipitation. The validated ECCA technique is being applied to future climate scenarios distributed by the IPCC in order to provide forcing data for regional hydrologic models and assess future water resources in the Southern California region.
Topographic ERP analyses: a step-by-step tutorial review.
Murray, Micah M; Brunet, Denis; Michel, Christoph M
2008-06-01
In this tutorial review, we detail both the rationale for as well as the implementation of a set of analyses of surface-recorded event-related potentials (ERPs) that uses the reference-free spatial (i.e. topographic) information available from high-density electrode montages to render statistical information concerning modulations in response strength, latency, and topography both between and within experimental conditions. In these and other ways these topographic analysis methods allow the experimenter to glean additional information and neurophysiologic interpretability beyond what is available from canonical waveform analyses. In this tutorial we present the example of somatosensory evoked potentials (SEPs) in response to stimulation of each hand to illustrate these points. For each step of these analyses, we provide the reader with both a conceptual and mathematical description of how the analysis is carried out, what it yields, and how to interpret its statistical outcome. We show that these topographic analysis methods are intuitive and easy-to-use approaches that can remove much of the guesswork often confronting ERP researchers and also assist in identifying the information contained within high-density ERP datasets.
NASA Astrophysics Data System (ADS)
Vouterakos, P. A.; Moustris, K. P.; Bartzokas, A.; Ziomas, I. C.; Nastos, P. T.; Paliatsos, A. G.
2012-12-01
In this work, artificial neural networks (ANNs) were developed and applied in order to forecast the discomfort levels due to the combination of high temperature and air humidity, during the hot season of the year, in eight different regions within the Greater Athens area (GAA), Greece. For the selection of the best type and architecture of ANNs-forecasting models, the multiple criteria analysis (MCA) technique was applied. Three different types of ANNs were developed and tested with the MCA method. Concretely, the multilayer perceptron, the generalized feed forward networks (GFFN), and the time-lag recurrent networks were developed and tested. Results showed that the best ANNs type performance was achieved by using the GFFN model for the prediction of discomfort levels due to high temperature and air humidity within GAA. For the evaluation of the constructed ANNs, appropriate statistical indices were used. The analysis proved that the forecasting ability of the developed ANNs models is very satisfactory at a significant statistical level of p < 0.01.
Traumatic injury among drywall installers, 1992 to 1995.
Chiou, S S; Pan, C S; Keane, P
2000-11-01
This study examined the traumatic-injury characteristics associated with one of the high-risk occupations in the construction industry--drywall installers--through an analysis of the traumatic-injury data obtained from the Bureau of Labor Statistics. An additional objective was to demonstrate a feasible and economic approach to identify risk factors associated with a specific occupation by using an existing database. An analysis of nonfatal traumatic injuries with days away from work among wage-and-salary drywall installers was performed for 1992 through 1995 using the Occupational Injury and Illness Survey conducted by the Bureau of Labor Statistics. Results from this study indicate that drywall installers are at a high risk of overexertion and falls to a lower level. More than 40% of the injured drywall installers suffered sprains, strains, and/or tears. The most frequently injured body part was the trunk. More than one-third of the trunk injuries occurred while handling solid building materials, mainly drywall. In addition, the database analysis used in this study is valid in identifying overall risk factors for specific occupations.
Wang, X; Hopkins, C
2016-10-01
Advanced Statistical Energy Analysis (ASEA) is used to predict vibration transmission across coupled beams which support multiple wave types up to high frequencies where Timoshenko theory is valid. Bending-longitudinal and bending-torsional models are considered for an L-junction and rectangular beam frame. Comparisons are made with measurements, Finite Element Methods (FEM) and Statistical Energy Analysis (SEA). When beams support at least two local modes for each wave type in a frequency band and the modal overlap factor is at least 0.1, measurements and FEM have relatively smooth curves. Agreement between measurements, FEM, and ASEA demonstrates that ASEA is able to predict high propagation losses which are not accounted for with SEA. These propagation losses tend to become more important at high frequencies with relatively high internal loss factors and can occur when there is more than one wave type. At such high frequencies, Timoshenko theory, rather than Euler-Bernoulli theory, is often required. Timoshenko theory is incorporated in ASEA and SEA using wave theory transmission coefficients derived assuming Euler-Bernoulli theory, but using Timoshenko group velocity when calculating coupling loss factors. The changeover between theories is appropriate above the frequency where there is a 26% difference between Euler-Bernoulli and Timoshenko group velocities.
NASA Technical Reports Server (NTRS)
1980-01-01
MATHPAC image-analysis library is collection of general-purpose mathematical and statistical routines and special-purpose data-analysis and pattern-recognition routines for image analysis. MATHPAC library consists of Linear Algebra, Optimization, Statistical-Summary, Densities and Distribution, Regression, and Statistical-Test packages.
Comparing Visual and Statistical Analysis of Multiple Baseline Design Graphs.
Wolfe, Katie; Dickenson, Tammiee S; Miller, Bridget; McGrath, Kathleen V
2018-04-01
A growing number of statistical analyses are being developed for single-case research. One important factor in evaluating these methods is the extent to which each corresponds to visual analysis. Few studies have compared statistical and visual analysis, and information about more recently developed statistics is scarce. Therefore, our purpose was to evaluate the agreement between visual analysis and four statistical analyses: improvement rate difference (IRD); Tau-U; Hedges, Pustejovsky, Shadish (HPS) effect size; and between-case standardized mean difference (BC-SMD). Results indicate that IRD and BC-SMD had the strongest overall agreement with visual analysis. Although Tau-U had strong agreement with visual analysis on raw values, it had poorer agreement when those values were dichotomized to represent the presence or absence of a functional relation. Overall, visual analysis appeared to be more conservative than statistical analysis, but further research is needed to evaluate the nature of these disagreements.
Liem, Franziskus; Mérillat, Susan; Bezzola, Ladina; Hirsiger, Sarah; Philipp, Michel; Madhyastha, Tara; Jäncke, Lutz
2015-03-01
FreeSurfer is a tool to quantify cortical and subcortical brain anatomy automatically and noninvasively. Previous studies have reported reliability and statistical power analyses in relatively small samples or only selected one aspect of brain anatomy. Here, we investigated reliability and statistical power of cortical thickness, surface area, volume, and the volume of subcortical structures in a large sample (N=189) of healthy elderly subjects (64+ years). Reliability (intraclass correlation coefficient) of cortical and subcortical parameters is generally high (cortical: ICCs>0.87, subcortical: ICCs>0.95). Surface-based smoothing increases reliability of cortical thickness maps, while it decreases reliability of cortical surface area and volume. Nevertheless, statistical power of all measures benefits from smoothing. When aiming to detect a 10% difference between groups, the number of subjects required to test effects with sufficient power over the entire cortex varies between cortical measures (cortical thickness: N=39, surface area: N=21, volume: N=81; 10mm smoothing, power=0.8, α=0.05). For subcortical regions this number is between 16 and 76 subjects, depending on the region. We also demonstrate the advantage of within-subject designs over between-subject designs. Furthermore, we publicly provide a tool that allows researchers to perform a priori power analysis and sensitivity analysis to help evaluate previously published studies and to design future studies with sufficient statistical power. Copyright © 2014 Elsevier Inc. All rights reserved.
Harkness, Mark; Fisher, Angela; Lee, Michael D; Mack, E Erin; Payne, Jo Ann; Dworatzek, Sandra; Roberts, Jeff; Acheson, Carolyn; Herrmann, Ronald; Possolo, Antonio
2012-04-01
A large, multi-laboratory microcosm study was performed to select amendments for supporting reductive dechlorination of high levels of trichloroethylene (TCE) found at an industrial site in the United Kingdom (UK) containing dense non-aqueous phase liquid (DNAPL) TCE. The study was designed as a fractional factorial experiment involving 177 bottles distributed between four industrial laboratories and was used to assess the impact of six electron donors, bioaugmentation, addition of supplemental nutrients, and two TCE levels (0.57 and 1.90 mM or 75 and 250 mg/L in the aqueous phase) on TCE dechlorination. Performance was assessed based on the concentration changes of TCE and reductive dechlorination degradation products. The chemical data was evaluated using analysis of variance (ANOVA) and survival analysis techniques to determine both main effects and important interactions for all the experimental variables during the 203-day study. The statistically based design and analysis provided powerful tools that aided decision-making for field application of this technology. The analysis showed that emulsified vegetable oil (EVO), lactate, and methanol were the most effective electron donors, promoting rapid and complete dechlorination of TCE to ethene. Bioaugmentation and nutrient addition also had a statistically significant positive impact on TCE dechlorination. In addition, the microbial community was measured using phospholipid fatty acid analysis (PLFA) for quantification of total biomass and characterization of the community structure and quantitative polymerase chain reaction (qPCR) for enumeration of Dehalococcoides organisms (Dhc) and the vinyl chloride reductase (vcrA) gene. The highest increase in levels of total biomass and Dhc was observed in the EVO microcosms, which correlated well with the dechlorination results. Copyright © 2012 Elsevier B.V. All rights reserved.
Grantz, Erin; Haggard, Brian; Scott, J Thad
2018-06-12
We calculated four median datasets (chlorophyll a, Chl a; total phosphorus, TP; and transparency) using multiple approaches to handling censored observations, including substituting fractions of the quantification limit (QL; dataset 1 = 1QL, dataset 2 = 0.5QL) and statistical methods for censored datasets (datasets 3-4) for approximately 100 Texas, USA reservoirs. Trend analyses of differences between dataset 1 and 3 medians indicated percent difference increased linearly above thresholds in percent censored data (%Cen). This relationship was extrapolated to estimate medians for site-parameter combinations with %Cen > 80%, which were combined with dataset 3 as dataset 4. Changepoint analysis of Chl a- and transparency-TP relationships indicated threshold differences up to 50% between datasets. Recursive analysis identified secondary thresholds in dataset 4. Threshold differences show that information introduced via substitution or missing due to limitations of statistical methods biased values, underestimated error, and inflated the strength of TP thresholds identified in datasets 1-3. Analysis of covariance identified differences in linear regression models relating transparency-TP between datasets 1, 2, and the more statistically robust datasets 3-4. Study findings identify high-risk scenarios for biased analytical outcomes when using substitution. These include high probability of median overestimation when %Cen > 50-60% for a single QL, or when %Cen is as low 16% for multiple QL's. Changepoint analysis was uniquely vulnerable to substitution effects when using medians from sites with %Cen > 50%. Linear regression analysis was less sensitive to substitution and missing data effects, but differences in model parameters for transparency cannot be discounted and could be magnified by log-transformation of the variables.
Giménez-Forcada, Elena; Vega-Alegre, Marisol; Timón-Sánchez, Susana
2017-09-01
Naturally occurring arsenic in groundwater exceeding the limit for potability has been reported along the southern edge of the Cenozoic Duero Basin (CDB) near its contact with the Spanish Central System (SCS). In this area, spatial variability of arsenic is high, peaking at 241μg/L. Forty-seven percent of samples collected contained arsenic above the maximum allowable concentration for drinking water (10μg/L). Correlations of As with other hydrochemical variables were investigated using multivariate statistical analysis (Hierarchical Cluster Analysis, HCA and Principal Component Analysis, PCA). It was found that As, V, Cr and pH are closely related and that there were also close correlations with temperature and Na + . The highest concentrations of arsenic and other associated Potentially Toxic Geogenic Trace Elements (PTGTE) are linked to alkaline NaHCO 3 waters (pH≈9), moderate oxic conditions and temperatures of around 18°C-19°C. The most plausible hypothesis to explain the high arsenic concentrations is the contribution of deeper regional flows with a significant hydrothermal component (cold-hydrothermal waters), flowing through faults in the basement rock. Water mixing and water-rock interactions occur both in the fissured aquifer media (igneous and metasedimentary bedrock) and in the sedimentary environment of the CDB, where agricultural pollution phenomena are also active. A combination of multivariate statistical tools and hydrochemical analysis enabled the distribution pattern of dissolved As and other PTGTE in groundwaters in the study area to be interpreted, and their most likely origin to be established. This methodology could be applied to other sedimentary areas with similar characteristics and problems. Copyright © 2017 Elsevier B.V. All rights reserved.
MacBride-Stewart, Sean; Marwick, Charis; Houston, Neil; Watt, Iain; Patton, Andrea; Guthrie, Bruce
2017-01-01
Background It is uncertain whether improvements in primary care high-risk prescribing seen in research trials can be realised in the real-world setting. Aim To evaluate the impact of a 1-year system-wide phase IV prescribing safety improvement initiative, which included education, feedback, support to identify patients to review, and small financial incentives. Design and setting An interrupted time series analysis of targeted high-risk prescribing in all 56 general practices in NHS Forth Valley, Scotland, was performed. In 2013–2014, this focused on high-risk non-steroidal anti-inflammatory drugs (NSAIDs) in older people and NSAIDs with oral anticoagulants; in 2014–2015, it focused on antipsychotics in older people. Method The primary analysis used segmented regression analysis to estimate impact at the end of the intervention, and 12 months later. The secondary analysis used difference-in-difference methods to compare Forth Valley changes with those in NHS Greater Glasgow and Clyde (GGC). Results In the primary analysis, downward trends for all three NSAID measures that were existent before the intervention statistically significantly steepened following implementation of the intervention. At the end of the intervention period, 1221 fewer patients than expected were prescribed a high-risk NSAID. In contrast, antipsychotic prescribing in older people increased slowly over time, with no intervention-associated change. In the secondary analysis, reductions at the end of the intervention period in all three NSAID measures were statistically significantly greater in NHS Forth Valley than in NHS GGC, but only significantly greater for two of these measures 12 months after the intervention finished. Conclusion There were substantial and sustained reductions in the high-risk prescribing of NSAIDs, although with some waning of effect 12 months after the intervention ceased. The same intervention had no effect on antipsychotic prescribing in older people. PMID:28347986
MacBride-Stewart, Sean; Marwick, Charis; Houston, Neil; Watt, Iain; Patton, Andrea; Guthrie, Bruce
2017-05-01
It is uncertain whether improvements in primary care high-risk prescribing seen in research trials can be realised in the real-world setting. To evaluate the impact of a 1-year system-wide phase IV prescribing safety improvement initiative, which included education, feedback, support to identify patients to review, and small financial incentives. An interrupted time series analysis of targeted high-risk prescribing in all 56 general practices in NHS Forth Valley, Scotland, was performed. In 2013-2014, this focused on high-risk non-steroidal anti-inflammatory drugs (NSAIDs) in older people and NSAIDs with oral anticoagulants; in 2014-2015, it focused on antipsychotics in older people. The primary analysis used segmented regression analysis to estimate impact at the end of the intervention, and 12 months later. The secondary analysis used difference-in-difference methods to compare Forth Valley changes with those in NHS Greater Glasgow and Clyde (GGC). In the primary analysis, downward trends for all three NSAID measures that were existent before the intervention statistically significantly steepened following implementation of the intervention. At the end of the intervention period, 1221 fewer patients than expected were prescribed a high-risk NSAID. In contrast, antipsychotic prescribing in older people increased slowly over time, with no intervention-associated change. In the secondary analysis, reductions at the end of the intervention period in all three NSAID measures were statistically significantly greater in NHS Forth Valley than in NHS GGC, but only significantly greater for two of these measures 12 months after the intervention finished. There were substantial and sustained reductions in the high-risk prescribing of NSAIDs, although with some waning of effect 12 months after the intervention ceased. The same intervention had no effect on antipsychotic prescribing in older people. © British Journal of General Practice 2017.
Fan, Yannan; Siklenka, Keith; Arora, Simran K.; Ribeiro, Paula; Kimmins, Sarah; Xia, Jianguo
2016-01-01
MicroRNAs (miRNAs) can regulate nearly all biological processes and their dysregulation is implicated in various complex diseases and pathological conditions. Recent years have seen a growing number of functional studies of miRNAs using high-throughput experimental technologies, which have produced a large amount of high-quality data regarding miRNA target genes and their interactions with small molecules, long non-coding RNAs, epigenetic modifiers, disease associations, etc. These rich sets of information have enabled the creation of comprehensive networks linking miRNAs with various biologically important entities to shed light on their collective functions and regulatory mechanisms. Here, we introduce miRNet, an easy-to-use web-based tool that offers statistical, visual and network-based approaches to help researchers understand miRNAs functions and regulatory mechanisms. The key features of miRNet include: (i) a comprehensive knowledge base integrating high-quality miRNA-target interaction data from 11 databases; (ii) support for differential expression analysis of data from microarray, RNA-seq and quantitative PCR; (iii) implementation of a flexible interface for data filtering, refinement and customization during network creation; (iv) a powerful fully featured network visualization system coupled with enrichment analysis. miRNet offers a comprehensive tool suite to enable statistical analysis and functional interpretation of various data generated from current miRNA studies. miRNet is freely available at http://www.mirnet.ca. PMID:27105848
Geostatistical applications in environmental remediation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stewart, R.N.; Purucker, S.T.; Lyon, B.F.
1995-02-01
Geostatistical analysis refers to a collection of statistical methods for addressing data that vary in space. By incorporating spatial information into the analysis, geostatistics has advantages over traditional statistical analysis for problems with a spatial context. Geostatistics has a history of success in earth science applications, and its popularity is increasing in other areas, including environmental remediation. Due to recent advances in computer technology, geostatistical algorithms can be executed at a speed comparable to many standard statistical software packages. When used responsibly, geostatistics is a systematic and defensible tool can be used in various decision frameworks, such as the Datamore » Quality Objectives (DQO) process. At every point in the site, geostatistics can estimate both the concentration level and the probability or risk of exceeding a given value. Using these probability maps can assist in identifying clean-up zones. Given any decision threshold and an acceptable level of risk, the probability maps identify those areas that are estimated to be above or below the acceptable risk. Those areas that are above the threshold are of the most concern with regard to remediation. In addition to estimating clean-up zones, geostatistics can assist in designing cost-effective secondary sampling schemes. Those areas of the probability map with high levels of estimated uncertainty are areas where more secondary sampling should occur. In addition, geostatistics has the ability to incorporate soft data directly into the analysis. These data include historical records, a highly correlated secondary contaminant, or expert judgment. The role of geostatistics in environmental remediation is a tool that in conjunction with other methods can provide a common forum for building consensus.« less
A note on generalized Genome Scan Meta-Analysis statistics
Koziol, James A; Feng, Anne C
2005-01-01
Background Wise et al. introduced a rank-based statistical technique for meta-analysis of genome scans, the Genome Scan Meta-Analysis (GSMA) method. Levinson et al. recently described two generalizations of the GSMA statistic: (i) a weighted version of the GSMA statistic, so that different studies could be ascribed different weights for analysis; and (ii) an order statistic approach, reflecting the fact that a GSMA statistic can be computed for each chromosomal region or bin width across the various genome scan studies. Results We provide an Edgeworth approximation to the null distribution of the weighted GSMA statistic, and, we examine the limiting distribution of the GSMA statistics under the order statistic formulation, and quantify the relevance of the pairwise correlations of the GSMA statistics across different bins on this limiting distribution. We also remark on aggregate criteria and multiple testing for determining significance of GSMA results. Conclusion Theoretical considerations detailed herein can lead to clarification and simplification of testing criteria for generalizations of the GSMA statistic. PMID:15717930
NASA Astrophysics Data System (ADS)
Ngan Nguyen, Thi To; Liu, Cheng-Chien
2013-04-01
How landslides occurred and which factors triggered and sped up landslide occurrences were usually asked by researchers in the past decades. Many investigations carried out in many places in the world to finding out methods that predict and prevent damages from landslides phenomena. Chen-Yu-Lan River watershed is reputed as a 'hot pot' of landslide researches in Taiwan by its complicated geological structures with the significant tectonic fault systems and steeply mountainous terrain. Beside annual high precipitation concentration and the abrupt slopes, some natural disaster, as typhoons (Sinlaku-2008, Kalmaegi-2008, and Marakot-2009) and earthquake (Chi-Chi earthquake-1999) are also the triggered factors cause landslides with serious damages in this place. This research expresses the quantitative approaches to generate landslide susceptible map for Chen-Yu-Lan watershed, a mountainous area in the central Taiwan. Landslide inventories data, which were detected from the Formosat-2 imageries for eight years from 2004 to 2011, were applied to carry out landslide susceptibility mapping. Bivariate statistics analysis and multivariate statistics analysis would be applied to calculate susceptible index of landslides. The weights of parameters were computed based on landslide data for eight years from 2004 to 2011. To validate effective levels of factors to landslide occurrences, this method built some multivariate algorithms and compared these results with real landslide occurrences. Besides this method, the historical data of landslides were also used to assess and classify landslide susceptibility levels. From long-term landslide data, relation between landslide susceptibility levels and landslide repetition was assigned. The results demonstrated differently effective levels of potential factors, such as, slope gradient, drainage density, lithology and land use to landslide phenomena. The results also showed logical relationship between weights and characteristics of factors' classes. Depending on these results be able to help planning managers localize the high risk areas of landslide or safely areas by building and human activities.
Pendse, Salil N; Maertens, Alexandra; Rosenberg, Michael; Roy, Dipanwita; Fasani, Rick A; Vantangoli, Marguerite M; Madnick, Samantha J; Boekelheide, Kim; Fornace, Albert J; Odwin, Shelly-Ann; Yager, James D; Hartung, Thomas; Andersen, Melvin E; McMullen, Patrick D
2017-04-01
The twenty-first century vision for toxicology involves a transition away from high-dose animal studies to in vitro and computational models (NRC in Toxicity testing in the 21st century: a vision and a strategy, The National Academies Press, Washington, DC, 2007). This transition requires mapping pathways of toxicity by understanding how in vitro systems respond to chemical perturbation. Uncovering transcription factors/signaling networks responsible for gene expression patterns is essential for defining pathways of toxicity, and ultimately, for determining the chemical modes of action through which a toxicant acts. Traditionally, transcription factor identification is achieved via chromatin immunoprecipitation studies and summarized by calculating which transcription factors are statistically associated with up- and downregulated genes. These lists are commonly determined via statistical or fold-change cutoffs, a procedure that is sensitive to statistical power and may not be as useful for determining transcription factor associations. To move away from an arbitrary statistical or fold-change-based cutoff, we developed, in the context of the Mapping the Human Toxome project, an enrichment paradigm called information-dependent enrichment analysis (IDEA) to guide identification of the transcription factor network. We used a test case of activation in MCF-7 cells by 17β estradiol (E2). Using this new approach, we established a time course for transcriptional and functional responses to E2. ERα and ERβ were associated with short-term transcriptional changes in response to E2. Sustained exposure led to recruitment of additional transcription factors and alteration of cell cycle machinery. TFAP2C and SOX2 were the transcription factors most highly correlated with dose. E2F7, E2F1, and Foxm1, which are involved in cell proliferation, were enriched only at 24 h. IDEA should be useful for identifying candidate pathways of toxicity. IDEA outperforms gene set enrichment analysis (GSEA) and provides similar results to weighted gene correlation network analysis, a platform that helps to identify genes not annotated to pathways.
Is fertility falling in Zimbabwe?
Udjo, E O
1996-01-01
With an unequalled contraceptive prevalence rate in sub-Saharan Africa, of 43% among currently married women in Zimbabwe, the Central Statistical Office (1989) observed that fertility has declined sharply in recent years. Using data from several surveys on Zimbabwe, especially the birth histories of the Zimbabwe Demographic and Health Survey, this study examines fertility trends in Zimbabwe. The results show that the fertility decline in Zimbabwe is modest and that the decline is concentrated among high order births. Multivariate analysis did not show a statistically significant effect of contraception on fertility, partly because a high proportion of Zimbabwean women in the reproductive age group never use contraception due to prevailing pronatalist attitudes in the country.
Collaborative classification of hyperspectral and visible images with convolutional neural network
NASA Astrophysics Data System (ADS)
Zhang, Mengmeng; Li, Wei; Du, Qian
2017-10-01
Recent advances in remote sensing technology have made multisensor data available for the same area, and it is well-known that remote sensing data processing and analysis often benefit from multisource data fusion. Specifically, low spatial resolution of hyperspectral imagery (HSI) degrades the quality of the subsequent classification task while using visible (VIS) images with high spatial resolution enables high-fidelity spatial analysis. A collaborative classification framework is proposed to fuse HSI and VIS images for finer classification. First, the convolutional neural network model is employed to extract deep spectral features for HSI classification. Second, effective binarized statistical image features are learned as contextual basis vectors for the high-resolution VIS image, followed by a classifier. The proposed approach employs diversified data in a decision fusion, leading to an integration of the rich spectral information, spatial information, and statistical representation information. In particular, the proposed approach eliminates the potential problems of the curse of dimensionality and excessive computation time. The experiments evaluated on two standard data sets demonstrate better classification performance offered by this framework.
NASA Astrophysics Data System (ADS)
Shan, HuanYuan; Liu, Xiangkun; Hildebrandt, Hendrik; Pan, Chuzhong; Martinet, Nicolas; Fan, Zuhui; Schneider, Peter; Asgari, Marika; Harnois-Déraps, Joachim; Hoekstra, Henk; Wright, Angus; Dietrich, Jörg P.; Erben, Thomas; Getman, Fedor; Grado, Aniello; Heymans, Catherine; Klaes, Dominik; Kuijken, Konrad; Merten, Julian; Puddu, Emanuella; Radovich, Mario; Wang, Qiao
2018-02-01
This paper is the first of a series of papers constraining cosmological parameters with weak lensing peak statistics using ˜ 450 deg2 of imaging data from the Kilo Degree Survey (KiDS-450). We measure high signal-to-noise ratio (SNR: ν) weak lensing convergence peaks in the range of 3 < ν < 5, and employ theoretical models to derive expected values. These models are validated using a suite of simulations. We take into account two major systematic effects, the boost factor and the effect of baryons on the mass-concentration relation of dark matter haloes. In addition, we investigate the impacts of other potential astrophysical systematics including the projection effects of large-scale structures, intrinsic galaxy alignments, as well as residual measurement uncertainties in the shear and redshift calibration. Assuming a flat Λ cold dark matter model, we find constraints for S_8=σ _8(Ω _m/0.3)^{0.5}=0.746^{+0.046}_{-0.107} according to the degeneracy direction of the cosmic shear analysis and Σ _8=σ _8(Ω _m/0.3)^{0.38}=0.696^{+0.048}_{-0.050} based on the derived degeneracy direction of our high-SNR peak statistics. The difference between the power index of S8 and in Σ8 indicates that combining cosmic shear with peak statistics has the potential to break the degeneracy in σ8 and Ωm. Our results are consistent with the cosmic shear tomographic correlation analysis of the same data set and ˜2σ lower than the Planck 2016 results.
Lim, Kyungjae; Kwon, Heejin; Cho, Jinhan; Oh, Jongyoung; Yoon, Seongkuk; Kang, Myungjin; Ha, Dongho; Lee, Jinhwa; Kang, Eunju
2015-01-01
The purpose of this study was to assess the image quality of a novel advanced iterative reconstruction (IR) method called as "adaptive statistical IR V" (ASIR-V) by comparing the image noise, contrast-to-noise ratio (CNR), and spatial resolution from those of filtered back projection (FBP) and adaptive statistical IR (ASIR) on computed tomography (CT) phantom image. We performed CT scans at 5 different tube currents (50, 70, 100, 150, and 200 mA) using 3 types of CT phantoms. Scanned images were subsequently reconstructed in 7 different scan settings, such as FBP, and 3 levels of ASIR and ASIR-V (30%, 50%, and 70%). The image noise was measured in the first study using body phantom. The CNR was measured in the second study using contrast phantom and the spatial resolutions were measured in the third study using a high-resolution phantom. We compared the image noise, CNR, and spatial resolution among the 7 reconstructed image scan settings to determine whether noise reduction, high CNR, and high spatial resolution could be achieved at ASIR-V. At quantitative analysis of the first and second studies, it showed that the images reconstructed using ASIR-V had reduced image noise and improved CNR compared with those of FBP and ASIR (P < 0.001). At qualitative analysis of the third study, it also showed that the images reconstructed using ASIR-V had significantly improved spatial resolution than those of FBP and ASIR (P < 0.001). Our phantom studies showed that ASIR-V provides a significant reduction in image noise and a significant improvement in CNR as well as spatial resolution. Therefore, this technique has the potential to reduce the radiation dose further without compromising image quality.
An Analysis of High School Students' Performance on Five Integrated Science Process Skills
NASA Astrophysics Data System (ADS)
Beaumont-Walters, Yvonne; Soyibo, Kola
2001-02-01
This study determined Jamaican high school students' level of performance on five integrated science process skills and if there were statistically significant differences in their performance linked to their gender, grade level, school location, school type, student type and socio-economic background (SEB). The 305 subjects comprised 133 males, 172 females, 146 ninth graders, 159 10th graders, 150 traditional and 155 comprehensive high school students, 164 students from the Reform of Secondary Education (ROSE) project and 141 non-ROSE students, 166 urban and 139 rural students and 110 students from a high SEB and 195 from a low SEB. Data were collected with the authors' constructed integrated science process skills test the results indicated that the subjects' mean score was low and unsatisfactory; their performance in decreasing order was: interpreting data, recording data, generalising, formulating hypotheses and identifying variables; there were statistically significant differences in their performance based on their grade level, school type, student type, and SEB in favour of the 10th graders, traditional high school students, ROSE students and students from a high SEB. There was a positive, statistically significant and fairly strong relationship between their performance and school type, but weak relationships among their student type, grade level and SEB and performance.
NASA Astrophysics Data System (ADS)
Wang, Dong
2016-03-01
Gears are the most commonly used components in mechanical transmission systems. Their failures may cause transmission system breakdown and result in economic loss. Identification of different gear crack levels is important to prevent any unexpected gear failure because gear cracks lead to gear tooth breakage. Signal processing based methods mainly require expertize to explain gear fault signatures which is usually not easy to be achieved by ordinary users. In order to automatically identify different gear crack levels, intelligent gear crack identification methods should be developed. The previous case studies experimentally proved that K-nearest neighbors based methods exhibit high prediction accuracies for identification of 3 different gear crack levels under different motor speeds and loads. In this short communication, to further enhance prediction accuracies of existing K-nearest neighbors based methods and extend identification of 3 different gear crack levels to identification of 5 different gear crack levels, redundant statistical features are constructed by using Daubechies 44 (db44) binary wavelet packet transform at different wavelet decomposition levels, prior to the use of a K-nearest neighbors method. The dimensionality of redundant statistical features is 620, which provides richer gear fault signatures. Since many of these statistical features are redundant and highly correlated with each other, dimensionality reduction of redundant statistical features is conducted to obtain new significant statistical features. At last, the K-nearest neighbors method is used to identify 5 different gear crack levels under different motor speeds and loads. A case study including 3 experiments is investigated to demonstrate that the developed method provides higher prediction accuracies than the existing K-nearest neighbors based methods for recognizing different gear crack levels under different motor speeds and loads. Based on the new significant statistical features, some other popular statistical models including linear discriminant analysis, quadratic discriminant analysis, classification and regression tree and naive Bayes classifier, are compared with the developed method. The results show that the developed method has the highest prediction accuracies among these statistical models. Additionally, selection of the number of new significant features and parameter selection of K-nearest neighbors are thoroughly investigated.
Meta-analysis and The Cochrane Collaboration: 20 years of the Cochrane Statistical Methods Group
2013-01-01
The Statistical Methods Group has played a pivotal role in The Cochrane Collaboration over the past 20 years. The Statistical Methods Group has determined the direction of statistical methods used within Cochrane reviews, developed guidance for these methods, provided training, and continued to discuss and consider new and controversial issues in meta-analysis. The contribution of Statistical Methods Group members to the meta-analysis literature has been extensive and has helped to shape the wider meta-analysis landscape. In this paper, marking the 20th anniversary of The Cochrane Collaboration, we reflect on the history of the Statistical Methods Group, beginning in 1993 with the identification of aspects of statistical synthesis for which consensus was lacking about the best approach. We highlight some landmark methodological developments that Statistical Methods Group members have contributed to in the field of meta-analysis. We discuss how the Group implements and disseminates statistical methods within The Cochrane Collaboration. Finally, we consider the importance of robust statistical methodology for Cochrane systematic reviews, note research gaps, and reflect on the challenges that the Statistical Methods Group faces in its future direction. PMID:24280020
Title I ESEA, High School; English as a Second Language: 1979-1980. OEE Evaluation Report.
ERIC Educational Resources Information Center
New York City Board of Education, Brooklyn, NY. Office of Educational Evaluation.
The report is an evaluation of the 1979-80 High School Title I English as a Second Language Program. Two types of information are presented: (1) a narrative description of the program which provides qualitative data regarding the program, and (2) a statistical analysis of test results which consists of quantitative, city-wide data. By integrating…
ERIC Educational Resources Information Center
Cahalan, Margaret W.; Ingels, Steven J.; Burns, Laura J.; Planty, Michael; Daniel, Bruce
2006-01-01
This report presents information on similarities and differences between U.S. high school sophomores as studied at three points in time over the past 22 years, with a focus on cohort demographics, academic programs and performance, extracurricular activities, life values, and educational/occupational aspirations. It provides an update to the…
ERIC Educational Resources Information Center
Strolin-Goltzman, Jessica
2008-01-01
This comparison study analyzes the commonalties, similarities, and differences on supervisory and organizational factors between a group of high turnover systems and a group of low turnover systems. Significant differences on organizational factors, but not on supervisory factors, emerged from the statistical analysis. Additionally, this study…
High School and Beyond. 1980 Senior Cohort. First Follow-Up (1982). [machine-readable data file].
ERIC Educational Resources Information Center
National Center for Education Statistics (ED), Washington, DC.
The High School and Beyond 1980 Senior Cohort First Follow-Up (1982) Data File is presented. The First Follow-Up Senior Cohort data tape consists of four related data files: (1) the student data file (including data availability flags, weights, questionnaire data, and composite variables); (2) Statistical Analysis System (SAS) control cards for…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mahfuz, H.; Maniruzzaman, M.; Vaidya, U.
1997-04-01
Monotonic tensile and fatigue response of continuous silicon carbide fiber reinforced silicon nitride (SiC{sub f}/Si{sub 3}N{sub 4}) composites has been investigated. The monotonic tensile tests have been performed at room and elevated temperatures. Fatigue tests have been conducted at room temperature (RT), at a stress ratio, R = 0.1 and a frequency of 5 Hz. It is observed during the monotonic tests that the composites retain only 30% of its room temperature strength at 1,600 C suggesting a substantial chemical degradation of the matrix at that temperature. The softening of the matrix at elevated temperature also causes reduction in tensilemore » modulus, and the total reduction in modulus is around 45%. Fatigue data have been generated at three load levels and the fatigue strength of the composite has been found to be considerably high; about 75% of its ultimate room temperature strength. Extensive statistical analysis has been performed to understand the degree of scatter in the fatigue as well as in the static test data. Weibull shape factors and characteristic values have been determined for each set of tests and their relationship with the response of the composites has been discussed. A statistical fatigue life prediction method developed from the Weibull distribution is also presented. Maximum Likelihood Estimator with censoring techniques and data pooling schemes has been employed to determine the distribution parameters for the statistical analysis. These parameters have been used to generate the S-N diagram with desired level of reliability. Details of the statistical analysis and the discussion of the static and fatigue behavior of the composites are presented in this paper.« less
Performance of RVGui sensor and Kodak Ektaspeed Plus film for proximal caries detection.
Abreu, M; Mol, A; Ludlow, J B
2001-03-01
A high-resolution charge-coupled device was used to compare the diagnostic performances obtained with Trophy's new RVGui sensor and Kodak Ektaspeed Plus film with respect to caries detection. Three acquisition modes of the Trophy RVGui sensor were compared with Kodak Ektaspeed Plus film. Images of the proximal surfaces of 40 extracted posterior teeth were evaluated by 6 observers. The presence or absence of caries was scored by means of a 5-point confidence scale. The actual caries status of each surface was determined through ground-section histology. Responses were evaluated by means of receiver operating characteristic analysis. Areas under receiver operating characteristic curves (A(Z)) were assessed through analysis of variance. The mean A(Z) scores were 0.85 for film, 0.84 for the high-resolution caries mode, and 0.82 for both the low resolution caries mode and the high-resolution periodontal mode. These differences were not statistically significant (P =.70). The differences among observers also were not statistically significant (P =.23). The performance of the RVGui sensor in high- and low-resolution modes for proximal caries detection is comparable to that of Ektaspeed Plus film.
Analysis of TCE Fate and Transport in Karst Groundwater Systems Using Statistical Mixed Models
NASA Astrophysics Data System (ADS)
Anaya, A. A.; Padilla, I. Y.
2012-12-01
Karst groundwater systems are highly productive and provide an important fresh water resource for human development and ecological integrity. Their high productivity is often associated with conduit flow and high matrix permeability. The same characteristics that make these aquifers productive also make them highly vulnerable to contamination and a likely for contaminant exposure. Of particular interest are trichloroethylene, (TCE) and Di-(2-Ethylhexyl) phthalate (DEHP). These chemicals have been identified as potential precursors of pre-term birth, a leading cause of neonatal complications with a significant health and societal cost. Both of these contaminants have been found in the karst groundwater formations in this area of the island. The general objectives of this work are to: (1) develop fundamental knowledge and determine the processes controlling the release, mobility, persistence, and possible pathways of contaminants in karst groundwater systems, and (2) characterize transport processes in conduit and diffusion-dominated flow under base flow and storm flow conditions. The work presented herein focuses on the use of geo-hydro statistical tools to characterize flow and transport processes under different flow regimes, and their application in the analysis of fate and transport of TCE. Multidimensional, laboratory-scale Geo-Hydrobed models (GHM) were used for this purpose. The models consist of stainless-steel tanks containing karstified limestone blocks collected from the karst aquifer formation of northern Puerto Rico. The models integrates a network of sampling wells to monitor flow, pressure, and solute concentrations temporally and spatially. Experimental work entails injecting dissolved CaCl2 tracers and TCE in the upstream boundary of the GHM while monitoring TCE and tracer concentrations spatially and temporally in the limestone under different groundwater flow regimes. Analysis of the temporal and spatial concentration distributions of solutes indicates a highly heterogeneous system resulting in large preferential flow components. The distributions are highly correlated with statistically-developed spatial flow models. High degree of tailing in breakthrough curves indicate significant amount of mass limitations, particularly in diffuse flow regions. Higher flow rates in the system result in increasing preferential flow region volumes, but lower mass transfer limitations. Future work will involve experiments with non-aqueous phase liquid TCE, DEHP, and a mixture of these, and geo-temporal statistical modeling. This work is supported by the U.S. Department of Energy, Savannah River (Grant Award No. DE-FG09-07SR22571), and the National Institute of Environmental Health Sciences (NIEHS, Grant Award No. P42ES017198).
Reliability analysis of single crystal NiAl turbine blades
NASA Technical Reports Server (NTRS)
Salem, Jonathan; Noebe, Ronald; Wheeler, Donald R.; Holland, Fred; Palko, Joseph; Duffy, Stephen; Wright, P. Kennard
1995-01-01
As part of a co-operative agreement with General Electric Aircraft Engines (GEAE), NASA LeRC is modifying and validating the Ceramic Analysis and Reliability Evaluation of Structures algorithm for use in design of components made of high strength NiAl based intermetallic materials. NiAl single crystal alloys are being actively investigated by GEAE as a replacement for Ni-based single crystal superalloys for use in high pressure turbine blades and vanes. The driving force for this research lies in the numerous property advantages offered by NiAl alloys over their superalloy counterparts. These include a reduction of density by as much as a third without significantly sacrificing strength, higher melting point, greater thermal conductivity, better oxidation resistance, and a better response to thermal barrier coatings. The current drawback to high strength NiAl single crystals is their limited ductility. Consequently, significant efforts including the work agreement with GEAE are underway to develop testing and design methodologies for these materials. The approach to validation and component analysis involves the following steps: determination of the statistical nature and source of fracture in a high strength, NiAl single crystal turbine blade material; measurement of the failure strength envelope of the material; coding of statistically based reliability models; verification of the code and model; and modeling of turbine blades and vanes for rig testing.
NASA Astrophysics Data System (ADS)
Martin, Lynn A.
The purpose of this study was to examine the relationship between teachers' self-reported preparedness for teaching science content and their instructional practices to the science achievement of eighth grade science students in the United States as demonstrated by TIMSS 2007. Six hundred eighty-seven eighth grade science teachers in the United States representing 7,377 students responded to the TIMSS 2007 questionnaire about their instructional preparedness and their instructional practices. Quantitative data were reported. Through correlation analysis, the researcher found statistically significant positive relationships emerge between eighth grade science teachers' main area of study and their self-reported beliefs about their preparedness to teach that same content area. Another correlation analysis found a statistically significant negative relationship existed between teachers' self-reported use of inquiry-based instruction and preparedness to teach chemistry, physics and earth science. Another correlation analysis discovered a statistically significant positive relationship existed between physics preparedness and student science achievement. Finally, a correlation analysis found a statistically significant positive relationship existed between science teachers' self-reported implementation of inquiry-based instructional practices and student achievement. The data findings support the conclusion that teachers who have feelings of preparedness to teach science content and implement more inquiry-based instruction and less didactic instruction produce high achieving science students. As science teachers obtain the appropriate knowledge in science content and pedagogy, science teachers will feel prepared and will implement inquiry-based instruction in science classrooms.
Representation of Probability Density Functions from Orbit Determination using the Particle Filter
NASA Technical Reports Server (NTRS)
Mashiku, Alinda K.; Garrison, James; Carpenter, J. Russell
2012-01-01
Statistical orbit determination enables us to obtain estimates of the state and the statistical information of its region of uncertainty. In order to obtain an accurate representation of the probability density function (PDF) that incorporates higher order statistical information, we propose the use of nonlinear estimation methods such as the Particle Filter. The Particle Filter (PF) is capable of providing a PDF representation of the state estimates whose accuracy is dependent on the number of particles or samples used. For this method to be applicable to real case scenarios, we need a way of accurately representing the PDF in a compressed manner with little information loss. Hence we propose using the Independent Component Analysis (ICA) as a non-Gaussian dimensional reduction method that is capable of maintaining higher order statistical information obtained using the PF. Methods such as the Principal Component Analysis (PCA) are based on utilizing up to second order statistics, hence will not suffice in maintaining maximum information content. Both the PCA and the ICA are applied to two scenarios that involve a highly eccentric orbit with a lower apriori uncertainty covariance and a less eccentric orbit with a higher a priori uncertainty covariance, to illustrate the capability of the ICA in relation to the PCA.
Low statistical power in biomedical science: a review of three human research domains.
Dumas-Mallet, Estelle; Button, Katherine S; Boraud, Thomas; Gonon, Francois; Munafò, Marcus R
2017-02-01
Studies with low statistical power increase the likelihood that a statistically significant finding represents a false positive result. We conducted a review of meta-analyses of studies investigating the association of biological, environmental or cognitive parameters with neurological, psychiatric and somatic diseases, excluding treatment studies, in order to estimate the average statistical power across these domains. Taking the effect size indicated by a meta-analysis as the best estimate of the likely true effect size, and assuming a threshold for declaring statistical significance of 5%, we found that approximately 50% of studies have statistical power in the 0-10% or 11-20% range, well below the minimum of 80% that is often considered conventional. Studies with low statistical power appear to be common in the biomedical sciences, at least in the specific subject areas captured by our search strategy. However, we also observe evidence that this depends in part on research methodology, with candidate gene studies showing very low average power and studies using cognitive/behavioural measures showing high average power. This warrants further investigation.
Low statistical power in biomedical science: a review of three human research domains
Dumas-Mallet, Estelle; Button, Katherine S.; Boraud, Thomas; Gonon, Francois
2017-01-01
Studies with low statistical power increase the likelihood that a statistically significant finding represents a false positive result. We conducted a review of meta-analyses of studies investigating the association of biological, environmental or cognitive parameters with neurological, psychiatric and somatic diseases, excluding treatment studies, in order to estimate the average statistical power across these domains. Taking the effect size indicated by a meta-analysis as the best estimate of the likely true effect size, and assuming a threshold for declaring statistical significance of 5%, we found that approximately 50% of studies have statistical power in the 0–10% or 11–20% range, well below the minimum of 80% that is often considered conventional. Studies with low statistical power appear to be common in the biomedical sciences, at least in the specific subject areas captured by our search strategy. However, we also observe evidence that this depends in part on research methodology, with candidate gene studies showing very low average power and studies using cognitive/behavioural measures showing high average power. This warrants further investigation. PMID:28386409
Ferrarini, Luca; Veer, Ilya M; van Lew, Baldur; Oei, Nicole Y L; van Buchem, Mark A; Reiber, Johan H C; Rombouts, Serge A R B; Milles, J
2011-06-01
In recent years, graph theory has been successfully applied to study functional and anatomical connectivity networks in the human brain. Most of these networks have shown small-world topological characteristics: high efficiency in long distance communication between nodes, combined with highly interconnected local clusters of nodes. Moreover, functional studies performed at high resolutions have presented convincing evidence that resting-state functional connectivity networks exhibits (exponentially truncated) scale-free behavior. Such evidence, however, was mostly presented qualitatively, in terms of linear regressions of the degree distributions on log-log plots. Even when quantitative measures were given, these were usually limited to the r(2) correlation coefficient. However, the r(2) statistic is not an optimal estimator of explained variance, when dealing with (truncated) power-law models. Recent developments in statistics have introduced new non-parametric approaches, based on the Kolmogorov-Smirnov test, for the problem of model selection. In this work, we have built on this idea to statistically tackle the issue of model selection for the degree distribution of functional connectivity at rest. The analysis, performed at voxel level and in a subject-specific fashion, confirmed the superiority of a truncated power-law model, showing high consistency across subjects. Moreover, the most highly connected voxels were found to be consistently part of the default mode network. Our results provide statistically sound support to the evidence previously presented in literature for a truncated power-law model of resting-state functional connectivity. Copyright © 2010 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Vitali, Lina; Righini, Gaia; Piersanti, Antonio; Cremona, Giuseppe; Pace, Giandomenico; Ciancarella, Luisella
2017-12-01
Air backward trajectory calculations are commonly used in a variety of atmospheric analyses, in particular for source attribution evaluation. The accuracy of backward trajectory analysis is mainly determined by the quality and the spatial and temporal resolution of the underlying meteorological data set, especially in the cases of complex terrain. This work describes a new tool for the calculation and the statistical elaboration of backward trajectories. To take advantage of the high-resolution meteorological database of the Italian national air quality model MINNI, a dedicated set of procedures was implemented under the name of M-TraCE (MINNI module for Trajectories Calculation and statistical Elaboration) to calculate and process the backward trajectories of air masses reaching a site of interest. Some outcomes from the application of the developed methodology to the Italian Network of Special Purpose Monitoring Stations are shown to assess its strengths for the meteorological characterization of air quality monitoring stations. M-TraCE has demonstrated its capabilities to provide a detailed statistical assessment of transport patterns and region of influence of the site under investigation, which is fundamental for correctly interpreting pollutants measurements and ascertaining the official classification of the monitoring site based on meta-data information. Moreover, M-TraCE has shown its usefulness in supporting other assessments, i.e., spatial representativeness of a monitoring site, focussing specifically on the analysis of the effects due to meteorological variables.
Statistical Downscaling of WRF-Chem Model: An Air Quality Analysis over Bogota, Colombia
NASA Astrophysics Data System (ADS)
Kumar, Anikender; Rojas, Nestor
2015-04-01
Statistical downscaling is a technique that is used to extract high-resolution information from regional scale variables produced by coarse resolution models such as Chemical Transport Models (CTMs). The fully coupled WRF-Chem (Weather Research and Forecasting with Chemistry) model is used to simulate air quality over Bogota. Bogota is a tropical Andean megacity located over a high-altitude plateau in the middle of very complex terrain. The WRF-Chem model was adopted for simulating the hourly ozone concentrations. The computational domains were chosen of 120x120x32, 121x121x32 and 121x121x32 grid points with horizontal resolutions of 27, 9 and 3 km respectively. The model was initialized with real boundary conditions using NCAR-NCEP's Final Analysis (FNL) and a 1ox1o (~111 km x 111 km) resolution. Boundary conditions were updated every 6 hours using reanalysis data. The emission rates were obtained from global inventories, namely the REanalysis of the TROpospheric (RETRO) chemical composition and the Emission Database for Global Atmospheric Research (EDGAR). Multiple linear regression and artificial neural network techniques are used to downscale the model output at each monitoring stations. The results confirm that the statistically downscaled outputs reduce simulated errors by up to 25%. This study provides a general overview of statistical downscaling of chemical transport models and can constitute a reference for future air quality modeling exercises over Bogota and other Colombian cities.
Statistical Tutorial | Center for Cancer Research
Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. ST is designed as a follow up to Statistical Analysis of Research Data (SARD) held in April 2018. The tutorial will apply the general principles of statistical analysis of research data including descriptive statistics, z- and t-tests of means and mean differences, simple and multiple linear regression, ANOVA tests, and Chi-Squared distribution.
NASA Astrophysics Data System (ADS)
Dennison, Andrew G.
Classification of the seafloor substrate can be done with a variety of methods. These methods include Visual (dives, drop cameras); mechanical (cores, grab samples); acoustic (statistical analysis of echosounder returns). Acoustic methods offer a more powerful and efficient means of collecting useful information about the bottom type. Due to the nature of an acoustic survey, larger areas can be sampled, and by combining the collected data with visual and mechanical survey methods provide greater confidence in the classification of a mapped region. During a multibeam sonar survey, both bathymetric and backscatter data is collected. It is well documented that the statistical characteristic of a sonar backscatter mosaic is dependent on bottom type. While classifying the bottom-type on the basis on backscatter alone can accurately predict and map bottom-type, i.e a muddy area from a rocky area, it lacks the ability to resolve and capture fine textural details, an important factor in many habitat mapping studies. Statistical processing of high-resolution multibeam data can capture the pertinent details about the bottom-type that are rich in textural information. Further multivariate statistical processing can then isolate characteristic features, and provide the basis for an accurate classification scheme. The development of a new classification method is described here. It is based upon the analysis of textural features in conjunction with ground truth sampling. The processing and classification result of two geologically distinct areas in nearshore regions of Lake Superior; off the Lester River,MN and Amnicon River, WI are presented here, using the Minnesota Supercomputer Institute's Mesabi computing cluster for initial processing. Processed data is then calibrated using ground truth samples to conduct an accuracy assessment of the surveyed areas. From analysis of high-resolution bathymetry data collected at both survey sites is was possible to successfully calculate a series of measures that describe textural information about the lake floor. Further processing suggests that the features calculated capture a significant amount of statistical information about the lake floor terrain as well. Two sources of error, an anomalous heave and refraction error significantly deteriorated the quality of the processed data and resulting validate results. Ground truth samples used to validate the classification methods utilized for both survey sites, however, resulted in accuracy values ranging from 5 -30 percent at the Amnicon River, and between 60-70 percent for the Lester River. The final results suggest that this new processing methodology does adequately capture textural information about the lake floor and does provide an acceptable classification in the absence of significant data quality issues.
High Agreement and High Prevalence: The Paradox of Cohen's Kappa.
Zec, Slavica; Soriani, Nicola; Comoretto, Rosanna; Baldi, Ileana
2017-01-01
Cohen's Kappa is the most used agreement statistic in literature. However, under certain conditions, it is affected by a paradox which returns biased estimates of the statistic itself. The aim of the study is to provide sufficient information which allows the reader to make an informed choice of the correct agreement measure, by underlining some optimal properties of Gwet's AC1 in comparison to Cohen's Kappa, using a real data example. During the process of literature review, we have asked a panel of three evaluators to come up with a judgment on the quality of 57 randomized controlled trials assigning a score to each trial using the Jadad scale. The quality was evaluated according to the following dimensions: adopted design, randomization unit, type of primary endpoint. With respect to each of the above described features, the agreement between the three evaluators has been calculated using Cohen's Kappa statistic and Gwet's AC1 statistic and, finally, the values have been compared with the observed agreement. The values of the Cohen's Kappa statistic would lead to believe that the agreement levels for the variables Unit, Design and Primary Endpoints are totally unsatisfactory. The AC1 statistic, on the contrary, shows plausible values which are in line with the respective values of the observed concordance. We conclude that it would always be appropriate to adopt the AC1 statistic, thus bypassing any risk of incurring the paradox and drawing wrong conclusions about the results of agreement analysis.
Toffan, Adam; Alexander, Marion J L; Peeler, Jason
2017-07-28
The purpose of the study was to compare the most effective joint movements, segment velocities and body positions to perform the fastest and most accurate pass of high school and university football quarterbacks. Secondary purposes were to develop a quarterback throwing test to assess skill level, to determine which kinematic variables were different between high school and university athletes as well as to determine which variables were significant predictors of quarterback throwing test performance. Ten high school and ten university athletes were filmed for the study, performing nine passes at a target and two passes for maximum distance. Thirty variables were measured using Dartfish Team Pro 4.5.2 video analysis system, and Microsoft Excel was used for statistical analysis. University athletes scored slightly higher than the high school athletes on the throwing test, however this result was not statistically significant. Correlation analysis and forward stepwise multiple regression analysis was performed on both the high school players and the university players in order to determine which variables were significant predictors of throwing test score. Ball velocity was determined to have the strongest predictive effect on throwing test score (r = 0.900) for the high school athletes, however, position of the back foot at release was also determined to be important (r = 0.661) for the university group. Several significant differences in throwing technique between groups were noted during the pass, however, body position at release showed the greatest differences between the two groups. High school players could benefit from more complete weight transfer and decreased throw time to increase throwing test score. University athletes could benefit from increased throw time and greater range of motion in external shoulder rotation and trunk rotation to increase their throwing test score. Coaches and practitioners will be able to use the findings of this research to help improve these and related throwing variables in their high school and university quarterbacks.
NASA Astrophysics Data System (ADS)
Kunert, Anna Theresa; Scheel, Jan Frederik; Helleis, Frank; Klimach, Thomas; Pöschl, Ulrich; Fröhlich-Nowoisky, Janine
2016-04-01
Freezing of water above homogeneous freezing is catalyzed by ice nucleation active (INA) particles called ice nuclei (IN), which can be of various inorganic or biological origin. The freezing temperatures reach up to -1 °C for some biological samples and are dependent on the chemical composition of the IN. The standard method to analyze IN in solution is the droplet freezing assay (DFA) established by Gabor Vali in 1970. Several modifications and improvements were already made within the last decades, but they are still limited by either small droplet numbers, large droplet volumes or inadequate separation of the single droplets resulting in mutual interferences and therefore improper measurements. The probability that miscellaneous IN are concentrated together in one droplet increases with the volume of the droplet, which can be described by the Poisson distribution. At a given concentration, the partition of a droplet into several smaller droplets leads to finely dispersed IN resulting in better statistics and therefore in a better resolution of the nucleation spectrum. We designed a new customized high-performance droplet freezing assay (HP-DFA), which represents an upgrade of the previously existing DFAs in terms of temperature range and statistics. The necessity of observing freezing events at temperatures lower than homogeneous freezing due to freezing point depression, requires high-performance thermostats combined with an optimal insulation. Furthermore, we developed a cooling setup, which allows both huge and tiny temperature changes within a very short period of time. Besides that, the new DFA provides the analysis of more than 750 droplets per run with a small droplet volume of 5 μL. This enables a fast and more precise analysis of biological samples with complex IN composition as well as better statistics for every sample at the same time.
A Network-Based Method to Assess the Statistical Significance of Mild Co-Regulation Effects
Horvát, Emőke-Ágnes; Zhang, Jitao David; Uhlmann, Stefan; Sahin, Özgür; Zweig, Katharina Anna
2013-01-01
Recent development of high-throughput, multiplexing technology has initiated projects that systematically investigate interactions between two types of components in biological networks, for instance transcription factors and promoter sequences, or microRNAs (miRNAs) and mRNAs. In terms of network biology, such screening approaches primarily attempt to elucidate relations between biological components of two distinct types, which can be represented as edges between nodes in a bipartite graph. However, it is often desirable not only to determine regulatory relationships between nodes of different types, but also to understand the connection patterns of nodes of the same type. Especially interesting is the co-occurrence of two nodes of the same type, i.e., the number of their common neighbours, which current high-throughput screening analysis fails to address. The co-occurrence gives the number of circumstances under which both of the biological components are influenced in the same way. Here we present SICORE, a novel network-based method to detect pairs of nodes with a statistically significant co-occurrence. We first show the stability of the proposed method on artificial data sets: when randomly adding and deleting observations we obtain reliable results even with noise exceeding the expected level in large-scale experiments. Subsequently, we illustrate the viability of the method based on the analysis of a proteomic screening data set to reveal regulatory patterns of human microRNAs targeting proteins in the EGFR-driven cell cycle signalling system. Since statistically significant co-occurrence may indicate functional synergy and the mechanisms underlying canalization, and thus hold promise in drug target identification and therapeutic development, we provide a platform-independent implementation of SICORE with a graphical user interface as a novel tool in the arsenal of high-throughput screening analysis. PMID:24039936
Statistical validation of a solar wind propagation model from 1 to 10 AU
NASA Astrophysics Data System (ADS)
Zieger, Bertalan; Hansen, Kenneth C.
2008-08-01
A one-dimensional (1-D) numerical magnetohydrodynamic (MHD) code is applied to propagate the solar wind from 1 AU through 10 AU, i.e., beyond the heliocentric distance of Saturn's orbit, in a non-rotating frame of reference. The time-varying boundary conditions at 1 AU are obtained from hourly solar wind data observed near the Earth. Although similar MHD simulations have been carried out and used by several authors, very little work has been done to validate the statistical accuracy of such solar wind predictions. In this paper, we present an extensive analysis of the prediction efficiency, using 12 selected years of solar wind data from the major heliospheric missions Pioneer, Voyager, and Ulysses. We map the numerical solution to each spacecraft in space and time, and validate the simulation, comparing the propagated solar wind parameters with in-situ observations. We do not restrict our statistical analysis to the times of spacecraft alignment, as most of the earlier case studies do. Our superposed epoch analysis suggests that the prediction efficiency is significantly higher during periods with high recurrence index of solar wind speed, typically in the late declining phase of the solar cycle. Among the solar wind variables, the solar wind speed can be predicted to the highest accuracy, with a linear correlation of 0.75 on average close to the time of opposition. We estimate the accuracy of shock arrival times to be as high as 10-15 hours within ±75 d from apparent opposition during years with high recurrence index. During solar activity maximum, there is a clear bias for the model to predicted shocks arriving later than observed in the data, suggesting that during these periods, there is an additional acceleration mechanism in the solar wind that is not included in the model.
Pohl, Lydia; Kölbl, Angelika; Werner, Florian; Mueller, Carsten W; Höschen, Carmen; Häusler, Werner; Kögel-Knabner, Ingrid
2018-04-30
Aluminium (Al)-substituted goethite is ubiquitous in soils and sediments. The extent of Al-substitution affects the physicochemical properties of the mineral and influences its macroscale properties. Bulk analysis only provides total Al/Fe ratios without providing information with respect to the Al-substitution of single minerals. Here, we demonstrate that nanoscale secondary ion mass spectrometry (NanoSIMS) enables the precise determination of Al-content in single minerals, while simultaneously visualising the variation of the Al/Fe ratio. Al-substituted goethite samples were synthesized with increasing Al concentrations of 0.1, 3, and 7 % and analysed by NanoSIMS in combination with established bulk spectroscopic methods (XRD, FTIR, Mössbauer spectroscopy). The high spatial resolution (50-150 nm) of NanoSIMS is accompanied by a high number of single-point measurements. We statistically evaluated the Al/Fe ratios derived from NanoSIMS, while maintaining the spatial information and reassigning it to its original localization. XRD analyses confirmed increasing concentration of incorporated Al within the goethite structure. Mössbauer spectroscopy revealed 11 % of the goethite samples generated at high Al concentrations consisted of hematite. The NanoSIMS data show that the Al/Fe ratios are in agreement with bulk data derived from total digestion and demonstrated small spatial variability between single-point measurements. More advantageously, statistical analysis and reassignment of single-point measurements allowed us to identify distinct spots with significantly higher or lower Al/Fe ratios. NanoSIMS measurements confirmed the capacity to produce images, which indicated the uniform increase in Al-concentrations in goethite. Using a combination of statistical analysis with information from complementary spectroscopic techniques (XRD, FTIR and Mössbauer spectroscopy) we were further able to reveal spots with lower Al/Fe ratios as hematite. Copyright © 2018 John Wiley & Sons, Ltd.
Interfaces between statistical analysis packages and the ESRI geographic information system
NASA Technical Reports Server (NTRS)
Masuoka, E.
1980-01-01
Interfaces between ESRI's geographic information system (GIS) data files and real valued data files written to facilitate statistical analysis and display of spatially referenced multivariable data are described. An example of data analysis which utilized the GIS and the statistical analysis system is presented to illustrate the utility of combining the analytic capability of a statistical package with the data management and display features of the GIS.
NASA Astrophysics Data System (ADS)
Lamb, Derek A.
2016-10-01
While sunspots follow a well-defined pattern of emergence in space and time, small-scale flux emergence is assumed to occur randomly at all times in the quiet Sun. HMI's full-disk coverage, high cadence, spatial resolution, and duty cycle allow us to probe that basic assumption. Some case studies of emergence suggest that temporal clustering on spatial scales of 50-150 Mm may occur. If clustering is present, it could serve as a diagnostic of large-scale subsurface magnetic field structures. We present the results of a manual survey of small-scale flux emergence events over a short time period, and a statistical analysis addressing the question of whether these events show spatio-temporal behavior that is anything other than random.
Zone trends for three metropolitan statistical areas in North Carolina
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oommen, R.G.; Aneja, V.P.; Riordan, A.J.
1996-12-31
As part of an effort by the state of North Carolina to develop a State Implementation Plan (SIP) for ozone control, a network of ozone stations was established to monitor ozone concentrations across the state. Approximately twenty-five ozone stations made continuous measurements surrounding the three major Metropolitan Statistical Areas (MSAs) between 1993-1995: Raleigh/Durham (RDU), Charlotte/Mecklenburg (CLT), and Greensboro/Winston-Salem/High Point (GSO). Statistical Averages on the ozone data were performed at each MSA to study trends and/or relationships on high ozone days. It was found that the three MSAs were not significantly different to each other, indicating they fall under the samemore » synoptic weather patterns, Transport and local production of biogenic sources of VOCs and NO{sub x} appear to play an important role for high ozone downwind at RDU, while mobile sources of these precursor gases contribute to the high ozone downwind of CLT and GSO. A {delta}(O{sub 3}) analysis (difference between the O{sub 3} measured at an upwind and downwind site) suggested that long-range transport of the precursors was a significant contribution for ozone problems at the three MSAs. 15 refs., 2 figs., 2 tabs.« less
Barakat-Haddad, Caroline
2013-01-01
This study examined the prevalence of high blood pressure, heart disease, and medical diagnoses in relation to blood disorders, among 6,329 adolescent students (age 15 to 18 years) who reside in the United Arab Emirates (UAE). Findings indicated that the overall prevalence of high blood pressure and heart disease was 1.8% and 1.3%, respectively. Overall, the prevalence for thalassemia, sickle-cell anemia, and iron-deficiency anemia was 0.9%, 1.6%, and 5%, respectively. Bivariate analysis revealed statistically significant differences in the prevalence of high blood pressure among the local and expatriate adolescent population in the Emirate of Sharjah. Similarly, statistically significant differences in the prevalence of iron-deficiency anemia were observed among the local and expatriate population in Abu Dhabi city, the western region of Abu Dhabi, and Al-Ain. Multivariate analysis revealed the following significant predictors of high blood pressure: residing in proximity to industry, nonconventional substance abuse, and age when smoking or exposure to smoking began. Ethnicity was a significant predictor of heart disease, thalassemia, sickle-cell anemia, and iron-deficiency anemia. In addition, predictors of thalassemia included gender (female) and participating in physical activity. Participants diagnosed with sickle-cell anemia and iron-deficiency anemia were more likely to experience different physical activities. PMID:23606864
Common pitfalls in statistical analysis: Clinical versus statistical significance
Ranganathan, Priya; Pramesh, C. S.; Buyse, Marc
2015-01-01
In clinical research, study results, which are statistically significant are often interpreted as being clinically important. While statistical significance indicates the reliability of the study results, clinical significance reflects its impact on clinical practice. The third article in this series exploring pitfalls in statistical analysis clarifies the importance of differentiating between statistical significance and clinical significance. PMID:26229754
Bermúdez, Valmore; Aparicio, Daniel; Rojas, Edward; Peñaranda, Lianny; Finol, Freddy; Acosta, Luis; Mengual, Edgardo; Rojas, Joselyn; Arráiz, Nailet; Toledo, Alexandra; Colmenares, Carlos; Urribarí, Jesica; Sanchez, Wireynis; Pineda, Carlos; Rodriguez, Dalia; Faria, Judith; Añez, Roberto; Cano, Raquel; Cano, Clímaco; Sorell, Luis; Velasco, Manuel
2010-01-01
Coronary artery disease is the main cause of death worldwide. Lipoprotein(a) [Lp(a)], is an independent risk factor for coronary artery disease in which concentrations are genetically regulated. Contradictory results have been published about physical activity influence on Lp(a) concentration. This research aimed to determine associations between different physical activity levels and Lp(a) concentration. A descriptive and cross-sectional study was made in 1340 randomly selected subjects (males = 598; females = 712) to whom a complete clinical history, the International Physical Activity Questionnaire, and Lp(a) level determination were made. Statistical analysis was carried out to assess qualitative variables relationship by chi2 and differences between means by one-way analysis of variance considering a P value <0.05 as statistically significant. Results are shown as absolute frequencies, percentages, and mean +/- standard deviation according to case. Physical activity levels were ordinal classified as follows: low activity with 24.3% (n = 318), moderate activity with 35.0% (n = 458), and high physical activity with 40.8% (n = 534). Lp(a) concentration in the studied sample was 26.28 +/- 12.64 (IC: 25.59-26.96) mg/dL. Lp(a) concentration according to low, moderate, and high physical activity levels were 29.22 +/- 13.74, 26.27 +/- 12.91, and 24.53 +/- 11.35 mg/dL, respectively, observing statistically significant differences between low and moderate level (P = 0.004) and low and high level (P < 0.001). A strong association (chi2 = 9.771; P = 0.002) was observed among a high physical activity level and a normal concentration of Lp(a) (less than 30 mg/dL). A lifestyle characterized by high physical activity is associated with normal Lp(a) levels.
Redmond, Shelagh M.; Alexander-Kisslig, Karin; Woodhall, Sarah C.; van den Broek, Ingrid V. F.; van Bergen, Jan; Ward, Helen; Uusküla, Anneli; Herrmann, Björn; Andersen, Berit; Götz, Hannelore M.; Sfetcu, Otilia; Low, Nicola
2015-01-01
Background Accurate information about the prevalence of Chlamydia trachomatis is needed to assess national prevention and control measures. Methods We systematically reviewed population-based cross-sectional studies that estimated chlamydia prevalence in European Union/European Economic Area (EU/EEA) Member States and non-European high income countries from January 1990 to August 2012. We examined results in forest plots, explored heterogeneity using the I2 statistic, and conducted random effects meta-analysis if appropriate. Meta-regression was used to examine the relationship between study characteristics and chlamydia prevalence estimates. Results We included 25 population-based studies from 11 EU/EEA countries and 14 studies from five other high income countries. Four EU/EEA Member States reported on nationally representative surveys of sexually experienced adults aged 18–26 years (response rates 52–71%). In women, chlamydia point prevalence estimates ranged from 3.0–5.3%; the pooled average of these estimates was 3.6% (95% CI 2.4, 4.8, I2 0%). In men, estimates ranged from 2.4–7.3% (pooled average 3.5%; 95% CI 1.9, 5.2, I2 27%). Estimates in EU/EEA Member States were statistically consistent with those in other high income countries (I2 0% for women, 6% for men). There was statistical evidence of an association between survey response rate and estimated chlamydia prevalence; estimates were higher in surveys with lower response rates, (p = 0.003 in women, 0.018 in men). Conclusions Population-based surveys that estimate chlamydia prevalence are at risk of participation bias owing to low response rates. Estimates obtained in nationally representative samples of the general population of EU/EEA Member States are similar to estimates from other high income countries. PMID:25615574
Targeting regional pediatric congenital hearing loss using a spatial scan statistic.
Bush, Matthew L; Christian, Warren Jay; Bianchi, Kristin; Lester, Cathy; Schoenberg, Nancy
2015-01-01
Congenital hearing loss is a common problem, and timely identification and intervention are paramount for language development. Patients from rural regions may have many barriers to timely diagnosis and intervention. The purpose of this study was to examine the spatial and hospital-based distribution of failed infant hearing screening testing and pediatric congenital hearing loss throughout Kentucky. Data on live births and audiological reporting of infant hearing loss results in Kentucky from 2009 to 2011 were analyzed. The authors used spatial scan statistics to identify high-rate clusters of failed newborn screening tests and permanent congenital hearing loss (PCHL), based on the total number of live births per county. The authors conducted further analyses on PCHL and failed newborn hearing screening tests, based on birth hospital data and method of screening. The authors observed four statistically significant (p < 0.05) high-rate clusters with failed newborn hearing screenings in Kentucky, including two in the Appalachian region. Hospitals using two-stage otoacoustic emission testing demonstrated higher rates of failed screening (p = 0.009) than those using two-stage automated auditory brainstem response testing. A significant cluster of high rate of PCHL was observed in Western Kentucky. Five of the 54 birthing hospitals were found to have higher relative risk of PCHL, and two of those hospitals are located in a very rural region of Western Kentucky within the cluster. This spatial analysis in children in Kentucky has identified specific regions throughout the state with high rates of congenital hearing loss and failed newborn hearing screening tests. Further investigation regarding causative factors is warranted. This method of analysis can be useful in the setting of hearing health disparities to focus efforts on regions facing high incidence of congenital hearing loss.
Transit safety & security statistics & analysis 2003 annual report (formerly SAMIS)
DOT National Transportation Integrated Search
2005-12-01
The Transit Safety & Security Statistics & Analysis 2003 Annual Report (formerly SAMIS) is a compilation and analysis of mass transit accident, casualty, and crime statistics reported under the Federal Transit Administrations (FTAs) National Tr...
Transit safety & security statistics & analysis 2002 annual report (formerly SAMIS)
DOT National Transportation Integrated Search
2004-12-01
The Transit Safety & Security Statistics & Analysis 2002 Annual Report (formerly SAMIS) is a compilation and analysis of mass transit accident, casualty, and crime statistics reported under the Federal Transit Administrations (FTAs) National Tr...
ROOT: A C++ framework for petabyte data storage, statistical analysis and visualization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Antcheva, I.; /CERN; Ballintijn, M.
2009-01-01
ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community, designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a machine-independent compressed binary format. In ROOT the TTree object container is optimized for statistical data analysis over very large data sets by using vertical data storage techniques. These containers can span a large number of files on local disks, the web or a number of different shared file systems. In order to analyze this data, the user can chose outmore » of a wide set of mathematical and statistical functions, including linear algebra classes, numerical algorithms such as integration and minimization, and various methods for performing regression analysis (fitting). In particular, the RooFit package allows the user to perform complex data modeling and fitting while the RooStats library provides abstractions and implementations for advanced statistical tools. Multivariate classification methods based on machine learning techniques are available via the TMVA package. A central piece in these analysis tools are the histogram classes which provide binning of one- and multi-dimensional data. Results can be saved in high-quality graphical formats like Postscript and PDF or in bitmap formats like JPG or GIF. The result can also be stored into ROOT macros that allow a full recreation and rework of the graphics. Users typically create their analysis macros step by step, making use of the interactive C++ interpreter CINT, while running over small data samples. Once the development is finished, they can run these macros at full compiled speed over large data sets, using on-the-fly compilation, or by creating a stand-alone batch program. Finally, if processing farms are available, the user can reduce the execution time of intrinsically parallel tasks - e.g. data mining in HEP - by using PROOF, which will take care of optimally distributing the work over the available resources in a transparent way.« less
Incipient fault detection study for advanced spacecraft systems
NASA Technical Reports Server (NTRS)
Milner, G. Martin; Black, Michael C.; Hovenga, J. Mike; Mcclure, Paul F.
1986-01-01
A feasibility study to investigate the application of vibration monitoring to the rotating machinery of planned NASA advanced spacecraft components is described. Factors investigated include: (1) special problems associated with small, high RPM machines; (2) application across multiple component types; (3) microgravity; (4) multiple fault types; (5) eight different analysis techniques including signature analysis, high frequency demodulation, cepstrum, clustering, amplitude analysis, and pattern recognition are compared; and (6) small sample statistical analysis is used to compare performance by computation of probability of detection and false alarm for an ensemble of repeated baseline and faulted tests. Both detection and classification performance are quantified. Vibration monitoring is shown to be an effective means of detecting the most important problem types for small, high RPM fans and pumps typical of those planned for the advanced spacecraft. A preliminary monitoring system design and implementation plan is presented.
A flexible, interpretable framework for assessing sensitivity to unmeasured confounding.
Dorie, Vincent; Harada, Masataka; Carnegie, Nicole Bohme; Hill, Jennifer
2016-09-10
When estimating causal effects, unmeasured confounding and model misspecification are both potential sources of bias. We propose a method to simultaneously address both issues in the form of a semi-parametric sensitivity analysis. In particular, our approach incorporates Bayesian Additive Regression Trees into a two-parameter sensitivity analysis strategy that assesses sensitivity of posterior distributions of treatment effects to choices of sensitivity parameters. This results in an easily interpretable framework for testing for the impact of an unmeasured confounder that also limits the number of modeling assumptions. We evaluate our approach in a large-scale simulation setting and with high blood pressure data taken from the Third National Health and Nutrition Examination Survey. The model is implemented as open-source software, integrated into the treatSens package for the R statistical programming language. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
A Flexible Approach for the Statistical Visualization of Ensemble Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Potter, K.; Wilson, A.; Bremer, P.
2009-09-29
Scientists are increasingly moving towards ensemble data sets to explore relationships present in dynamic systems. Ensemble data sets combine spatio-temporal simulation results generated using multiple numerical models, sampled input conditions and perturbed parameters. While ensemble data sets are a powerful tool for mitigating uncertainty, they pose significant visualization and analysis challenges due to their complexity. We present a collection of overview and statistical displays linked through a high level of interactivity to provide a framework for gaining key scientific insight into the distribution of the simulation results as well as the uncertainty associated with the data. In contrast to methodsmore » that present large amounts of diverse information in a single display, we argue that combining multiple linked statistical displays yields a clearer presentation of the data and facilitates a greater level of visual data analysis. We demonstrate this approach using driving problems from climate modeling and meteorology and discuss generalizations to other fields.« less
Synchronization from Second Order Network Connectivity Statistics
Zhao, Liqiong; Beverlin, Bryce; Netoff, Theoden; Nykamp, Duane Q.
2011-01-01
We investigate how network structure can influence the tendency for a neuronal network to synchronize, or its synchronizability, independent of the dynamical model for each neuron. The synchrony analysis takes advantage of the framework of second order networks, which defines four second order connectivity statistics based on the relative frequency of two-connection network motifs. The analysis identifies two of these statistics, convergent connections, and chain connections, as highly influencing the synchrony. Simulations verify that synchrony decreases with the frequency of convergent connections and increases with the frequency of chain connections. These trends persist with simulations of multiple models for the neuron dynamics and for different types of networks. Surprisingly, divergent connections, which determine the fraction of shared inputs, do not strongly influence the synchrony. The critical role of chains, rather than divergent connections, in influencing synchrony can be explained by their increasing the effective coupling strength. The decrease of synchrony with convergent connections is primarily due to the resulting heterogeneity in firing rates. PMID:21779239
Spatial statistical analysis of tree deaths using airborne digital imagery
NASA Astrophysics Data System (ADS)
Chang, Ya-Mei; Baddeley, Adrian; Wallace, Jeremy; Canci, Michael
2013-04-01
High resolution digital airborne imagery offers unprecedented opportunities for observation and monitoring of vegetation, providing the potential to identify, locate and track individual vegetation objects over time. Analytical tools are required to quantify relevant information. In this paper, locations of trees over a large area of native woodland vegetation were identified using morphological image analysis techniques. Methods of spatial point process statistics were then applied to estimate the spatially-varying tree death risk, and to show that it is significantly non-uniform. [Tree deaths over the area were detected in our previous work (Wallace et al., 2008).] The study area is a major source of ground water for the city of Perth, and the work was motivated by the need to understand and quantify vegetation changes in the context of water extraction and drying climate. The influence of hydrological variables on tree death risk was investigated using spatial statistics (graphical exploratory methods, spatial point pattern modelling and diagnostics).
Teusch, V I; Wohlgemuth, W A; Piehler, A P; Jung, E M
2014-01-01
Aim of our pilot study was the application of a contrast-enhanced color-coded ultrasound perfusion analysis in patients with vascular malformations to quantify microcirculatory alterations. 28 patients (16 female, 12 male, mean age 24.9 years) with high flow (n = 6) or slow-flow (n = 22) malformations were analyzed before intervention. An experienced examiner performed a color-coded Doppler sonography (CCDS) and a Power Doppler as well as a contrast-enhanced ultrasound after intravenous bolus injection of 1 - 2.4 ml of a second-generation ultrasound contrast medium (SonoVue®, Bracco, Milan). The contrast-enhanced examination was documented as a cine sequence over 60 s. The quantitative analysis based on color-coded contrast-enhanced ultrasound (CEUS) images included percentage peak enhancement (%peak), time to peak (TTP), area under the curve (AUC), and mean transit time (MTT). No side effects occurred after intravenous contrast injection. The mean %peak in arteriovenous malformations was almost twice as high as in slow-flow-malformations. The area under the curve was 4 times higher in arteriovenous malformations compared to the mean value of other malformations. The mean transit time was 1.4 times higher in high-flow-malformations compared to slow-flow-malformations. There was no difference regarding the time to peak between the different malformation types. The comparison between all vascular malformation and surrounding tissue showed statistically significant differences for all analyzed data (%peak, TTP, AUC, MTT; p < 0.01). High-flow and slow-flow vascular malformations had statistically significant differences in %peak (p < 0.01), AUC analysis (p < 0.01), and MTT (p < 0.05). Color-coded perfusion analysis of CEUS seems to be a promising technique for the dynamic assessment of microvasculature in vascular malformations.
Wang, Ming; Long, Qi
2016-09-01
Prediction models for disease risk and prognosis play an important role in biomedical research, and evaluating their predictive accuracy in the presence of censored data is of substantial interest. The standard concordance (c) statistic has been extended to provide a summary measure of predictive accuracy for survival models. Motivated by a prostate cancer study, we address several issues associated with evaluating survival prediction models based on c-statistic with a focus on estimators using the technique of inverse probability of censoring weighting (IPCW). Compared to the existing work, we provide complete results on the asymptotic properties of the IPCW estimators under the assumption of coarsening at random (CAR), and propose a sensitivity analysis under the mechanism of noncoarsening at random (NCAR). In addition, we extend the IPCW approach as well as the sensitivity analysis to high-dimensional settings. The predictive accuracy of prediction models for cancer recurrence after prostatectomy is assessed by applying the proposed approaches. We find that the estimated predictive accuracy for the models in consideration is sensitive to NCAR assumption, and thus identify the best predictive model. Finally, we further evaluate the performance of the proposed methods in both settings of low-dimensional and high-dimensional data under CAR and NCAR through simulations. © 2016, The International Biometric Society.
NASA Astrophysics Data System (ADS)
Kim, Ji-Soo; Han, Soo-Hyung; Ryang, Woo-Hun
2001-12-01
Electrical resistivity mapping was conducted to delineate boundaries and architecture of the Eumsung Basin Cretaceous. Basin boundaries are effectively clarified in electrical dipole-dipole resistivity sections as high-resistivity contrast bands. High resistivities most likely originate from the basement of Jurassic granite and Precambrian gneiss, contrasting with the lower resistivities from infilled sedimentary rocks. The electrical properties of basin-margin boundaries are compatible with the results of vertical electrical soundings and very-low-frequency electromagnetic surveys. A statistical analysis of the resistivity sections is tested in terms of standard deviation and is found to be an effective scheme for the subsurface reconstruction of basin architecture as well as the surface demarcation of basin-margin faults and brittle fracture zones, characterized by much higher standard deviation. Pseudo three-dimensional architecture of the basin is delineated by integrating the composite resistivity structure information from two cross-basin E-W magnetotelluric lines and dipole-dipole resistivity lines. Based on statistical analysis, the maximum depth of the basin varies from about 1 km in the northern part to 3 km or more in the middle part. This strong variation supports the view that the basin experienced pull-apart opening with rapid subsidence of the central blocks and asymmetric cross-basinal extension.
40 CFR 796.2750 - Sediment and soil adsorption isotherm.
Code of Federal Regulations, 2014 CFR
2014-07-01
... are highly reproducible. The test provides excellent quantitative data readily amenable to statistical... combination of methods suitable for the identification and quantitative detection of the parent test chemical... quantitative analysis of the parent chemical. (3) Amount of parent test chemical applied, the amount recovered...
40 CFR 796.2750 - Sediment and soil adsorption isotherm.
Code of Federal Regulations, 2013 CFR
2013-07-01
... highly reproducible. The test provides excellent quantitative data readily amenable to statistical... combination of methods suitable for the identification and quantitative detection of the parent test chemical... quantitative analysis of the parent chemical. (3) Amount of parent test chemical applied, the amount recovered...
40 CFR 796.2750 - Sediment and soil adsorption isotherm.
Code of Federal Regulations, 2012 CFR
2012-07-01
... highly reproducible. The test provides excellent quantitative data readily amenable to statistical... combination of methods suitable for the identification and quantitative detection of the parent test chemical... quantitative analysis of the parent chemical. (3) Amount of parent test chemical applied, the amount recovered...
NASA Astrophysics Data System (ADS)
Suzuki, Masahiro; Nakade, Koji; Ido, Atsushi
As the maximum speed of high-speed trains increases, flow-induced vibration of trains in tunnels has become a subject of discussion in Japan. In this paper, we report the result of a study on use of modifications of train shapes as a countermeasure for reducing an unsteady aerodynamic force by on-track tests and a wind tunnel test. First, we conduct a statistical analysis of on-track test data to identify exterior parts of a train which cause the unsteady aerodynamic force. Next, we carry out a wind tunnel test to measure the unsteady aerodynamic force acting on a train in a tunnel and examined train shapes with a particular emphasis on the exterior parts identified by the statistical analysis. The wind tunnel test shows that fins under the car body are effective in reducing the unsteady aerodynamic force. Finally, we test the fins by an on-track test and confirmed its effectiveness.
ERIC Educational Resources Information Center
Petocz, Agnes; Newbery, Glenn
2010-01-01
Statistics education in psychology often falls disappointingly short of its goals. The increasing use of qualitative approaches in statistics education research has extended and enriched our understanding of statistical cognition processes, and thus facilitated improvements in statistical education and practices. Yet conceptual analysis, a…
Time Series Analysis Based on Running Mann Whitney Z Statistics
USDA-ARS?s Scientific Manuscript database
A sensitive and objective time series analysis method based on the calculation of Mann Whitney U statistics is described. This method samples data rankings over moving time windows, converts those samples to Mann-Whitney U statistics, and then normalizes the U statistics to Z statistics using Monte-...
Analyzing Immunoglobulin Repertoires
Chaudhary, Neha; Wesemann, Duane R.
2018-01-01
Somatic assembly of T cell receptor and B cell receptor (BCR) genes produces a vast diversity of lymphocyte antigen recognition capacity. The advent of efficient high-throughput sequencing of lymphocyte antigen receptor genes has recently generated unprecedented opportunities for exploration of adaptive immune responses. With these opportunities have come significant challenges in understanding the analysis techniques that most accurately reflect underlying biological phenomena. In this regard, sample preparation and sequence analysis techniques, which have largely been borrowed and adapted from other fields, continue to evolve. Here, we review current methods and challenges of library preparation, sequencing and statistical analysis of lymphocyte receptor repertoire studies. We discuss the general steps in the process of immune repertoire generation including sample preparation, platforms available for sequencing, processing of sequencing data, measurable features of the immune repertoire, and the statistical tools that can be used for analysis and interpretation of the data. Because BCR analysis harbors additional complexities, such as immunoglobulin (Ig) (i.e., antibody) gene somatic hypermutation and class switch recombination, the emphasis of this review is on Ig/BCR sequence analysis. PMID:29593723
Gene- and pathway-based association tests for multiple traits with GWAS summary statistics.
Kwak, Il-Youp; Pan, Wei
2017-01-01
To identify novel genetic variants associated with complex traits and to shed new insights on underlying biology, in addition to the most popular single SNP-single trait association analysis, it would be useful to explore multiple correlated (intermediate) traits at the gene- or pathway-level by mining existing single GWAS or meta-analyzed GWAS data. For this purpose, we present an adaptive gene-based test and a pathway-based test for association analysis of multiple traits with GWAS summary statistics. The proposed tests are adaptive at both the SNP- and trait-levels; that is, they account for possibly varying association patterns (e.g. signal sparsity levels) across SNPs and traits, thus maintaining high power across a wide range of situations. Furthermore, the proposed methods are general: they can be applied to mixed types of traits, and to Z-statistics or P-values as summary statistics obtained from either a single GWAS or a meta-analysis of multiple GWAS. Our numerical studies with simulated and real data demonstrated the promising performance of the proposed methods. The methods are implemented in R package aSPU, freely and publicly available at: https://cran.r-project.org/web/packages/aSPU/ CONTACT: weip@biostat.umn.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Morphological texture assessment of oral bone as a screening tool for osteoporosis
NASA Astrophysics Data System (ADS)
Analoui, Mostafa; Eggertsson, Hafsteinn; Eckert, George
2001-07-01
Three classes of texture analysis approaches have been employed to assess the textural characteristic of oral bone. A set of linear structuring elements was used to compute granulometric features of trabecular bone. Multifractal analysis was also used to compute the fractal dimension of the corresponding tissues. In addition, some statistical features and histomorphometric parameters were computed. To assess the proposed approach we acquired digital intraoral radiographs of 47 subjects (14 males and 33 females). All radiographs were captured at 12 bits/pixel. Images were converted to binary form through a sliding locally adaptive thresholding approach. Each subject was scanned by DEXA for bone dosimetry. Subject were classified into one of the following three categories according World Health Organization (WHO) standard (1) healthy, (2) with osteopenia and (3) osteoporosis. In this study fractal dimension showed very low correlation with bone mineral density (BMD) measurements, which did not reach a level of statistical significance (p<0.5). However, entropy of pattern spectrum (EPS), along with statistical features and histomorphometric parameters, has shown correlation coefficients ranging from low to high, with statistical significance for both males and females. The results of this study indicate the utility of this approach for bone texture analysis. It is conjectured that designing a 2-D structuring element, specially tuned to trabecular bone texture, will increase the efficacy of the proposed method.
NASA Astrophysics Data System (ADS)
Druzgalski, Clara; Mani, Ali
2016-11-01
We investigate electroconvection and its impact on ion transport in a model system comprised of an ion-selective membrane, an aqueous electrolyte, and an external electric field applied normal to the membrane. We develop a direct numerical simulation code to solve the governing Poisson-Nernst-Planck and Navier-Stokes equations in three dimensions using a specialized parallel numerical algorithm and sufficient resolution to capture the high frequency and high wavenumber physics. We show a comprehensive statistical analysis of the transport phenomena in the highly chaotic regime. Qualitative and quantitative comparisons of two-dimensional (2D) and 3D simulations include prediction of the mean concentration fields as well as the spectra of concentration, charge density, and velocity signals. Our analyses reveal a significant quantitative difference between 2D and 3D electroconvection. Furthermore, we show that high-intensity yet short-lived current density hot spots appear randomly on the membrane surface, contributing significantly to the mean current density. By examining cross correlations between current density on the membrane and other field quantities we explore the physical mechanisms leading to current hot spots. We also present analysis of transport fluxes in the context of ensemble-averaged equations. Our analysis reveals that in the highly chaotic regime the mixing layer (ML), which spans the majority of the domain extent, is governed by advective fluctuations. Furthermore, we show that in the ML the mean electromigration fluxes cancel out for positive and negative ions, indicating that the mean transport of total salt content within the ML can be represented via the electroneutral approximation. Finally, we present an assessment of the importance of different length scales in enhancing transport by computing the cross covariance of concentration and velocity fluctuations in the wavenumber space. Our analysis indicates that in the majority of the domain the large scales contribute most significantly to transport, while the effects of small scales become more appreciable in regions very near the membrane.
Data series embedding and scale invariant statistics.
Michieli, I; Medved, B; Ristov, S
2010-06-01
Data sequences acquired from bio-systems such as human gait data, heart rate interbeat data, or DNA sequences exhibit complex dynamics that is frequently described by a long-memory or power-law decay of autocorrelation function. One way of characterizing that dynamics is through scale invariant statistics or "fractal-like" behavior. For quantifying scale invariant parameters of physiological signals several methods have been proposed. Among them the most common are detrended fluctuation analysis, sample mean variance analyses, power spectral density analysis, R/S analysis, and recently in the realm of the multifractal approach, wavelet analysis. In this paper it is demonstrated that embedding the time series data in the high-dimensional pseudo-phase space reveals scale invariant statistics in the simple fashion. The procedure is applied on different stride interval data sets from human gait measurements time series (Physio-Bank data library). Results show that introduced mapping adequately separates long-memory from random behavior. Smaller gait data sets were analyzed and scale-free trends for limited scale intervals were successfully detected. The method was verified on artificially produced time series with known scaling behavior and with the varying content of noise. The possibility for the method to falsely detect long-range dependence in the artificially generated short range dependence series was investigated. (c) 2009 Elsevier B.V. All rights reserved.
Lee, Ellen E; Della Selva, Megan P; Liu, Anson; Himelhoch, Seth
2015-01-01
Given the significant disability, morbidity and mortality associated with depression, the promising recent trials of ketamine highlight a novel intervention. A meta-analysis was conducted to assess the efficacy of ketamine in comparison with placebo for the reduction of depressive symptoms in patients who meet criteria for a major depressive episode. Two electronic databases were searched in September 2013 for English-language studies that were randomized placebo-controlled trials of ketamine treatment for patients with major depressive disorder or bipolar depression and utilized a standardized rating scale. Studies including participants receiving electroconvulsive therapy and adolescent/child participants were excluded. Five studies were included in the quantitative meta-analysis. The quantitative meta-analysis showed that ketamine significantly reduced depressive symptoms. The overall effect size at day 1 was large and statistically significant with an overall standardized mean difference of 1.01 (95% confidence interval 0.69-1.34) (P<.001), with the effects sustained at 7 days postinfusion. The heterogeneity of the studies was low and not statistically significant, and the funnel plot showed no publication bias. The large and statistically significant effect of ketamine on depressive symptoms supports a promising, new and effective pharmacotherapy with rapid onset, high efficacy and good tolerability. Copyright © 2015. Published by Elsevier Inc.
[Road Extraction in Remote Sensing Images Based on Spectral and Edge Analysis].
Zhao, Wen-zhi; Luo, Li-qun; Guo, Zhou; Yue, Jun; Yu, Xue-ying; Liu, Hui; Wei, Jing
2015-10-01
Roads are typically man-made objects in urban areas. Road extraction from high-resolution images has important applications for urban planning and transportation development. However, due to the confusion of spectral characteristic, it is difficult to distinguish roads from other objects by merely using traditional classification methods that mainly depend on spectral information. Edge is an important feature for the identification of linear objects (e. g. , roads). The distribution patterns of edges vary greatly among different objects. It is crucial to merge edge statistical information into spectral ones. In this study, a new method that combines spectral information and edge statistical features has been proposed. First, edge detection is conducted by using self-adaptive mean-shift algorithm on the panchromatic band, which can greatly reduce pseudo-edges and noise effects. Then, edge statistical features are obtained from the edge statistical model, which measures the length and angle distribution of edges. Finally, by integrating the spectral and edge statistical features, SVM algorithm is used to classify the image and roads are ultimately extracted. A series of experiments are conducted and the results show that the overall accuracy of proposed method is 93% comparing with only 78% overall accuracy of the traditional. The results demonstrate that the proposed method is efficient and valuable for road extraction, especially on high-resolution images.
Lu, Qiongshi; Li, Boyang; Ou, Derek; Erlendsdottir, Margret; Powles, Ryan L; Jiang, Tony; Hu, Yiming; Chang, David; Jin, Chentian; Dai, Wei; He, Qidu; Liu, Zefeng; Mukherjee, Shubhabrata; Crane, Paul K; Zhao, Hongyu
2017-12-07
Despite the success of large-scale genome-wide association studies (GWASs) on complex traits, our understanding of their genetic architecture is far from complete. Jointly modeling multiple traits' genetic profiles has provided insights into the shared genetic basis of many complex traits. However, large-scale inference sets a high bar for both statistical power and biological interpretability. Here we introduce a principled framework to estimate annotation-stratified genetic covariance between traits using GWAS summary statistics. Through theoretical and numerical analyses, we demonstrate that our method provides accurate covariance estimates, thereby enabling researchers to dissect both the shared and distinct genetic architecture across traits to better understand their etiologies. Among 50 complex traits with publicly accessible GWAS summary statistics (N total ≈ 4.5 million), we identified more than 170 pairs with statistically significant genetic covariance. In particular, we found strong genetic covariance between late-onset Alzheimer disease (LOAD) and amyotrophic lateral sclerosis (ALS), two major neurodegenerative diseases, in single-nucleotide polymorphisms (SNPs) with high minor allele frequencies and in SNPs located in the predicted functional genome. Joint analysis of LOAD, ALS, and other traits highlights LOAD's correlation with cognitive traits and hints at an autoimmune component for ALS. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Psychosocial work environment and mental health--a meta-analytic review.
Stansfeld, Stephen; Candy, Bridget
2006-12-01
To clarify the associations between psychosocial work stressors and mental ill health, a meta-analysis of psychosocial work stressors and common mental disorders was undertaken using longitudinal studies identified through a systematic literature review. The review used a standardized search strategy and strict inclusion and quality criteria in seven databases in 1994-2005. Papers were identified from 24,939 citations covering social determinants of health, 50 relevant papers were identified, 38 fulfilled inclusion criteria, and 11 were suitable for a meta-analysis. The Comprehensive Meta-analysis Programme was used for decision authority, decision latitude, psychological demands, and work social support, components of the job-strain and iso-strain models, and the combination of effort and reward that makes up the effort-reward imbalance model and job insecurity. Cochran's Q statistic assessed the heterogeneity of the results, and the I2 statistic determined any inconsistency between studies. Job strain, low decision latitude, low social support, high psychological demands, effort-reward imbalance, and high job insecurity predicted common mental disorders despite the heterogeneity for psychological demands and social support among men. The strongest effects were found for job strain and effort-reward imbalance. This meta-analysis provides robust consistent evidence that (combinations of) high demands and low decision latitude and (combinations of) high efforts and low rewards are prospective risk factors for common mental disorders and suggests that the psychosocial work environment is important for mental health. The associations are not merely explained by response bias. The impact of work stressors on common mental disorders differs for women and men.
ERIC Educational Resources Information Center
Ba, Harouna; Meade, Terri; Pierson, Elizabeth; Ferguson, Camille; Roy, Amanda; Williams, Hakim
2009-01-01
Forrest County Agricultural High School (FCAHS) is located in Brooklyn, a small rural town in southern Mississippi and part of the Hattiesburg Metropolitan Statistical Area. Unlike the other schools that participated in the Cisco 21S initiative, FCAHS is not part of a larger school district. Therefore, the unit of analysis throughout this summary…
Gal-Nadasan, Norbert; Gal-Nadasan, Emanuela Georgiana; Stoicu-Tivadar, Vasile; Poenaru, Dan V; Popa-Andrei, Diana
2017-01-01
This paper suggests the usage of the Microsoft Kinect to detect the onset of the scoliosis at high school students due to incorrect sitting positions. The measurement is done by measuring the overall posture in orthostatic position using the Microsoft Kinect. During the measuring process several key points of the human body are tracked like the hips and shoulders to form the postural data. The test was done on 30 high school students who spend 6 to 7 hours per day in the school benches. The postural data is statistically processed by IBM Watson's Analytics. From the statistical analysis we have obtained that a prolonged sitting position at such young ages affects in a negative way the spinal cord and facilitates the appearance of malicious postures like scoliosis and lordosis.
Nitrifying biomass characterization and monitoring during bioaugmentation in a membrane bioreactor.
D'Anteo, Sibilla; Mannucci, Alberto; Meliani, Matteo; Verni, Franco; Petroni, Giulio; Munz, Giulio; Lubello, Claudio; Mori, Gualtiero; Vannini, Claudia
2015-01-01
A membrane bioreactor (MBR), fed with domestic wastewater, was bioaugmented with nitrifying biomass selected in a side-stream MBR fed with a synthetic high nitrogen-loaded influent. Microbial communities evolution was monitored and comparatively analysed through an extensive bio-molecular investigation (16S rRNA gene library construction and terminal-restriction fragment length polymorphism techniques) followed by statistical analyses. As expected, a highly specialized nitrifying biomass was selected in the side-stream reactor fed with high-strength ammonia synthetic wastewater. The bioaugmentation process caused an increase of nitrifying bacteria of the genera Nitrosomonas (up to more than 30%) and Nitrobacter in the inoculated MBR reactor. The overall structure of the microbial community changed in the mainstream MBR as a result of bioaugmentation. The effect of bioaugmentation in the shift of the microbial community was also verified through statistical analysis.
Clinical competence of Guatemalan and Mexican physicians for family dysfunction management.
Cabrera-Pivaral, Carlos Enrique; Orozco-Valerio, María de Jesús; Celis-de la Rosa, Alfredo; Covarrubias-Bermúdez, María de Los Ángeles; Zavala-González, Marco Antonio
2017-01-01
To evaluate the clinical competence of Mexican and Guatemalan physicians to management the family dysfunction. Cross comparative study in four care units first in Guadalajara, Mexico, and four in Guatemala, Guatemala, based on a purposeful sampling, involving 117 and 100 physicians, respectively. Clinical competence evaluated by validated instrument integrated for 187 items. Non-parametric descriptive and inferential statistical analysis was performed. The percentage of Mexican physicians with high clinical competence was 13.7%, medium 53%, low 24.8% and defined by random 8.5%. For the Guatemalan physicians'14% was high, average 63%, and 23% defined by random. There were no statistically significant differences between healthcare country units, but between the medium of Mexicans (0.55) and Guatemalans (0.55) (p = 0.02). The proportion of the high clinical competency of Mexican physicians' was as Guatemalans.
High Dimensional Classification Using Features Annealed Independence Rules.
Fan, Jianqing; Fan, Yingying
2008-01-01
Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. In a seminal paper, Bickel and Levina (2004) show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as bad as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as bad as the random guessing. Thus, it is paramountly important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the two-sample t-statistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.
Semistochastic approach to many electron systems
NASA Astrophysics Data System (ADS)
Grossjean, M. K.; Grossjean, M. F.; Schulten, K.; Tavan, P.
1992-08-01
A Pariser-Parr-Pople (PPP) Hamiltonian of an 8π electron system of the molecule octatetraene, represented in a configuration-interaction basis (CI basis), is analyzed with respect to the statistical properties of its matrix elements. Based on this analysis we develop an effective Hamiltonian, which represents virtual excitations by a Gaussian orthogonal ensemble (GOE). We also examine numerical approaches which replace the original Hamiltonian by a semistochastically generated CI matrix. In that CI matrix, the matrix elements of high energy excitations are choosen randomly according to distributions reflecting the statistics of the original CI matrix.
Avalappampatty Sivasamy, Aneetha; Sundan, Bose
2015-01-01
The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T2 method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T2 statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better. PMID:26357668
Sivasamy, Aneetha Avalappampatty; Sundan, Bose
2015-01-01
The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T(2) method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T(2) statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better.
Statistical Quality Control of Moisture Data in GEOS DAS
NASA Technical Reports Server (NTRS)
Dee, D. P.; Rukhovets, L.; Todling, R.
1999-01-01
A new statistical quality control algorithm was recently implemented in the Goddard Earth Observing System Data Assimilation System (GEOS DAS). The final step in the algorithm consists of an adaptive buddy check that either accepts or rejects outlier observations based on a local statistical analysis of nearby data. A basic assumption in any such test is that the observed field is spatially coherent, in the sense that nearby data can be expected to confirm each other. However, the buddy check resulted in excessive rejection of moisture data, especially during the Northern Hemisphere summer. The analysis moisture variable in GEOS DAS is water vapor mixing ratio. Observational evidence shows that the distribution of mixing ratio errors is far from normal. Furthermore, spatial correlations among mixing ratio errors are highly anisotropic and difficult to identify. Both factors contribute to the poor performance of the statistical quality control algorithm. To alleviate the problem, we applied the buddy check to relative humidity data instead. This variable explicitly depends on temperature and therefore exhibits a much greater spatial coherence. As a result, reject rates of moisture data are much more reasonable and homogeneous in time and space.
auf dem Keller, Ulrich; Prudova, Anna; Gioia, Magda; Butler, Georgina S.; Overall, Christopher M.
2010-01-01
Terminal amine isotopic labeling of substrates (TAILS), our recently introduced platform for quantitative N-terminome analysis, enables wide dynamic range identification of original mature protein N-termini and protease cleavage products. Modifying TAILS by use of isobaric tag for relative and absolute quantification (iTRAQ)-like labels for quantification together with a robust statistical classifier derived from experimental protease cleavage data, we report reliable and statistically valid identification of proteolytic events in complex biological systems in MS2 mode. The statistical classifier is supported by a novel parameter evaluating ion intensity-dependent quantification confidences of single peptide quantifications, the quantification confidence factor (QCF). Furthermore, the isoform assignment score (IAS) is introduced, a new scoring system for the evaluation of single peptide-to-protein assignments based on high confidence protein identifications in the same sample prior to negative selection enrichment of N-terminal peptides. By these approaches, we identified and validated, in addition to known substrates, low abundance novel bioactive MMP-2 targets including the plasminogen receptor S100A10 (p11) and the proinflammatory cytokine proEMAP/p43 that were previously undescribed. PMID:20305283
Adaptive filtering in biological signal processing.
Iyer, V K; Ploysongsang, Y; Ramamoorthy, P A
1990-01-01
The high dependence of conventional optimal filtering methods on the a priori knowledge of the signal and noise statistics render them ineffective in dealing with signals whose statistics cannot be predetermined accurately. Adaptive filtering methods offer a better alternative, since the a priori knowledge of statistics is less critical, real time processing is possible, and the computations are less expensive for this approach. Adaptive filtering methods compute the filter coefficients "on-line", converging to the optimal values in the least-mean square (LMS) error sense. Adaptive filtering is therefore apt for dealing with the "unknown" statistics situation and has been applied extensively in areas like communication, speech, radar, sonar, seismology, and biological signal processing and analysis for channel equalization, interference and echo canceling, line enhancement, signal detection, system identification, spectral analysis, beamforming, modeling, control, etc. In this review article adaptive filtering in the context of biological signals is reviewed. An intuitive approach to the underlying theory of adaptive filters and its applicability are presented. Applications of the principles in biological signal processing are discussed in a manner that brings out the key ideas involved. Current and potential future directions in adaptive biological signal processing are also discussed.
Impact of Damping Uncertainty on SEA Model Response Variance
NASA Technical Reports Server (NTRS)
Schiller, Noah; Cabell, Randolph; Grosveld, Ferdinand
2010-01-01
Statistical Energy Analysis (SEA) is commonly used to predict high-frequency vibroacoustic levels. This statistical approach provides the mean response over an ensemble of random subsystems that share the same gross system properties such as density, size, and damping. Recently, techniques have been developed to predict the ensemble variance as well as the mean response. However these techniques do not account for uncertainties in the system properties. In the present paper uncertainty in the damping loss factor is propagated through SEA to obtain more realistic prediction bounds that account for both ensemble and damping variance. The analysis is performed on a floor-equipped cylindrical test article that resembles an aircraft fuselage. Realistic bounds on the damping loss factor are determined from measurements acquired on the sidewall of the test article. The analysis demonstrates that uncertainties in damping have the potential to significantly impact the mean and variance of the predicted response.
UNCERTAINTY ON RADIATION DOSES ESTIMATED BY BIOLOGICAL AND RETROSPECTIVE PHYSICAL METHODS.
Ainsbury, Elizabeth A; Samaga, Daniel; Della Monaca, Sara; Marrale, Maurizio; Bassinet, Celine; Burbidge, Christopher I; Correcher, Virgilio; Discher, Michael; Eakins, Jon; Fattibene, Paola; Güçlü, Inci; Higueras, Manuel; Lund, Eva; Maltar-Strmecki, Nadica; McKeever, Stephen; Rääf, Christopher L; Sholom, Sergey; Veronese, Ivan; Wieser, Albrecht; Woda, Clemens; Trompier, Francois
2018-03-01
Biological and physical retrospective dosimetry are recognised as key techniques to provide individual estimates of dose following unplanned exposures to ionising radiation. Whilst there has been a relatively large amount of recent development in the biological and physical procedures, development of statistical analysis techniques has failed to keep pace. The aim of this paper is to review the current state of the art in uncertainty analysis techniques across the 'EURADOS Working Group 10-Retrospective dosimetry' members, to give concrete examples of implementation of the techniques recommended in the international standards, and to further promote the use of Monte Carlo techniques to support characterisation of uncertainties. It is concluded that sufficient techniques are available and in use by most laboratories for acute, whole body exposures to highly penetrating radiation, but further work will be required to ensure that statistical analysis is always wholly sufficient for the more complex exposure scenarios.
Agin, Patricia Poh; Edmonds, Susan H
2002-08-01
The goals of this study were (i) to demonstrate that existing and widely used sun protection factor (SPF) test methodologies can produce accurate and reproducible results for high SPF formulations and (ii) to provide data on the number of test-subjects needed, the variability of the data, and the appropriate exposure increments needed for testing high SPF formulations. Three high SPF formulations were tested, according to the Food and Drug Administration's (FDA) 1993 tentative final monograph (TFM) 'very water resistant' test method and/or the 1978 proposed monograph 'waterproof' test method, within one laboratory. A fourth high SPF formulation was tested at four independent SPF testing laboratories, using the 1978 waterproof SPF test method. All laboratories utilized xenon arc solar simulators. The data illustrate that the testing conducted within one laboratory, following either the 1978 proposed or the 1993 TFM SPF test method, was able to reproducibly determine the SPFs of the formulations tested, using either the statistical analysis method in the proposed monograph or the statistical method described in the TFM. When one formulation was tested at four different laboratories, the anticipated variation in the data owing to the equipment and other operational differences was minimized through the use of the statistical method described in the 1993 monograph. The data illustrate that either the 1978 proposed monograph SPF test method or the 1993 TFM SPF test method can provide accurate and reproducible results for high SPF formulations. Further, these results can be achieved with panels of 20-25 subjects with an acceptable level of variability. Utilization of the statistical controls from the 1993 sunscreen monograph can help to minimize lab-to-lab variability for well-formulated products.
Verzotto, Davide; M Teo, Audrey S; Hillmer, Axel M; Nagarajan, Niranjan
2016-01-01
Resolution of complex repeat structures and rearrangements in the assembly and analysis of large eukaryotic genomes is often aided by a combination of high-throughput sequencing and genome-mapping technologies (for example, optical restriction mapping). In particular, mapping technologies can generate sparse maps of large DNA fragments (150 kilo base pairs (kbp) to 2 Mbp) and thus provide a unique source of information for disambiguating complex rearrangements in cancer genomes. Despite their utility, combining high-throughput sequencing and mapping technologies has been challenging because of the lack of efficient and sensitive map-alignment algorithms for robustly aligning error-prone maps to sequences. We introduce a novel seed-and-extend glocal (short for global-local) alignment method, OPTIMA (and a sliding-window extension for overlap alignment, OPTIMA-Overlap), which is the first to create indexes for continuous-valued mapping data while accounting for mapping errors. We also present a novel statistical model, agnostic with respect to technology-dependent error rates, for conservatively evaluating the significance of alignments without relying on expensive permutation-based tests. We show that OPTIMA and OPTIMA-Overlap outperform other state-of-the-art approaches (1.6-2 times more sensitive) and are more efficient (170-200 %) and precise in their alignments (nearly 99 % precision). These advantages are independent of the quality of the data, suggesting that our indexing approach and statistical evaluation are robust, provide improved sensitivity and guarantee high precision.
Visualizing statistical significance of disease clusters using cartograms.
Kronenfeld, Barry J; Wong, David W S
2017-05-15
Health officials and epidemiological researchers often use maps of disease rates to identify potential disease clusters. Because these maps exaggerate the prominence of low-density districts and hide potential clusters in urban (high-density) areas, many researchers have used density-equalizing maps (cartograms) as a basis for epidemiological mapping. However, we do not have existing guidelines for visual assessment of statistical uncertainty. To address this shortcoming, we develop techniques for visual determination of statistical significance of clusters spanning one or more districts on a cartogram. We developed the techniques within a geovisual analytics framework that does not rely on automated significance testing, and can therefore facilitate visual analysis to detect clusters that automated techniques might miss. On a cartogram of the at-risk population, the statistical significance of a disease cluster is determinate from the rate, area and shape of the cluster under standard hypothesis testing scenarios. We develop formulae to determine, for a given rate, the area required for statistical significance of a priori and a posteriori designated regions under certain test assumptions. Uniquely, our approach enables dynamic inference of aggregate regions formed by combining individual districts. The method is implemented in interactive tools that provide choropleth mapping, automated legend construction and dynamic search tools to facilitate cluster detection and assessment of the validity of tested assumptions. A case study of leukemia incidence analysis in California demonstrates the ability to visually distinguish between statistically significant and insignificant regions. The proposed geovisual analytics approach enables intuitive visual assessment of statistical significance of arbitrarily defined regions on a cartogram. Our research prompts a broader discussion of the role of geovisual exploratory analyses in disease mapping and the appropriate framework for visually assessing the statistical significance of spatial clusters.
NASA Astrophysics Data System (ADS)
Bruña, Ricardo; Poza, Jesús; Gómez, Carlos; García, María; Fernández, Alberto; Hornero, Roberto
2012-06-01
Alzheimer's disease (AD) is the most common cause of dementia. Over the last few years, a considerable effort has been devoted to exploring new biomarkers. Nevertheless, a better understanding of brain dynamics is still required to optimize therapeutic strategies. In this regard, the characterization of mild cognitive impairment (MCI) is crucial, due to the high conversion rate from MCI to AD. However, only a few studies have focused on the analysis of magnetoencephalographic (MEG) rhythms to characterize AD and MCI. In this study, we assess the ability of several parameters derived from information theory to describe spontaneous MEG activity from 36 AD patients, 18 MCI subjects and 26 controls. Three entropies (Shannon, Tsallis and Rényi entropies), one disequilibrium measure (based on Euclidean distance ED) and three statistical complexities (based on Lopez Ruiz-Mancini-Calbet complexity LMC) were used to estimate the irregularity and statistical complexity of MEG activity. Statistically significant differences between AD patients and controls were obtained with all parameters (p < 0.01). In addition, statistically significant differences between MCI subjects and controls were achieved by ED and LMC (p < 0.05). In order to assess the diagnostic ability of the parameters, a linear discriminant analysis with a leave-one-out cross-validation procedure was applied. The accuracies reached 83.9% and 65.9% to discriminate AD and MCI subjects from controls, respectively. Our findings suggest that MCI subjects exhibit an intermediate pattern of abnormalities between normal aging and AD. Furthermore, the proposed parameters provide a new description of brain dynamics in AD and MCI.
NASA Astrophysics Data System (ADS)
Jordan, Phil; Melland, Alice; Shore, Mairead; Mellander, Per-Erik; Shortle, Ger; Ryan, David; Crockford, Lucy; Macintosh, Katrina; Campbell, Julie; Arnscheidt, Joerg; Cassidy, Rachel
2014-05-01
A complete appraisal of material fluxes in flowing waters is really only possibly with high time resolution data synchronous with measurements of discharge. Defined by Kirchner et al. (2004; Hydrological Processes, 18/7) as the high-frequency wave of the future and with regard to disentangling signal noise from process pattern, this challenge has been met in terms of nutrient flux monitoring by automated bankside analysis. In Ireland over a ten-year period, time-series nutrient data collected on a sub-hourly basis in rivers have been used to distinguish fluxes from different catchment sources and pathways and to provide more certain temporal pictures of flux for the comparative definition of catchment nutrient dynamics. In catchments where nutrient fluxes are particularly high and exhibit a mix of extreme diffuse and point source influences, high time resolution data analysis indicates that there are no satisfactory statistical proxies for seasonal or annual flux predictions that use coarse datasets. Or at least exposes the limits of statistical approaches to catchment scale and hydrological response. This has profound implications for catchment monitoring programmes that rely on modelled relationships. However, using high resolution monitoring for long term assessments of catchment mitigation measures comes with further challenges. Sustaining continuous wet chemistry analysis at river stations is resource intensive in terms of capital, maintenance and quality assurance. Furthermore, big data capture requires investment in data management systems and analysis. These two institutional challenges are magnified when considering the extended time period required to identify the influences of land-based nutrient control measures on water based systems. Separating the 'climate signal' from the 'source signal' in river nutrient flux data is a major analysis challenge; more so when tackled with anything but higher resolution data. Nevertheless, there is scope to lower costs in bankside analysis through technology development, and the scientific advantages of these data are clear and exciting. When integrating its use with policy appraisal, it must be made clear that the advances in river process understanding from high resolution monitoring data capture come as a package with the ability to make more informed decisions through an investment in better information.
IVHS Countermeasures for Rear-End Collisions, Task 1; Vol. II: Statistical Analysis
DOT National Transportation Integrated Search
1994-02-25
This report is from the NHTSA sponsored program, "IVHS Countermeasures for Rear-End Collisions". This Volume, Volume II, Statistical Analysis, presents the statistical analysis of rear-end collision accident data that characterizes the accidents with...
Notes on numerical reliability of several statistical analysis programs
Landwehr, J.M.; Tasker, Gary D.
1999-01-01
This report presents a benchmark analysis of several statistical analysis programs currently in use in the USGS. The benchmark consists of a comparison between the values provided by a statistical analysis program for variables in the reference data set ANASTY and their known or calculated theoretical values. The ANASTY data set is an amendment of the Wilkinson NASTY data set that has been used in the statistical literature to assess the reliability (computational correctness) of calculated analytical results.
Computer Automated Ultrasonic Inspection System
1985-02-06
Reports 74 3.1.4 Statistical Analysis Capability 74 3.2 Nondestructive Evaluation Terminal Hardware 76 3.3 Nondestructive Evaluation Terminal Vendor...3.4.2.6 Create a Hold Tape 103 vi TABLE OF CONTENTS SECTION PAGE 3.4.3 System Status 104 3.4.4 Statistical Analysis 105 3.4.4.1 Statistical Analysis...Data Extraction 105 3.4.4.2 Statistical Analysis Report and Display Generation 106 3.4.5 Quality Assurance Reports 106 3.4.6 Nondestructive Inspection
Agile Airmen: Developing the Capacity to Quickly Create Innovative Ideas
2011-03-23
economic growth.26 In contrast, a 2008 statistical analysis finds a high correlation to economic growth. Eric Hanushek and Ludger Woessmann studied... Hanushek and Woessmann analysis identified STEM leaders as vital to America‟s long-term prosperity, but having quality teachers who can teach STEM...accessed November 10, 2010). 27 Eric A. Hanushek & Ludger Woessmann, "Do Better Schools Lead to More Growth? Cognitive Skills, Economic Outcomes
Method for factor analysis of GC/MS data
Van Benthem, Mark H; Kotula, Paul G; Keenan, Michael R
2012-09-11
The method of the present invention provides a fast, robust, and automated multivariate statistical analysis of gas chromatography/mass spectroscopy (GC/MS) data sets. The method can involve systematic elimination of undesired, saturated peak masses to yield data that follow a linear, additive model. The cleaned data can then be subjected to a combination of PCA and orthogonal factor rotation followed by refinement with MCR-ALS to yield highly interpretable results.
Detailed Uncertainty Analysis of the ZEM-3 Measurement System
NASA Technical Reports Server (NTRS)
Mackey, Jon; Sehirlioglu, Alp; Dynys, Fred
2014-01-01
The measurement of Seebeck coefficient and electrical resistivity are critical to the investigation of all thermoelectric systems. Therefore, it stands that the measurement uncertainty must be well understood to report ZT values which are accurate and trustworthy. A detailed uncertainty analysis of the ZEM-3 measurement system has been performed. The uncertainty analysis calculates error in the electrical resistivity measurement as a result of sample geometry tolerance, probe geometry tolerance, statistical error, and multi-meter uncertainty. The uncertainty on Seebeck coefficient includes probe wire correction factors, statistical error, multi-meter uncertainty, and most importantly the cold-finger effect. The cold-finger effect plagues all potentiometric (four-probe) Seebeck measurement systems, as heat parasitically transfers through thermocouple probes. The effect leads to an asymmetric over-estimation of the Seebeck coefficient. A thermal finite element analysis allows for quantification of the phenomenon, and provides an estimate on the uncertainty of the Seebeck coefficient. The thermoelectric power factor has been found to have an uncertainty of +9-14 at high temperature and 9 near room temperature.
Escalante, Yolanda; Saavedra, Jose M; Tella, Victor; Mansilla, Mirella; García-Hermoso, Antonio; Domínguez, Ana M
2013-04-01
The aims of this study were (a) to compare water polo game-related statistics by context (winning and losing teams) and phase (preliminary, classification, and semifinal/bronze medal/gold medal), and (b) identify characteristics that discriminate performances for each phase. The game-related statistics of the 230 men's matches played in World Championships (2007, 2009, and 2011) and European Championships (2008 and 2010) were analyzed. Differences between contexts (winning or losing teams) in each phase (preliminary, classification, and semifinal/bronze medal/gold medal) were determined using the chi-squared statistic, also calculating the effect sizes of the differences. A discriminant analysis was then performed after the sample-splitting method according to context (winning and losing teams) in each of the 3 phases. It was found that the game-related statistics differentiate the winning from the losing teams in each phase of an international championship. The differentiating variables are both offensive and defensive, including action shots, sprints, goalkeeper-blocked shots, and goalkeeper-blocked action shots. However, the number of discriminatory variables decreases as the phase becomes more demanding and the teams become more equally matched. The discriminant analysis showed the game-related statistics to discriminate performance in all phases (preliminary, classificatory, and semifinal/bronze medal/gold medal phase) with high percentages (91, 90, and 73%, respectively). Again, the model selected both defensive and offensive variables.
Statistical wind analysis for near-space applications
NASA Astrophysics Data System (ADS)
Roney, Jason A.
2007-09-01
Statistical wind models were developed based on the existing observational wind data for near-space altitudes between 60 000 and 100 000 ft (18 30 km) above ground level (AGL) at two locations, Akon, OH, USA, and White Sands, NM, USA. These two sites are envisioned as playing a crucial role in the first flights of high-altitude airships. The analysis shown in this paper has not been previously applied to this region of the stratosphere for such an application. Standard statistics were compiled for these data such as mean, median, maximum wind speed, and standard deviation, and the data were modeled with Weibull distributions. These statistics indicated, on a yearly average, there is a lull or a “knee” in the wind between 65 000 and 72 000 ft AGL (20 22 km). From the standard statistics, trends at both locations indicated substantial seasonal variation in the mean wind speed at these heights. The yearly and monthly statistical modeling indicated that Weibull distributions were a reasonable model for the data. Forecasts and hindcasts were done by using a Weibull model based on 2004 data and comparing the model with the 2003 and 2005 data. The 2004 distribution was also a reasonable model for these years. Lastly, the Weibull distribution and cumulative function were used to predict the 50%, 95%, and 99% winds, which are directly related to the expected power requirements of a near-space station-keeping airship. These values indicated that using only the standard deviation of the mean may underestimate the operational conditions.
Interpretation of correlations in clinical research.
Hung, Man; Bounsanga, Jerry; Voss, Maren Wright
2017-11-01
Critically analyzing research is a key skill in evidence-based practice and requires knowledge of research methods, results interpretation, and applications, all of which rely on a foundation based in statistics. Evidence-based practice makes high demands on trained medical professionals to interpret an ever-expanding array of research evidence. As clinical training emphasizes medical care rather than statistics, it is useful to review the basics of statistical methods and what they mean for interpreting clinical studies. We reviewed the basic concepts of correlational associations, violations of normality, unobserved variable bias, sample size, and alpha inflation. The foundations of causal inference were discussed and sound statistical analyses were examined. We discuss four ways in which correlational analysis is misused, including causal inference overreach, over-reliance on significance, alpha inflation, and sample size bias. Recent published studies in the medical field provide evidence of causal assertion overreach drawn from correlational findings. The findings present a primer on the assumptions and nature of correlational methods of analysis and urge clinicians to exercise appropriate caution as they critically analyze the evidence before them and evaluate evidence that supports practice. Critically analyzing new evidence requires statistical knowledge in addition to clinical knowledge. Studies can overstate relationships, expressing causal assertions when only correlational evidence is available. Failure to account for the effect of sample size in the analyses tends to overstate the importance of predictive variables. It is important not to overemphasize the statistical significance without consideration of effect size and whether differences could be considered clinically meaningful.
High precision mass measurements for wine metabolomics
Roullier-Gall, Chloé; Witting, Michael; Gougeon, Régis D.; Schmitt-Kopplin, Philippe
2014-01-01
An overview of the critical steps for the non-targeted Ultra-High Performance Liquid Chromatography coupled with Quadrupole Time-of-Flight Mass Spectrometry (UPLC-Q-ToF-MS) analysis of wine chemistry is given, ranging from the study design, data preprocessing and statistical analyses, to markers identification. UPLC-Q-ToF-MS data was enhanced by the alignment of exact mass data from FTICR-MS, and marker peaks were identified using UPLC-Q-ToF-MS2. In combination with multivariate statistical tools and the annotation of peaks with metabolites from relevant databases, this analytical process provides a fine description of the chemical complexity of wines, as exemplified in the case of red (Pinot noir) and white (Chardonnay) wines from various geographic origins in Burgundy. PMID:25431760
High precision mass measurements for wine metabolomics
NASA Astrophysics Data System (ADS)
Roullier-Gall, Chloé; Witting, Michael; Gougeon, Régis; Schmitt-Kopplin, Philippe
2014-11-01
An overview of the critical steps for the non-targeted Ultra-High Performance Liquid Chromatography coupled with Quadrupole Time-of-Flight Mass Spectrometry (UPLC-Q-ToF-MS) analysis of wine chemistry is given, ranging from the study design, data preprocessing and statistical analyses, to markers identification. UPLC-Q-ToF-MS data was enhanced by the alignment of exact mass data from FTICR-MS, and marker peaks were identified using UPLC-Q-ToF-MS². In combination with multivariate statistical tools and the annotation of peaks with metabolites from relevant databases, this analytical process provides a fine description of the chemical complexity of wines, as exemplified in the case of red (Pinot noir) and white (Chardonnay) wines from various geographic origins in Burgundy.
Results of a joint NOAA/NASA sounder simulation study
NASA Technical Reports Server (NTRS)
Phillips, N.; Susskind, Joel; Mcmillin, L.
1988-01-01
This paper presents the results of a joint NOAA and NASA sounder simulation study in which the accuracies of atmospheric temperature profiles and surface skin temperature measuremnents retrieved from two sounders were compared: (1) the currently used IR temperature sounder HIRS2 (High-resolution Infrared Radiation Sounder 2); and (2) the recently proposed high-spectral-resolution IR sounder AMTS (Advanced Moisture and Temperature Sounder). Simulations were conducted for both clear and partial cloud conditions. Data were analyzed at NASA using a physical inversion technique and at NOAA using a statistical technique. Results show significant improvement of AMTS compared to HIRS2 for both clear and cloudy conditions. The improvements are indicated by both methods of data analysis, but the physical retrievals outperform the statistical retrievals.
Climate Considerations Of The Electricity Supply Systems In Industries
NASA Astrophysics Data System (ADS)
Asset, Khabdullin; Zauresh, Khabdullina
2014-12-01
The study is focused on analysis of climate considerations of electricity supply systems in a pellet industry. The developed analysis model consists of two modules: statistical data of active power losses evaluation module and climate aspects evaluation module. The statistical data module is presented as a universal mathematical model of electrical systems and components of industrial load. It forms a basis for detailed accounting of power loss from the voltage levels. On the basis of the universal model, a set of programs is designed to perform the calculation and experimental research. It helps to obtain the statistical characteristics of the power losses and loads of the electricity supply systems and to define the nature of changes in these characteristics. Within the module, several methods and algorithms for calculating parameters of equivalent circuits of low- and high-voltage ADC and SD with a massive smooth rotor with laminated poles are developed. The climate aspects module includes an analysis of the experimental data of power supply system in pellet production. It allows identification of GHG emission reduction parameters: operation hours, type of electrical motors, values of load factor and deviation of standard value of voltage.
NASA Astrophysics Data System (ADS)
Roy, P. K.; Pal, S.; Banerjee, G.; Biswas Roy, M.; Ray, D.; Majumder, A.
2014-12-01
River is considered as one of the main sources of freshwater all over the world. Hence analysis and maintenance of this water resource is globally considered a matter of major concern. This paper deals with the assessment of surface water quality of the Ichamati river using multivariate statistical techniques. Eight distinct surface water quality observation stations were located and samples were collected. For the samples collected statistical techniques were applied to the physico-chemical parameters and depth of siltation. In this paper cluster analysis is done to determine the relations between surface water quality and siltation depth of river Ichamati. Multiple regressions and mathematical equation modeling have been done to characterize surface water quality of Ichamati river on the basis of physico-chemical parameters. It was found that surface water quality of the downstream river was different from the water quality of the upstream. The analysis of the water quality parameters of the Ichamati river clearly indicate high pollution load on the river water which can be accounted to agricultural discharge, tidal effect and soil erosion. The results further reveal that with the increase in depth of siltation, water quality degraded.
Patterson, Megan S; Goodson, Patricia
2017-05-01
Compulsive exercise, a form of unhealthy exercise often associated with prioritizing exercise and feeling guilty when exercise is missed, is a common precursor to and symptom of eating disorders. College-aged women are at high risk of exercising compulsively compared with other groups. Social network analysis (SNA) is a theoretical perspective and methodology allowing researchers to observe the effects of relational dynamics on the behaviors of people. SNA was used to assess the relationship between compulsive exercise and body dissatisfaction, physical activity, and network variables. Descriptive statistics were conducted using SPSS, and quadratic assignment procedure (QAP) analyses were conducted using UCINET. QAP regression analysis revealed a statistically significant model (R 2 = .375, P < .0001) predicting compulsive exercise behavior. Physical activity, body dissatisfaction, and network variables were statistically significant predictor variables in the QAP regression model. In our sample, women who are connected to "important" or "powerful" people in their network are likely to have higher compulsive exercise scores. This result provides healthcare practitioners key target points for intervention within similar groups of women. For scholars researching eating disorders and associated behaviors, this study supports looking into group dynamics and network structure in conjunction with body dissatisfaction and exercise frequency.
An operational definition of a statistically meaningful trend.
Bryhn, Andreas C; Dimberg, Peter H
2011-04-28
Linear trend analysis of time series is standard procedure in many scientific disciplines. If the number of data is large, a trend may be statistically significant even if data are scattered far from the trend line. This study introduces and tests a quality criterion for time trends referred to as statistical meaningfulness, which is a stricter quality criterion for trends than high statistical significance. The time series is divided into intervals and interval mean values are calculated. Thereafter, r(2) and p values are calculated from regressions concerning time and interval mean values. If r(2) ≥ 0.65 at p ≤ 0.05 in any of these regressions, then the trend is regarded as statistically meaningful. Out of ten investigated time series from different scientific disciplines, five displayed statistically meaningful trends. A Microsoft Excel application (add-in) was developed which can perform statistical meaningfulness tests and which may increase the operationality of the test. The presented method for distinguishing statistically meaningful trends should be reasonably uncomplicated for researchers with basic statistics skills and may thus be useful for determining which trends are worth analysing further, for instance with respect to causal factors. The method can also be used for determining which segments of a time trend may be particularly worthwhile to focus on.
Sampling and Data Analysis for Environmental Microbiology
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murray, Christopher J.
2001-06-01
A brief review of the literature indicates the importance of statistical analysis in applied and environmental microbiology. Sampling designs are particularly important for successful studies, and it is highly recommended that researchers review their sampling design before heading to the laboratory or the field. Most statisticians have numerous stories of scientists who approached them after their study was complete only to have to tell them that the data they gathered could not be used to test the hypothesis they wanted to address. Once the data are gathered, a large and complex body of statistical techniques are available for analysis ofmore » the data. Those methods include both numerical and graphical techniques for exploratory characterization of the data. Hypothesis testing and analysis of variance (ANOVA) are techniques that can be used to compare the mean and variance of two or more groups of samples. Regression can be used to examine the relationships between sets of variables and is often used to examine the dependence of microbiological populations on microbiological parameters. Multivariate statistics provides several methods that can be used for interpretation of datasets with a large number of variables and to partition samples into similar groups, a task that is very common in taxonomy, but also has applications in other fields of microbiology. Geostatistics and other techniques have been used to examine the spatial distribution of microorganisms. The objectives of this chapter are to provide a brief survey of some of the statistical techniques that can be used for sample design and data analysis of microbiological data in environmental studies, and to provide some examples of their use from the literature.« less
Sarode, D; Bari, D A; Cain, A C; Syed, M I; Williams, A T
2017-04-01
To critically evaluate the evidence comparing success rates of endonasal dacryocystorhinostomy (EN-DCR) with and without silicone tubing and to thus determine whether silicone intubation is beneficial in primary EN-DCR. Systematic review and meta-analysis. A literature search was performed on AMED, EMBASE, HMIC, MEDLINE, PsycINFO, BNI, CINAHL, HEALTH BUSINESS ELITE, CENTRAL and Cochrane Ear, Nose and Throat disorders groups trials register using a combination of various MeSH. The date of last search was January 2016. This review was limited to randomised controlled trials (RCTs) in English language. Risk of bias was assessed using the Cochrane Collaboration's risk of bias tool. Chi-square and I 2 statistics were calculated to determine the presence and extent of statistical heterogeneity. Study selection, data extraction and risk of bias scoring were performed independently by two authors in concordance with the PRISMA statement. Five RCTs (447 primary EN-DCR procedures in 426 patients) were included for analysis. Moderate interstudy statistical heterogeneity was demonstrated (Chi 2 = 6.18; d.f. = 4; I 2 = 35%). Bicanalicular silicone stents were used in 229 and not used in 218 procedures. The overall success rate of EN-DCR was 92.8% (415/447). The success rate of EN-DCR was 93.4% (214/229) with silicone tubing and 92.2% (201/218) without silicone tubing. Meta-analysis using a random-effects model showed no statistically significant difference in outcomes between the two groups (P = 0.63; RR = 0.79; 95% CI = 0.3-2.06). Our review and meta-analysis did not demonstrate an additional advantage of silicone stenting. A high-quality well-powered prospective multicentre RCT is needed to further clarify on the benefit of silicone stents. © 2016 John Wiley & Sons Ltd.
Analysis of Information Content in High-Spectral Resolution Sounders using Subset Selection Analysis
NASA Technical Reports Server (NTRS)
Velez-Reyes, Miguel; Joiner, Joanna
1998-01-01
In this paper, we summarize the results of the sensitivity analysis and data reduction carried out to determine the information content of AIRS and IASI channels. The analysis and data reduction was based on the use of subset selection techniques developed in the linear algebra and statistical community to study linear dependencies in high dimensional data sets. We applied the subset selection method to study dependency among channels by studying the dependency among their weighting functions. Also, we applied the technique to study the information provided by the different levels in which the atmosphere is discretized for retrievals and analysis. Results from the method correlate well with intuition in many respects and point out to possible modifications for band selection in sensor design and number and location of levels in the analysis process.
NASA Technical Reports Server (NTRS)
Marchenko, V. I.
1974-01-01
During periods of high solar activity fibrinolysis and fibrinogenolysis are increased. A direct correlative relationship is established between the indices of fibrinolysis, fibrinogenolysis and solar flares which were recorded two days before the blood was collected for analysis.
Capturing Positive Transgressive Variation From Wild And Exotic Germplasm Resources
USDA-ARS?s Scientific Manuscript database
Only a small fraction of the naturally occurring genetic diversity available in rice germplasm repositories around the world has been explored to date. This is beginning to change with the advent of affordable, high throughput genotyping approaches coupled with robust statistical analysis methods th...
Contrast-Enhanced Ultrasonography in Differential Diagnosis of Benign and Malignant Ovarian Tumors
Qiao, Jing-Jing; Yu, Jing; Yu, Zhe; Li, Na; Song, Chen; Li, Man
2015-01-01
Objective To evaluate the accuracy of contrast-enhanced ultrasonography (CEUS) in differential diagnosis of benign and malignant ovarian tumors. Methods The scientific literature databases PubMed, Cochrane Library and CNKI were comprehensively searched for studies relevant to the use of CEUS technique for differential diagnosis of benign and malignant ovarian cancer. Pooled summary statistics for specificity (Spe), sensitivity (Sen), positive and negative likelihood ratios (LR+/LR−), and diagnostic odds ratio (DOR) and their 95%CIs were calculated. Software for statistical analysis included STATA version 12.0 (Stata Corp, College Station, TX, USA) and Meta-Disc version 1.4 (Universidad Complutense, Madrid, Spain). Results Following a stringent selection process, seven high quality clinical trials were found suitable for inclusion in the present meta-analysis. The 7 studies contained a combined total of 375 ovarian cancer patients (198 malignant and 177 benign). Statistical analysis revealed that CEUS was associated with the following performance measures in differential diagnosis of ovarian tumors: pooled Sen was 0.96 (95%CI = 0.92∼0.98); the summary Spe was 0.91 (95%CI = 0.86∼0.94); the pooled LR+ was 10.63 (95%CI = 6.59∼17.17); the pooled LR− was 0.04 (95%CI = 0.02∼0.09); and the pooled DOR was 241.04 (95% CI = 92.61∼627.37). The area under the SROC curve was 0.98 (95% CI = 0.20∼1.00). Lastly, publication bias was not detected (t = −0.52, P = 0.626) in the meta-analysis. Conclusions Our results revealed the high clinical value of CEUS in differential diagnosis of benign and malignant ovarian tumors. Further, CEUS may also prove to be useful in differential diagnosis at early stages of this disease. PMID:25764442
Bigdely-Shamlo, Nima; Mullen, Tim; Kreutz-Delgado, Kenneth; Makeig, Scott
2013-01-01
A crucial question for the analysis of multi-subject and/or multi-session electroencephalographic (EEG) data is how to combine information across multiple recordings from different subjects and/or sessions, each associated with its own set of source processes and scalp projections. Here we introduce a novel statistical method for characterizing the spatial consistency of EEG dynamics across a set of data records. Measure Projection Analysis (MPA) first finds voxels in a common template brain space at which a given dynamic measure is consistent across nearby source locations, then computes local-mean EEG measure values for this voxel subspace using a statistical model of source localization error and between-subject anatomical variation. Finally, clustering the mean measure voxel values in this locally consistent brain subspace finds brain spatial domains exhibiting distinguishable measure features and provides 3-D maps plus statistical significance estimates for each EEG measure of interest. Applied to sufficient high-quality data, the scalp projections of many maximally independent component (IC) processes contributing to recorded high-density EEG data closely match the projection of a single equivalent dipole located in or near brain cortex. We demonstrate the application of MPA to a multi-subject EEG study decomposed using independent component analysis (ICA), compare the results to k-means IC clustering in EEGLAB (sccn.ucsd.edu/eeglab), and use surrogate data to test MPA robustness. A Measure Projection Toolbox (MPT) plug-in for EEGLAB is available for download (sccn.ucsd.edu/wiki/MPT). Together, MPA and ICA allow use of EEG as a 3-D cortical imaging modality with near-cm scale spatial resolution. PMID:23370059
VoxelStats: A MATLAB Package for Multi-Modal Voxel-Wise Brain Image Analysis.
Mathotaarachchi, Sulantha; Wang, Seqian; Shin, Monica; Pascoal, Tharick A; Benedet, Andrea L; Kang, Min Su; Beaudry, Thomas; Fonov, Vladimir S; Gauthier, Serge; Labbe, Aurélie; Rosa-Neto, Pedro
2016-01-01
In healthy individuals, behavioral outcomes are highly associated with the variability on brain regional structure or neurochemical phenotypes. Similarly, in the context of neurodegenerative conditions, neuroimaging reveals that cognitive decline is linked to the magnitude of atrophy, neurochemical declines, or concentrations of abnormal protein aggregates across brain regions. However, modeling the effects of multiple regional abnormalities as determinants of cognitive decline at the voxel level remains largely unexplored by multimodal imaging research, given the high computational cost of estimating regression models for every single voxel from various imaging modalities. VoxelStats is a voxel-wise computational framework to overcome these computational limitations and to perform statistical operations on multiple scalar variables and imaging modalities at the voxel level. VoxelStats package has been developed in Matlab(®) and supports imaging formats such as Nifti-1, ANALYZE, and MINC v2. Prebuilt functions in VoxelStats enable the user to perform voxel-wise general and generalized linear models and mixed effect models with multiple volumetric covariates. Importantly, VoxelStats can recognize scalar values or image volumes as response variables and can accommodate volumetric statistical covariates as well as their interaction effects with other variables. Furthermore, this package includes built-in functionality to perform voxel-wise receiver operating characteristic analysis and paired and unpaired group contrast analysis. Validation of VoxelStats was conducted by comparing the linear regression functionality with existing toolboxes such as glim_image and RMINC. The validation results were identical to existing methods and the additional functionality was demonstrated by generating feature case assessments (t-statistics, odds ratio, and true positive rate maps). In summary, VoxelStats expands the current methods for multimodal imaging analysis by allowing the estimation of advanced regional association metrics at the voxel level.
Sparse approximation of currents for statistics on curves and surfaces.
Durrleman, Stanley; Pennec, Xavier; Trouvé, Alain; Ayache, Nicholas
2008-01-01
Computing, processing, visualizing statistics on shapes like curves or surfaces is a real challenge with many applications ranging from medical image analysis to computational geometry. Modelling such geometrical primitives with currents avoids feature-based approach as well as point-correspondence method. This framework has been proved to be powerful to register brain surfaces or to measure geometrical invariants. However, if the state-of-the-art methods perform efficiently pairwise registrations, new numerical schemes are required to process groupwise statistics due to an increasing complexity when the size of the database is growing. Statistics such as mean and principal modes of a set of shapes often have a heavy and highly redundant representation. We propose therefore to find an adapted basis on which mean and principal modes have a sparse decomposition. Besides the computational improvement, this sparse representation offers a way to visualize and interpret statistics on currents. Experiments show the relevance of the approach on 34 sets of 70 sulcal lines and on 50 sets of 10 meshes of deep brain structures.
Ogneva-Himmelberger, Yelena; Dahlberg, Tyler; Kelly, Kristen; Simas, Tiffany A. Moore
2015-01-01
The study uses geographic information science (GIS) and statistics to find out if there are statistical differences between full term and preterm births to non-Hispanic white, non-Hispanic Black, and Hispanic mothers in their exposure to air pollution and access to environmental amenities (green space and vendors of healthy food) in the second largest city in New England, Worcester, Massachusetts. Proximity to a Toxic Release Inventory site has a statistically significant effect on preterm birth regardless of race. The air-pollution hazard score from the Risk Screening Environmental Indicators Model is also a statistically significant factor when preterm births are categorized into three groups based on the degree of prematurity. Proximity to green space and to a healthy food vendor did not have an effect on preterm births. The study also used cluster analysis and found statistically significant spatial clusters of high preterm birth volume for non-Hispanic white, non-Hispanic Black, and Hispanic mothers. PMID:29546120
Ogneva-Himmelberger, Yelena; Dahlberg, Tyler; Kelly, Kristen; Simas, Tiffany A Moore
2015-01-01
The study uses geographic information science (GIS) and statistics to find out if there are statistical differences between full term and preterm births to non-Hispanic white, non-Hispanic Black, and Hispanic mothers in their exposure to air pollution and access to environmental amenities (green space and vendors of healthy food) in the second largest city in New England, Worcester, Massachusetts. Proximity to a Toxic Release Inventory site has a statistically significant effect on preterm birth regardless of race. The air-pollution hazard score from the Risk Screening Environmental Indicators Model is also a statistically significant factor when preterm births are categorized into three groups based on the degree of prematurity. Proximity to green space and to a healthy food vendor did not have an effect on preterm births. The study also used cluster analysis and found statistically significant spatial clusters of high preterm birth volume for non-Hispanic white, non-Hispanic Black, and Hispanic mothers.
Computers as an Instrument for Data Analysis. Technical Report No. 11.
ERIC Educational Resources Information Center
Muller, Mervin E.
A review of statistical data analysis involving computers as a multi-dimensional problem provides the perspective for consideration of the use of computers in statistical analysis and the problems associated with large data files. An overall description of STATJOB, a particular system for doing statistical data analysis on a digital computer,…
Shock and Vibration Symposium (59th) Held in Albuquerque, New Mexico on 18-20 October 1988. Volume 1
1988-10-01
Partial contents: The Quest for Omega = sq root(K/M) -- Notes on the development of vibration analysis; An overview of Statistical Energy analysis ; Its...and inplane vibration transmission in statistical energy analysis ; Vibroacoustic response using the finite element method and statistical energy analysis ; Helium
Babu, Giridhara R; Murthy, G V S; Ana, Yamuna; Patel, Prital; Deepa, R; Neelon, Sara E Benjamin; Kinra, Sanjay; Reddy, K Srinath
2018-01-01
AIM To perform a meta-analysis of the association of obesity with hypertension and type 2 diabetes mellitus (T2DM) in India among adults. METHODS To conduct meta-analysis, we performed comprehensive, electronic literature search in the PubMed, CINAHL Plus, and Google Scholar. We restricted the analysis to studies with documentation of some measure of obesity namely; body mass index, waist-hip ratio, waist circumference and diagnosis of hypertension or diagnosis of T2DM. By obtaining summary estimates of all included studies, the meta-analysis was performed using both RevMan version 5 and “metan” command STATA version 11. Heterogeneity was measured by I2 statistic. Funnel plot analysis has been done to assess the study publication bias. RESULTS Of the 956 studies screened, 18 met the eligibility criteria. The pooled odds ratio between obesity and hypertension was 3.82 (95%CI: 3.39 to 4.25). The heterogeneity around this estimate (I2 statistic) was 0%, indicating low variability. The pooled odds ratio from the included studies showed a statistically significant association between obesity and T2DM (OR = 1.14, 95%CI: 1.04 to 1.24) with a high degree of variability. CONCLUSION Despite methodological differences, obesity showed significant, potentially plausible association with hypertension and T2DM in studies conducted in India. Being a modifiable risk factor, our study informs setting policy priority and intervention efforts to prevent debilitating complications. PMID:29359028
Babu, Giridhara R; Murthy, G V S; Ana, Yamuna; Patel, Prital; Deepa, R; Neelon, Sara E Benjamin; Kinra, Sanjay; Reddy, K Srinath
2018-01-15
To perform a meta-analysis of the association of obesity with hypertension and type 2 diabetes mellitus (T2DM) in India among adults. To conduct meta-analysis, we performed comprehensive, electronic literature search in the PubMed, CINAHL Plus, and Google Scholar. We restricted the analysis to studies with documentation of some measure of obesity namely; body mass index, waist-hip ratio, waist circumference and diagnosis of hypertension or diagnosis of T2DM. By obtaining summary estimates of all included studies, the meta-analysis was performed using both RevMan version 5 and "metan" command STATA version 11. Heterogeneity was measured by I 2 statistic. Funnel plot analysis has been done to assess the study publication bias. Of the 956 studies screened, 18 met the eligibility criteria. The pooled odds ratio between obesity and hypertension was 3.82 (95%CI: 3.39 to 4.25). The heterogeneity around this estimate (I2 statistic) was 0%, indicating low variability. The pooled odds ratio from the included studies showed a statistically significant association between obesity and T2DM (OR = 1.14, 95%CI: 1.04 to 1.24) with a high degree of variability. Despite methodological differences, obesity showed significant, potentially plausible association with hypertension and T2DM in studies conducted in India. Being a modifiable risk factor, our study informs setting policy priority and intervention efforts to prevent debilitating complications.
Malvisi, Lucio; Troisi, Catherine L; Selwyn, Beatrice J
2018-06-23
The risk of malaria infection displays spatial and temporal variability that is likely due to interaction between the physical environment and the human population. In this study, we performed a spatial analysis at three different time points, corresponding to three cross-sectional surveys conducted as part of an insecticide-treated bed nets efficacy study, to reveal patterns of malaria incidence distribution in an area of Northern Guatemala characterized by low malaria endemicity. A thorough understanding of the spatial and temporal patterns of malaria distribution is essential for targeted malaria control programs. Two methods, the local Moran's I and the Getis-Ord G * (d), were used for the analysis, providing two different statistical approaches and allowing for a comparison of results. A distance band of 3.5 km was considered to be the most appropriate distance for the analysis of data based on epidemiological and entomological factors. Incidence rates were higher at the first cross-sectional survey conducted prior to the intervention compared to the following two surveys. Clusters or hot spots of malaria incidence exhibited high spatial and temporal variations. Findings from the two statistics were similar, though the G * (d) detected cold spots using a higher distance band (5.5 km). The high spatial and temporal variability in the distribution of clusters of high malaria incidence seems to be consistent with an area of unstable malaria transmission. In such a context, a strong surveillance system and the use of spatial analysis may be crucial for targeted malaria control activities.
Bianchi, Bernardo; Ferri, Andrea; Ferrari, Silvano; Copelli, Chiara; Sesenna, Enrico
2011-04-01
The purpose of this article was to analyze the efficacy of facelift incision, sternocleidomastoid muscle flap, and superficial musculoaponeurotic system flap for improving the esthetic results in patients undergoing partial parotidectomy for benign parotid tumor resection. The usefulness of partial parotidectomy is discussed, and a statistical evaluation of the esthetic results was performed. From January 1, 1996, to January 1, 2007, 274 patients treated for benign parotid tumors were studied. Of these, 172 underwent partial parotidectomy. The 172 patients were divided into 4 groups: partial parotidectomy with classic or modified Blair incision without reconstruction (group 1), partial parotidectomy with facelift incision and without reconstruction (group 2), partial parotidectomy with facelift incision associated with sternocleidomastoid muscle flap (group 3), and partial parotidectomy with facelift incision associated with superficial musculoaponeurotic system flap (group 4). Patients were considered, after a follow-up of at least 18 months, for functional and esthetic evaluation. The functional outcome was assessed considering the facial nerve function, Frey syndrome, and recurrence. The esthetic evaluation was performed by inviting the patients and a blind panel of 1 surgeon and 2 secretaries of the department to give a score of 1 to 10 to assess the final cosmetic outcome. The statistical analysis was finally performed using the Mann-Whitney U test for nonparametric data to compare the different group results. P less than .05 was considered significant. No recurrence developed in any of the 4 groups or in any of the 274 patients during the follow-up period. The statistical analysis, comparing group 1 and the other groups, revealed a highly significant statistical difference (P < .0001) for all groups. Also, when group 2 was compared with groups 3 and 4, the difference was highly significantly different statistically (P = .0018 for group 3 and P = .0005 for group 4). Finally, when groups 3 and 4 were compared, the difference was not statistically significant (P = .3467). Partial parotidectomy is the real key point for improving esthetic results in benign parotid surgery. The evaluation of functional complications and the recurrence rate in this series of patients has confirmed that this technique can be safely used for parotid benign tumor resection. The use of a facelift incision alone led to a high statistically significant improvement in the esthetic outcome. When the facelift incision was used with reconstructive techniques, such as the sternocleidomastoid muscle flap or the superficial musculoaponeurotic system flap, the esthetic results improved further. Finally, no statistically significant difference resulted comparing the use of the superficial musculoaponeurotic system and the sternocleidomastoid muscle flap. Copyright © 2011 American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.
Simulation on a car interior aerodynamic noise control based on statistical energy analysis
NASA Astrophysics Data System (ADS)
Chen, Xin; Wang, Dengfeng; Ma, Zhengdong
2012-09-01
How to simulate interior aerodynamic noise accurately is an important question of a car interior noise reduction. The unsteady aerodynamic pressure on body surfaces is proved to be the key effect factor of car interior aerodynamic noise control in high frequency on high speed. In this paper, a detail statistical energy analysis (SEA) model is built. And the vibra-acoustic power inputs are loaded on the model for the valid result of car interior noise analysis. The model is the solid foundation for further optimization on car interior noise control. After the most sensitive subsystems for the power contribution to car interior noise are pointed by SEA comprehensive analysis, the sound pressure level of car interior aerodynamic noise can be reduced by improving their sound and damping characteristics. The further vehicle testing results show that it is available to improve the interior acoustic performance by using detailed SEA model, which comprised by more than 80 subsystems, with the unsteady aerodynamic pressure calculation on body surfaces and the materials improvement of sound/damping properties. It is able to acquire more than 2 dB reduction on the central frequency in the spectrum over 800 Hz. The proposed optimization method can be looked as a reference of car interior aerodynamic noise control by the detail SEA model integrated unsteady computational fluid dynamics (CFD) and sensitivity analysis of acoustic contribution.
A new image encryption algorithm based on the fractional-order hyperchaotic Lorenz system
NASA Astrophysics Data System (ADS)
Wang, Zhen; Huang, Xia; Li, Yu-Xia; Song, Xiao-Na
2013-01-01
We propose a new image encryption algorithm on the basis of the fractional-order hyperchaotic Lorenz system. While in the process of generating a key stream, the system parameters and the derivative order are embedded in the proposed algorithm to enhance the security. Such an algorithm is detailed in terms of security analyses, including correlation analysis, information entropy analysis, run statistic analysis, mean-variance gray value analysis, and key sensitivity analysis. The experimental results demonstrate that the proposed image encryption scheme has the advantages of large key space and high security for practical image encryption.
The added value of ordinal analysis in clinical trials: an example in traumatic brain injury.
Roozenbeek, Bob; Lingsma, Hester F; Perel, Pablo; Edwards, Phil; Roberts, Ian; Murray, Gordon D; Maas, Andrew Ir; Steyerberg, Ewout W
2011-01-01
In clinical trials, ordinal outcome measures are often dichotomized into two categories. In traumatic brain injury (TBI) the 5-point Glasgow outcome scale (GOS) is collapsed into unfavourable versus favourable outcome. Simulation studies have shown that exploiting the ordinal nature of the GOS increases chances of detecting treatment effects. The objective of this study is to quantify the benefits of ordinal analysis in the real-life situation of a large TBI trial. We used data from the CRASH trial that investigated the efficacy of corticosteroids in TBI patients (n = 9,554). We applied two techniques for ordinal analysis: proportional odds analysis and the sliding dichotomy approach, where the GOS is dichotomized at different cut-offs according to baseline prognostic risk. These approaches were compared to dichotomous analysis. The information density in each analysis was indicated by a Wald statistic. All analyses were adjusted for baseline characteristics. Dichotomous analysis of the six-month GOS showed a non-significant treatment effect (OR = 1.09, 95% CI 0.98 to 1.21, P = 0.096). Ordinal analysis with proportional odds regression or sliding dichotomy showed highly statistically significant treatment effects (OR 1.15, 95% CI 1.06 to 1.25, P = 0.0007 and 1.19, 95% CI 1.08 to 1.30, P = 0.0002), with 2.05-fold and 2.56-fold higher information density compared to the dichotomous approach respectively. Analysis of the CRASH trial data confirmed that ordinal analysis of outcome substantially increases statistical power. We expect these results to hold for other fields of critical care medicine that use ordinal outcome measures and recommend that future trials adopt ordinal analyses. This will permit detection of smaller treatment effects.
Linear regression models and k-means clustering for statistical analysis of fNIRS data.
Bonomini, Viola; Zucchelli, Lucia; Re, Rebecca; Ieva, Francesca; Spinelli, Lorenzo; Contini, Davide; Paganoni, Anna; Torricelli, Alessandro
2015-02-01
We propose a new algorithm, based on a linear regression model, to statistically estimate the hemodynamic activations in fNIRS data sets. The main concern guiding the algorithm development was the minimization of assumptions and approximations made on the data set for the application of statistical tests. Further, we propose a K-means method to cluster fNIRS data (i.e. channels) as activated or not activated. The methods were validated both on simulated and in vivo fNIRS data. A time domain (TD) fNIRS technique was preferred because of its high performances in discriminating cortical activation and superficial physiological changes. However, the proposed method is also applicable to continuous wave or frequency domain fNIRS data sets.
Linear regression models and k-means clustering for statistical analysis of fNIRS data
Bonomini, Viola; Zucchelli, Lucia; Re, Rebecca; Ieva, Francesca; Spinelli, Lorenzo; Contini, Davide; Paganoni, Anna; Torricelli, Alessandro
2015-01-01
We propose a new algorithm, based on a linear regression model, to statistically estimate the hemodynamic activations in fNIRS data sets. The main concern guiding the algorithm development was the minimization of assumptions and approximations made on the data set for the application of statistical tests. Further, we propose a K-means method to cluster fNIRS data (i.e. channels) as activated or not activated. The methods were validated both on simulated and in vivo fNIRS data. A time domain (TD) fNIRS technique was preferred because of its high performances in discriminating cortical activation and superficial physiological changes. However, the proposed method is also applicable to continuous wave or frequency domain fNIRS data sets. PMID:25780751
Cho, Hyun-Deok; Kim, Unyong; Suh, Joon Hyuk; Eom, Han Young; Kim, Junghyun; Lee, Seul Gi; Choi, Yong Seok; Han, Sang Beom
2016-04-01
Analytical methods using high-performance liquid chromatography with diode array and tandem mass spectrometry detection were developed for the discrimination of the rhizomes of four Atractylodes medicinal plants: A. japonica, A. macrocephala, A. chinensis, and A. lancea. A quantitative study was performed, selecting five bioactive components, including atractylenolide I, II, III, eudesma-4(14),7(11)-dien-8-one and atractylodin, on twenty-six Atractylodes samples of various origins. Sample extraction was optimized to sonication with 80% methanol for 40 min at room temperature. High-performance liquid chromatography with diode array detection was established using a C18 column with a water/acetonitrile gradient system at a flow rate of 1.0 mL/min, and the detection wavelength was set at 236 nm. Liquid chromatography with tandem mass spectrometry was applied to certify the reliability of the quantitative results. The developed methods were validated by ensuring specificity, linearity, limit of quantification, accuracy, precision, recovery, robustness, and stability. Results showed that cangzhu contained higher amounts of atractylenolide I and atractylodin than baizhu, and especially atractylodin contents showed the greatest variation between baizhu and cangzhu. Multivariate statistical analysis, such as principal component analysis and hierarchical cluster analysis, were also employed for further classification of the Atractylodes plants. The established method was suitable for quality control of the Atractylodes plants. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Prognostic value of stromal decorin expression in patients with breast cancer: a meta-analysis.
Li, Shuang-Jiang; Chen, Da-Li; Zhang, Wen-Biao; Shen, Cheng; Che, Guo-Wei
2015-11-01
Numbers of studies have investigated the biological functions of decorin (DCN) in oncogenesis, tumor progression, angiogenesis and metastasis. Although many of them aim to highlight the prognostic value of stromal DCN expression in breast cancer, some controversial results still exist and a consensus has not been reached until now. Therefore, our meta-analysis aims to determine the prognostic significance of stromal DCN expression in breast cancer patients. PubMed, EMBASE, the Web of Science and China National Knowledge Infrastructure (CNKI) databases were searched for full-text literatures met out inclusion criteria. We applied the hazard ratio (HR) with 95% confidence interval (CI) as the appropriate summarized statistics. Q-test and I(2) statistic were employed to estimate the level of heterogeneity across the included studies. Sensitivity analysis was conducted to further identify the possible origins of heterogeneity. The publication bias was detected by Begg's test and Egger's test. There were three English literatures (involving 6 studies) included into our meta-analysis. On the one hand, both the summarized outcomes based on univariate analysis (HR: 0.513; 95% CI: 0.406-0.648; P<0.001) and multivariate analysis (HR: 0.544; 95% CI: 0.388-0.763; P<0.001) indicated that stromal DCN expression could promise the high cancer-specific survival (CSS) of breast cancer patients. On the other hand, both the summarized outcomes based on univariate analysis (HR: 0.504; 95% CI: 0.389-0.651; P<0.001) and multivariate analysis (HR: 0.568; 95% CI: 0.400-0.806; P=0.002) also indicated that stromal DCN expression was positively associated with high disease-free survival (DFS) of breast cancer patients. No significant heterogeneity or publication bias was observed within this meta-analysis. The present evidences indicate that high stromal DCN expression can significantly predict the good prognosis in patients with breast cancer. The discoveries from our meta-analysis have better be confirmed in the updated review pooling more relevant investigations in the future.
Algorithm for computing descriptive statistics for very large data sets and the exa-scale era
NASA Astrophysics Data System (ADS)
Beekman, Izaak
2017-11-01
An algorithm for Single-point, Parallel, Online, Converging Statistics (SPOCS) is presented. It is suited for in situ analysis that traditionally would be relegated to post-processing, and can be used to monitor the statistical convergence and estimate the error/residual in the quantity-useful for uncertainty quantification too. Today, data may be generated at an overwhelming rate by numerical simulations and proliferating sensing apparatuses in experiments and engineering applications. Monitoring descriptive statistics in real time lets costly computations and experiments be gracefully aborted if an error has occurred, and monitoring the level of statistical convergence allows them to be run for the shortest amount of time required to obtain good results. This algorithm extends work by Pébay (Sandia Report SAND2008-6212). Pébay's algorithms are recast into a converging delta formulation, with provably favorable properties. The mean, variance, covariances and arbitrary higher order statistical moments are computed in one pass. The algorithm is tested using Sillero, Jiménez, & Moser's (2013, 2014) publicly available UPM high Reynolds number turbulent boundary layer data set, demonstrating numerical robustness, efficiency and other favorable properties.
Capturing rogue waves by multi-point statistics
NASA Astrophysics Data System (ADS)
Hadjihosseini, A.; Wächter, Matthias; Hoffmann, N. P.; Peinke, J.
2016-01-01
As an example of a complex system with extreme events, we investigate ocean wave states exhibiting rogue waves. We present a statistical method of data analysis based on multi-point statistics which for the first time allows the grasping of extreme rogue wave events in a highly satisfactory statistical manner. The key to the success of the approach is mapping the complexity of multi-point data onto the statistics of hierarchically ordered height increments for different time scales, for which we can show that a stochastic cascade process with Markov properties is governed by a Fokker-Planck equation. Conditional probabilities as well as the Fokker-Planck equation itself can be estimated directly from the available observational data. With this stochastic description surrogate data sets can in turn be generated, which makes it possible to work out arbitrary statistical features of the complex sea state in general, and extreme rogue wave events in particular. The results also open up new perspectives for forecasting the occurrence probability of extreme rogue wave events, and even for forecasting the occurrence of individual rogue waves based on precursory dynamics.
Application of multivariate statistical techniques in microbial ecology
Paliy, O.; Shankar, V.
2016-01-01
Recent advances in high-throughput methods of molecular analyses have led to an explosion of studies generating large scale ecological datasets. Especially noticeable effect has been attained in the field of microbial ecology, where new experimental approaches provided in-depth assessments of the composition, functions, and dynamic changes of complex microbial communities. Because even a single high-throughput experiment produces large amounts of data, powerful statistical techniques of multivariate analysis are well suited to analyze and interpret these datasets. Many different multivariate techniques are available, and often it is not clear which method should be applied to a particular dataset. In this review we describe and compare the most widely used multivariate statistical techniques including exploratory, interpretive, and discriminatory procedures. We consider several important limitations and assumptions of these methods, and we present examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world. Finally, we offer suggestions for the selection of appropriate methods based on the research question and dataset structure. PMID:26786791
NASA Astrophysics Data System (ADS)
Hassanzadeh, S.; Hosseinibalam, F.; Omidvari, M.
2008-04-01
Data of seven meteorological variables (relative humidity, wet temperature, dry temperature, maximum temperature, minimum temperature, ground temperature and sun radiation time) and ozone values have been used for statistical analysis. Meteorological variables and ozone values were analyzed using both multiple linear regression and principal component methods. Data for the period 1999-2004 are analyzed jointly using both methods. For all periods, temperature dependent variables were highly correlated, but were all negatively correlated with relative humidity. Multiple regression analysis was used to fit the meteorological variables using the meteorological variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to obtain subsets of the predictor variables to be included in the linear regression model of the meteorological variables. In 1999, 2001 and 2002 one of the meteorological variables was weakly influenced predominantly by the ozone concentrations. However, the model did not predict that the meteorological variables for the year 2000 were not influenced predominantly by the ozone concentrations that point to variation in sun radiation. This could be due to other factors that were not explicitly considered in this study.
Comparative inter-institutional study of stress among dentists.
Pozos-Radillo, Blanca E; Galván-Ramírez, Ma Luz; Pando, Manuel; Carrión, Ma De los Angeles; González, Guillermo J
2010-01-01
Dentistry is considered to be a stressful profession due to different factors caused by work, representing a threat to dentists'health. The objectives of this work were to identify and compare chronic stress in dentists among the different health institutions and the association of stress with risk factors. The study in question is observational, transversal and comparative; 256 dentists were included, distributed among five public health institutions in the city of Guadalajara, Jalisco, Mexico, namely: the Mexican Institute of Social Security (IMSS), the Ministry of Health (SS), the Integral Development of the Family (DIF), the Social Security Services Institute for the Workers (ISSSTE) and the University of Guadalajara (U. de G) Data were obtained by means of the census technique. Stress was identified using the Stress Symptoms Inventory and the statistical analysis was performed using the Odds Ratio (O.R.) and the chi-square statistic. From the total population studied, 219 subjects presented high levels of chronic stress and 37, low levels. In the results of comparative analysis, significant differences were found between IMSS and U. de G and likewise between IMSS and SS. However, in the analysis of association, only U. de G was found to be associated with the high level of chronic stress.
Furbish, David; Schmeeckle, Mark; Schumer, Rina; Fathel, Siobhan
2016-01-01
We describe the most likely forms of the probability distributions of bed load particle velocities, accelerations, hop distances, and travel times, in a manner that formally appeals to inferential statistics while honoring mechanical and kinematic constraints imposed by equilibrium transport conditions. The analysis is based on E. Jaynes's elaboration of the implications of the similarity between the Gibbs entropy in statistical mechanics and the Shannon entropy in information theory. By maximizing the information entropy of a distribution subject to known constraints on its moments, our choice of the form of the distribution is unbiased. The analysis suggests that particle velocities and travel times are exponentially distributed and that particle accelerations follow a Laplace distribution with zero mean. Particle hop distances, viewed alone, ought to be distributed exponentially. However, the covariance between hop distances and travel times precludes this result. Instead, the covariance structure suggests that hop distances follow a Weibull distribution. These distributions are consistent with high-resolution measurements obtained from high-speed imaging of bed load particle motions. The analysis brings us closer to choosing distributions based on our mechanical insight.
NASA Astrophysics Data System (ADS)
Wan, Tao; Naoe, Takashi; Futakawa, Masatoshi
2016-01-01
A double-wall structure mercury target will be installed at the high-power pulsed spallation neutron source in the Japan Proton Accelerator Research Complex (J-PARC). Cavitation damage on the inner wall is an important factor governing the lifetime of the target-vessel. To monitor the structural integrity of the target vessel, displacement velocity at a point on the outer surface of the target vessel is measured using a laser Doppler vibrometer (LDV). The measured signals can be used for evaluating the damage inside the target vessel because of cyclic loading and cavitation bubble collapse caused by pulsed-beam induced pressure waves. The wavelet differential analysis (WDA) was applied to reveal the effects of the damage on vibrational cycling. To reduce the effects of noise superimposed on the vibration signals on the WDA results, analysis of variance (ANOVA) and analysis of covariance (ANCOVA), statistical methods were applied. Results from laboratory experiments, numerical simulation results with random noise added, and target vessel field data were analyzed by the WDA and the statistical methods. The analyses demonstrated that the established in-situ diagnostic technique can be used to effectively evaluate the structural response of the target vessel.
Restoration of MRI data for intensity non-uniformities using local high order intensity statistics
Hadjidemetriou, Stathis; Studholme, Colin; Mueller, Susanne; Weiner, Michael; Schuff, Norbert
2008-01-01
MRI at high magnetic fields (>3.0 T) is complicated by strong inhomogeneous radio-frequency fields, sometimes termed the “bias field”. These lead to non-biological intensity non-uniformities across the image. They can complicate further image analysis such as registration and tissue segmentation. Existing methods for intensity uniformity restoration have been optimized for 1.5 T, but they are less effective for 3.0 T MRI, and not at all satisfactory for higher fields. Also, many of the existing restoration algorithms require a brain template or use a prior atlas, which can restrict their practicalities. In this study an effective intensity uniformity restoration algorithm has been developed based on non-parametric statistics of high order local intensity co-occurrences. These statistics are restored with a non-stationary Wiener filter. The algorithm also assumes a smooth non-uniformity and is stable. It does not require a prior atlas and is robust to variations in anatomy. In geriatric brain imaging it is robust to variations such as enlarged ventricles and low contrast to noise ratio. The co-occurrence statistics improve robustness to whole head images with pronounced non-uniformities present in high field acquisitions. Its significantly improved performance and lower time requirements have been demonstrated by comparing it to the very commonly used N3 algorithm on BrainWeb MR simulator images as well as on real 4 T human head images. PMID:18621568
Eagle Plus Air Superiority into the 21st Century
1996-04-01
18 Data Collection Method ....................................................................................... 18 Statistical Trend Analysis...19 Statistical Readiness Analysis.................................................................................... 20 Aging Aircraft...generated by Mr. Jeff Hill served as the foundation of our statistical analysis. Special thanks go out to Mrs. Betsy Mullis, LFLL branch chief, and to
Statistical Models for the Analysis of Zero-Inflated Pain Intensity Numeric Rating Scale Data.
Goulet, Joseph L; Buta, Eugenia; Bathulapalli, Harini; Gueorguieva, Ralitza; Brandt, Cynthia A
2017-03-01
Pain intensity is often measured in clinical and research settings using the 0 to 10 numeric rating scale (NRS). NRS scores are recorded as discrete values, and in some samples they may display a high proportion of zeroes and a right-skewed distribution. Despite this, statistical methods for normally distributed data are frequently used in the analysis of NRS data. We present results from an observational cross-sectional study examining the association of NRS scores with patient characteristics using data collected from a large cohort of 18,935 veterans in Department of Veterans Affairs care diagnosed with a potentially painful musculoskeletal disorder. The mean (variance) NRS pain was 3.0 (7.5), and 34% of patients reported no pain (NRS = 0). We compared the following statistical models for analyzing NRS scores: linear regression, generalized linear models (Poisson and negative binomial), zero-inflated and hurdle models for data with an excess of zeroes, and a cumulative logit model for ordinal data. We examined model fit, interpretability of results, and whether conclusions about the predictor effects changed across models. In this study, models that accommodate zero inflation provided a better fit than the other models. These models should be considered for the analysis of NRS data with a large proportion of zeroes. We examined and analyzed pain data from a large cohort of veterans with musculoskeletal disorders. We found that many reported no current pain on the NRS on the diagnosis date. We present several alternative statistical methods for the analysis of pain intensity data with a large proportion of zeroes. Published by Elsevier Inc.
Guyot, Patricia; Ades, A E; Ouwens, Mario J N M; Welton, Nicky J
2012-02-01
The results of Randomized Controlled Trials (RCTs) on time-to-event outcomes that are usually reported are median time to events and Cox Hazard Ratio. These do not constitute the sufficient statistics required for meta-analysis or cost-effectiveness analysis, and their use in secondary analyses requires strong assumptions that may not have been adequately tested. In order to enhance the quality of secondary data analyses, we propose a method which derives from the published Kaplan Meier survival curves a close approximation to the original individual patient time-to-event data from which they were generated. We develop an algorithm that maps from digitised curves back to KM data by finding numerical solutions to the inverted KM equations, using where available information on number of events and numbers at risk. The reproducibility and accuracy of survival probabilities, median survival times and hazard ratios based on reconstructed KM data was assessed by comparing published statistics (survival probabilities, medians and hazard ratios) with statistics based on repeated reconstructions by multiple observers. The validation exercise established there was no material systematic error and that there was a high degree of reproducibility for all statistics. Accuracy was excellent for survival probabilities and medians, for hazard ratios reasonable accuracy can only be obtained if at least numbers at risk or total number of events are reported. The algorithm is a reliable tool for meta-analysis and cost-effectiveness analyses of RCTs reporting time-to-event data. It is recommended that all RCTs should report information on numbers at risk and total number of events alongside KM curves.
NASA Astrophysics Data System (ADS)
Vigan, A.; Chauvin, G.; Bonavita, M.; Desidera, S.; Bonnefoy, M.; Mesa, D.; Beuzit, J.-L.; Augereau, J.-C.; Biller, B.; Boccaletti, A.; Brugaletta, E.; Buenzli, E.; Carson, J.; Covino, E.; Delorme, P.; Eggenberger, A.; Feldt, M.; Hagelberg, J.; Henning, T.; Lagrange, A.-M.; Lanzafame, A.; Ménard, F.; Messina, S.; Meyer, M.; Montagnier, G.; Mordasini, C.; Mouillet, D.; Moutou, C.; Mugnier, L.; Quanz, S. P.; Reggiani, M.; Ségransan, D.; Thalmann, C.; Waters, R.; Zurlo, A.
2014-01-01
Over the past decade, a growing number of deep imaging surveys have started to provide meaningful constraints on the population of extrasolar giant planets at large orbital separation. Primary targets for these surveys have been carefully selected based on their age, distance and spectral type, and often on their membership to young nearby associations where all stars share common kinematics, photometric and spectroscopic properties. The next step is a wider statistical analysis of the frequency and properties of low mass companions as a function of stellar mass and orbital separation. In late 2009, we initiated a coordinated European Large Program using angular differential imaging in the H band (1.66 μm) with NaCo at the VLT. Our aim is to provide a comprehensive and statistically significant study of the occurrence of extrasolar giant planets and brown dwarfs at large (5-500 AU) orbital separation around ~150 young, nearby stars, a large fraction of which have never been observed at very deep contrast. The survey has now been completed and we present the data analysis and detection limits for the observed sample, for which we reach the planetary-mass domain at separations of >~50 AU on average. We also present the results of the statistical analysis that has been performed over the 75 targets newly observed at high-contrast. We discuss the details of the statistical analysis and the physical constraints that our survey provides for the frequency and formation scenario of planetary mass companions at large separation.
An application of principal component analysis to the clavicle and clavicle fixation devices.
Daruwalla, Zubin J; Courtis, Patrick; Fitzpatrick, Clare; Fitzpatrick, David; Mullett, Hannan
2010-03-26
Principal component analysis (PCA) enables the building of statistical shape models of bones and joints. This has been used in conjunction with computer assisted surgery in the past. However, PCA of the clavicle has not been performed. Using PCA, we present a novel method that examines the major modes of size and three-dimensional shape variation in male and female clavicles and suggests a method of grouping the clavicle into size and shape categories. Twenty-one high-resolution computerized tomography scans of the clavicle were reconstructed and analyzed using a specifically developed statistical software package. After performing statistical shape analysis, PCA was applied to study the factors that account for anatomical variation. The first principal component representing size accounted for 70.5 percent of anatomical variation. The addition of a further three principal components accounted for almost 87 percent. Using statistical shape analysis, clavicles in males have a greater lateral depth and are longer, wider and thicker than in females. However, the sternal angle in females is larger than in males. PCA confirmed these differences between genders but also noted that men exhibit greater variance and classified clavicles into five morphological groups. This unique approach is the first that standardizes a clavicular orientation. It provides information that is useful to both, the biomedical engineer and clinician. Other applications include implant design with regard to modifying current or designing future clavicle fixation devices. Our findings support the need for further development of clavicle fixation devices and the questioning of whether gender-specific devices are necessary.
Majid, Kamran; Crowder, Terence; Baker, Erin; Baker, Kevin; Koueiter, Denise; Shields, Edward; Herkowitz, Harry N
2011-12-01
One hundred eighteen patients retrieved 316L stainless steel thoracolumbar plates, of 3 different designs, used for fusion in 60 patients were examined for evidence of corrosion. A medical record review and statistical analysis were also carried out. This study aims to identify types of corrosion and examine preferential metal ion release and the possibility of statistical correlation to clinical effects. Earlier studies have found that stainless steel spine devices showed evidence of mild-to-severe corrosion; fretting and crevice corrosion were the most commonly reported types. Studies have also shown the toxicity of metal ions released from stainless steel corrosion and how the ions may adversely affect bone formation and/or induce granulomatous foreign body responses. The retrieved plates were visually inspected and graded based on the degree of corrosion. The plates were then analyzed with optical microscopy, scanning electron microscopy, and energy dispersive x-ray spectroscopy. A retrospective medical record review was performed and statistical analysis was carried out to determine any correlations between experimental findings and patient data. More than 70% of the plates exhibited some degree of corrosion. Both fretting and crevice corrosion mechanisms were observed, primarily at the screw plate interface. Energy dispersive x-ray spectroscopy analysis indicated reductions in nickel content in corroded areas, suggestive of nickel ion release to the surrounding biological environment. The incidence and severity of corrosion was significantly correlated with the design of the implant. Stainless steel thoracolumbar plates show a high incidence of corrosion, with statistical dependence on device design.
Association of Glycemic Status with Bone Turnover Markers in Type 2 Diabetes Mellitus.
Kulkarni, Sweta Vilas; Meenatchi, Suruthi; Reeta, R; Ramesh, Ramasamy; Srinivasan, A R; Lenin, C
2017-01-01
Type 2 diabetes mellitus has profound implications on the skeleton. Even though bone mineral density is increased in type 2 diabetes mellitus patients, they are more prone for fractures. The weakening of bone tissue in type 2 diabetes mellitus can be due to uncontrolled blood sugar levels leading to high levels of bone turnover markers in blood. The aim of this study is to find the association between glycemic status and bone turnover markers in type 2 diabetes mellitus. This case-control study was carried out in a tertiary health care hospital. Fifty clinically diagnosed type 2 diabetes mellitus patients in the age group between 30 and 50 years were included as cases. Fifty age- and gender-matched healthy nondiabetics were included as controls. Patients with complications and chronic illness were excluded from the study. Depending on glycated hemoglobin (HbA1c) levels, patients were grouped into uncontrolled (HbA1c >7%, n = 36) and controlled (HbA1c <7%, n = 14) diabetics. Based on duration of diabetes, patients were grouped into newly diagnosed, 1-2 years, 3-5 years, and >5 years. Serum osteocalcin (OC), bone alkaline phosphatase (BAP), acid phosphatase (ACP), and HbA1c levels were estimated. OC/BAP and OC/ACP ratio was calculated. Student's t -test, analysis of variance, and Chi-square tests were used for analysis. Receiver operating characteristic (ROC) curve analysis was done for OC/BAP and OC/ACP ratios. Serum OC, HbA1c, and OC/BAP ratio were increased in cases when compared to controls and were statistically significant ( P < 0.001). OC/ACP ratio was decreased in type 2 diabetes mellitus and was statistically significant ( P = 0.01). In patients with >5-year duration of diabetes, HbA1c level was high and was statistically significant ( P < 0.042). BAP levels were high in uncontrolled diabetics but statistically not significant. ROC curve showed OC/BAP ratio better marker than OC/ACP ratio. Uncontrolled type 2 diabetes mellitus affects bone tissue resulting in variations in bone turnover markers. Bone turnover markers are better in predicting recent changes in bone morphology and are cost effective.
Misgna, Haftom Gebrehiwot; Gebru, Haftu Berhe; Birhanu, Mulugeta Molla
2016-06-21
Around the world, more than three million newborns die in their first months of life every year. In Ethiopia during the last five years period; neonatal mortality is 37 deaths per 1000 live births. Even though there is an improvement compared to the past five years, there is still high home delivery 90 %, and high neonatal mortality about the Millennium Development Goal, which aims to be less than 32/1000 live births in Ethiopia. The purpose of this study is to assess maternal knowledge, practice and associated factors of essential newborn care at home in Gulomekada District Eastern Tigray, Ethiopia. A community-based cross-sectional study is conducted in 296 mothers from Gulomekada District by using simple random sampling technique. Data entry and analysis is carried out by using Statistical Package for Social Sciences-20. The magnitude of the association between different variables about the outcome variable is measured by odds ratio with 95 % confidence interval. A binary logistic regression analysis is made to obtain odds ratio and the confidence interval of statistical associations. The goodness of fit had tested by Hosmer-Lemeshow statistic and all variables with P-value greater than 0.05 are fitted to the multivariate model. Variables with P < 0.2 in the bivariate analysis are included in the final model, and statistical significance is declared at P < 0.05. Eighty percent (80.4 %) study participants had good knowledge on essential new born care and 92.9 % had the good practice of essential new born care. About 60 % of mothers applied butter or oil on the cord stump for their last baby. Marital status and education are significantly associated with knowledge, whereas urban residence mothers with good knowledge on essential newborn care and employed mothers are significantly associated with mothers' practice of essential newborn care. Almost all mothers know and practice essential newborn care correctly except oil or butter application to the cord stump is highly practiced which should be avoided. Only marital status and educational status are significantly associated with mothers' knowledge.
Spatial Ensemble Postprocessing of Precipitation Forecasts Using High Resolution Analyses
NASA Astrophysics Data System (ADS)
Lang, Moritz N.; Schicker, Irene; Kann, Alexander; Wang, Yong
2017-04-01
Ensemble prediction systems are designed to account for errors or uncertainties in the initial and boundary conditions, imperfect parameterizations, etc. However, due to sampling errors and underestimation of the model errors, these ensemble forecasts tend to be underdispersive, and to lack both reliability and sharpness. To overcome such limitations, statistical postprocessing methods are commonly applied to these forecasts. In this study, a full-distributional spatial post-processing method is applied to short-range precipitation forecasts over Austria using Standardized Anomaly Model Output Statistics (SAMOS). Following Stauffer et al. (2016), observation and forecast fields are transformed into standardized anomalies by subtracting a site-specific climatological mean and dividing by the climatological standard deviation. Due to the need of fitting only a single regression model for the whole domain, the SAMOS framework provides a computationally inexpensive method to create operationally calibrated probabilistic forecasts for any arbitrary location or for all grid points in the domain simultaneously. Taking advantage of the INCA system (Integrated Nowcasting through Comprehensive Analysis), high resolution analyses are used for the computation of the observed climatology and for model training. The INCA system operationally combines station measurements and remote sensing data into real-time objective analysis fields at 1 km-horizontal resolution and 1 h-temporal resolution. The precipitation forecast used in this study is obtained from a limited area model ensemble prediction system also operated by ZAMG. The so called ALADIN-LAEF provides, by applying a multi-physics approach, a 17-member forecast at a horizontal resolution of 10.9 km and a temporal resolution of 1 hour. The performed SAMOS approach statistically combines the in-house developed high resolution analysis and ensemble prediction system. The station-based validation of 6 hour precipitation sums shows a mean improvement of more than 40% in CRPS when compared to bilinearly interpolated uncalibrated ensemble forecasts. The validation on randomly selected grid points, representing the true height distribution over Austria, still indicates a mean improvement of 35%. The applied statistical model is currently set up for 6-hourly and daily accumulation periods, but will be extended to a temporal resolution of 1-3 hours within a new probabilistic nowcasting system operated by ZAMG.
NASA Astrophysics Data System (ADS)
McNamara, Mark W.
The study tested the hypothesis that self-efficacy, stress, and acculturation are useful predictors of academic achievement in first year university science, independent of high school GPA and SAT scores, in a sample of Latino students at a South Texas Hispanic serving institution of higher education. The correlational study employed a mixed methods explanatory sequential model. The non-probability sample consisted of 98 university science and engineering students. The study participants had high science self-efficacy, low number of stressors, and were slightly Anglo-oriented bicultural to strongly Anglo-oriented. As expected, the control variables of SAT score and high school GPA were statistically significant predictors of the outcome measures. Together, they accounted for 19.80% of the variation in first year GPA, 13.80% of the variation in earned credit hours, and 11.30% of the variation in intent to remain in the science major. After controlling for SAT scores and high school GPAs, self-efficacy was a statistically significant predictor of credit hours earned and accounted for 5.60% of the variation; its unique contribution in explaining the variation in first year GPA and intent to remain in the science major was not statistically significant. Stress and acculturation were not statistically significant predictors of any of the outcome measures. Analysis of the qualitative data resulted in six themes (a) high science self-efficacy, (b) stressors, (c) positive role of stress, (d) Anglo-oriented, (e) bicultural, and (f) family. The quantitative and qualitative results were synthesized and practical implications were discussed.
Multifractal analysis of geophysical time series in the urban lake of Créteil (France).
NASA Astrophysics Data System (ADS)
Mezemate, Yacine; Tchiguirinskaia, Ioulia; Bonhomme, Celine; Schertzer, Daniel; Lemaire, Bruno Jacques; Vinçon leite, Brigitte; Lovejoy, Shaun
2013-04-01
Urban water bodies take part in the environmental quality of the cities. They regulate heat, contribute to the beauty of landscape and give some space for leisure activities (aquatic sports, swimming). As they are often artificial they are only a few meters deep. It confers them some specific properties. Indeed, they are particularly sensitive to global environmental changes, including climate change, eutrophication and contamination by micro-pollutants due to the urbanization of the watershed. Monitoring their quality has become a major challenge for urban areas. The need for a tool for predicting short-term proliferation of potentially toxic phytoplankton therefore arises. In lakes, the behavior of biological and physical (temperature) fields is mainly driven by the turbulence regime in the water. Turbulence is highly non linear, nonstationary and intermittent. This is why statistical tools are needed to characterize the evolution of the fields. The knowledge of the probability distribution of all the statistical moments of a given field is necessary to fully characterize it. This possibility is offered by the multifractal analysis based on the assumption of scale invariance. To investigate the effect of space-time variability of temperature, chlorophyll and dissolved oxygen on the cyanobacteria proliferation in the urban lake of Creteil (France), a spectral analysis is first performed on each time series (or on subsamples) to have an overall estimate of their scaling behaviors. Then a multifractal analysis (Trace Moment, Double Trace Moment) estimates the statistical moments of different orders. This analysis is adapted to the specific properties of the studied time series, i. e. the presence of large scale gradients. The nonlinear behavior of the scaling functions K(q) confirms that the investigated aquatic time series are indeed multifractal and highly intermittent .The knowledge of the universal multifractal parameters is the key to calculate the different statistical moments and thus make some predictions on the fields. As a conclusion, the relationships between the fields will be highlighted with a discussion on the cross predictability of the different fields. This draws a prospective for the use of this kind of time series analysis in the field of limnology. The authors acknowledge the financial support from the R2DS-PLUMMME and Climate-KIC BlueGreenDream projects.
Ranking metrics in gene set enrichment analysis: do they matter?
Zyla, Joanna; Marczyk, Michal; Weiner, January; Polanska, Joanna
2017-05-12
There exist many methods for describing the complex relation between changes of gene expression in molecular pathways or gene ontologies under different experimental conditions. Among them, Gene Set Enrichment Analysis seems to be one of the most commonly used (over 10,000 citations). An important parameter, which could affect the final result, is the choice of a metric for the ranking of genes. Applying a default ranking metric may lead to poor results. In this work 28 benchmark data sets were used to evaluate the sensitivity and false positive rate of gene set analysis for 16 different ranking metrics including new proposals. Furthermore, the robustness of the chosen methods to sample size was tested. Using k-means clustering algorithm a group of four metrics with the highest performance in terms of overall sensitivity, overall false positive rate and computational load was established i.e. absolute value of Moderated Welch Test statistic, Minimum Significant Difference, absolute value of Signal-To-Noise ratio and Baumgartner-Weiss-Schindler test statistic. In case of false positive rate estimation, all selected ranking metrics were robust with respect to sample size. In case of sensitivity, the absolute value of Moderated Welch Test statistic and absolute value of Signal-To-Noise ratio gave stable results, while Baumgartner-Weiss-Schindler and Minimum Significant Difference showed better results for larger sample size. Finally, the Gene Set Enrichment Analysis method with all tested ranking metrics was parallelised and implemented in MATLAB, and is available at https://github.com/ZAEDPolSl/MrGSEA . Choosing a ranking metric in Gene Set Enrichment Analysis has critical impact on results of pathway enrichment analysis. The absolute value of Moderated Welch Test has the best overall sensitivity and Minimum Significant Difference has the best overall specificity of gene set analysis. When the number of non-normally distributed genes is high, using Baumgartner-Weiss-Schindler test statistic gives better outcomes. Also, it finds more enriched pathways than other tested metrics, which may induce new biological discoveries.
NASA Astrophysics Data System (ADS)
Vinh, T.
1980-08-01
There is a need for better and more effective lightning protection for transmission and switching substations. In the past, a number of empirical methods were utilized to design systems to protect substations and transmission lines from direct lightning strokes. The need exists for convenient analytical lightning models adequate for engineering usage. In this study, analytical lightning models were developed along with a method for improved analysis of the physical properties of lightning through their use. This method of analysis is based upon the most recent statistical field data. The result is an improved method for predicting the occurrence of sheilding failure and for designing more effective protection for high and extra high voltage substations from direct strokes.
NASA Technical Reports Server (NTRS)
Zimmerman, G. A.; Olsen, E. T.
1992-01-01
Noise power estimation in the High-Resolution Microwave Survey (HRMS) sky survey element is considered as an example of a constant false alarm rate (CFAR) signal detection problem. Order-statistic-based noise power estimators for CFAR detection are considered in terms of required estimator accuracy and estimator dynamic range. By limiting the dynamic range of the value to be estimated, the performance of an order-statistic estimator can be achieved by simpler techniques requiring only a single pass of the data. Simple threshold-and-count techniques are examined, and it is shown how several parallel threshold-and-count estimation devices can be used to expand the dynamic range to meet HRMS system requirements with minimal hardware complexity. An input/output (I/O) efficient limited-precision order-statistic estimator with wide but limited dynamic range is also examined.
NASA Astrophysics Data System (ADS)
Ducariu, A.; Constantin, G. C.; Puscas, N. N.
2005-08-01
In the small gain approximation and the unsaturated regime in this paper we report some original results concerning the evaluation of the Fano factor, statistical fluctuation and spontaneous emission factor which characterize the photon statistics on the number of excited modes, dopant concentration and power pumping in the single and double pass Er3+ - doped LiNbO, straight waveguide amplifiers pumped near 1484 nm using erfc, Gaussian and constant profile of the Er3+ ions in LiNbO, crystal. We demonstrated that for 50 mW input pump power the Poisson photon statistics are maintained in the above mentioned amplifiers for concentrations of the Er ions smaller than l026 m-3 and also high gains and low noise figures are achievable. The obtained results can be used for the design of optoelectronic integrated circuits.
The Content of Statistical Requirements for Authors in Biomedical Research Journals
Liu, Tian-Yi; Cai, Si-Yu; Nie, Xiao-Lu; Lyu, Ya-Qi; Peng, Xiao-Xia; Feng, Guo-Shuang
2016-01-01
Background: Robust statistical designing, sound statistical analysis, and standardized presentation are important to enhance the quality and transparency of biomedical research. This systematic review was conducted to summarize the statistical reporting requirements introduced by biomedical research journals with an impact factor of 10 or above so that researchers are able to give statistical issues’ serious considerations not only at the stage of data analysis but also at the stage of methodological design. Methods: Detailed statistical instructions for authors were downloaded from the homepage of each of the included journals or obtained from the editors directly via email. Then, we described the types and numbers of statistical guidelines introduced by different press groups. Items of statistical reporting guideline as well as particular requirements were summarized in frequency, which were grouped into design, method of analysis, and presentation, respectively. Finally, updated statistical guidelines and particular requirements for improvement were summed up. Results: Totally, 21 of 23 press groups introduced at least one statistical guideline. More than half of press groups can update their statistical instruction for authors gradually relative to issues of new statistical reporting guidelines. In addition, 16 press groups, covering 44 journals, address particular statistical requirements. The most of the particular requirements focused on the performance of statistical analysis and transparency in statistical reporting, including “address issues relevant to research design, including participant flow diagram, eligibility criteria, and sample size estimation,” and “statistical methods and the reasons.” Conclusions: Statistical requirements for authors are becoming increasingly perfected. Statistical requirements for authors remind researchers that they should make sufficient consideration not only in regards to statistical methods during the research design, but also standardized statistical reporting, which would be beneficial in providing stronger evidence and making a greater critical appraisal of evidence more accessible. PMID:27748343
The Content of Statistical Requirements for Authors in Biomedical Research Journals.
Liu, Tian-Yi; Cai, Si-Yu; Nie, Xiao-Lu; Lyu, Ya-Qi; Peng, Xiao-Xia; Feng, Guo-Shuang
2016-10-20
Robust statistical designing, sound statistical analysis, and standardized presentation are important to enhance the quality and transparency of biomedical research. This systematic review was conducted to summarize the statistical reporting requirements introduced by biomedical research journals with an impact factor of 10 or above so that researchers are able to give statistical issues' serious considerations not only at the stage of data analysis but also at the stage of methodological design. Detailed statistical instructions for authors were downloaded from the homepage of each of the included journals or obtained from the editors directly via email. Then, we described the types and numbers of statistical guidelines introduced by different press groups. Items of statistical reporting guideline as well as particular requirements were summarized in frequency, which were grouped into design, method of analysis, and presentation, respectively. Finally, updated statistical guidelines and particular requirements for improvement were summed up. Totally, 21 of 23 press groups introduced at least one statistical guideline. More than half of press groups can update their statistical instruction for authors gradually relative to issues of new statistical reporting guidelines. In addition, 16 press groups, covering 44 journals, address particular statistical requirements. The most of the particular requirements focused on the performance of statistical analysis and transparency in statistical reporting, including "address issues relevant to research design, including participant flow diagram, eligibility criteria, and sample size estimation," and "statistical methods and the reasons." Statistical requirements for authors are becoming increasingly perfected. Statistical requirements for authors remind researchers that they should make sufficient consideration not only in regards to statistical methods during the research design, but also standardized statistical reporting, which would be beneficial in providing stronger evidence and making a greater critical appraisal of evidence more accessible.
Tu, Huakang; Sun, Liping; Dong, Xiao; Gong, Yuehua; Xu, Qian; Jing, Jingjing; Bostick, Roberd M; Wu, Xifeng; Yuan, Yuan
2017-05-01
We aimed to assess a serological biopsy using five stomach-specific circulating biomarkers-pepsinogen I (PGI), PGII, PGI/II ratio, anti-Helicobacter pylori (H. pylori) antibody, and gastrin-17 (G-17)-for identifying high-risk individuals and predicting risk of developing gastric cancer (GC). Among 12,112 participants with prospective follow-up from an ongoing population-based screening program using both serology and gastroscopy in China, we conducted a multi-phase study involving a cross-sectional analysis, a follow-up analysis, and an integrative risk prediction modeling analysis. In the cross-sectional analysis, the five biomarkers (especially PGII, the PGI/II ratio, and H. pylori sero-positivity) were associated with the presence of precancerous gastric lesions or GC at enrollment. In the follow-up analysis, low PGI levels and PGI/II ratios were associated with higher risk of developing GC, and both low (<0.5 pmol/l) and high (>4.7 pmol/l) G-17 levels were associated with higher risk of developing GC, suggesting a J-shaped association. In the risk prediction modeling analysis, the five biomarkers combined yielded a C statistic of 0.803 (95% confidence interval (CI)=0.789-0.816) and improved prediction beyond traditional risk factors (C statistic from 0.580 to 0.811, P<0.001) for identifying precancerous lesions at enrollment, and higher serological biopsy scores based on the five biomarkers at enrollment were associated with higher risk of developing GC during follow-up (P for trend <0.001). A serological biopsy composed of the five stomach-specific circulating biomarkers could be used to identify high-risk individuals for further diagnostic gastroscopy, and to stratify individuals' risk of developing GC and thus to guide targeted screening and precision prevention.
Joint principal trend analysis for longitudinal high-dimensional data.
Zhang, Yuping; Ouyang, Zhengqing
2018-06-01
We consider a research scenario motivated by integrating multiple sources of information for better knowledge discovery in diverse dynamic biological processes. Given two longitudinal high-dimensional datasets for a group of subjects, we want to extract shared latent trends and identify relevant features. To solve this problem, we present a new statistical method named as joint principal trend analysis (JPTA). We demonstrate the utility of JPTA through simulations and applications to gene expression data of the mammalian cell cycle and longitudinal transcriptional profiling data in response to influenza viral infections. © 2017, The International Biometric Society.
Analysis of Yb3+/Er3+-codoped microring resonator cross-grid matrices
NASA Astrophysics Data System (ADS)
Vallés, Juan A.; Gǎlǎtuş, Ramona
2014-09-01
An analytic model of the scattering response of a highly Yb3+/Er3+-codoped phosphate glass microring resonator matrix is considered to obtain the transfer functions of an M x N cross-grid microring resonator structure. Then a detailed model is used to calculate the pump and signal propagation, including a microscopic statistical formalism to describe the high-concentration induced energy-transfer mechanisms and passive and active features are combined to realistically simulate the performance as a wavelength-selective amplifier or laser. This analysis allows the optimization of these structures for telecom or sensing applications.
LES, DNS and RANS for the analysis of high-speed turbulent reacting flows
NASA Technical Reports Server (NTRS)
Adumitroaie, V.; Colucci, P. J.; Taulbee, D. B.; Givi, P.
1995-01-01
The purpose of this research is to continue our efforts in advancing the state of knowledge in large eddy simulation (LES), direct numerical simulation (DNS), and Reynolds averaged Navier Stokes (RANS) methods for the computational analysis of high-speed reacting turbulent flows. In the second phase of this work, covering the period 1 Aug. 1994 - 31 Jul. 1995, we have focused our efforts on two programs: (1) developments of explicit algebraic moment closures for statistical descriptions of compressible reacting flows and (2) development of Monte Carlo numerical methods for LES of chemically reacting flows.
NASA Astrophysics Data System (ADS)
Kucharik, C.; Roth, J.
2002-12-01
The threat of global climate change has provoked policy-makers to consider plausible strategies to slow the accumulation of greenhouse gases, especially carbon dioxide, in the atmosphere. One such idea involves the sequestration of atmospheric carbon (C) in degraded agricultural soils as part of the Conservation Reserve Program (CRP). While the potential for significant C sequestration in CRP grassland ecosystems has been demonstrated, the paired-site sampling approach traditionally used to quantify soil C changes has not been evaluated with robust statistical analysis. In this study, 14 paired CRP (> 8 years old) and cropland sites in Dane County, Wisconsin (WI) were used to assess whether a paired-site sampling design could detect statistically significant differences (ANOVA) in mean soil organic C and total nitrogen (N) storage. We compared surface (0 to 10 cm) bulk density, and sampled soils (0 to 5, 5 to 10, and 10 to 25 cm) for textural differences and chemical analysis of organic matter (OM), soil organic C (SOC), total N, and pH. The CRP contributed to lowering soil bulk density by 13% (p < 0.0001) and increased SOC and OM storage (kg m-2) by 13 to 17% in the 0 to 5 cm layer (p = 0.1). We tested the statistical power associated with ANOVA for measured soil properties, and calculated minimum detectable differences (MDD). We concluded that 40 to 65 paired sites and soil sampling in 5 cm increments near the surface were needed to achieve an 80% confidence level (α = 0.05; β = 0.20) in soil C and N sequestration rates. Because soil C and total N storage was highly variable among these sites (CVs > 20%), only a 23 to 29% change in existing total organic C and N pools could be reliably detected. While C and N sequestration (247 kg C ha{-1 } yr-1 and 17 kg N ha-1 yr-1) may be occurring and confined to the surface 5 cm as part of the WI CRP, our sampling design did not statistically support the desired 80% confidence level. We conclude that usage of statistical power analysis is essential to insure a high level of confidence in soil C and N sequestration rates that are quantified using paired plots.
Kakourou, Alexia; Vach, Werner; Nicolardi, Simone; van der Burgt, Yuri; Mertens, Bart
2016-10-01
Mass spectrometry based clinical proteomics has emerged as a powerful tool for high-throughput protein profiling and biomarker discovery. Recent improvements in mass spectrometry technology have boosted the potential of proteomic studies in biomedical research. However, the complexity of the proteomic expression introduces new statistical challenges in summarizing and analyzing the acquired data. Statistical methods for optimally processing proteomic data are currently a growing field of research. In this paper we present simple, yet appropriate methods to preprocess, summarize and analyze high-throughput MALDI-FTICR mass spectrometry data, collected in a case-control fashion, while dealing with the statistical challenges that accompany such data. The known statistical properties of the isotopic distribution of the peptide molecules are used to preprocess the spectra and translate the proteomic expression into a condensed data set. Information on either the intensity level or the shape of the identified isotopic clusters is used to derive summary measures on which diagnostic rules for disease status allocation will be based. Results indicate that both the shape of the identified isotopic clusters and the overall intensity level carry information on the class outcome and can be used to predict the presence or absence of the disease.
NASA Astrophysics Data System (ADS)
Li, Jun-Wei; Cao, Jun-Wei
2010-04-01
One challenge in large-scale scientific data analysis is to monitor data in real-time in a distributed environment. For the LIGO (Laser Interferometer Gravitational-wave Observatory) project, a dedicated suit of data monitoring tools (DMT) has been developed, yielding good extensibility to new data type and high flexibility to a distributed environment. Several services are provided, including visualization of data information in various forms and file output of monitoring results. In this work, a DMT monitor, OmegaMon, is developed for tracking statistics of gravitational-wave (OW) burst triggers that are generated from a specific OW burst data analysis pipeline, the Omega Pipeline. Such results can provide diagnostic information as reference of trigger post-processing and interferometer maintenance.
Analysis of pediatric blood lead levels in New York City for 1970-1976.
Billick, I H; Curran, A S; Shier, D R
1979-01-01
A study was completed of more than 170,000 records of pediatric venous blood levels and supporting demographic information collected in New York City during 1970-1976. The geometric mean (GM) blood lead level shows a consistent cyclical variation superimposed on an overall decreasing trend with time for all ages and ethnic groups studied. The GM blood lead levels for blacks are significantly greater than those for either Hispanics or whites. Regression analysis indicates a significant statistical association between GM blood lead level and ambient air lead level, after appropriate adjustments are made for age and ethnic group. These highly significant statistical relationships provide extremely strong incentives and directions for research into casual factors related to blood lead levels in children. PMID:499123
Friedman, David B
2012-01-01
All quantitative proteomics experiments measure variation between samples. When performing large-scale experiments that involve multiple conditions or treatments, the experimental design should include the appropriate number of individual biological replicates from each condition to enable the distinction between a relevant biological signal from technical noise. Multivariate statistical analyses, such as principal component analysis (PCA), provide a global perspective on experimental variation, thereby enabling the assessment of whether the variation describes the expected biological signal or the unanticipated technical/biological noise inherent in the system. Examples will be shown from high-resolution multivariable DIGE experiments where PCA was instrumental in demonstrating biologically significant variation as well as sample outliers, fouled samples, and overriding technical variation that would not be readily observed using standard univariate tests.
Vadalà, Rossella; Mottese, Antonio F.; Bua, Giuseppe D.; Salvo, Andrea; Mallamace, Domenico; Corsaro, Carmelo; Vasi, Sebastiano; Giofrè, Salvatore V.; Alfa, Maria; Cicero, Nicola; Dugo, Giacomo
2016-01-01
We performed a statistical analysis of the concentration of mineral elements, by means of inductively coupled plasma mass spectrometry (ICP-MS), in different varieties of garlic from Spain, Tunisia, and Italy. Nubia Red Garlic (Sicily) is one of the most known Italian varieties that belongs to traditional Italian food products (P.A.T.) of the Ministry of Agriculture, Food, and Forestry. The obtained results suggest that the concentrations of the considered elements may serve as geographical indicators for the discrimination of the origin of the different samples. In particular, we found a relatively high content of Selenium in the garlic variety known as Nubia red garlic, and, indeed, it could be used as an anticarcinogenic agent. PMID:28231115
Use of model calibration to achieve high accuracy in analysis of computer networks
Frogner, Bjorn; Guarro, Sergio; Scharf, Guy
2004-05-11
A system and method are provided for creating a network performance prediction model, and calibrating the prediction model, through application of network load statistical analyses. The method includes characterizing the measured load on the network, which may include background load data obtained over time, and may further include directed load data representative of a transaction-level event. Probabilistic representations of load data are derived to characterize the statistical persistence of the network performance variability and to determine delays throughout the network. The probabilistic representations are applied to the network performance prediction model to adapt the model for accurate prediction of network performance. Certain embodiments of the method and system may be used for analysis of the performance of a distributed application characterized as data packet streams.
Application of Statistics in Engineering Technology Programs
ERIC Educational Resources Information Center
Zhan, Wei; Fink, Rainer; Fang, Alex
2010-01-01
Statistics is a critical tool for robustness analysis, measurement system error analysis, test data analysis, probabilistic risk assessment, and many other fields in the engineering world. Traditionally, however, statistics is not extensively used in undergraduate engineering technology (ET) programs, resulting in a major disconnect from industry…
ERIC Educational Resources Information Center
Osler, James Edward
2014-01-01
This monograph provides an epistemological rational for the design of a novel post hoc statistical measure called "Tri-Center Analysis". This new statistic is designed to analyze the post hoc outcomes of the Tri-Squared Test. In Tri-Center Analysis trichotomous parametric inferential parametric statistical measures are calculated from…
Fast fMRI provides high statistical power in the analysis of epileptic networks.
Jacobs, Julia; Stich, Julia; Zahneisen, Benjamin; Assländer, Jakob; Ramantani, Georgia; Schulze-Bonhage, Andreas; Korinthenberg, Rudolph; Hennig, Jürgen; LeVan, Pierre
2014-03-01
EEG-fMRI is a unique method to combine the high temporal resolution of EEG with the high spatial resolution of MRI to study generators of intrinsic brain signals such as sleep grapho-elements or epileptic spikes. While the standard EPI sequence in fMRI experiments has a temporal resolution of around 2.5-3s a newly established fast fMRI sequence called MREG (Magnetic-Resonance-Encephalography) provides a temporal resolution of around 100ms. This technical novelty promises to improve statistics, facilitate correction of physiological artifacts and improve the understanding of epileptic networks in fMRI. The present study compares simultaneous EEG-EPI and EEG-MREG analyzing epileptic spikes to determine the yield of fast MRI in the analysis of intrinsic brain signals. Patients with frequent interictal spikes (>3/20min) underwent EEG-MREG and EEG-EPI (3T, 20min each, voxel size 3×3×3mm, EPI TR=2.61s, MREG TR=0.1s). Timings of the spikes were used in an event-related analysis to generate activation maps of t-statistics. (FMRISTAT, |t|>3.5, cluster size: 7 voxels, p<0.05 corrected). For both sequences, the amplitude and location of significant BOLD activations were compared with the spike topography. 13 patients were recorded and 33 different spike types could be analyzed. Peak T-values were significantly higher in MREG than in EPI (p<0.0001). Positive BOLD effects correlating with the spike topography were found in 8/29 spike types using the EPI and in 22/33 spikes types using the MREG sequence. Negative BOLD responses in the default mode network could be observed in 3/29 spike types with the EPI and in 19/33 with the MREG sequence. With the latter method, BOLD changes were observed even when few spikes occurred during the investigation. Simultaneous EEG-MREG thus is possible with good EEG quality and shows higher sensitivity in regard to the localization of spike-related BOLD responses than EEG-EPI. The development of new methods of analysis for this sequence such as modeling of physiological noise, temporal analysis of the BOLD signal and defining appropriate thresholds is required to fully profit from its high temporal resolution. © 2013.