NASA Astrophysics Data System (ADS)
Saputra, K. V. I.; Cahyadi, L.; Sembiring, U. A.
2018-01-01
Start in this paper, we assess our traditional elementary statistics education and also we introduce elementary statistics with simulation-based inference. To assess our statistical class, we adapt the well-known CAOS (Comprehensive Assessment of Outcomes in Statistics) test that serves as an external measure to assess the student’s basic statistical literacy. This test generally represents as an accepted measure of statistical literacy. We also introduce a new teaching method on elementary statistics class. Different from the traditional elementary statistics course, we will introduce a simulation-based inference method to conduct hypothesis testing. From the literature, it has shown that this new teaching method works very well in increasing student’s understanding of statistics.
Statistical Learning Analysis in Neuroscience: Aiming for Transparency
Hanke, Michael; Halchenko, Yaroslav O.; Haxby, James V.; Pollmann, Stefan
2009-01-01
Encouraged by a rise of reciprocal interest between the machine learning and neuroscience communities, several recent studies have demonstrated the explanatory power of statistical learning techniques for the analysis of neural data. In order to facilitate a wider adoption of these methods, neuroscientific research needs to ensure a maximum of transparency to allow for comprehensive evaluation of the employed procedures. We argue that such transparency requires “neuroscience-aware” technology for the performance of multivariate pattern analyses of neural data that can be documented in a comprehensive, yet comprehensible way. Recently, we introduced PyMVPA, a specialized Python framework for machine learning based data analysis that addresses this demand. Here, we review its features and applicability to various neural data modalities. PMID:20582270
Animal movement: Statistical models for telemetry data
Hooten, Mevin B.; Johnson, Devin S.; McClintock, Brett T.; Morales, Juan M.
2017-01-01
The study of animal movement has always been a key element in ecological science, because it is inherently linked to critical processes that scale from individuals to populations and communities to ecosystems. Rapid improvements in biotelemetry data collection and processing technology have given rise to a variety of statistical methods for characterizing animal movement. The book serves as a comprehensive reference for the types of statistical models used to study individual-based animal movement.
Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong
2015-01-01
Background Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. Objectives This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. Methods We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. Results There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. Conclusion The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent. PMID:26053876
Scene-based nonuniformity correction and enhancement: pixel statistics and subpixel motion.
Zhao, Wenyi; Zhang, Chao
2008-07-01
We propose a framework for scene-based nonuniformity correction (NUC) and nonuniformity correction and enhancement (NUCE) that is required for focal-plane array-like sensors to obtain clean and enhanced-quality images. The core of the proposed framework is a novel registration-based nonuniformity correction super-resolution (NUCSR) method that is bootstrapped by statistical scene-based NUC methods. Based on a comprehensive imaging model and an accurate parametric motion estimation, we are able to remove severe/structured nonuniformity and in the presence of subpixel motion to simultaneously improve image resolution. One important feature of our NUCSR method is the adoption of a parametric motion model that allows us to (1) handle many practical scenarios where parametric motions are present and (2) carry out perfect super-resolution in principle by exploring available subpixel motions. Experiments with real data demonstrate the efficiency of the proposed NUCE framework and the effectiveness of the NUCSR method.
Wu, Yazhou; Zhou, Liang; Li, Gaoming; Yi, Dali; Wu, Xiaojiao; Liu, Xiaoyu; Zhang, Yanqi; Liu, Ling; Yi, Dong
2015-01-01
Although a substantial number of studies focus on the teaching and application of medical statistics in China, few studies comprehensively evaluate the recognition of and demand for medical statistics. In addition, the results of these various studies differ and are insufficiently comprehensive and systematic. This investigation aimed to evaluate the general cognition of and demand for medical statistics by undergraduates, graduates, and medical staff in China. We performed a comprehensive database search related to the cognition of and demand for medical statistics from January 2007 to July 2014 and conducted a meta-analysis of non-controlled studies with sub-group analysis for undergraduates, graduates, and medical staff. There are substantial differences with respect to the cognition of theory in medical statistics among undergraduates (73.5%), graduates (60.7%), and medical staff (39.6%). The demand for theory in medical statistics is high among graduates (94.6%), undergraduates (86.1%), and medical staff (88.3%). Regarding specific statistical methods, the cognition of basic statistical methods is higher than of advanced statistical methods. The demand for certain advanced statistical methods, including (but not limited to) multiple analysis of variance (ANOVA), multiple linear regression, and logistic regression, is higher than that for basic statistical methods. The use rates of the Statistical Package for the Social Sciences (SPSS) software and statistical analysis software (SAS) are only 55% and 15%, respectively. The overall statistical competence of undergraduates, graduates, and medical staff is insufficient, and their ability to practically apply their statistical knowledge is limited, which constitutes an unsatisfactory state of affairs for medical statistics education. Because the demand for skills in this area is increasing, the need to reform medical statistics education in China has become urgent.
SSD for R: A Comprehensive Statistical Package to Analyze Single-System Data
ERIC Educational Resources Information Center
Auerbach, Charles; Schudrich, Wendy Zeitlin
2013-01-01
The need for statistical analysis in single-subject designs presents a challenge, as analytical methods that are applied to group comparison studies are often not appropriate in single-subject research. "SSD for R" is a robust set of statistical functions with wide applicability to single-subject research. It is a comprehensive package…
Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H
2017-05-10
We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value
Weir, Christopher J; Butcher, Isabella; Assi, Valentina; Lewis, Stephanie C; Murray, Gordon D; Langhorne, Peter; Brady, Marian C
2018-03-07
Rigorous, informative meta-analyses rely on availability of appropriate summary statistics or individual participant data. For continuous outcomes, especially those with naturally skewed distributions, summary information on the mean or variability often goes unreported. While full reporting of original trial data is the ideal, we sought to identify methods for handling unreported mean or variability summary statistics in meta-analysis. We undertook two systematic literature reviews to identify methodological approaches used to deal with missing mean or variability summary statistics. Five electronic databases were searched, in addition to the Cochrane Colloquium abstract books and the Cochrane Statistics Methods Group mailing list archive. We also conducted cited reference searching and emailed topic experts to identify recent methodological developments. Details recorded included the description of the method, the information required to implement the method, any underlying assumptions and whether the method could be readily applied in standard statistical software. We provided a summary description of the methods identified, illustrating selected methods in example meta-analysis scenarios. For missing standard deviations (SDs), following screening of 503 articles, fifteen methods were identified in addition to those reported in a previous review. These included Bayesian hierarchical modelling at the meta-analysis level; summary statistic level imputation based on observed SD values from other trials in the meta-analysis; a practical approximation based on the range; and algebraic estimation of the SD based on other summary statistics. Following screening of 1124 articles for methods estimating the mean, one approximate Bayesian computation approach and three papers based on alternative summary statistics were identified. Illustrative meta-analyses showed that when replacing a missing SD the approximation using the range minimised loss of precision and generally performed better than omitting trials. When estimating missing means, a formula using the median, lower quartile and upper quartile performed best in preserving the precision of the meta-analysis findings, although in some scenarios, omitting trials gave superior results. Methods based on summary statistics (minimum, maximum, lower quartile, upper quartile, median) reported in the literature facilitate more comprehensive inclusion of randomised controlled trials with missing mean or variability summary statistics within meta-analyses.
Ensemble-based prediction of RNA secondary structures.
Aghaeepour, Nima; Hoos, Holger H
2013-04-24
Accurate structure prediction methods play an important role for the understanding of RNA function. Energy-based, pseudoknot-free secondary structure prediction is one of the most widely used and versatile approaches, and improved methods for this task have received much attention over the past five years. Despite the impressive progress that as been achieved in this area, existing evaluations of the prediction accuracy achieved by various algorithms do not provide a comprehensive, statistically sound assessment. Furthermore, while there is increasing evidence that no prediction algorithm consistently outperforms all others, no work has been done to exploit the complementary strengths of multiple approaches. In this work, we present two contributions to the area of RNA secondary structure prediction. Firstly, we use state-of-the-art, resampling-based statistical methods together with a previously published and increasingly widely used dataset of high-quality RNA structures to conduct a comprehensive evaluation of existing RNA secondary structure prediction procedures. The results from this evaluation clarify the performance relationship between ten well-known existing energy-based pseudoknot-free RNA secondary structure prediction methods and clearly demonstrate the progress that has been achieved in recent years. Secondly, we introduce AveRNA, a generic and powerful method for combining a set of existing secondary structure prediction procedures into an ensemble-based method that achieves significantly higher prediction accuracies than obtained from any of its component procedures. Our new, ensemble-based method, AveRNA, improves the state of the art for energy-based, pseudoknot-free RNA secondary structure prediction by exploiting the complementary strengths of multiple existing prediction procedures, as demonstrated using a state-of-the-art statistical resampling approach. In addition, AveRNA allows an intuitive and effective control of the trade-off between false negative and false positive base pair predictions. Finally, AveRNA can make use of arbitrary sets of secondary structure prediction procedures and can therefore be used to leverage improvements in prediction accuracy offered by algorithms and energy models developed in the future. Our data, MATLAB software and a web-based version of AveRNA are publicly available at http://www.cs.ubc.ca/labs/beta/Software/AveRNA.
Practice-based evidence study design for comparative effectiveness research.
Horn, Susan D; Gassaway, Julie
2007-10-01
To describe a new, rigorous, comprehensive practice-based evidence for clinical practice improvement (PBE-CPI) study methodology, and compare its features, advantages, and disadvantages to those of randomized controlled trials and sophisticated statistical methods for comparative effectiveness research. PBE-CPI incorporates natural variation within data from routine clinical practice to determine what works, for whom, when, and at what cost. It uses the knowledge of front-line caregivers, who develop study questions and define variables as part of a transdisciplinary team. Its comprehensive measurement framework provides a basis for analyses of significant bivariate and multivariate associations between treatments and outcomes, controlling for patient differences, such as severity of illness. PBE-CPI studies can uncover better practices more quickly than randomized controlled trials or sophisticated statistical methods, while achieving many of the same advantages. We present examples of actionable findings from PBE-CPI studies in postacute care settings related to comparative effectiveness of medications, nutritional support approaches, incontinence products, physical therapy activities, and other services. Outcomes improved when practices associated with better outcomes in PBE-CPI analyses were adopted in practice.
Gu, Hai Ting; Xie, Ping; Sang, Yan Fang; Wu, Zi Yi
2018-04-01
Abrupt change is an important manifestation of hydrological process with dramatic variation in the context of global climate change, the accurate recognition of which has great significance to understand hydrological process changes and carry out the actual hydrological and water resources works. The traditional method is not reliable at both ends of the samples. The results of the methods are often inconsistent. In order to solve the problem, we proposed a comprehensive weighted recognition method for hydrological abrupt change based on weighting by comparing of 12 commonly used methods for testing change points. The reliability of the method was verified by Monte Carlo statistical test. The results showed that the efficiency of the 12 methods was influenced by the factors including coefficient of variation (Cv), deviation coefficient (Cs) before the change point, mean value difference coefficient, Cv difference coefficient and Cs difference coefficient, but with no significant relationship with the mean value of the sequence. Based on the performance of each method, the weight of each test method was given following the results from statistical test. The sliding rank sum test method and the sliding run test method had the highest weight, whereas the RS test method had the lowest weight. By this means, the change points with the largest comprehensive weight could be selected as the final result when the results of the different methods were inconsistent. This method was used to analyze the daily maximum sequence of Jiajiu station in the lower reaches of the Lancang River (1-day, 3-day, 5-day, 7-day and 1-month). The results showed that each sequence had obvious jump variation in 2004, which was in agreement with the physical causes of hydrological process change and water conservancy construction. The rationality and reliability of the proposed method was verified.
Ozaki, Vitor A.; Ghosh, Sujit K.; Goodwin, Barry K.; Shirota, Ricardo
2009-01-01
This article presents a statistical model of agricultural yield data based on a set of hierarchical Bayesian models that allows joint modeling of temporal and spatial autocorrelation. This method captures a comprehensive range of the various uncertainties involved in predicting crop insurance premium rates as opposed to the more traditional ad hoc, two-stage methods that are typically based on independent estimation and prediction. A panel data set of county-average yield data was analyzed for 290 counties in the State of Paraná (Brazil) for the period of 1990 through 2002. Posterior predictive criteria are used to evaluate different model specifications. This article provides substantial improvements in the statistical and actuarial methods often applied to the calculation of insurance premium rates. These improvements are especially relevant to situations where data are limited. PMID:19890450
Trends in study design and the statistical methods employed in a leading general medicine journal.
Gosho, M; Sato, Y; Nagashima, K; Takahashi, S
2018-02-01
Study design and statistical methods have become core components of medical research, and the methodology has become more multifaceted and complicated over time. The study of the comprehensive details and current trends of study design and statistical methods is required to support the future implementation of well-planned clinical studies providing information about evidence-based medicine. Our purpose was to illustrate study design and statistical methods employed in recent medical literature. This was an extension study of Sato et al. (N Engl J Med 2017; 376: 1086-1087), which reviewed 238 articles published in 2015 in the New England Journal of Medicine (NEJM) and briefly summarized the statistical methods employed in NEJM. Using the same database, we performed a new investigation of the detailed trends in study design and individual statistical methods that were not reported in the Sato study. Due to the CONSORT statement, prespecification and justification of sample size are obligatory in planning intervention studies. Although standard survival methods (eg Kaplan-Meier estimator and Cox regression model) were most frequently applied, the Gray test and Fine-Gray proportional hazard model for considering competing risks were sometimes used for a more valid statistical inference. With respect to handling missing data, model-based methods, which are valid for missing-at-random data, were more frequently used than single imputation methods. These methods are not recommended as a primary analysis, but they have been applied in many clinical trials. Group sequential design with interim analyses was one of the standard designs, and novel design, such as adaptive dose selection and sample size re-estimation, was sometimes employed in NEJM. Model-based approaches for handling missing data should replace single imputation methods for primary analysis in the light of the information found in some publications. Use of adaptive design with interim analyses is increasing after the presentation of the FDA guidance for adaptive design. © 2017 John Wiley & Sons Ltd.
ProUCL version 4.1.00 Documentation Downloads
ProUCL version 4.1.00 represents a comprehensive statistical software package equipped with statistical methods and graphical tools needed to address many environmental sampling and statistical issues as described in various these guidance documents.
Image object recognition based on the Zernike moment and neural networks
NASA Astrophysics Data System (ADS)
Wan, Jianwei; Wang, Ling; Huang, Fukan; Zhou, Liangzhu
1998-03-01
This paper first give a comprehensive discussion about the concept of artificial neural network its research methods and the relations with information processing. On the basis of such a discussion, we expound the mathematical similarity of artificial neural network and information processing. Then, the paper presents a new method of image recognition based on invariant features and neural network by using image Zernike transform. The method not only has the invariant properties for rotation, shift and scale of image object, but also has good fault tolerance and robustness. Meanwhile, it is also compared with statistical classifier and invariant moments recognition method.
Time-variant random interval natural frequency analysis of structures
NASA Astrophysics Data System (ADS)
Wu, Binhua; Wu, Di; Gao, Wei; Song, Chongmin
2018-02-01
This paper presents a new robust method namely, unified interval Chebyshev-based random perturbation method, to tackle hybrid random interval structural natural frequency problem. In the proposed approach, random perturbation method is implemented to furnish the statistical features (i.e., mean and standard deviation) and Chebyshev surrogate model strategy is incorporated to formulate the statistical information of natural frequency with regards to the interval inputs. The comprehensive analysis framework combines the superiority of both methods in a way that computational cost is dramatically reduced. This presented method is thus capable of investigating the day-to-day based time-variant natural frequency of structures accurately and efficiently under concrete intrinsic creep effect with probabilistic and interval uncertain variables. The extreme bounds of the mean and standard deviation of natural frequency are captured through the embedded optimization strategy within the analysis procedure. Three particularly motivated numerical examples with progressive relationship in perspective of both structure type and uncertainty variables are demonstrated to justify the computational applicability, accuracy and efficiency of the proposed method.
Wu, Hao
2018-05-01
In structural equation modelling (SEM), a robust adjustment to the test statistic or to its reference distribution is needed when its null distribution deviates from a χ 2 distribution, which usually arises when data do not follow a multivariate normal distribution. Unfortunately, existing studies on this issue typically focus on only a few methods and neglect the majority of alternative methods in statistics. Existing simulation studies typically consider only non-normal distributions of data that either satisfy asymptotic robustness or lead to an asymptotic scaled χ 2 distribution. In this work we conduct a comprehensive study that involves both typical methods in SEM and less well-known methods from the statistics literature. We also propose the use of several novel non-normal data distributions that are qualitatively different from the non-normal distributions widely used in existing studies. We found that several under-studied methods give the best performance under specific conditions, but the Satorra-Bentler method remains the most viable method for most situations. © 2017 The British Psychological Society.
Functional annotation of regulatory pathways.
Pandey, Jayesh; Koyutürk, Mehmet; Kim, Yohan; Szpankowski, Wojciech; Subramaniam, Shankar; Grama, Ananth
2007-07-01
Standardized annotations of biomolecules in interaction networks (e.g. Gene Ontology) provide comprehensive understanding of the function of individual molecules. Extending such annotations to pathways is a critical component of functional characterization of cellular signaling at the systems level. We propose a framework for projecting gene regulatory networks onto the space of functional attributes using multigraph models, with the objective of deriving statistically significant pathway annotations. We first demonstrate that annotations of pairwise interactions do not generalize to indirect relationships between processes. Motivated by this result, we formalize the problem of identifying statistically overrepresented pathways of functional attributes. We establish the hardness of this problem by demonstrating the non-monotonicity of common statistical significance measures. We propose a statistical model that emphasizes the modularity of a pathway, evaluating its significance based on the coupling of its building blocks. We complement the statistical model by an efficient algorithm and software, Narada, for computing significant pathways in large regulatory networks. Comprehensive results from our methods applied to the Escherichia coli transcription network demonstrate that our approach is effective in identifying known, as well as novel biological pathway annotations. Narada is implemented in Java and is available at http://www.cs.purdue.edu/homes/jpandey/narada/.
NASA Astrophysics Data System (ADS)
Eum, H. I.; Cannon, A. J.
2015-12-01
Climate models are a key provider to investigate impacts of projected future climate conditions on regional hydrologic systems. However, there is a considerable mismatch of spatial resolution between GCMs and regional applications, in particular a region characterized by complex terrain such as Korean peninsula. Therefore, a downscaling procedure is an essential to assess regional impacts of climate change. Numerous statistical downscaling methods have been used mainly due to the computational efficiency and simplicity. In this study, four statistical downscaling methods [Bias-Correction/Spatial Disaggregation (BCSD), Bias-Correction/Constructed Analogue (BCCA), Multivariate Adaptive Constructed Analogs (MACA), and Bias-Correction/Climate Imprint (BCCI)] are applied to downscale the latest Climate Forecast System Reanalysis data to stations for precipitation, maximum temperature, and minimum temperature over South Korea. By split sampling scheme, all methods are calibrated with observational station data for 19 years from 1973 to 1991 are and tested for the recent 19 years from 1992 to 2010. To assess skill of the downscaling methods, we construct a comprehensive suite of performance metrics that measure an ability of reproducing temporal correlation, distribution, spatial correlation, and extreme events. In addition, we employ Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) to identify robust statistical downscaling methods based on the performance metrics for each season. The results show that downscaling skill is considerably affected by the skill of CFSR and all methods lead to large improvements in representing all performance metrics. According to seasonal performance metrics evaluated, when TOPSIS is applied, MACA is identified as the most reliable and robust method for all variables and seasons. Note that such result is derived from CFSR output which is recognized as near perfect climate data in climate studies. Therefore, the ranking of this study may be changed when various GCMs are downscaled and evaluated. Nevertheless, it may be informative for end-users (i.e. modelers or water resources managers) to understand and select more suitable downscaling methods corresponding to priorities on regional applications.
Kowalska, Joanna Beata; Mazurek, Ryszard; Gąsiorek, Michał; Zaleski, Tomasz
2018-04-05
The paper provides a complex, critical assessment of heavy metal soil pollution using different indices. Pollution indices are widely considered a useful tool for the comprehensive evaluation of the degree of contamination. Moreover, they can have a great importance in the assessment of soil quality and the prediction of future ecosystem sustainability, especially in the case of farmlands. Eighteen indices previously described by several authors (I geo , PI, EF, C f , PI sum , PI Nemerow , PLI, PI ave , PI Vector , PIN, MEC, CSI, MERMQ, C deg , RI, mCd and ExF) as well as the newly published Biogeochemical Index (BGI) were compared. The content, as determined by other authors, of the most widely investigated heavy metals (Cd, Pb and Zn) in farmland, forest and urban soils was used as a database for the calculation of all of the presented indices, and this shows, based on statistical methods, the similarities and differences between them. The indices were initially divided into two groups: individual and complex. In order to achieve a more precise classification, our study attempted to further split indices based on their purpose and method of calculation. The strengths and weaknesses of each index were assessed; in addition, a comprehensive method for pollution index choice is presented, in order to best interpret pollution in different soils (farmland, forest and urban). This critical review also contains an evaluation of various geochemical backgrounds (GBs) used in heavy metal soil pollution assessments. The authors propose a comprehensive method in order to assess soil quality, based on the application of local and reference GB.
NASA Astrophysics Data System (ADS)
York, Kathleen Christine
This mixed method study explored the relationship between metacomprehension strategy awareness and reading comprehension performance with narrative and science texts. Participants, 132 eighth-grade, predominately African American students, attending one middle school in a southeastern state, were administered a narrative and science version of the Metacomprehension Strategy Index (MSI) and asked to identify helpful strategic behaviors from six clustered subcategories (predicting and verifying; previewing; purpose setting; self-questioning; drawing from background knowledge; and summarizing and applying fix-up strategies). Participants also read and answered comprehension questions about narrative and science passages. Findings revealed no statistically significant differences in overall metacomprehension awareness with narrative and science texts. Statistically significant (p<.05) differences were found for two of the six subcategories, indicating students preview and set purpose more often with science than narrative texts. Findings also indicated overall narrative and science metacomprehension awareness and comprehension performance scores were statistically significantly (p<.01) related. Specifically, the category of summarizing and applying fix-up strategies was the strongest predictor of comprehension performance for both narrative and science texts. The qualitative phase of this study explored the relationship between metacomprehension awareness with narrative and science texts and the comprehension performance of six middle school students, three of whom scored high overall on the narrative and science text comprehension assessments in phase one of the study, and three of whom scored low. A qualitative analysis of multiple sources of data, including video-taped interviews and think-alouds, revealed the three high scoring participants engaged in competent school-based, metacognitive conversations infused with goal, self, and narrative talk and demonstrated multi-strategic engagements with narrative and science texts. In stark contrast, the three low scoring participants engaged in dissonant school-based talk infused with disclaimers, over-generalized, decontextualized, and literalized answers and demonstrated robotic, limited (primarily rereading and restating), and frustrated strategic acts when interacting with both narrative and science texts. The educational implications are discussed. This dissertation was funded by the Office of Special Education Programs, Federal Office Grant Award No. 324E031501.
Allen, Peter J.; Dorozenko, Kate P.; Roberts, Lynne D.
2016-01-01
Quantitative research methods are essential to the development of professional competence in psychology. They are also an area of weakness for many students. In particular, students are known to struggle with the skill of selecting quantitative analytical strategies appropriate for common research questions, hypotheses and data types. To begin understanding this apparent deficit, we presented nine psychology undergraduates (who had all completed at least one quantitative methods course) with brief research vignettes, and asked them to explicate the process they would follow to identify an appropriate statistical technique for each. Thematic analysis revealed that all participants found this task challenging, and even those who had completed several research methods courses struggled to articulate how they would approach the vignettes on more than a very superficial and intuitive level. While some students recognized that there is a systematic decision making process that can be followed, none could describe it clearly or completely. We then presented the same vignettes to 10 psychology academics with particular expertise in conducting research and/or research methods instruction. Predictably, these “experts” were able to describe a far more systematic, comprehensive, flexible, and nuanced approach to statistical decision making, which begins early in the research process, and pays consideration to multiple contextual factors. They were sensitive to the challenges that students experience when making statistical decisions, which they attributed partially to how research methods and statistics are commonly taught. This sensitivity was reflected in their pedagogic practices. When asked to consider the format and features of an aid that could facilitate the statistical decision making process, both groups expressed a preference for an accessible, comprehensive and reputable resource that follows a basic decision tree logic. For the academics in particular, this aid should function as a teaching tool, which engages the user with each choice-point in the decision making process, rather than simply providing an “answer.” Based on these findings, we offer suggestions for tools and strategies that could be deployed in the research methods classroom to facilitate and strengthen students' statistical decision making abilities. PMID:26909064
Allen, Peter J; Dorozenko, Kate P; Roberts, Lynne D
2016-01-01
Quantitative research methods are essential to the development of professional competence in psychology. They are also an area of weakness for many students. In particular, students are known to struggle with the skill of selecting quantitative analytical strategies appropriate for common research questions, hypotheses and data types. To begin understanding this apparent deficit, we presented nine psychology undergraduates (who had all completed at least one quantitative methods course) with brief research vignettes, and asked them to explicate the process they would follow to identify an appropriate statistical technique for each. Thematic analysis revealed that all participants found this task challenging, and even those who had completed several research methods courses struggled to articulate how they would approach the vignettes on more than a very superficial and intuitive level. While some students recognized that there is a systematic decision making process that can be followed, none could describe it clearly or completely. We then presented the same vignettes to 10 psychology academics with particular expertise in conducting research and/or research methods instruction. Predictably, these "experts" were able to describe a far more systematic, comprehensive, flexible, and nuanced approach to statistical decision making, which begins early in the research process, and pays consideration to multiple contextual factors. They were sensitive to the challenges that students experience when making statistical decisions, which they attributed partially to how research methods and statistics are commonly taught. This sensitivity was reflected in their pedagogic practices. When asked to consider the format and features of an aid that could facilitate the statistical decision making process, both groups expressed a preference for an accessible, comprehensive and reputable resource that follows a basic decision tree logic. For the academics in particular, this aid should function as a teaching tool, which engages the user with each choice-point in the decision making process, rather than simply providing an "answer." Based on these findings, we offer suggestions for tools and strategies that could be deployed in the research methods classroom to facilitate and strengthen students' statistical decision making abilities.
Whole-Range Assessment: A Simple Method for Analysing Allelopathic Dose-Response Data
An, Min; Pratley, J. E.; Haig, T.; Liu, D.L.
2005-01-01
Based on the typical biological responses of an organism to allelochemicals (hormesis), concepts of whole-range assessment and inhibition index were developed for improved analysis of allelopathic data. Examples of their application are presented using data drawn from the literature. The method is concise and comprehensive, and makes data grouping and multiple comparisons simple, logical, and possible. It improves data interpretation, enhances research outcomes, and is a statistically efficient summary of the plant response profiles. PMID:19330165
Uncertainty-based Optimization Algorithms in Designing Fractionated Spacecraft
Ning, Xin; Yuan, Jianping; Yue, Xiaokui
2016-01-01
A fractionated spacecraft is an innovative application of a distributive space system. To fully understand the impact of various uncertainties on its development, launch and in-orbit operation, we use the stochastic missioncycle cost to comprehensively evaluate the survivability, flexibility, reliability and economy of the ways of dividing the various modules of the different configurations of fractionated spacecraft. We systematically describe its concept and then analyze its evaluation and optimal design method that exists during recent years and propose the stochastic missioncycle cost for comprehensive evaluation. We also establish the models of the costs such as module development, launch and deployment and the impacts of their uncertainties respectively. Finally, we carry out the Monte Carlo simulation of the complete missioncycle costs of various configurations of the fractionated spacecraft under various uncertainties and give and compare the probability density distribution and statistical characteristics of its stochastic missioncycle cost, using the two strategies of timing module replacement and non-timing module replacement. The simulation results verify the effectiveness of the comprehensive evaluation method and show that our evaluation method can comprehensively evaluate the adaptability of the fractionated spacecraft under different technical and mission conditions. PMID:26964755
Swetha, Jonnalagadda Laxmi; Arpita, Ramisetti; Srikanth, Chintalapani; Nutalapati, Rajasekhar
2014-01-01
Background: Biostatistics is an integral part of research protocols. In any field of inquiry or investigation, data obtained is subsequently classified, analyzed and tested for accuracy by statistical methods. Statistical analysis of collected data, thus, forms the basis for all evidence-based conclusions. Aim: The aim of this study is to evaluate the cognition, comprehension and application of biostatistics in research among post graduate students in Periodontics, in India. Materials and Methods: A total of 391 post graduate students registered for a master's course in periodontics at various dental colleges across India were included in the survey. Data regarding the level of knowledge, understanding and its application in design and conduct of the research protocol was collected using a dichotomous questionnaire. A descriptive statistics was used for data analysis. Results: Nearly 79.2% students were aware of the importance of biostatistics in research, 55-65% were familiar with MS-EXCEL spreadsheet for graphical representation of data and with the statistical softwares available on the internet, 26.0% had biostatistics as mandatory subject in their curriculum, 9.5% tried to perform statistical analysis on their own while 3.0% were successful in performing statistical analysis of their studies on their own. Conclusion: Biostatistics should play a central role in planning, conduct, interim analysis, final analysis and reporting of periodontal research especially by the postgraduate students. Indian postgraduate students in periodontics are aware of the importance of biostatistics in research but the level of understanding and application is still basic and needs to be addressed. PMID:24744547
Nour-Eldein, Hebatallah
2016-01-01
Background: With limited statistical knowledge of most physicians it is not uncommon to find statistical errors in research articles. Objectives: To determine the statistical methods and to assess the statistical errors in family medicine (FM) research articles that were published between 2010 and 2014. Methods: This was a cross-sectional study. All 66 FM research articles that were published over 5 years by FM authors with affiliation to Suez Canal University were screened by the researcher between May and August 2015. Types and frequencies of statistical methods were reviewed in all 66 FM articles. All 60 articles with identified inferential statistics were examined for statistical errors and deficiencies. A comprehensive 58-item checklist based on statistical guidelines was used to evaluate the statistical quality of FM articles. Results: Inferential methods were recorded in 62/66 (93.9%) of FM articles. Advanced analyses were used in 29/66 (43.9%). Contingency tables 38/66 (57.6%), regression (logistic, linear) 26/66 (39.4%), and t-test 17/66 (25.8%) were the most commonly used inferential tests. Within 60 FM articles with identified inferential statistics, no prior sample size 19/60 (31.7%), application of wrong statistical tests 17/60 (28.3%), incomplete documentation of statistics 59/60 (98.3%), reporting P value without test statistics 32/60 (53.3%), no reporting confidence interval with effect size measures 12/60 (20.0%), use of mean (standard deviation) to describe ordinal/nonnormal data 8/60 (13.3%), and errors related to interpretation were mainly for conclusions without support by the study data 5/60 (8.3%). Conclusion: Inferential statistics were used in the majority of FM articles. Data analysis and reporting statistics are areas for improvement in FM research articles. PMID:27453839
ERIC Educational Resources Information Center
Shintani, Natsuko; Li, Shaofeng; Ellis, Rod
2013-01-01
This article reports a meta-analysis of studies that investigated the relative effectiveness of comprehension-based instruction (CBI) and production-based instruction (PBI). The meta-analysis only included studies that featured a direct comparison of CBI and PBI in order to ensure methodological and statistical robustness. A total of 35 research…
Application of Ontology Technology in Health Statistic Data Analysis.
Guo, Minjiang; Hu, Hongpu; Lei, Xingyun
2017-01-01
Research Purpose: establish health management ontology for analysis of health statistic data. Proposed Methods: this paper established health management ontology based on the analysis of the concepts in China Health Statistics Yearbook, and used protégé to define the syntactic and semantic structure of health statistical data. six classes of top-level ontology concepts and their subclasses had been extracted and the object properties and data properties were defined to establish the construction of these classes. By ontology instantiation, we can integrate multi-source heterogeneous data and enable administrators to have an overall understanding and analysis of the health statistic data. ontology technology provides a comprehensive and unified information integration structure of the health management domain and lays a foundation for the efficient analysis of multi-source and heterogeneous health system management data and enhancement of the management efficiency.
USING STATISTICAL METHODS FOR WATER QUALITY MANAGEMENT: ISSUES, PROBLEMS AND SOLUTIONS
This book is readable, comprehensible and I anticipate, usable. The author has an enthusiasm which comes out in the text. Statistics is presented as a living breathing subject, still being debated, defined, and refined. This statistics book actually has examples in the field...
Prediction of protein secondary structure content for the twilight zone sequences.
Homaeian, Leila; Kurgan, Lukasz A; Ruan, Jishou; Cios, Krzysztof J; Chen, Ke
2007-11-15
Secondary protein structure carries information about local structural arrangements, which include three major conformations: alpha-helices, beta-strands, and coils. Significant majority of successful methods for prediction of the secondary structure is based on multiple sequence alignment. However, multiple alignment fails to provide accurate results when a sequence comes from the twilight zone, that is, it is characterized by low (<30%) homology. To this end, we propose a novel method for prediction of secondary structure content through comprehensive sequence representation, called PSSC-core. The method uses a multiple linear regression model and introduces a comprehensive feature-based sequence representation to predict amount of helices and strands for sequences from the twilight zone. The PSSC-core method was tested and compared with two other state-of-the-art prediction methods on a set of 2187 twilight zone sequences. The results indicate that our method provides better predictions for both helix and strand content. The PSSC-core is shown to provide statistically significantly better results when compared with the competing methods, reducing the prediction error by 5-7% for helix and 7-9% for strand content predictions. The proposed feature-based sequence representation uses a comprehensive set of physicochemical properties that are custom-designed for each of the helix and strand content predictions. It includes composition and composition moment vectors, frequency of tetra-peptides associated with helical and strand conformations, various property-based groups like exchange groups, chemical groups of the side chains and hydrophobic group, auto-correlations based on hydrophobicity, side-chain masses, hydropathy, and conformational patterns for beta-sheets. The PSSC-core method provides an alternative for predicting the secondary structure content that can be used to validate and constrain results of other structure prediction methods. At the same time, it also provides useful insight into design of successful protein sequence representations that can be used in developing new methods related to prediction of different aspects of the secondary protein structure. (c) 2007 Wiley-Liss, Inc.
Murphy, Thomas; Schwedock, Julie; Nguyen, Kham; Mills, Anna; Jones, David
2015-01-01
New recommendations for the validation of rapid microbiological methods have been included in the revised Technical Report 33 release from the PDA. The changes include a more comprehensive review of the statistical methods to be used to analyze data obtained during validation. This case study applies those statistical methods to accuracy, precision, ruggedness, and equivalence data obtained using a rapid microbiological methods system being evaluated for water bioburden testing. Results presented demonstrate that the statistical methods described in the PDA Technical Report 33 chapter can all be successfully applied to the rapid microbiological method data sets and gave the same interpretation for equivalence to the standard method. The rapid microbiological method was in general able to pass the requirements of PDA Technical Report 33, though the study shows that there can be occasional outlying results and that caution should be used when applying statistical methods to low average colony-forming unit values. Prior to use in a quality-controlled environment, any new method or technology has to be shown to work as designed by the manufacturer for the purpose required. For new rapid microbiological methods that detect and enumerate contaminating microorganisms, additional recommendations have been provided in the revised PDA Technical Report No. 33. The changes include a more comprehensive review of the statistical methods to be used to analyze data obtained during validation. This paper applies those statistical methods to analyze accuracy, precision, ruggedness, and equivalence data obtained using a rapid microbiological method system being validated for water bioburden testing. The case study demonstrates that the statistical methods described in the PDA Technical Report No. 33 chapter can be successfully applied to rapid microbiological method data sets and give the same comparability results for similarity or difference as the standard method. © PDA, Inc. 2015.
ERIC Educational Resources Information Center
Freng, Scott; Webber, David; Blatter, Jamin; Wing, Ashley; Scott, Walter D.
2011-01-01
Comprehension of statistics and research methods is crucial to understanding psychology as a science (APA, 2007). However, psychology majors sometimes approach methodology courses with derision or anxiety (Onwuegbuzie & Wilson, 2003; Rajecki, Appleby, Williams, Johnson, & Jeschke, 2005); consequently, students may postpone…
A peaking-regulation-balance-based method for wind & PV power integrated accommodation
NASA Astrophysics Data System (ADS)
Zhang, Jinfang; Li, Nan; Liu, Jun
2018-02-01
Rapid development of China’s new energy in current and future should be focused on cooperation of wind and PV power. Based on the analysis of system peaking balance, combined with the statistical features of wind and PV power output characteristics, a method of comprehensive integrated accommodation analysis of wind and PV power is put forward. By the electric power balance during night peaking load period in typical day, wind power installed capacity is determined firstly; then PV power installed capacity could be figured out by midday peak load hours, which effectively solves the problem of uncertainty when traditional method hard determines the combination of the wind and solar power simultaneously. The simulation results have validated the effectiveness of the proposed method.
A Multi-level Fuzzy Evaluation Method for Smart Distribution Network Based on Entropy Weight
NASA Astrophysics Data System (ADS)
Li, Jianfang; Song, Xiaohui; Gao, Fei; Zhang, Yu
2017-05-01
Smart distribution network is considered as the future trend of distribution network. In order to comprehensive evaluate smart distribution construction level and give guidance to the practice of smart distribution construction, a multi-level fuzzy evaluation method based on entropy weight is proposed. Firstly, focus on both the conventional characteristics of distribution network and new characteristics of smart distribution network such as self-healing and interaction, a multi-level evaluation index system which contains power supply capability, power quality, economy, reliability and interaction is established. Then, a combination weighting method based on Delphi method and entropy weight method is put forward, which take into account not only the importance of the evaluation index in the experts’ subjective view, but also the objective and different information from the index values. Thirdly, a multi-level evaluation method based on fuzzy theory is put forward. Lastly, an example is conducted based on the statistical data of some cites’ distribution network and the evaluation method is proved effective and rational.
Nour-Eldein, Hebatallah
2016-01-01
With limited statistical knowledge of most physicians it is not uncommon to find statistical errors in research articles. To determine the statistical methods and to assess the statistical errors in family medicine (FM) research articles that were published between 2010 and 2014. This was a cross-sectional study. All 66 FM research articles that were published over 5 years by FM authors with affiliation to Suez Canal University were screened by the researcher between May and August 2015. Types and frequencies of statistical methods were reviewed in all 66 FM articles. All 60 articles with identified inferential statistics were examined for statistical errors and deficiencies. A comprehensive 58-item checklist based on statistical guidelines was used to evaluate the statistical quality of FM articles. Inferential methods were recorded in 62/66 (93.9%) of FM articles. Advanced analyses were used in 29/66 (43.9%). Contingency tables 38/66 (57.6%), regression (logistic, linear) 26/66 (39.4%), and t-test 17/66 (25.8%) were the most commonly used inferential tests. Within 60 FM articles with identified inferential statistics, no prior sample size 19/60 (31.7%), application of wrong statistical tests 17/60 (28.3%), incomplete documentation of statistics 59/60 (98.3%), reporting P value without test statistics 32/60 (53.3%), no reporting confidence interval with effect size measures 12/60 (20.0%), use of mean (standard deviation) to describe ordinal/nonnormal data 8/60 (13.3%), and errors related to interpretation were mainly for conclusions without support by the study data 5/60 (8.3%). Inferential statistics were used in the majority of FM articles. Data analysis and reporting statistics are areas for improvement in FM research articles.
Quantifying the indirect impacts of climate on agriculture: an inter-method comparison
Calvin, Kate; Fisher-Vanden, Karen
2017-10-27
Climate change and increases in CO2 concentration affect the productivity of land, with implications for land use, land cover, and agricultural production. Much of the literature on the effect of climate on agriculture has focused on linking projections of changes in climate to process-based or statistical crop models. However, the changes in productivity have broader economic implications that cannot be quantified in crop models alone. How important are these socio-economic feedbacks to a comprehensive assessment of the impacts of climate change on agriculture? In this paper, we attempt to measure the importance of these interaction effects through an inter-method comparisonmore » between process models, statistical models, and integrated assessment model (IAMs). We find the impacts on crop yields vary widely between these three modeling approaches. Yield impacts generated by the IAMs are 20%-40% higher than the yield impacts generated by process-based or statistical crop models, with indirect climate effects adjusting yields by between - 12% and + 15% (e.g. input substitution and crop switching). The remaining effects are due to technological change.« less
Quantifying the indirect impacts of climate on agriculture: an inter-method comparison
NASA Astrophysics Data System (ADS)
Calvin, Kate; Fisher-Vanden, Karen
2017-11-01
Climate change and increases in CO2 concentration affect the productivity of land, with implications for land use, land cover, and agricultural production. Much of the literature on the effect of climate on agriculture has focused on linking projections of changes in climate to process-based or statistical crop models. However, the changes in productivity have broader economic implications that cannot be quantified in crop models alone. How important are these socio-economic feedbacks to a comprehensive assessment of the impacts of climate change on agriculture? In this paper, we attempt to measure the importance of these interaction effects through an inter-method comparison between process models, statistical models, and integrated assessment model (IAMs). We find the impacts on crop yields vary widely between these three modeling approaches. Yield impacts generated by the IAMs are 20%-40% higher than the yield impacts generated by process-based or statistical crop models, with indirect climate effects adjusting yields by between -12% and +15% (e.g. input substitution and crop switching). The remaining effects are due to technological change.
Quantifying the indirect impacts of climate on agriculture: an inter-method comparison
DOE Office of Scientific and Technical Information (OSTI.GOV)
Calvin, Kate; Fisher-Vanden, Karen
Climate change and increases in CO2 concentration affect the productivity of land, with implications for land use, land cover, and agricultural production. Much of the literature on the effect of climate on agriculture has focused on linking projections of changes in climate to process-based or statistical crop models. However, the changes in productivity have broader economic implications that cannot be quantified in crop models alone. How important are these socio-economic feedbacks to a comprehensive assessment of the impacts of climate change on agriculture? In this paper, we attempt to measure the importance of these interaction effects through an inter-method comparisonmore » between process models, statistical models, and integrated assessment model (IAMs). We find the impacts on crop yields vary widely between these three modeling approaches. Yield impacts generated by the IAMs are 20%-40% higher than the yield impacts generated by process-based or statistical crop models, with indirect climate effects adjusting yields by between - 12% and + 15% (e.g. input substitution and crop switching). The remaining effects are due to technological change.« less
NASA Astrophysics Data System (ADS)
Shi, Liehang; Ling, Tonghui; Zhang, Jianguo
2016-03-01
Radiologists currently use a variety of terminologies and standards in most hospitals in China, and even there are multiple terminologies being used for different sections in one department. In this presentation, we introduce a medical semantic comprehension system (MedSCS) to extract semantic information about clinical findings and conclusion from free text radiology reports so that the reports can be classified correctly based on medical terms indexing standards such as Radlex or SONMED-CT. Our system (MedSCS) is based on both rule-based methods and statistics-based methods which improve the performance and the scalability of MedSCS. In order to evaluate the over all of the system and measure the accuracy of the outcomes, we developed computation methods to calculate the parameters of precision rate, recall rate, F-score and exact confidence interval.
Barnes, Stephen; Benton, H. Paul; Casazza, Krista; Cooper, Sara; Cui, Xiangqin; Du, Xiuxia; Engler, Jeffrey; Kabarowski, Janusz H.; Li, Shuzhao; Pathmasiri, Wimal; Prasain, Jeevan K.; Renfrow, Matthew B.; Tiwari, Hemant K.
2017-01-01
Metabolomics, a systems biology discipline representing analysis of known and unknown pathways of metabolism, has grown tremendously over the past 20 years. Because of its comprehensive nature, metabolomics requires careful consideration of the question(s) being asked, the scale needed to answer the question(s), collection and storage of the sample specimens, methods for extraction of the metabolites from biological matrices, the analytical method(s) to be employed and the quality control of the analyses, how collected data are correlated, the statistical methods to determine metabolites undergoing significant change, putative identification of metabolites, and the use of stable isotopes to aid in verifying metabolite identity and establishing pathway connections and fluxes. This second part of a comprehensive description of the methods of metabolomics focuses on data analysis, emerging methods in metabolomics and the future of this discipline. PMID:28239968
Aryee, Martin J.; Jaffe, Andrew E.; Corrada-Bravo, Hector; Ladd-Acosta, Christine; Feinberg, Andrew P.; Hansen, Kasper D.; Irizarry, Rafael A.
2014-01-01
Motivation: The recently released Infinium HumanMethylation450 array (the ‘450k’ array) provides a high-throughput assay to quantify DNA methylation (DNAm) at ∼450 000 loci across a range of genomic features. Although less comprehensive than high-throughput sequencing-based techniques, this product is more cost-effective and promises to be the most widely used DNAm high-throughput measurement technology over the next several years. Results: Here we describe a suite of computational tools that incorporate state-of-the-art statistical techniques for the analysis of DNAm data. The software is structured to easily adapt to future versions of the technology. We include methods for preprocessing, quality assessment and detection of differentially methylated regions from the kilobase to the megabase scale. We show how our software provides a powerful and flexible development platform for future methods. We also illustrate how our methods empower the technology to make discoveries previously thought to be possible only with sequencing-based methods. Availability and implementation: http://bioconductor.org/packages/release/bioc/html/minfi.html. Contact: khansen@jhsph.edu; rafa@jimmy.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24478339
[Dermatoglyphics in the prognostication of constitutional and physical traits in humans].
Mazur, E S; Sidorenko, A G
2009-01-01
The present study was designed to elucidate the relationship between palmar and digital dermatoglyphic patterns and descriptive signs of human appearance based on the results of comprehensive anthropometric examination of 2620 men and 380 women. A battery of different methods were used to statistically treat the results of dactyloscopic records. They demonstrated correlation between skin patterns and external body features that can be used to construct diagnostic models for the purpose of personality identification.
Swetha, Jonnalagadda Laxmi; Arpita, Ramisetti; Srikanth, Chintalapani; Nutalapati, Rajasekhar
2014-01-01
Biostatistics is an integral part of research protocols. In any field of inquiry or investigation, data obtained is subsequently classified, analyzed and tested for accuracy by statistical methods. Statistical analysis of collected data, thus, forms the basis for all evidence-based conclusions. The aim of this study is to evaluate the cognition, comprehension and application of biostatistics in research among post graduate students in Periodontics, in India. A total of 391 post graduate students registered for a master's course in periodontics at various dental colleges across India were included in the survey. Data regarding the level of knowledge, understanding and its application in design and conduct of the research protocol was collected using a dichotomous questionnaire. A descriptive statistics was used for data analysis. Nearly 79.2% students were aware of the importance of biostatistics in research, 55-65% were familiar with MS-EXCEL spreadsheet for graphical representation of data and with the statistical softwares available on the internet, 26.0% had biostatistics as mandatory subject in their curriculum, 9.5% tried to perform statistical analysis on their own while 3.0% were successful in performing statistical analysis of their studies on their own. Biostatistics should play a central role in planning, conduct, interim analysis, final analysis and reporting of periodontal research especially by the postgraduate students. Indian postgraduate students in periodontics are aware of the importance of biostatistics in research but the level of understanding and application is still basic and needs to be addressed.
A Ricin Forensic Profiling Approach Based on a Complex Set of Biomarkers
Fredriksson, Sten-Ake; Wunschel, David S.; Lindstrom, Susanne Wiklund; ...
2018-03-28
A forensic method for the retrospective determination of preparation methods used for illicit ricin toxin production was developed. The method was based on a complex set of biomarkers, including carbohydrates, fatty acids, seed storage proteins, in combination with data on ricin and Ricinus communis agglutinin. The analyses were performed on samples prepared from four castor bean plant (R. communis) cultivars by four different sample preparation methods (PM1 – PM4) ranging from simple disintegration of the castor beans to multi-step preparation methods including different protein precipitation methods. Comprehensive analytical data was collected by use of a range of analytical methods andmore » robust orthogonal partial least squares-discriminant analysis- models (OPLS-DA) were constructed based on the calibration set. By the use of a decision tree and two OPLS-DA models, the sample preparation methods of test set samples were determined. The model statistics of the two models were good and a 100% rate of correct predictions of the test set was achieved.« less
A Ricin Forensic Profiling Approach Based on a Complex Set of Biomarkers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fredriksson, Sten-Ake; Wunschel, David S.; Lindstrom, Susanne Wiklund
A forensic method for the retrospective determination of preparation methods used for illicit ricin toxin production was developed. The method was based on a complex set of biomarkers, including carbohydrates, fatty acids, seed storage proteins, in combination with data on ricin and Ricinus communis agglutinin. The analyses were performed on samples prepared from four castor bean plant (R. communis) cultivars by four different sample preparation methods (PM1 – PM4) ranging from simple disintegration of the castor beans to multi-step preparation methods including different protein precipitation methods. Comprehensive analytical data was collected by use of a range of analytical methods andmore » robust orthogonal partial least squares-discriminant analysis- models (OPLS-DA) were constructed based on the calibration set. By the use of a decision tree and two OPLS-DA models, the sample preparation methods of test set samples were determined. The model statistics of the two models were good and a 100% rate of correct predictions of the test set was achieved.« less
A Principal Component Analysis/Fuzzy Comprehensive Evaluation for Rockburst Potential in Kimberlite
NASA Astrophysics Data System (ADS)
Pu, Yuanyuan; Apel, Derek; Xu, Huawei
2018-02-01
Kimberlite is an igneous rock which sometimes bears diamonds. Most of the diamonds mined in the world today are found in kimberlite ores. Burst potential in kimberlite has not been investigated, because kimberlite is mostly mined using open-pit mining, which poses very little threat of rock bursting. However, as the mining depth keeps increasing, the mines convert to underground mining methods, which can pose a threat of rock bursting in kimberlite. This paper focuses on the burst potential of kimberlite at a diamond mine in northern Canada. A combined model with the methods of principal component analysis (PCA) and fuzzy comprehensive evaluation (FCE) is developed to process data from 12 different locations in kimberlite pipes. Based on calculated 12 fuzzy evaluation vectors, 8 locations show a moderate burst potential, 2 locations show no burst potential, and 2 locations show strong and violent burst potential, respectively. Using statistical principles, a Mahalanobis distance is adopted to build a comprehensive fuzzy evaluation vector for the whole mine and the final evaluation for burst potential is moderate, which is verified by a practical rockbursting situation at mine site.
A weighted U-statistic for genetic association analyses of sequencing data.
Wei, Changshuai; Li, Ming; He, Zihuai; Vsevolozhskaya, Olga; Schaid, Daniel J; Lu, Qing
2014-12-01
With advancements in next-generation sequencing technology, a massive amount of sequencing data is generated, which offers a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, the high-dimensional sequencing data poses a great challenge for statistical analysis. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a Weighted U Sequencing test, referred to as WU-SEQ, for the high-dimensional association analysis of sequencing data. Based on a nonparametric U-statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we showed that WU-SEQ outperformed a commonly used sequence kernel association test (SKAT) method when the underlying assumptions were violated (e.g., the phenotype followed a heavy-tailed distribution). Even when the assumptions were satisfied, WU-SEQ still attained comparable performance to SKAT. Finally, we applied WU-SEQ to sequencing data from the Dallas Heart Study (DHS), and detected an association between ANGPTL 4 and very low density lipoprotein cholesterol. © 2014 WILEY PERIODICALS, INC.
Nookaew, Intawat; Papini, Marta; Pornputtapong, Natapol; Scalcinati, Gionata; Fagerberg, Linn; Uhlén, Matthias; Nielsen, Jens
2012-01-01
RNA-seq, has recently become an attractive method of choice in the studies of transcriptomes, promising several advantages compared with microarrays. In this study, we sought to assess the contribution of the different analytical steps involved in the analysis of RNA-seq data generated with the Illumina platform, and to perform a cross-platform comparison based on the results obtained through Affymetrix microarray. As a case study for our work we, used the Saccharomyces cerevisiae strain CEN.PK 113-7D, grown under two different conditions (batch and chemostat). Here, we asses the influence of genetic variation on the estimation of gene expression level using three different aligners for read-mapping (Gsnap, Stampy and TopHat) on S288c genome, the capabilities of five different statistical methods to detect differential gene expression (baySeq, Cuffdiff, DESeq, edgeR and NOISeq) and we explored the consistency between RNA-seq analysis using reference genome and de novo assembly approach. High reproducibility among biological replicates (correlation ≥0.99) and high consistency between the two platforms for analysis of gene expression levels (correlation ≥0.91) are reported. The results from differential gene expression identification derived from the different statistical methods, as well as their integrated analysis results based on gene ontology annotation are in good agreement. Overall, our study provides a useful and comprehensive comparison between the two platforms (RNA-seq and microrrays) for gene expression analysis and addresses the contribution of the different steps involved in the analysis of RNA-seq data. PMID:22965124
NASA Astrophysics Data System (ADS)
Chang, Anteng; Li, Huajun; Wang, Shuqing; Du, Junfeng
2017-08-01
Both wave-frequency (WF) and low-frequency (LF) components of mooring tension are in principle non-Gaussian due to nonlinearities in the dynamic system. This paper conducts a comprehensive investigation of applicable probability density functions (PDFs) of mooring tension amplitudes used to assess mooring-line fatigue damage via the spectral method. Short-term statistical characteristics of mooring-line tension responses are firstly investigated, in which the discrepancy arising from Gaussian approximation is revealed by comparing kurtosis and skewness coefficients. Several distribution functions based on present analytical spectral methods are selected to express the statistical distribution of the mooring-line tension amplitudes. Results indicate that the Gamma-type distribution and a linear combination of Dirlik and Tovo-Benasciutti formulas are suitable for separate WF and LF mooring tension components. A novel parametric method based on nonlinear transformations and stochastic optimization is then proposed to increase the effectiveness of mooring-line fatigue assessment due to non-Gaussian bimodal tension responses. Using time domain simulation as a benchmark, its accuracy is further validated using a numerical case study of a moored semi-submersible platform.
GIS-based bivariate statistical techniques for groundwater potential analysis (an example of Iran)
NASA Astrophysics Data System (ADS)
Haghizadeh, Ali; Moghaddam, Davoud Davoudi; Pourghasemi, Hamid Reza
2017-12-01
Groundwater potential analysis prepares better comprehension of hydrological settings of different regions. This study shows the potency of two GIS-based data driven bivariate techniques namely statistical index (SI) and Dempster-Shafer theory (DST) to analyze groundwater potential in Broujerd region of Iran. The research was done using 11 groundwater conditioning factors and 496 spring positions. Based on the ground water potential maps (GPMs) of SI and DST methods, 24.22% and 23.74% of the study area is covered by poor zone of groundwater potential, and 43.93% and 36.3% of Broujerd region is covered by good and very good potential zones, respectively. The validation of outcomes displayed that area under the curve (AUC) of SI and DST techniques are 81.23% and 79.41%, respectively, which shows SI method has slightly a better performance than the DST technique. Therefore, SI and DST methods are advantageous to analyze groundwater capacity and scrutinize the complicated relation between groundwater occurrence and groundwater conditioning factors, which permits investigation of both systemic and stochastic uncertainty. Finally, it can be realized that these techniques are very beneficial for groundwater potential analyzing and can be practical for water-resource management experts.
Hosseininasab, Abufazel; Mohammadi, Mohammadreza; Jouzi, Samira; Esmaeilinasab, Maryam; Delavar, Ali
2016-01-01
Objective: This study aimed to provide a normative study documenting how 114 five-seven year-old non-patient Iranian children respond to the Rorschach test. We compared this especial sample to international normative reference values for the Comprehensive System (CS). Method: One hundred fourteen 5- 7- year-old non-patient Iranian children were recruited from public schools. Using five child and adolescent samples from five countries, we compared Iranian Normative Reference Data- based on reference means and standard deviations for each sample. Results: Findings revealed that how the scores in each sample were distributed and how the samples were compared across variables in eight Rorschach Comprehensive System (CS) clusters. We reported all descriptive statistics such as reference mean and standard deviation for all variables. Conclusion: Iranian clinicians could rely on country specific or “local norms” when assessing children. We discourage Iranian clinicians to use many CS scores to make nomothetic, score-based inferences about psychopathology in children and adolescents. PMID:27928247
Barnes, Stephen; Benton, H Paul; Casazza, Krista; Cooper, Sara J; Cui, Xiangqin; Du, Xiuxia; Engler, Jeffrey; Kabarowski, Janusz H; Li, Shuzhao; Pathmasiri, Wimal; Prasain, Jeevan K; Renfrow, Matthew B; Tiwari, Hemant K
2016-08-01
Metabolomics, a systems biology discipline representing analysis of known and unknown pathways of metabolism, has grown tremendously over the past 20 years. Because of its comprehensive nature, metabolomics requires careful consideration of the question(s) being asked, the scale needed to answer the question(s), collection and storage of the sample specimens, methods for extraction of the metabolites from biological matrices, the analytical method(s) to be employed and the quality control of the analyses, how collected data are correlated, the statistical methods to determine metabolites undergoing significant change, putative identification of metabolites and the use of stable isotopes to aid in verifying metabolite identity and establishing pathway connections and fluxes. This second part of a comprehensive description of the methods of metabolomics focuses on data analysis, emerging methods in metabolomics and the future of this discipline. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Liu, Huiling; Xia, Bingbing; Yi, Dehui
2016-01-01
We propose a new feature extraction method of liver pathological image based on multispatial mapping and statistical properties. For liver pathological images of Hematein Eosin staining, the image of R and B channels can reflect the sensitivity of liver pathological images better, while the entropy space and Local Binary Pattern (LBP) space can reflect the texture features of the image better. To obtain the more comprehensive information, we map liver pathological images to the entropy space, LBP space, R space, and B space. The traditional Higher Order Local Autocorrelation Coefficients (HLAC) cannot reflect the overall information of the image, so we propose an average correction HLAC feature. We calculate the statistical properties and the average gray value of pathological images and then update the current pixel value as the absolute value of the difference between the current pixel gray value and the average gray value, which can be more sensitive to the gray value changes of pathological images. Lastly the HLAC template is used to calculate the features of the updated image. The experiment results show that the improved features of the multispatial mapping have the better classification performance for the liver cancer. PMID:27022407
Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data.
Wei, Runmin; Wang, Jingye; Su, Mingming; Jia, Erik; Chen, Shaoqiu; Chen, Tianlu; Ni, Yan
2018-01-12
Missing values exist widely in mass-spectrometry (MS) based metabolomics data. Various methods have been applied for handling missing values, but the selection can significantly affect following data analyses. Typically, there are three types of missing values, missing not at random (MNAR), missing at random (MAR), and missing completely at random (MCAR). Our study comprehensively compared eight imputation methods (zero, half minimum (HM), mean, median, random forest (RF), singular value decomposition (SVD), k-nearest neighbors (kNN), and quantile regression imputation of left-censored data (QRILC)) for different types of missing values using four metabolomics datasets. Normalized root mean squared error (NRMSE) and NRMSE-based sum of ranks (SOR) were applied to evaluate imputation accuracy. Principal component analysis (PCA)/partial least squares (PLS)-Procrustes analysis were used to evaluate the overall sample distribution. Student's t-test followed by correlation analysis was conducted to evaluate the effects on univariate statistics. Our findings demonstrated that RF performed the best for MCAR/MAR and QRILC was the favored one for left-censored MNAR. Finally, we proposed a comprehensive strategy and developed a public-accessible web-tool for the application of missing value imputation in metabolomics ( https://metabolomics.cc.hawaii.edu/software/MetImp/ ).
2017-01-01
Cytochrome P450 aromatase (CYP19A1) plays a key role in the development of estrogen dependent breast cancer, and aromatase inhibitors have been at the front line of treatment for the past three decades. The development of potent, selective and safer inhibitors is ongoing with in silico screening methods playing a more prominent role in the search for promising lead compounds in bioactivity-relevant chemical space. Here we present a set of comprehensive binding affinity prediction models for CYP19A1 using our automated Linear Interaction Energy (LIE) based workflow on a set of 132 putative and structurally diverse aromatase inhibitors obtained from a typical industrial screening study. We extended the workflow with machine learning methods to automatically cluster training and test compounds in order to maximize the number of explained compounds in one or more predictive LIE models. The method uses protein–ligand interaction profiles obtained from Molecular Dynamics (MD) trajectories to help model search and define the applicability domain of the resolved models. Our method was successful in accounting for 86% of the data set in 3 robust models that show high correlation between calculated and observed values for ligand-binding free energies (RMSE < 2.5 kJ mol–1), with good cross-validation statistics. PMID:28776988
The Empirical Review of Meta-Analysis Published in Korea
ERIC Educational Resources Information Center
Park, Sunyoung; Hong, Sehee
2016-01-01
Meta-analysis is a statistical method that is increasingly utilized to combine and compare the results of previous primary studies. However, because of the lack of comprehensive guidelines for how to use meta-analysis, many meta-analysis studies have failed to consider important aspects, such as statistical programs, power analysis, publication…
This analysis updates EPA's standard VSL estimate by using a more comprehensive collection of VSL studies that include studies published between 1992 and 2000, as well as applying a more appropriate statistical method. We provide a pooled effect VSL estimate by applying the empi...
Du Mont, Janice; Macdonald, Sheila; Kosa, Daisy; Elliot, Shannon; Spencer, Charmaine; Yaffe, Mark
2015-01-01
Introduction Elder abuse, a universal human rights problem, is associated with many negative consequences. In most jurisdictions, however, there are no comprehensive hospital-based interventions for elder abuse that address the totality of needs of abused older adults: psychological, physical, legal, and social. As the first step towards the development of such an intervention, we undertook a systematic scoping review. Objectives Our primary objective was to systematically extract and synthesize actionable and applicable recommendations for components of a multidisciplinary intersectoral hospital-based elder abuse intervention. A secondary objective was to summarize the characteristics of the responses reviewed, including methods of development and validation. Methods The grey and scholarly literatures were systematically searched, with two independent reviewers conducting the title, abstract and full text screening. Documents were considered eligible for inclusion if they: 1) addressed a response (e.g., an intervention) to elder abuse, 2) contained recommendations for responding to abused older adults with potential relevance to a multidisciplinary and intersectoral hospital-based elder abuse intervention; and 3) were available in English. Analysis The extracted recommendations for care were collated, coded, categorized into themes, and further reviewed for relevancy to a comprehensive hospital-based response. Characteristics of the responses were summarized using descriptive statistics. Results 649 recommendations were extracted from 68 distinct elder abuse responses, 149 of which were deemed relevant and were categorized into 5 themes: Initial contact; Capacity and consent; Interview with older adult, caregiver, collateral contacts, and/or suspected abuser; Assessment: physical/forensic, mental, psychosocial, and environmental/functional; and care plan. Only 6 responses had been evaluated, suggesting a significant gap between development and implementation of recommendations. Discussion To address the lack of evidence to support the recommendations extracted in this review, in a future study, a group of experts will formally evaluate each recommendation for its inclusion in a comprehensive hospital-based response. PMID:25938414
Involvement of the right hemisphere in reading comprehension: a DTI study
Horowitz-Kraus, Tzipi; Wang, Yingying; Plante, Elena; Holland, Scott K.
2014-01-01
The Simple View of reading emphasizes the critical role of two factors in normal reading skills: word recognition and reading comprehension. The current study aims to identify the anatomical support for aspects of reading performance that fall within these two components. Fractional anisotropy (FA) values were obtained from Diffusion Tensor images in twenty-one typical adolescents and young adults using the Tract Based Spatial Statistics (TBSS) method. We focused on the Arcuate Fasciculus (AF) and Inferior Longitudinal Fasciculus (ILF) as fiber tracts that connect regions already implicated in the distributed cortical network for reading. Our results demonstrate dissociation between word-level and narrative-level reading skills: the FA values for both left and right ILF were correlated with measures of word reading, while only the left ILF correlated with reading comprehension scores. FA in the AF, however, correlated only with reading comprehension scores, bilaterally. Correlations with the right AF were particularly robust, emphasizing the contribution of the right hemisphere, especially the frontal lobe, to reading comprehension performance on the particular passage comprehension test used in this study. The anatomical dissociation between these reading skills is supported by the Simple View theory and may shed light on why these two skills dissociate in those with reading disorders. PMID:24909792
Yang, Weichao; Xu, Kui; Lian, Jijian; Bin, Lingling; Ma, Chao
2018-05-01
Flood is a serious challenge that increasingly affects the residents as well as policymakers. Flood vulnerability assessment is becoming gradually relevant in the world. The purpose of this study is to develop an approach to reveal the relationship between exposure, sensitivity and adaptive capacity for better flood vulnerability assessment, based on the fuzzy comprehensive evaluation method (FCEM) and coordinated development degree model (CDDM). The approach is organized into three parts: establishment of index system, assessment of exposure, sensitivity and adaptive capacity, and multiple flood vulnerability assessment. Hydrodynamic model and statistical data are employed for the establishment of index system; FCEM is used to evaluate exposure, sensitivity and adaptive capacity; and CDDM is applied to express the relationship of the three components of vulnerability. Six multiple flood vulnerability types and four levels are proposed to assess flood vulnerability from multiple perspectives. Then the approach is applied to assess the spatiality of flood vulnerability in Hainan's eastern area, China. Based on the results of multiple flood vulnerability, a decision-making process for rational allocation of limited resources is proposed and applied to the study area. The study shows that multiple flood vulnerability assessment can evaluate vulnerability more completely, and help decision makers learn more information about making decisions in a more comprehensive way. In summary, this study provides a new way for flood vulnerability assessment and disaster prevention decision. Copyright © 2018 Elsevier Ltd. All rights reserved.
Steingass, Christof Björn; Jutzi, Manfred; Müller, Jenny; Carle, Reinhold; Schmarr, Hans-Georg
2015-03-01
Ripening-dependent changes of pineapple volatiles were studied in a nontargeted profiling analysis. Volatiles were isolated via headspace solid phase microextraction and analyzed by comprehensive 2D gas chromatography and mass spectrometry (HS-SPME-GC×GC-qMS). Profile patterns presented in the contour plots were evaluated applying image processing techniques and subsequent multivariate statistical data analysis. Statistical methods comprised unsupervised hierarchical cluster analysis (HCA) and principal component analysis (PCA) to classify the samples. Supervised partial least squares discriminant analysis (PLS-DA) and partial least squares (PLS) regression were applied to discriminate different ripening stages and describe the development of volatiles during postharvest storage, respectively. Hereby, substantial chemical markers allowing for class separation were revealed. The workflow permitted the rapid distinction between premature green-ripe pineapples and postharvest-ripened sea-freighted fruits. Volatile profiles of fully ripe air-freighted pineapples were similar to those of green-ripe fruits postharvest ripened for 6 days after simulated sea freight export, after PCA with only two principal components. However, PCA considering also the third principal component allowed differentiation between air-freighted fruits and the four progressing postharvest maturity stages of sea-freighted pineapples.
A ricin forensic profiling approach based on a complex set of biomarkers.
Fredriksson, Sten-Åke; Wunschel, David S; Lindström, Susanne Wiklund; Nilsson, Calle; Wahl, Karen; Åstot, Crister
2018-08-15
A forensic method for the retrospective determination of preparation methods used for illicit ricin toxin production was developed. The method was based on a complex set of biomarkers, including carbohydrates, fatty acids, seed storage proteins, in combination with data on ricin and Ricinus communis agglutinin. The analyses were performed on samples prepared from four castor bean plant (R. communis) cultivars by four different sample preparation methods (PM1-PM4) ranging from simple disintegration of the castor beans to multi-step preparation methods including different protein precipitation methods. Comprehensive analytical data was collected by use of a range of analytical methods and robust orthogonal partial least squares-discriminant analysis- models (OPLS-DA) were constructed based on the calibration set. By the use of a decision tree and two OPLS-DA models, the sample preparation methods of test set samples were determined. The model statistics of the two models were good and a 100% rate of correct predictions of the test set was achieved. Copyright © 2018 Elsevier B.V. All rights reserved.
Methods of Improving Speech Intelligibility for Listeners with Hearing Resolution Deficit
2012-01-01
Abstract Methods developed for real-time time scale modification (TSM) of speech signal are presented. They are based on the non-uniform, speech rate depended SOLA algorithm (Synchronous Overlap and Add). Influence of the proposed method on the intelligibility of speech was investigated for two separate groups of listeners, i.e. hearing impaired children and elderly listeners. It was shown that for the speech with average rate equal to or higher than 6.48 vowels/s, all of the proposed methods have statistically significant impact on the improvement of speech intelligibility for hearing impaired children with reduced hearing resolution and one of the proposed methods significantly improves comprehension of speech in the group of elderly listeners with reduced hearing resolution. Virtual slides http://www.diagnosticpathology.diagnomx.eu/vs/2065486371761991 PMID:23009662
NASA Astrophysics Data System (ADS)
Wu, Z.; Luo, Z.; Zhang, Y.; Guo, F.; He, L.
2018-04-01
A Modulation Transfer Function (MTF)-based fuzzy comprehensive evaluation method was proposed in this paper for the purpose of evaluating high-resolution satellite image quality. To establish the factor set, two MTF features and seven radiant features were extracted from the knife-edge region of image patch, which included Nyquist, MTF0.5, entropy, peak signal to noise ratio (PSNR), average difference, edge intensity, average gradient, contrast and ground spatial distance (GSD). After analyzing the statistical distribution of above features, a fuzzy evaluation threshold table and fuzzy evaluation membership functions was established. The experiments for comprehensive quality assessment of different natural and artificial objects was done with GF2 image patches. The results showed that the calibration field image has the highest quality scores. The water image has closest image quality to the calibration field, quality of building image is a little poor than water image, but much higher than farmland image. In order to test the influence of different features on quality evaluation, the experiment with different weights were tested on GF2 and SPOT7 images. The results showed that different weights correspond different evaluating effectiveness. In the case of setting up the weights of edge features and GSD, the image quality of GF2 is better than SPOT7. However, when setting MTF and PSNR as main factor, the image quality of SPOT7 is better than GF2.
Cost-Effectiveness Analysis: a proposal of new reporting standards in statistical analysis
Bang, Heejung; Zhao, Hongwei
2014-01-01
Cost-effectiveness analysis (CEA) is a method for evaluating the outcomes and costs of competing strategies designed to improve health, and has been applied to a variety of different scientific fields. Yet, there are inherent complexities in cost estimation and CEA from statistical perspectives (e.g., skewness, bi-dimensionality, and censoring). The incremental cost-effectiveness ratio that represents the additional cost per one unit of outcome gained by a new strategy has served as the most widely accepted methodology in the CEA. In this article, we call for expanded perspectives and reporting standards reflecting a more comprehensive analysis that can elucidate different aspects of available data. Specifically, we propose that mean and median-based incremental cost-effectiveness ratios and average cost-effectiveness ratios be reported together, along with relevant summary and inferential statistics as complementary measures for informed decision making. PMID:24605979
Evaluation of risk communication in a mammography patient decision aid
Klein, Krystal A.; Watson, Lindsey; Ash, Joan S.; Eden, Karen B.
2016-01-01
Objectives We characterized patients’ comprehension, memory, and impressions of risk communication messages in a patient decision aid (PtDA), Mammopad, and clarified perceived importance of numeric risk information in medical decision making. Methods Participants were 75 women in their forties with average risk factors for breast cancer. We used mixed methods, comprising a risk estimation problem administered within a pretest–posttest design, and semi-structured qualitative interviews with a subsample of 21 women. Results Participants’ positive predictive value estimates of screening mammography improved after using Mammopad. Although risk information was only briefly memorable, through content analysis, we identified themes describing why participants value quantitative risk information, and obstacles to understanding. We describe ways the most complicated graphic was incompletely comprehended. Conclusions Comprehension of risk information following Mammopad use could be improved. Patients valued receiving numeric statistical information, particularly in pictograph format. Obstacles to understanding risk information, including potential for confusion between statistics, should be identified and mitigated in PtDA design. Practice implications Using simple pictographs accompanied by text, PtDAs may enhance a shared decision-making discussion. PtDA designers and providers should be aware of benefits and limitations of graphical risk presentations. Incorporating comprehension checks could help identify and correct misapprehensions of graphically presented statistics PMID:26965020
Replicate This! Creating Individual-Level Data from Summary Statistics Using R
ERIC Educational Resources Information Center
Morse, Brendan J.
2013-01-01
Incorporating realistic data and research examples into quantitative (e.g., statistics and research methods) courses has been widely recommended for enhancing student engagement and comprehension. One way to achieve these ends is to use a data generator to emulate the data in published research articles. "MorseGen" is a free data generator that…
Bahlmann, Claus; Burkhardt, Hans
2004-03-01
In this paper, we give a comprehensive description of our writer-independent online handwriting recognition system frog on hand. The focus of this work concerns the presentation of the classification/training approach, which we call cluster generative statistical dynamic time warping (CSDTW). CSDTW is a general, scalable, HMM-based method for variable-sized, sequential data that holistically combines cluster analysis and statistical sequence modeling. It can handle general classification problems that rely on this sequential type of data, e.g., speech recognition, genome processing, robotics, etc. Contrary to previous attempts, clustering and statistical sequence modeling are embedded in a single feature space and use a closely related distance measure. We show character recognition experiments of frog on hand using CSDTW on the UNIPEN online handwriting database. The recognition accuracy is significantly higher than reported results of other handwriting recognition systems. Finally, we describe the real-time implementation of frog on hand on a Linux Compaq iPAQ embedded device.
Environmental Health Practice: Statistically Based Performance Measurement
Enander, Richard T.; Gagnon, Ronald N.; Hanumara, R. Choudary; Park, Eugene; Armstrong, Thomas; Gute, David M.
2007-01-01
Objectives. State environmental and health protection agencies have traditionally relied on a facility-by-facility inspection-enforcement paradigm to achieve compliance with government regulations. We evaluated the effectiveness of a new approach that uses a self-certification random sampling design. Methods. Comprehensive environmental and occupational health data from a 3-year statewide industry self-certification initiative were collected from representative automotive refinishing facilities located in Rhode Island. Statistical comparisons between baseline and postintervention data facilitated a quantitative evaluation of statewide performance. Results. The analysis of field data collected from 82 randomly selected automotive refinishing facilities showed statistically significant improvements (P<.05, Fisher exact test) in 4 major performance categories: occupational health and safety, air pollution control, hazardous waste management, and wastewater discharge. Statistical significance was also shown when a modified Bonferroni adjustment for multiple comparisons was performed. Conclusions. Our findings suggest that the new self-certification approach to environmental and worker protection is effective and can be used as an adjunct to further enhance state and federal enforcement programs. PMID:17267709
2015-06-18
Engineering Effectiveness Survey. CMU/SEI-2012-SR-009. Carnegie Mellon University. November 2012. Field, Andy. Discovering Statistics Using SPSS , 3rd...enough into the survey to begin answering questions on risk practices. All of the data statistical analysis will be performed using SPSS . Prior to...probabilistically using distributions for likelihood and impact. Statistical methods like Monte Carlo can more comprehensively evaluate the cost and
Robustly detecting differential expression in RNA sequencing data using observation weights
Zhou, Xiaobei; Lindsay, Helen; Robinson, Mark D.
2014-01-01
A popular approach for comparing gene expression levels between (replicated) conditions of RNA sequencing data relies on counting reads that map to features of interest. Within such count-based methods, many flexible and advanced statistical approaches now exist and offer the ability to adjust for covariates (e.g. batch effects). Often, these methods include some sort of ‘sharing of information’ across features to improve inferences in small samples. It is important to achieve an appropriate tradeoff between statistical power and protection against outliers. Here, we study the robustness of existing approaches for count-based differential expression analysis and propose a new strategy based on observation weights that can be used within existing frameworks. The results suggest that outliers can have a global effect on differential analyses. We demonstrate the effectiveness of our new approach with real data and simulated data that reflects properties of real datasets (e.g. dispersion-mean trend) and develop an extensible framework for comprehensive testing of current and future methods. In addition, we explore the origin of such outliers, in some cases highlighting additional biological or technical factors within the experiment. Further details can be downloaded from the project website: http://imlspenticton.uzh.ch/robinson_lab/edgeR_robust/. PMID:24753412
ERIC Educational Resources Information Center
Rahim, Syed A.
Based in part on a list developed by the United Nations Educational, Scientific, and Cultural Organization (UNESCO) for use in Afghanistan, this document presents a comprehensive checklist of items of statistical and descriptive data required for planning a national communication system. It is noted that such a system provides the vital…
Grey Comprehensive Evaluation of Biomass Power Generation Project Based on Group Judgement
NASA Astrophysics Data System (ADS)
Xia, Huicong; Niu, Dongxiao
2017-06-01
The comprehensive evaluation of benefit is an important task needed to be carried out at all stages of biomass power generation projects. This paper proposed an improved grey comprehensive evaluation method based on triangle whiten function. To improve the objectivity of weight calculation result of only reference comparison judgment method, this paper introduced group judgment to the weighting process. In the process of grey comprehensive evaluation, this paper invited a number of experts to estimate the benefit level of projects, and optimized the basic estimations based on the minimum variance principle to improve the accuracy of evaluation result. Taking a biomass power generation project as an example, the grey comprehensive evaluation result showed that the benefit level of this project was good. This example demonstrates the feasibility of grey comprehensive evaluation method based on group judgment for benefit evaluation of biomass power generation project.
A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data
Vinaixa, Maria; Samino, Sara; Saez, Isabel; Duran, Jordi; Guinovart, Joan J.; Yanes, Oscar
2012-01-01
Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples. PMID:24957762
A Guideline to Univariate Statistical Analysis for LC/MS-Based Untargeted Metabolomics-Derived Data.
Vinaixa, Maria; Samino, Sara; Saez, Isabel; Duran, Jordi; Guinovart, Joan J; Yanes, Oscar
2012-10-18
Several metabolomic software programs provide methods for peak picking, retention time alignment and quantification of metabolite features in LC/MS-based metabolomics. Statistical analysis, however, is needed in order to discover those features significantly altered between samples. By comparing the retention time and MS/MS data of a model compound to that from the altered feature of interest in the research sample, metabolites can be then unequivocally identified. This paper reports on a comprehensive overview of a workflow for statistical analysis to rank relevant metabolite features that will be selected for further MS/MS experiments. We focus on univariate data analysis applied in parallel on all detected features. Characteristics and challenges of this analysis are discussed and illustrated using four different real LC/MS untargeted metabolomic datasets. We demonstrate the influence of considering or violating mathematical assumptions on which univariate statistical test rely, using high-dimensional LC/MS datasets. Issues in data analysis such as determination of sample size, analytical variation, assumption of normality and homocedasticity, or correction for multiple testing are discussed and illustrated in the context of our four untargeted LC/MS working examples.
TREATMENT SWITCHING: STATISTICAL AND DECISION-MAKING CHALLENGES AND APPROACHES.
Latimer, Nicholas R; Henshall, Chris; Siebert, Uwe; Bell, Helen
2016-01-01
Treatment switching refers to the situation in a randomized controlled trial where patients switch from their randomly assigned treatment onto an alternative. Often, switching is from the control group onto the experimental treatment. In this instance, a standard intention-to-treat analysis does not identify the true comparative effectiveness of the treatments under investigation. We aim to describe statistical methods for adjusting for treatment switching in a comprehensible way for nonstatisticians, and to summarize views on these methods expressed by stakeholders at the 2014 Adelaide International Workshop on Treatment Switching in Clinical Trials. We describe three statistical methods used to adjust for treatment switching: marginal structural models, two-stage adjustment, and rank preserving structural failure time models. We draw upon discussion heard at the Adelaide International Workshop to explore the views of stakeholders on the acceptability of these methods. Stakeholders noted that adjustment methods are based on assumptions, the validity of which may often be questionable. There was disagreement on the acceptability of adjustment methods, but consensus that when these are used, they should be justified rigorously. The utility of adjustment methods depends upon the decision being made and the processes used by the decision-maker. Treatment switching makes estimating the true comparative effect of a new treatment challenging. However, many decision-makers have reservations with adjustment methods. These, and how they affect the utility of adjustment methods, require further exploration. Further technical work is required to develop adjustment methods to meet real world needs, to enhance their acceptability to decision-makers.
How language production shapes language form and comprehension
MacDonald, Maryellen C.
2012-01-01
Language production processes can provide insight into how language comprehension works and language typology—why languages tend to have certain characteristics more often than others. Drawing on work in memory retrieval, motor planning, and serial order in action planning, the Production-Distribution-Comprehension (PDC) account links work in the fields of language production, typology, and comprehension: (1) faced with substantial computational burdens of planning and producing utterances, language producers implicitly follow three biases in utterance planning that promote word order choices that reduce these burdens, thereby improving production fluency. (2) These choices, repeated over many utterances and individuals, shape the distributions of utterance forms in language. The claim that language form stems in large degree from producers' attempts to mitigate utterance planning difficulty is contrasted with alternative accounts in which form is driven by language use more broadly, language acquisition processes, or producers' attempts to create language forms that are easily understood by comprehenders. (3) Language perceivers implicitly learn the statistical regularities in their linguistic input, and they use this prior experience to guide comprehension of subsequent language. In particular, they learn to predict the sequential structure of linguistic signals, based on the statistics of previously-encountered input. Thus, key aspects of comprehension behavior are tied to lexico-syntactic statistics in the language, which in turn derive from utterance planning biases promoting production of comparatively easy utterance forms over more difficult ones. This approach contrasts with classic theories in which comprehension behaviors are attributed to innate design features of the language comprehension system and associated working memory. The PDC instead links basic features of comprehension to a different source: production processes that shape language form. PMID:23637689
Comprehensive Optimization of LC-MS Metabolomics Methods Using Design of Experiments (COLMeD)
Rhoades, Seth D.
2017-01-01
Introduction Both reverse-phase and HILIC chemistries are deployed for liquid-chromatography mass spectrometry (LC-MS) metabolomics analyses, however HILIC methods lag behind reverse-phase methods in reproducibility and versatility. Comprehensive metabolomics analysis is additionally complicated by the physiochemical diversity of metabolites and array of tunable analytical parameters. Objective Our aim was to rationally and efficiently design complementary HILIC-based polar metabolomics methods on multiple instruments using Design of Experiments (DoE). Methods We iteratively tuned LC and MS conditions on ion-switching triple quadrupole (QqQ) and quadrupole-time-of-flight (qTOF) mass spectrometers through multiple rounds of a workflow we term COLMeD (Comprehensive optimization of LC-MS metabolomics methods using design of experiments). Multivariate statistical analysis guided our decision process in the method optimizations. Results LC-MS/MS tuning for the QqQ method on serum metabolites yielded a median response increase of 161.5% (p<0.0001) over initial conditions with a 13.3% increase in metabolite coverage. The COLMeD output was benchmarked against two widely used polar metabolomics methods, demonstrating total ion current increases of 105.8% and 57.3%, with median metabolite response increases of 106.1% and 10.3% (p<0.0001 and p<0.05 respectively). For our optimized qTOF method, 22 solvent systems were compared on a standard mix of physiochemically diverse metabolites, followed by COLMeD optimization, yielding a median 29.8% response increase (p<0.0001) over initial conditions. Conclusions The COLMeD process elucidated response tradeoffs, facilitating improved chromatography and MS response without compromising separation of isobars. COLMeD is efficient, requiring no more than 20 injections in a given DoE round, and flexible, capable of class-specific optimization as demonstrated through acylcarnitine optimization within the QqQ method. PMID:28348510
Raja, Muhammad Asif Zahoor; Khan, Junaid Ali; Ahmad, Siraj-ul-Islam; Qureshi, Ijaz Mansoor
2012-01-01
A methodology for solution of Painlevé equation-I is presented using computational intelligence technique based on neural networks and particle swarm optimization hybridized with active set algorithm. The mathematical model of the equation is developed with the help of linear combination of feed-forward artificial neural networks that define the unsupervised error of the model. This error is minimized subject to the availability of appropriate weights of the networks. The learning of the weights is carried out using particle swarm optimization algorithm used as a tool for viable global search method, hybridized with active set algorithm for rapid local convergence. The accuracy, convergence rate, and computational complexity of the scheme are analyzed based on large number of independents runs and their comprehensive statistical analysis. The comparative studies of the results obtained are made with MATHEMATICA solutions, as well as, with variational iteration method and homotopy perturbation method. PMID:22919371
Antal, Péter; Kiszel, Petra Sz.; Gézsi, András; Hadadi, Éva; Virág, Viktor; Hajós, Gergely; Millinghoffer, András; Nagy, Adrienne; Kiss, András; Semsei, Ágnes F.; Temesi, Gergely; Melegh, Béla; Kisfali, Péter; Széll, Márta; Bikov, András; Gálffy, Gabriella; Tamási, Lilla; Falus, András; Szalai, Csaba
2012-01-01
Genetic studies indicate high number of potential factors related to asthma. Based on earlier linkage analyses we selected the 11q13 and 14q22 asthma susceptibility regions, for which we designed a partial genome screening study using 145 SNPs in 1201 individuals (436 asthmatic children and 765 controls). The results were evaluated with traditional frequentist methods and we applied a new statistical method, called Bayesian network based Bayesian multilevel analysis of relevance (BN-BMLA). This method uses Bayesian network representation to provide detailed characterization of the relevance of factors, such as joint significance, the type of dependency, and multi-target aspects. We estimated posteriors for these relations within the Bayesian statistical framework, in order to estimate the posteriors whether a variable is directly relevant or its association is only mediated. With frequentist methods one SNP (rs3751464 in the FRMD6 gene) provided evidence for an association with asthma (OR = 1.43(1.2–1.8); p = 3×10−4). The possible role of the FRMD6 gene in asthma was also confirmed in an animal model and human asthmatics. In the BN-BMLA analysis altogether 5 SNPs in 4 genes were found relevant in connection with asthma phenotype: PRPF19 on chromosome 11, and FRMD6, PTGER2 and PTGDR on chromosome 14. In a subsequent step a partial dataset containing rhinitis and further clinical parameters was used, which allowed the analysis of relevance of SNPs for asthma and multiple targets. These analyses suggested that SNPs in the AHNAK and MS4A2 genes were indirectly associated with asthma. This paper indicates that BN-BMLA explores the relevant factors more comprehensively than traditional statistical methods and extends the scope of strong relevance based methods to include partial relevance, global characterization of relevance and multi-target relevance. PMID:22432035
Chapter 11. Community analysis-based methods
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cao, Y.; Wu, C.H.; Andersen, G.L.
2010-05-01
Microbial communities are each a composite of populations whose presence and relative abundance in water or other environmental samples are a direct manifestation of environmental conditions, including the introduction of microbe-rich fecal material and factors promoting persistence of the microbes therein. As shown by culture-independent methods, different animal-host fecal microbial communities appear distinctive, suggesting that their community profiles can be used to differentiate fecal samples and to potentially reveal the presence of host fecal material in environmental waters. Cross-comparisons of microbial communities from different hosts also reveal relative abundances of genetic groups that can be used to distinguish sources. Inmore » increasing order of their information richness, several community analysis methods hold promise for MST applications: phospholipid fatty acid (PLFA) analysis, denaturing gradient gel electrophoresis (DGGE), terminal restriction fragment length polymorphism (TRFLP), cloning/sequencing, and PhyloChip. Specific case studies involving TRFLP and PhyloChip approaches demonstrate the ability of community-based analyses of contaminated waters to confirm a diagnosis of water quality based on host-specific marker(s). The success of community-based MST for comprehensively confirming fecal sources relies extensively upon using appropriate multivariate statistical approaches. While community-based MST is still under evaluation and development as a primary diagnostic tool, results presented herein demonstrate its promise. Coupled with its inherently comprehensive ability to capture an unprecedented amount of microbiological data that is relevant to water quality, the tools for microbial community analysis are increasingly accessible, and community-based approaches have unparalleled potential for translation into rapid, perhaps real-time, monitoring platforms.« less
McDermott, Jason E.; Wang, Jing; Mitchell, Hugh; Webb-Robertson, Bobbie-Jo; Hafen, Ryan; Ramey, John; Rodland, Karin D.
2012-01-01
Introduction The advent of high throughput technologies capable of comprehensive analysis of genes, transcripts, proteins and other significant biological molecules has provided an unprecedented opportunity for the identification of molecular markers of disease processes. However, it has simultaneously complicated the problem of extracting meaningful molecular signatures of biological processes from these complex datasets. The process of biomarker discovery and characterization provides opportunities for more sophisticated approaches to integrating purely statistical and expert knowledge-based approaches. Areas covered In this review we will present examples of current practices for biomarker discovery from complex omic datasets and the challenges that have been encountered in deriving valid and useful signatures of disease. We will then present a high-level review of data-driven (statistical) and knowledge-based methods applied to biomarker discovery, highlighting some current efforts to combine the two distinct approaches. Expert opinion Effective, reproducible and objective tools for combining data-driven and knowledge-based approaches to identify predictive signatures of disease are key to future success in the biomarker field. We will describe our recommendations for possible approaches to this problem including metrics for the evaluation of biomarkers. PMID:23335946
DOE Office of Scientific and Technical Information (OSTI.GOV)
McDermott, Jason E.; Wang, Jing; Mitchell, Hugh D.
2013-01-01
The advent of high throughput technologies capable of comprehensive analysis of genes, transcripts, proteins and other significant biological molecules has provided an unprecedented opportunity for the identification of molecular markers of disease processes. However, it has simultaneously complicated the problem of extracting meaningful signatures of biological processes from these complex datasets. The process of biomarker discovery and characterization provides opportunities both for purely statistical and expert knowledge-based approaches and would benefit from improved integration of the two. Areas covered In this review we will present examples of current practices for biomarker discovery from complex omic datasets and the challenges thatmore » have been encountered. We will then present a high-level review of data-driven (statistical) and knowledge-based methods applied to biomarker discovery, highlighting some current efforts to combine the two distinct approaches. Expert opinion Effective, reproducible and objective tools for combining data-driven and knowledge-based approaches to biomarker discovery and characterization are key to future success in the biomarker field. We will describe our recommendations of possible approaches to this problem including metrics for the evaluation of biomarkers.« less
Comprehensive Optimization of LC-MS Metabolomics Methods Using Design of Experiments (COLMeD).
Rhoades, Seth D; Weljie, Aalim M
2016-12-01
Both reverse-phase and HILIC chemistries are deployed for liquid-chromatography mass spectrometry (LC-MS) metabolomics analyses, however HILIC methods lag behind reverse-phase methods in reproducibility and versatility. Comprehensive metabolomics analysis is additionally complicated by the physiochemical diversity of metabolites and array of tunable analytical parameters. Our aim was to rationally and efficiently design complementary HILIC-based polar metabolomics methods on multiple instruments using Design of Experiments (DoE). We iteratively tuned LC and MS conditions on ion-switching triple quadrupole (QqQ) and quadrupole-time-of-flight (qTOF) mass spectrometers through multiple rounds of a workflow we term COLMeD (Comprehensive optimization of LC-MS metabolomics methods using design of experiments). Multivariate statistical analysis guided our decision process in the method optimizations. LC-MS/MS tuning for the QqQ method on serum metabolites yielded a median response increase of 161.5% (p<0.0001) over initial conditions with a 13.3% increase in metabolite coverage. The COLMeD output was benchmarked against two widely used polar metabolomics methods, demonstrating total ion current increases of 105.8% and 57.3%, with median metabolite response increases of 106.1% and 10.3% (p<0.0001 and p<0.05 respectively). For our optimized qTOF method, 22 solvent systems were compared on a standard mix of physiochemically diverse metabolites, followed by COLMeD optimization, yielding a median 29.8% response increase (p<0.0001) over initial conditions. The COLMeD process elucidated response tradeoffs, facilitating improved chromatography and MS response without compromising separation of isobars. COLMeD is efficient, requiring no more than 20 injections in a given DoE round, and flexible, capable of class-specific optimization as demonstrated through acylcarnitine optimization within the QqQ method.
Graph-based structural change detection for rotating machinery monitoring
NASA Astrophysics Data System (ADS)
Lu, Guoliang; Liu, Jie; Yan, Peng
2018-01-01
Detection of structural changes is critically important in operational monitoring of a rotating machine. This paper presents a novel framework for this purpose, where a graph model for data modeling is adopted to represent/capture statistical dynamics in machine operations. Meanwhile we develop a numerical method for computing temporal anomalies in the constructed graphs. The martingale-test method is employed for the change detection when making decisions on possible structural changes, where excellent performance is demonstrated outperforming exciting results such as the autoregressive-integrated-moving average (ARIMA) model. Comprehensive experimental results indicate good potentials of the proposed algorithm in various engineering applications. This work is an extension of a recent result (Lu et al., 2017).
Podsakoff, Philip M; MacKenzie, Scott B; Lee, Jeong-Yeon; Podsakoff, Nathan P
2003-10-01
Interest in the problem of method biases has a long history in the behavioral sciences. Despite this, a comprehensive summary of the potential sources of method biases and how to control for them does not exist. Therefore, the purpose of this article is to examine the extent to which method biases influence behavioral research results, identify potential sources of method biases, discuss the cognitive processes through which method biases influence responses to measures, evaluate the many different procedural and statistical techniques that can be used to control method biases, and provide recommendations for how to select appropriate procedural and statistical remedies for different types of research settings.
Liu, Dong-jun; Li, Li
2015-01-01
For the issue of haze-fog, PM2.5 is the main influence factor of haze-fog pollution in China. The trend of PM2.5 concentration was analyzed from a qualitative point of view based on mathematical models and simulation in this study. The comprehensive forecasting model (CFM) was developed based on the combination forecasting ideas. Autoregressive Integrated Moving Average Model (ARIMA), Artificial Neural Networks (ANNs) model and Exponential Smoothing Method (ESM) were used to predict the time series data of PM2.5 concentration. The results of the comprehensive forecasting model were obtained by combining the results of three methods based on the weights from the Entropy Weighting Method. The trend of PM2.5 concentration in Guangzhou China was quantitatively forecasted based on the comprehensive forecasting model. The results were compared with those of three single models, and PM2.5 concentration values in the next ten days were predicted. The comprehensive forecasting model balanced the deviation of each single prediction method, and had better applicability. It broadens a new prediction method for the air quality forecasting field. PMID:26110332
Liu, Dong-jun; Li, Li
2015-06-23
For the issue of haze-fog, PM2.5 is the main influence factor of haze-fog pollution in China. The trend of PM2.5 concentration was analyzed from a qualitative point of view based on mathematical models and simulation in this study. The comprehensive forecasting model (CFM) was developed based on the combination forecasting ideas. Autoregressive Integrated Moving Average Model (ARIMA), Artificial Neural Networks (ANNs) model and Exponential Smoothing Method (ESM) were used to predict the time series data of PM2.5 concentration. The results of the comprehensive forecasting model were obtained by combining the results of three methods based on the weights from the Entropy Weighting Method. The trend of PM2.5 concentration in Guangzhou China was quantitatively forecasted based on the comprehensive forecasting model. The results were compared with those of three single models, and PM2.5 concentration values in the next ten days were predicted. The comprehensive forecasting model balanced the deviation of each single prediction method, and had better applicability. It broadens a new prediction method for the air quality forecasting field.
Moshtagh-Khorasani, Majid; Akbarzadeh-T, Mohammad-R; Jahangiri, Nader; Khoobdel, Mehdi
2009-01-01
BACKGROUND: Aphasia diagnosis is particularly challenging due to the linguistic uncertainty and vagueness, inconsistencies in the definition of aphasic syndromes, large number of measurements with imprecision, natural diversity and subjectivity in test objects as well as in opinions of experts who diagnose the disease. METHODS: Fuzzy probability is proposed here as the basic framework for handling the uncertainties in medical diagnosis and particularly aphasia diagnosis. To efficiently construct this fuzzy probabilistic mapping, statistical analysis is performed that constructs input membership functions as well as determines an effective set of input features. RESULTS: Considering the high sensitivity of performance measures to different distribution of testing/training sets, a statistical t-test of significance is applied to compare fuzzy approach results with NN results as well as author's earlier work using fuzzy logic. The proposed fuzzy probability estimator approach clearly provides better diagnosis for both classes of data sets. Specifically, for the first and second type of fuzzy probability classifiers, i.e. spontaneous speech and comprehensive model, P-values are 2.24E-08 and 0.0059, respectively, strongly rejecting the null hypothesis. CONCLUSIONS: The technique is applied and compared on both comprehensive and spontaneous speech test data for diagnosis of four Aphasia types: Anomic, Broca, Global and Wernicke. Statistical analysis confirms that the proposed approach can significantly improve accuracy using fewer Aphasia features. PMID:21772867
Pezzuti, L; Nacinovich, R; Oggiano, S; Bomba, M; Ferri, R; La Stella, A; Rossetti, S; Orsini, A
2018-07-01
Individuals with Down syndrome generally show a floor effect on Wechsler Scales that is manifested by flat profiles and with many or all of the weighted scores on the subtests equal to 1. The main aim of the present paper is to use the statistical Hessl method and the extended statistical method of Orsini, Pezzuti and Hulbert with a sample of individuals with Down syndrome (n = 128; 72 boys and 56 girls), to underline the variability of performance on Wechsler Intelligence Scale for Children-Fourth Edition subtests and indices, highlighting any strengths and weaknesses of this population that otherwise appear to be flattened. Based on results using traditional transformation of raw scores into weighted scores, a very high percentage of subtests with weighted score of 1 occurred in the Down syndrome sample, with a floor effect and without any statistically significant difference between four core Wechsler Intelligence Scale for Children-Fourth Edition indices. The results, using traditional transformation, confirm a deep cognitive impairment of those with Down syndrome. Conversely, using the new statistical method, it is immediately apparent that the variability of the scores, both on subtests and indices, is wider with respect to the traditional method. Children with Down syndrome show a greater ability in the Verbal Comprehension Index than in the Working Memory Index. © 2018 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
The choice of statistical methods for comparisons of dosimetric data in radiotherapy.
Chaikh, Abdulhamid; Giraud, Jean-Yves; Perrin, Emmanuel; Bresciani, Jean-Pierre; Balosso, Jacques
2014-09-18
Novel irradiation techniques are continuously introduced in radiotherapy to optimize the accuracy, the security and the clinical outcome of treatments. These changes could raise the question of discontinuity in dosimetric presentation and the subsequent need for practice adjustments in case of significant modifications. This study proposes a comprehensive approach to compare different techniques and tests whether their respective dose calculation algorithms give rise to statistically significant differences in the treatment doses for the patient. Statistical investigation principles are presented in the framework of a clinical example based on 62 fields of radiotherapy for lung cancer. The delivered doses in monitor units were calculated using three different dose calculation methods: the reference method accounts the dose without tissues density corrections using Pencil Beam Convolution (PBC) algorithm, whereas new methods calculate the dose with tissues density correction for 1D and 3D using Modified Batho (MB) method and Equivalent Tissue air ratio (ETAR) method, respectively. The normality of the data and the homogeneity of variance between groups were tested using Shapiro-Wilks and Levene test, respectively, then non-parametric statistical tests were performed. Specifically, the dose means estimated by the different calculation methods were compared using Friedman's test and Wilcoxon signed-rank test. In addition, the correlation between the doses calculated by the three methods was assessed using Spearman's rank and Kendall's rank tests. The Friedman's test showed a significant effect on the calculation method for the delivered dose of lung cancer patients (p <0.001). The density correction methods yielded to lower doses as compared to PBC by on average (-5 ± 4.4 SD) for MB and (-4.7 ± 5 SD) for ETAR. Post-hoc Wilcoxon signed-rank test of paired comparisons indicated that the delivered dose was significantly reduced using density-corrected methods as compared to the reference method. Spearman's and Kendall's rank tests indicated a positive correlation between the doses calculated with the different methods. This paper illustrates and justifies the use of statistical tests and graphical representations for dosimetric comparisons in radiotherapy. The statistical analysis shows the significance of dose differences resulting from two or more techniques in radiotherapy.
NASA Astrophysics Data System (ADS)
Hartmann, Alexander K.; Weigt, Martin
2005-10-01
A concise, comprehensive introduction to the topic of statistical physics of combinatorial optimization, bringing together theoretical concepts and algorithms from computer science with analytical methods from physics. The result bridges the gap between statistical physics and combinatorial optimization, investigating problems taken from theoretical computing, such as the vertex-cover problem, with the concepts and methods of theoretical physics. The authors cover rapid developments and analytical methods that are both extremely complex and spread by word-of-mouth, providing all the necessary basics in required detail. Throughout, the algorithms are shown with examples and calculations, while the proofs are given in a way suitable for graduate students, post-docs, and researchers. Ideal for newcomers to this young, multidisciplinary field.
NASA Astrophysics Data System (ADS)
Shi, Aiye; Wang, Chao; Shen, Shaohong; Huang, Fengchen; Ma, Zhenli
2016-10-01
Chi-squared transform (CST), as a statistical method, can describe the difference degree between vectors. The CST-based methods operate directly on information stored in the difference image and are simple and effective methods for detecting changes in remotely sensed images that have been registered and aligned. However, the technique does not take spatial information into consideration, which leads to much noise in the result of change detection. An improved unsupervised change detection method is proposed based on spatial constraint CST (SCCST) in combination with a Markov random field (MRF) model. First, the mean and variance matrix of the difference image of bitemporal images are estimated by an iterative trimming method. In each iteration, spatial information is injected to reduce scattered changed points (also known as "salt and pepper" noise). To determine the key parameter confidence level in the SCCST method, a pseudotraining dataset is constructed to estimate the optimal value. Then, the result of SCCST, as an initial solution of change detection, is further improved by the MRF model. The experiments on simulated and real multitemporal and multispectral images indicate that the proposed method performs well in comprehensive indices compared with other methods.
ERIC Educational Resources Information Center
Yung-Kuan, Chan; Hsieh, Ming-Yuan; Lee, Chin-Feng; Huang, Chih-Cheng; Ho, Li-Chih
2017-01-01
Under the hyper-dynamic education situation, this research, in order to comprehensively explore the interplays between Teacher Competence Demands (TCD) and Learning Organization Requests (LOR), cross-employs the data refined method of Descriptive Statistics (DS) method and Analysis of Variance (ANOVA) and Principal Components Analysis (PCA)…
Karzmark, Peter; Deutsch, Gayle K
2018-01-01
This investigation was designed to determine the predictive accuracy of a comprehensive neuropsychological and brief neuropsychological test battery with regard to the capacity to perform instrumental activities of daily living (IADLs). Accuracy statistics that included measures of sensitivity, specificity, positive and negative predicted power and positive likelihood ratio were calculated for both types of batteries. The sample was drawn from a general neurological group of adults (n = 117) that included a number of older participants (age >55; n = 38). Standardized neuropsychological assessments were administered to all participants and were comprised of the Halstead Reitan Battery and portions of the Wechsler Adult Intelligence Scale-III. A comprehensive test battery yielded a moderate increase over base-rate in predictive accuracy that generalized to older individuals. There was only limited support for using a brief battery, for although sensitivity was high, specificity was low. We found that a comprehensive neuropsychological test battery provided good classification accuracy for predicting IADL capacity.
Does money matter in inflation forecasting?
NASA Astrophysics Data System (ADS)
Binner, J. M.; Tino, P.; Tepper, J.; Anderson, R.; Jones, B.; Kendall, G.
2010-11-01
This paper provides the most fully comprehensive evidence to date on whether or not monetary aggregates are valuable for forecasting US inflation in the early to mid 2000s. We explore a wide range of different definitions of money, including different methods of aggregation and different collections of included monetary assets. In our forecasting experiment we use two nonlinear techniques, namely, recurrent neural networks and kernel recursive least squares regression-techniques that are new to macroeconomics. Recurrent neural networks operate with potentially unbounded input memory, while the kernel regression technique is a finite memory predictor. The two methodologies compete to find the best fitting US inflation forecasting models and are then compared to forecasts from a naïve random walk model. The best models were nonlinear autoregressive models based on kernel methods. Our findings do not provide much support for the usefulness of monetary aggregates in forecasting inflation. Beyond its economic findings, our study is in the tradition of physicists’ long-standing interest in the interconnections among statistical mechanics, neural networks, and related nonparametric statistical methods, and suggests potential avenues of extension for such studies.
Statistical analysis of fNIRS data: a comprehensive review.
Tak, Sungho; Ye, Jong Chul
2014-01-15
Functional near-infrared spectroscopy (fNIRS) is a non-invasive method to measure brain activities using the changes of optical absorption in the brain through the intact skull. fNIRS has many advantages over other neuroimaging modalities such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), or magnetoencephalography (MEG), since it can directly measure blood oxygenation level changes related to neural activation with high temporal resolution. However, fNIRS signals are highly corrupted by measurement noises and physiology-based systemic interference. Careful statistical analyses are therefore required to extract neuronal activity-related signals from fNIRS data. In this paper, we provide an extensive review of historical developments of statistical analyses of fNIRS signal, which include motion artifact correction, short source-detector separation correction, principal component analysis (PCA)/independent component analysis (ICA), false discovery rate (FDR), serially-correlated errors, as well as inference techniques such as the standard t-test, F-test, analysis of variance (ANOVA), and statistical parameter mapping (SPM) framework. In addition, to provide a unified view of various existing inference techniques, we explain a linear mixed effect model with restricted maximum likelihood (ReML) variance estimation, and show that most of the existing inference methods for fNIRS analysis can be derived as special cases. Some of the open issues in statistical analysis are also described. Copyright © 2013 Elsevier Inc. All rights reserved.
Turewicz, Michael; Kohl, Michael; Ahrens, Maike; Mayer, Gerhard; Uszkoreit, Julian; Naboulsi, Wael; Bracht, Thilo; Megger, Dominik A; Sitek, Barbara; Marcus, Katrin; Eisenacher, Martin
2017-11-10
The analysis of high-throughput mass spectrometry-based proteomics data must address the specific challenges of this technology. To this end, the comprehensive proteomics workflow offered by the de.NBI service center BioInfra.Prot provides indispensable components for the computational and statistical analysis of this kind of data. These components include tools and methods for spectrum identification and protein inference, protein quantification, expression analysis as well as data standardization and data publication. All particular methods of the workflow which address these tasks are state-of-the-art or cutting edge. As has been shown in previous publications, each of these methods is adequate to solve its specific task and gives competitive results. However, the methods included in the workflow are continuously reviewed, updated and improved to adapt to new scientific developments. All of these particular components and methods are available as stand-alone BioInfra.Prot services or as a complete workflow. Since BioInfra.Prot provides manifold fast communication channels to get access to all components of the workflow (e.g., via the BioInfra.Prot ticket system: bioinfraprot@rub.de) users can easily benefit from this service and get support by experts. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Climate Change Impacts at Department of Defense
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kotamarthi, Rao; Wang, Jiali; Zoebel, Zach
This project is aimed at providing the U.S. Department of Defense (DoD) with a comprehensive analysis of the uncertainty associated with generating climate projections at the regional scale that can be used by stakeholders and decision makers to quantify and plan for the impacts of future climate change at specific locations. The merits and limitations of commonly used downscaling models, ranging from simple to complex, are compared, and their appropriateness for application at installation scales is evaluated. Downscaled climate projections are generated at selected DoD installations using dynamic and statistical methods with an emphasis on generating probability distributions of climatemore » variables and their associated uncertainties. The sites selection and selection of variables and parameters for downscaling was based on a comprehensive understanding of the current and projected roles that weather and climate play in operating, maintaining, and planning DoD facilities and installations.« less
A comprehensive study on pavement edge line implementation.
DOT National Transportation Integrated Search
2014-04-01
The previous 2011 study Safety Improvement from Edge Lines on Rural Two-Lane Highways analyzed the crash data of : three years before and one year after edge line implementation by using the latest safety analysis statistical method. It : concl...
State-of-the-art in asphalt pavement specifications
DOT National Transportation Integrated Search
1984-07-01
The great increase in highway construction beginning in the 1950's made evident the need for better control of materials .and construction. A comprehensive research and development program was begun to use statistical methods for quality assurance in...
Analysis of high-throughput biological data using their rank values.
Dembélé, Doulaye
2018-01-01
High-throughput biological technologies are routinely used to generate gene expression profiling or cytogenetics data. To achieve high performance, methods available in the literature become more specialized and often require high computational resources. Here, we propose a new versatile method based on the data-ordering rank values. We use linear algebra, the Perron-Frobenius theorem and also extend a method presented earlier for searching differentially expressed genes for the detection of recurrent copy number aberration. A result derived from the proposed method is a one-sample Student's t-test based on rank values. The proposed method is to our knowledge the only that applies to gene expression profiling and to cytogenetics data sets. This new method is fast, deterministic, and requires a low computational load. Probabilities are associated with genes to allow a statistically significant subset selection in the data set. Stability scores are also introduced as quality parameters. The performance and comparative analyses were carried out using real data sets. The proposed method can be accessed through an R package available from the CRAN (Comprehensive R Archive Network) website: https://cran.r-project.org/web/packages/fcros .
DAnTE: a statistical tool for quantitative analysis of –omics data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Polpitiya, Ashoka D.; Qian, Weijun; Jaitly, Navdeep
2008-05-03
DAnTE (Data Analysis Tool Extension) is a statistical tool designed to address challenges unique to quantitative bottom-up, shotgun proteomics data. This tool has also been demonstrated for microarray data and can easily be extended to other high-throughput data types. DAnTE features selected normalization methods, missing value imputation algorithms, peptide to protein rollup methods, an extensive array of plotting functions, and a comprehensive ANOVA scheme that can handle unbalanced data and random effects. The Graphical User Interface (GUI) is designed to be very intuitive and user friendly.
NASA Astrophysics Data System (ADS)
Han, Shenchao; Yang, Yanchun; Liu, Yude; Zhang, Peng; Li, Siwei
2018-01-01
It is effective to reduce haze in winter by changing the distributed heat supply system. Thus, the studies on comprehensive index system and scientific evaluation method of distributed heat supply project are essential. Firstly, research the influence factors of heating modes, and an index system with multiple dimension including economic, environmental, risk and flexibility was built and all indexes were quantified. Secondly, a comprehensive evaluation method based on AHP was put forward to analyze the proposed multiple and comprehensive index system. Lastly, the case study suggested that supplying heat with electricity has great advantage and promotional value. The comprehensive index system of distributed heating supply project and evaluation method in this paper can evaluate distributed heat supply project effectively and provide scientific support for choosing the distributed heating project.
Bhavnani, Suresh K.; Chen, Tianlong; Ayyaswamy, Archana; Visweswaran, Shyam; Bellala, Gowtham; Rohit, Divekar; Kevin E., Bassler
2017-01-01
A primary goal of precision medicine is to identify patient subgroups based on their characteristics (e.g., comorbidities or genes) with the goal of designing more targeted interventions. While network visualization methods such as Fruchterman-Reingold have been used to successfully identify such patient subgroups in small to medium sized data sets, they often fail to reveal comprehensible visual patterns in large and dense networks despite having significant clustering. We therefore developed an algorithm called ExplodeLayout, which exploits the existence of significant clusters in bipartite networks to automatically “explode” a traditional network layout with the goal of separating overlapping clusters, while at the same time preserving key network topological properties that are critical for the comprehension of patient subgroups. We demonstrate the utility of ExplodeLayout by visualizing a large dataset extracted from Medicare consisting of readmitted hip-fracture patients and their comorbidities, demonstrate its statistically significant improvement over a traditional layout algorithm, and discuss how the resulting network visualization enabled clinicians to infer mechanisms precipitating hospital readmission in specific patient subgroups. PMID:28815099
Green, Colin; Shearer, James; Ritchie, Craig W; Zajicek, John P
2011-01-01
To consider the methods available to model Alzheimer's disease (AD) progression over time to inform on the structure and development of model-based evaluations, and the future direction of modelling methods in AD. A systematic search of the health care literature was undertaken to identify methods to model disease progression in AD. Modelling methods are presented in a descriptive review. The literature search identified 42 studies presenting methods or applications of methods to model AD progression over time. The review identified 10 general modelling frameworks available to empirically model the progression of AD as part of a model-based evaluation. Seven of these general models are statistical models predicting progression of AD using a measure of cognitive function. The main concerns with models are on model structure, around the limited characterization of disease progression, and on the use of a limited number of health states to capture events related to disease progression over time. None of the available models have been able to present a comprehensive model of the natural history of AD. Although helpful, there are serious limitations in the methods available to model progression of AD over time. Advances are needed to better model the progression of AD and the effects of the disease on peoples' lives. Recent evidence supports the need for a multivariable approach to the modelling of AD progression, and indicates that a latent variable analytic approach to characterising AD progression is a promising avenue for advances in the statistical development of modelling methods. Copyright © 2011 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Protein and gene model inference based on statistical modeling in k-partite graphs.
Gerster, Sarah; Qeli, Ermir; Ahrens, Christian H; Bühlmann, Peter
2010-07-06
One of the major goals of proteomics is the comprehensive and accurate description of a proteome. Shotgun proteomics, the method of choice for the analysis of complex protein mixtures, requires that experimentally observed peptides are mapped back to the proteins they were derived from. This process is also known as protein inference. We present Markovian Inference of Proteins and Gene Models (MIPGEM), a statistical model based on clearly stated assumptions to address the problem of protein and gene model inference for shotgun proteomics data. In particular, we are dealing with dependencies among peptides and proteins using a Markovian assumption on k-partite graphs. We are also addressing the problems of shared peptides and ambiguous proteins by scoring the encoding gene models. Empirical results on two control datasets with synthetic mixtures of proteins and on complex protein samples of Saccharomyces cerevisiae, Drosophila melanogaster, and Arabidopsis thaliana suggest that the results with MIPGEM are competitive with existing tools for protein inference.
NASA Astrophysics Data System (ADS)
Cheng, Jie; Qian, Zhaogang; Irani, Keki B.; Etemad, Hossein; Elta, Michael E.
1991-03-01
To meet the ever-increasing demand of the rapidly-growing semiconductor manufacturing industry it is critical to have a comprehensive methodology integrating techniques for process optimization real-time monitoring and adaptive process control. To this end we have accomplished an integrated knowledge-based approach combining latest expert system technology machine learning method and traditional statistical process control (SPC) techniques. This knowledge-based approach is advantageous in that it makes it possible for the task of process optimization and adaptive control to be performed consistently and predictably. Furthermore this approach can be used to construct high-level and qualitative description of processes and thus make the process behavior easy to monitor predict and control. Two software packages RIST (Rule Induction and Statistical Testing) and KARSM (Knowledge Acquisition from Response Surface Methodology) have been developed and incorporated with two commercially available packages G2 (real-time expert system) and ULTRAMAX (a tool for sequential process optimization).
Hasegawa, Takanori; Yamaguchi, Rui; Nagasaki, Masao; Miyano, Satoru; Imoto, Seiya
2014-01-01
Comprehensive understanding of gene regulatory networks (GRNs) is a major challenge in the field of systems biology. Currently, there are two main approaches in GRN analysis using time-course observation data, namely an ordinary differential equation (ODE)-based approach and a statistical model-based approach. The ODE-based approach can generate complex dynamics of GRNs according to biologically validated nonlinear models. However, it cannot be applied to ten or more genes to simultaneously estimate system dynamics and regulatory relationships due to the computational difficulties. The statistical model-based approach uses highly abstract models to simply describe biological systems and to infer relationships among several hundreds of genes from the data. However, the high abstraction generates false regulations that are not permitted biologically. Thus, when dealing with several tens of genes of which the relationships are partially known, a method that can infer regulatory relationships based on a model with low abstraction and that can emulate the dynamics of ODE-based models while incorporating prior knowledge is urgently required. To accomplish this, we propose a method for inference of GRNs using a state space representation of a vector auto-regressive (VAR) model with L1 regularization. This method can estimate the dynamic behavior of genes based on linear time-series modeling constructed from an ODE-based model and can infer the regulatory structure among several tens of genes maximizing prediction ability for the observational data. Furthermore, the method is capable of incorporating various types of existing biological knowledge, e.g., drug kinetics and literature-recorded pathways. The effectiveness of the proposed method is shown through a comparison of simulation studies with several previous methods. For an application example, we evaluated mRNA expression profiles over time upon corticosteroid stimulation in rats, thus incorporating corticosteroid kinetics/dynamics, literature-recorded pathways and transcription factor (TF) information. PMID:25162401
The Health Impact Assessment (HIA) Resource and Tool ...
Health Impact Assessment (HIA) is a relatively new and rapidly emerging field in the U.S. An inventory of available HIA resources and tools was conducted, with a primary focus on resources developed in the U.S. The resources and tools available to HIA practitioners in the conduct of their work were identified through multiple methods and compiled into a comprehensive list. The compilation includes tools and resources related to the HIA process itself and those that can be used to collect and analyze data, establish a baseline profile, assess potential health impacts, and establish benchmarks and indicators for monitoring and evaluation. These resources include literature and evidence bases, data and statistics, guidelines, benchmarks, decision and economic analysis tools, scientific models, methods, frameworks, indices, mapping, and various data collection tools. Understanding the data, tools, models, methods, and other resources available to perform HIAs will help to advance the HIA community of practice in the U.S., improve the quality and rigor of assessments upon which stakeholder and policy decisions are based, and potentially improve the overall effectiveness of HIA to promote healthy and sustainable communities. The Health Impact Assessment (HIA) Resource and Tool Compilation is a comprehensive list of resources and tools that can be utilized by HIA practitioners with all levels of HIA experience to guide them throughout the HIA process. The HIA Resource
NASA Astrophysics Data System (ADS)
Kassinopoulos, Michalis; Dong, Jing; Tearney, Guillermo J.; Pitris, Costas
2018-02-01
Catheter-based Optical Coherence Tomography (OCT) devices allow real-time and comprehensive imaging of the human esophagus. Hence, they provide the potential to overcome some of the limitations of endoscopy and biopsy, allowing earlier diagnosis and better prognosis for esophageal adenocarcinoma patients. However, the large number of images produced during every scan makes manual evaluation of the data exceedingly difficult. In this study, we propose a fully automated tissue characterization algorithm, capable of discriminating normal tissue from Barrett's Esophagus (BE) and dysplasia through entire three-dimensional (3D) data sets, acquired in vivo. The method is based on both the estimation of the scatterer size of the esophageal epithelial cells, using the bandwidth of the correlation of the derivative (COD) method, as well as intensity-based characteristics. The COD method can effectively estimate the scatterer size of the esophageal epithelium cells in good agreement with the literature. As expected, both the mean scatterer size and its standard deviation increase with increasing severity of disease (i.e. from normal to BE to dysplasia). The differences in the distribution of scatterer size for each tissue type are statistically significant, with a p value of < 0.0001. However, the scatterer size by itself cannot be used to accurately classify the various tissues. With the addition of intensity-based statistics the correct classification rates for all three tissue types range from 83 to 100% depending on the lesion size.
Non-ad-hoc decision rule for the Dempster-Shafer method of evidential reasoning
NASA Astrophysics Data System (ADS)
Cheaito, Ali; Lecours, Michael; Bosse, Eloi
1998-03-01
This paper is concerned with the fusion of identity information through the use of statistical analysis rooted in Dempster-Shafer theory of evidence to provide automatic identification aboard a platform. An identity information process for a baseline Multi-Source Data Fusion (MSDF) system is defined. The MSDF system is applied to information sources which include a number of radars, IFF systems, an ESM system, and a remote track source. We use a comprehensive Platform Data Base (PDB) containing all the possible identity values that the potential target may take, and we use the fuzzy logic strategies which enable the fusion of subjective attribute information from sensor and the PDB to make the derivation of target identity more quickly, more precisely, and with statistically quantifiable measures of confidence. The conventional Dempster-Shafer lacks a formal basis upon which decision can be made in the face of ambiguity. We define a non-ad hoc decision rule based on the expected utility interval for pruning the `unessential' propositions which would otherwise overload the real-time data fusion systems. An example has been selected to demonstrate the implementation of our modified Dempster-Shafer method of evidential reasoning.
ERIC Educational Resources Information Center
Wine, Jennifer; Bryan, Michael; Siegel, Peter
2013-01-01
The National Postsecondary Student Aid Study (NPSAS) helps fulfill the U.S. Department of Education's National Center for Education Statistics (NCES) mandate to collect, analyze, and publish statistics related to education. The purpose of NPSAS is to compile a comprehensive research dataset, based on student-level records, on financial aid…
ERIC Educational Resources Information Center
Wine, Jennifer; Bryan, Michael; Siegel, Peter
2013-01-01
The National Postsecondary Student Aid Study (NPSAS) helps fulfill the U.S. Department of Education's National Center for Education Statistics (NCES) mandate to collect, analyze, and publish statistics related to education. The purpose of NPSAS is to compile a comprehensive research dataset, based on student-level records, on financial aid…
Factors associated with comprehensive dental care following an initial emergency dental visit.
Johnson, Jeffrey T; Turner, Erwin G; Novak, Karen F; Kaplan, Alan L
2005-01-01
The purpose of this study was to characterize the patient population utilization of a dental home as grouped by: (1) age; (2) sex; and (3) payment method. A retrospective chart review of 1,020 patients, who initially presented for an emergency visit, was performed. From the original data pool, 2 groups were delineated: (1) those patients who returned for comprehensive dental care; and (2) those who did not return for comprehensive dental care. Patients with private dental insurance or Medicaid dental benefits were statistically more likely to return for comprehensive oral health care than those with no form of dental insurance. Younger patients (< or =3 years of age) were least likely to return for comprehensive dental care. Socioeconomic factors play a crucial role in care-seeking behaviors. These obstacles are often a barrier to preventive and comprehensive oral health care.
A novel hybrid meta-heuristic technique applied to the well-known benchmark optimization problems
NASA Astrophysics Data System (ADS)
Abtahi, Amir-Reza; Bijari, Afsane
2017-03-01
In this paper, a hybrid meta-heuristic algorithm, based on imperialistic competition algorithm (ICA), harmony search (HS), and simulated annealing (SA) is presented. The body of the proposed hybrid algorithm is based on ICA. The proposed hybrid algorithm inherits the advantages of the process of harmony creation in HS algorithm to improve the exploitation phase of the ICA algorithm. In addition, the proposed hybrid algorithm uses SA to make a balance between exploration and exploitation phases. The proposed hybrid algorithm is compared with several meta-heuristic methods, including genetic algorithm (GA), HS, and ICA on several well-known benchmark instances. The comprehensive experiments and statistical analysis on standard benchmark functions certify the superiority of the proposed method over the other algorithms. The efficacy of the proposed hybrid algorithm is promising and can be used in several real-life engineering and management problems.
Neil, Jordan M.; Strekalova, Yulia A.; Sarge, Melanie A.
2017-01-01
Abstract Background: Improving informed consent to participate in randomized clinical trials (RCTs) is a key challenge in cancer communication. The current study examines strategies for enhancing randomization comprehension among patients with diverse levels of health literacy and identifies cognitive and affective predictors of intentions to participate in cancer RCTs. Methods: Using a post-test-only experimental design, cancer patients (n = 500) were randomly assigned to receive one of three message conditions for explaining randomization (ie, plain language condition, gambling metaphor, benign metaphor) or a control message. All statistical tests were two-sided. Results: Health literacy was a statistically significant moderator of randomization comprehension (P = .03). Among participants with the lowest levels of health literacy, the benign metaphor resulted in greater comprehension of randomization as compared with plain language (P = .04) and control (P = .004) messages. Among participants with the highest levels of health literacy, the gambling metaphor resulted in greater randomization comprehension as compared with the benign metaphor (P = .04). A serial mediation model showed a statistically significant negative indirect effect of comprehension on behavioral intention through personal relevance of RCTs and anxiety associated with participation in RCTs (P < .001). Conclusions: The effectiveness of metaphors for explaining randomization depends on health literacy, with a benign metaphor being particularly effective for patients at the lower end of the health literacy spectrum. The theoretical model demonstrates the cognitive and affective predictors of behavioral intention to participate in cancer RCTs and offers guidance on how future research should employ communication strategies to improve the informed consent processes. PMID:27794035
Paré, Pierre; Math, Joanna Lee M; Hawes, Ian A
2010-01-01
OBJECTIVE: To determine whether strategies to counsel and empower patients with heartburn-predominant dyspepsia could improve health-related quality of life. METHODS: Using a cluster randomized, parallel group, multicentre design, nine centres were assigned to provide either basic or comprehensive counselling to patients (age range 18 to 50 years) presenting with heartburn-predominant upper gastrointestinal symptoms, who would be considered for drug therapy without further investigation. Patients were treated for four weeks with esomeprazole 40 mg once daily, followed by six months of treatment that was at the physician’s discretion. The primary end point was the baseline change in Quality of Life in Reflux and Dyspepsia (QOLRAD) questionnaire score. RESULTS: A total of 135 patients from nine centres were included in the intention-to-treat analysis. There was a statistically significant baseline improvement in all domains of the QOLRAD questionnaire in both study arms at four and seven months (P<0.0001). After four months, the overall mean change in QOLRAD score appeared greater in the comprehensive counselling group than in the basic counselling group (1.77 versus 1.47, respectively); however, this difference was not statistically significant (P=0.07). After seven months, the overall mean baseline change in QOLRAD score between the comprehensive and basic counselling groups was not statistically significant (1.69 versus 1.56, respectively; P=0.63). CONCLUSIONS: A standardized, comprehensive counselling intervention showed a positive initial trend in improving quality of life in patients with heartburn-predominant uninvestigated dyspepsia. Further investigation is needed to confirm the potential benefits of providing patients with comprehensive counselling regarding disease management. PMID:20352148
Zhang, Lin; Vranckx, Katleen; Janssens, Koen; Sandrin, Todd R.
2015-01-01
MALDI-TOF mass spectrometry has been shown to be a rapid and reliable tool for identification of bacteria at the genus and species, and in some cases, strain levels. Commercially available and open source software tools have been developed to facilitate identification; however, no universal/standardized data analysis pipeline has been described in the literature. Here, we provide a comprehensive and detailed demonstration of bacterial identification procedures using a MALDI-TOF mass spectrometer. Mass spectra were collected from 15 diverse bacteria isolated from Kartchner Caverns, AZ, USA, and identified by 16S rDNA sequencing. Databases were constructed in BioNumerics 7.1. Follow-up analyses of mass spectra were performed, including cluster analyses, peak matching, and statistical analyses. Identification was performed using blind-coded samples randomly selected from these 15 bacteria. Two identification methods are presented: similarity coefficient-based and biomarker-based methods. Results show that both identification methods can identify the bacteria to the species level. PMID:25590854
Zhang, Lin; Vranckx, Katleen; Janssens, Koen; Sandrin, Todd R
2015-01-02
MALDI-TOF mass spectrometry has been shown to be a rapid and reliable tool for identification of bacteria at the genus and species, and in some cases, strain levels. Commercially available and open source software tools have been developed to facilitate identification; however, no universal/standardized data analysis pipeline has been described in the literature. Here, we provide a comprehensive and detailed demonstration of bacterial identification procedures using a MALDI-TOF mass spectrometer. Mass spectra were collected from 15 diverse bacteria isolated from Kartchner Caverns, AZ, USA, and identified by 16S rDNA sequencing. Databases were constructed in BioNumerics 7.1. Follow-up analyses of mass spectra were performed, including cluster analyses, peak matching, and statistical analyses. Identification was performed using blind-coded samples randomly selected from these 15 bacteria. Two identification methods are presented: similarity coefficient-based and biomarker-based methods. Results show that both identification methods can identify the bacteria to the species level.
Research on Bidding Decision-making of International Public-Private Partnership Projects
NASA Astrophysics Data System (ADS)
Hu, Zhen Yu; Zhang, Shui Bo; Liu, Xin Yan
2018-06-01
In order to select the optimal quasi-bidding project for an investment enterprise, a bidding decision-making model for international PPP projects was established in this paper. Firstly, the literature frequency statistics method was adopted to screen out the bidding decision-making indexes, and accordingly the bidding decision-making index system for international PPP projects was constructed. Then, the group decision-making characteristic root method, the entropy weight method, and the optimization model based on least square method were used to set the decision-making index weights. The optimal quasi-bidding project was thus determined by calculating the consistent effect measure of each decision-making index value and the comprehensive effect measure of each quasi-bidding project. Finally, the bidding decision-making model for international PPP projects was further illustrated by a hypothetical case. This model can effectively serve as a theoretical foundation and technical support for the bidding decision-making of international PPP projects.
Statistical Methods Applied to Gamma-ray Spectroscopy Algorithms in Nuclear Security Missions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fagan, Deborah K.; Robinson, Sean M.; Runkle, Robert C.
2012-10-01
In a wide range of nuclear security missions, gamma-ray spectroscopy is a critical research and development priority. One particularly relevant challenge is the interdiction of special nuclear material for which gamma-ray spectroscopy supports the goals of detecting and identifying gamma-ray sources. This manuscript examines the existing set of spectroscopy methods, attempts to categorize them by the statistical methods on which they rely, and identifies methods that have yet to be considered. Our examination shows that current methods effectively estimate the effect of counting uncertainty but in many cases do not address larger sources of decision uncertainty—ones that are significantly moremore » complex. We thus explore the premise that significantly improving algorithm performance requires greater coupling between the problem physics that drives data acquisition and statistical methods that analyze such data. Untapped statistical methods, such as Bayes Modeling Averaging and hierarchical and empirical Bayes methods have the potential to reduce decision uncertainty by more rigorously and comprehensively incorporating all sources of uncertainty. We expect that application of such methods will demonstrate progress in meeting the needs of nuclear security missions by improving on the existing numerical infrastructure for which these analyses have not been conducted.« less
Computational methods to extract meaning from text and advance theories of human cognition.
McNamara, Danielle S
2011-01-01
Over the past two decades, researchers have made great advances in the area of computational methods for extracting meaning from text. This research has to a large extent been spurred by the development of latent semantic analysis (LSA), a method for extracting and representing the meaning of words using statistical computations applied to large corpora of text. Since the advent of LSA, researchers have developed and tested alternative statistical methods designed to detect and analyze meaning in text corpora. This research exemplifies how statistical models of semantics play an important role in our understanding of cognition and contribute to the field of cognitive science. Importantly, these models afford large-scale representations of human knowledge and allow researchers to explore various questions regarding knowledge, discourse processing, text comprehension, and language. This topic includes the latest progress by the leading researchers in the endeavor to go beyond LSA. Copyright © 2010 Cognitive Science Society, Inc.
Irrigated areas of India derived using MODIS 500 m time series for the years 2001-2003
Dheeravath, V.; Thenkabail, P.S.; Chandrakantha, G.; Noojipady, P.; Reddy, G.P.O.; Biradar, C.M.; Gumma, M.K.; Velpuri, M.
2010-01-01
The overarching goal of this research was to develop methods and protocols for mapping irrigated areas using a Moderate Resolution Imaging Spectroradiometer (MODIS) 500 m time series, to generate irrigated area statistics, and to compare these with ground- and census-based statistics. The primary mega-file data-cube (MFDC), comparable to a hyper-spectral data cube, used in this study consisted of 952 bands of data in a single file that were derived from MODIS 500 m, 7-band reflectance data acquired every 8-days during 2001-2003. The methods consisted of (a) segmenting the 952-band MFDC based not only on elevation-precipitation-temperature zones but on major and minor irrigated command area boundaries obtained from India's Central Board of Irrigation and Power (CBIP), (b) developing a large ideal spectral data bank (ISDB) of irrigated areas for India, (c) adopting quantitative spectral matching techniques (SMTs) such as the spectral correlation similarity (SCS) R2-value, (d) establishing a comprehensive set of protocols for class identification and labeling, and (e) comparing the results with the National Census data of India and field-plot data gathered during this project for determining accuracies, uncertainties and errors. The study produced irrigated area maps and statistics of India at the national and the subnational (e.g., state, district) levels based on MODIS data from 2001-2003. The Total Area Available for Irrigation (TAAI) and Annualized Irrigated Areas (AIAs) were 113 and 147 million hectares (MHa), respectively. The TAAI does not consider the intensity of irrigation, and its nearest equivalent is the net irrigated areas in the Indian National Statistics. The AIA considers intensity of irrigation and is the equivalent of "irrigated potential utilized (IPU)" reported by India's Ministry of Water Resources (MoWR). The field-plot data collected during this project showed that the accuracy of TAAI classes was 88% with a 12% error of omission and 32% of error of commission. Comparisons between the AIA and IPU produced an R2-value of 0.84. However, AIA was consistently higher than IPU. The causes for differences were both in traditional approaches and remote sensing. The causes of uncertainties unique to traditional approaches were (a) inadequate accounting of minor irrigation (groundwater, small reservoirs and tanks), (b) unwillingness to share irrigated area statistics by the individual Indian states because of their stakes, (c) absence of comprehensive statistical analyses of reported data, and (d) subjectivity involved in observation-based data collection process. The causes of uncertainties unique to remote sensing approaches were (a) irrigated area fraction estimate and related sub-pixel area computations and (b) resolution of the imagery. The causes of uncertainties common in both traditional and remote sensing approaches were definitions and methodological issues. ?? 2009 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS).
Innovative intelligent technology of distance learning for visually impaired people
NASA Astrophysics Data System (ADS)
Samigulina, Galina; Shayakhmetova, Assem; Nuysuppov, Adlet
2017-12-01
The aim of the study is to develop innovative intelligent technology and information systems of distance education for people with impaired vision (PIV). To solve this problem a comprehensive approach has been proposed, which consists in the aggregate of the application of artificial intelligence methods and statistical analysis. Creating an accessible learning environment, identifying the intellectual, physiological, psychophysiological characteristics of perception and information awareness by this category of people is based on cognitive approach. On the basis of fuzzy logic the individually-oriented learning path of PIV is con- structed with the aim of obtaining high-quality engineering education with modern equipment in the joint use laboratories.
Rao, Goutham; Lopez-Jimenez, Francisco; Boyd, Jack; D'Amico, Frank; Durant, Nefertiti H; Hlatky, Mark A; Howard, George; Kirley, Katherine; Masi, Christopher; Powell-Wiley, Tiffany M; Solomonides, Anthony E; West, Colin P; Wessel, Jennifer
2017-09-05
Meta-analyses are becoming increasingly popular, especially in the fields of cardiovascular disease prevention and treatment. They are often considered to be a reliable source of evidence for making healthcare decisions. Unfortunately, problems among meta-analyses such as the misapplication and misinterpretation of statistical methods and tests are long-standing and widespread. The purposes of this statement are to review key steps in the development of a meta-analysis and to provide recommendations that will be useful for carrying out meta-analyses and for readers and journal editors, who must interpret the findings and gauge methodological quality. To make the statement practical and accessible, detailed descriptions of statistical methods have been omitted. Based on a survey of cardiovascular meta-analyses, published literature on methodology, expert consultation, and consensus among the writing group, key recommendations are provided. Recommendations reinforce several current practices, including protocol registration; comprehensive search strategies; methods for data extraction and abstraction; methods for identifying, measuring, and dealing with heterogeneity; and statistical methods for pooling results. Other practices should be discontinued, including the use of levels of evidence and evidence hierarchies to gauge the value and impact of different study designs (including meta-analyses) and the use of structured tools to assess the quality of studies to be included in a meta-analysis. We also recommend choosing a pooling model for conventional meta-analyses (fixed effect or random effects) on the basis of clinical and methodological similarities among studies to be included, rather than the results of a test for statistical heterogeneity. © 2017 American Heart Association, Inc.
Chan, Robin F.; Shabalin, Andrey A.; Xie, Lin Y.; Adkins, Daniel E.; Zhao, Min; Turecki, Gustavo; Clark, Shaunna L.; Aberg, Karolina A.
2017-01-01
Abstract Methylome-wide association studies are typically performed using microarray technologies that only assay a very small fraction of the CG methylome and entirely miss two forms of methylation that are common in brain and likely of particular relevance for neuroscience and psychiatric disorders. The alternative is to use whole genome bisulfite (WGB) sequencing but this approach is not yet practically feasible with sample sizes required for adequate statistical power. We argue for revisiting methylation enrichment methods that, provided optimal protocols are used, enable comprehensive, adequately powered and cost-effective genome-wide investigations of the brain methylome. To support our claim we use data showing that enrichment methods approximate the sensitivity obtained with WGB methods and with slightly better specificity. However, this performance is achieved at <5% of the reagent costs. Furthermore, because many more samples can be sequenced simultaneously, projects can be completed about 15 times faster. Currently the only viable option available for comprehensive brain methylome studies, enrichment methods may be critical for moving the field forward. PMID:28334972
New generation of hydraulic pedotransfer functions for Europe
Tóth, B; Weynants, M; Nemes, A; Makó, A; Bilas, G; Tóth, G
2015-01-01
A range of continental-scale soil datasets exists in Europe with different spatial representation and based on different principles. We developed comprehensive pedotransfer functions (PTFs) for applications principally on spatial datasets with continental coverage. The PTF development included the prediction of soil water retention at various matric potentials and prediction of parameters to characterize soil moisture retention and the hydraulic conductivity curve (MRC and HCC) of European soils. We developed PTFs with a hierarchical approach, determined by the input requirements. The PTFs were derived by using three statistical methods: (i) linear regression where there were quantitative input variables, (ii) a regression tree for qualitative, quantitative and mixed types of information and (iii) mean statistics of developer-defined soil groups (class PTF) when only qualitative input parameters were available. Data of the recently established European Hydropedological Data Inventory (EU-HYDI), which holds the most comprehensive geographical and thematic coverage of hydro-pedological data in Europe, were used to train and test the PTFs. The applied modelling techniques and the EU-HYDI allowed the development of hydraulic PTFs that are more reliable and applicable for a greater variety of input parameters than those previously available for Europe. Therefore the new set of PTFs offers tailored advanced tools for a wide range of applications in the continent. PMID:25866465
Measuring Speech Comprehensibility in Students with Down Syndrome
Woynaroski, Tiffany; Camarata, Stephen
2016-01-01
Purpose There is an ongoing need to develop assessments of spontaneous speech that focus on whether the child's utterances are comprehensible to listeners. This study sought to identify the attributes of a stable ratings-based measure of speech comprehensibility, which enabled examining the criterion-related validity of an orthography-based measure of the comprehensibility of conversational speech in students with Down syndrome. Method Participants were 10 elementary school students with Down syndrome and 4 unfamiliar adult raters. Averaged across-observer Likert ratings of speech comprehensibility were called a ratings-based measure of speech comprehensibility. The proportion of utterance attempts fully glossed constituted an orthography-based measure of speech comprehensibility. Results Averaging across 4 raters on four 5-min segments produced a reliable (G = .83) ratings-based measure of speech comprehensibility. The ratings-based measure was strongly (r > .80) correlated with the orthography-based measure for both the same and different conversational samples. Conclusion Reliable and valid measures of speech comprehensibility are achievable with the resources available to many researchers and some clinicians. PMID:27299989
Martins, Cátia; Brandão, Tiago; Almeida, Adelaide; Rocha, Sílvia M
2017-05-01
Saccharomyces spp. are widely used in the food and beverages industries. Their cellular excreted metabolites are important for general quality of products and can contribute to product differentiation. This exploratory study presents a metabolomics strategy for the comprehensive mapping of cellular metabolites of two yeast species, Saccharomyces cerevisiae and S. pastorianus (both collected in an industrial context) through a multidimensional chromatography platform. Solid-phase microextraction was used as a sample preparation method. The yeast viability, a specific technological quality parameter, was also assessed. This untargeted analysis allowed the putative identification of 525 analytes, distributed over 14 chemical families, the origin of which may be explained through the pathways network associated with yeasts metabolism. The expression of the different metabolic pathways was similar for both species, event that seems to be yeast genus dependent. Nevertheless, these species showed different growth rates, which led to statistically different metabolites content. This was the first in-depth approach that characterizes the headspace content of S. cerevisiae and S. pastorianus species cultures. The combination of a sample preparation method capable of providing released volatile metabolites directly from yeast culture headspace with comprehensive two-dimensional gas chromatography was successful in uncovering a specific metabolomic pattern for each species. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A comprehensive simulation study on classification of RNA-Seq data.
Zararsız, Gökmen; Goksuluk, Dincer; Korkmaz, Selcuk; Eldem, Vahap; Zararsiz, Gozde Erturk; Duru, Izzet Parug; Ozturk, Ahmet
2017-01-01
RNA sequencing (RNA-Seq) is a powerful technique for the gene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies. Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of gene-expression data are either based on a continuous scale (eg. microarray data) or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data closer to microarrays and apply microarray-based classifiers. In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM), classification and regression trees (CART), and random forests (RF). We also examined the effect of several parameters such as overdispersion, sample size, number of genes, number of classes, differential-expression rate, and the transformation method on model performances. A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count-based classifier, the power transformed PLDA and, as a microarray-based classifier, vst or rlog transformed RF and SVM classifiers may be a good choice for classification. An R/BIOCONDUCTOR package, MLSeq, is freely available at https://www.bioconductor.org/packages/release/bioc/html/MLSeq.html.
Fan, Yannan; Siklenka, Keith; Arora, Simran K.; Ribeiro, Paula; Kimmins, Sarah; Xia, Jianguo
2016-01-01
MicroRNAs (miRNAs) can regulate nearly all biological processes and their dysregulation is implicated in various complex diseases and pathological conditions. Recent years have seen a growing number of functional studies of miRNAs using high-throughput experimental technologies, which have produced a large amount of high-quality data regarding miRNA target genes and their interactions with small molecules, long non-coding RNAs, epigenetic modifiers, disease associations, etc. These rich sets of information have enabled the creation of comprehensive networks linking miRNAs with various biologically important entities to shed light on their collective functions and regulatory mechanisms. Here, we introduce miRNet, an easy-to-use web-based tool that offers statistical, visual and network-based approaches to help researchers understand miRNAs functions and regulatory mechanisms. The key features of miRNet include: (i) a comprehensive knowledge base integrating high-quality miRNA-target interaction data from 11 databases; (ii) support for differential expression analysis of data from microarray, RNA-seq and quantitative PCR; (iii) implementation of a flexible interface for data filtering, refinement and customization during network creation; (iv) a powerful fully featured network visualization system coupled with enrichment analysis. miRNet offers a comprehensive tool suite to enable statistical analysis and functional interpretation of various data generated from current miRNA studies. miRNet is freely available at http://www.mirnet.ca. PMID:27105848
Assessment of data pre-processing methods for LC-MS/MS-based metabolomics of uterine cervix cancer.
Chen, Yanhua; Xu, Jing; Zhang, Ruiping; Shen, Guoqing; Song, Yongmei; Sun, Jianghao; He, Jiuming; Zhan, Qimin; Abliz, Zeper
2013-05-07
A metabolomics strategy based on rapid resolution liquid chromatography/tandem mass spectrometry (RRLC-MS/MS) and multivariate statistics has been implemented to identify potential biomarkers in uterine cervix cancer. Due to the importance of the data pre-processing method, three popular software packages have been compared. Then they have been used to acquire respective data matrices from the same LC-MS/MS data. Multivariate statistics was subsequently used to identify significantly changed biomarkers for uterine cervix cancer from the resulting data matrices. The reliabilities of the identified discriminated metabolites have been further validated on the basis of manually extracted data and ROC curves. Nine potential biomarkers have been identified as having a close relationship with uterine cervix cancer. Considering these in combination as a biomarker group, the AUC amounted to 0.997, with a sensitivity of 92.9% and a specificity of 95.6%. The prediction accuracy was 96.6%. Among these potential biomarkers, the amounts of four purine derivatives were greatly decreased, which might be related to a P2 receptor that might lead to a decrease in cell number through apoptosis. Moreover, only two of them were identified simultaneously by all of the pre-processing tools. The results have demonstrated that the data pre-processing method could seriously bias the metabolomics results. Therefore, application of two or more data pre-processing methods would reveal a more comprehensive set of potential biomarkers in non-targeted metabolomics, before a further validation with LC-MS/MS based targeted metabolomics in MRM mode could be conducted.
Kling, Teresia; Johansson, Patrik; Sanchez, José; Marinescu, Voichita D.; Jörnsten, Rebecka; Nelander, Sven
2015-01-01
Statistical network modeling techniques are increasingly important tools to analyze cancer genomics data. However, current tools and resources are not designed to work across multiple diagnoses and technical platforms, thus limiting their applicability to comprehensive pan-cancer datasets such as The Cancer Genome Atlas (TCGA). To address this, we describe a new data driven modeling method, based on generalized Sparse Inverse Covariance Selection (SICS). The method integrates genetic, epigenetic and transcriptional data from multiple cancers, to define links that are present in multiple cancers, a subset of cancers, or a single cancer. It is shown to be statistically robust and effective at detecting direct pathway links in data from TCGA. To facilitate interpretation of the results, we introduce a publicly accessible tool (cancerlandscapes.org), in which the derived networks are explored as interactive web content, linked to several pathway and pharmacological databases. To evaluate the performance of the method, we constructed a model for eight TCGA cancers, using data from 3900 patients. The model rediscovered known mechanisms and contained interesting predictions. Possible applications include prediction of regulatory relationships, comparison of network modules across multiple forms of cancer and identification of drug targets. PMID:25953855
Stratification of Recanalization for Patients with Endovascular Treatment of Intracranial Aneurysms
Ogilvy, Christopher S.; Chua, Michelle H.; Fusco, Matthew R.; Reddy, Arra S.; Thomas, Ajith J.
2015-01-01
Background With increasing utilization of endovascular techniques in the treatment of both ruptured and unruptured intracranial aneurysms, the issue of obliteration efficacy has become increasingly important. Objective Our goal was to systematically develop a comprehensive model for predicting retreatment with various types of endovascular treatment. Methods We retrospectively reviewed medical records that were prospectively collected for 305 patients who received endovascular treatment for intracranial aneurysms from 2007 to 2013. Multivariable logistic regression was performed on candidate predictors identified by univariable screening analysis to detect independent predictors of retreatment. A composite risk score was constructed based on the proportional contribution of independent predictors in the multivariable model. Results Size (>10 mm), aneurysm rupture, stent assistance, and post-treatment degree of aneurysm occlusion were independently associated with retreatment while intraluminal thrombosis and flow diversion demonstrated a trend towards retreatment. The Aneurysm Recanalization Stratification Scale was constructed by assigning the following weights to statistically and clinically significant predictors. Aneurysm-specific factors: Size (>10 mm), 2 points; rupture, 2 points; presence of thrombus, 2 points. Treatment-related factors: Stent assistance, -1 point; flow diversion, -2 points; Raymond Roy 2 occlusion, 1 point; Raymond Roy 3 occlusion, 2 points. This scale demonstrated good discrimination with a C-statistic of 0.799. Conclusion Surgical decision-making and patient-centered informed consent require comprehensive and accessible information on treatment efficacy. We have constructed the Aneurysm Recanalization Stratification Scale to enhance this decision-making process. This is the first comprehensive model that has been developed to quantitatively predict the risk of retreatment following endovascular therapy. PMID:25621984
NASA Astrophysics Data System (ADS)
Yu, Jianbo
2017-01-01
This study proposes an adaptive-learning-based method for machine faulty detection and health degradation monitoring. The kernel of the proposed method is an "evolving" model that uses an unsupervised online learning scheme, in which an adaptive hidden Markov model (AHMM) is used for online learning the dynamic health changes of machines in their full life. A statistical index is developed for recognizing the new health states in the machines. Those new health states are then described online by adding of new hidden states in AHMM. Furthermore, the health degradations in machines are quantified online by an AHMM-based health index (HI) that measures the similarity between two density distributions that describe the historic and current health states, respectively. When necessary, the proposed method characterizes the distinct operating modes of the machine and can learn online both abrupt as well as gradual health changes. Our method overcomes some drawbacks of the HIs (e.g., relatively low comprehensibility and applicability) based on fixed monitoring models constructed in the offline phase. Results from its application in a bearing life test reveal that the proposed method is effective in online detection and adaptive assessment of machine health degradation. This study provides a useful guide for developing a condition-based maintenance (CBM) system that uses an online learning method without considerable human intervention.
Load Model Verification, Validation and Calibration Framework by Statistical Analysis on Field Data
NASA Astrophysics Data System (ADS)
Jiao, Xiangqing; Liao, Yuan; Nguyen, Thai
2017-11-01
Accurate load models are critical for power system analysis and operation. A large amount of research work has been done on load modeling. Most of the existing research focuses on developing load models, while little has been done on developing formal load model verification and validation (V&V) methodologies or procedures. Most of the existing load model validation is based on qualitative rather than quantitative analysis. In addition, not all aspects of model V&V problem have been addressed by the existing approaches. To complement the existing methods, this paper proposes a novel load model verification and validation framework that can systematically and more comprehensively examine load model's effectiveness and accuracy. Statistical analysis, instead of visual check, quantifies the load model's accuracy, and provides a confidence level of the developed load model for model users. The analysis results can also be used to calibrate load models. The proposed framework can be used as a guidance to systematically examine load models for utility engineers and researchers. The proposed method is demonstrated through analysis of field measurements collected from a utility system.
Selecting the optimum plot size for a California design-based stream and wetland mapping program.
Lackey, Leila G; Stein, Eric D
2014-04-01
Accurate estimates of the extent and distribution of wetlands and streams are the foundation of wetland monitoring, management, restoration, and regulatory programs. Traditionally, these estimates have relied on comprehensive mapping. However, this approach is prohibitively resource-intensive over large areas, making it both impractical and statistically unreliable. Probabilistic (design-based) approaches to evaluating status and trends provide a more cost-effective alternative because, compared with comprehensive mapping, overall extent is inferred from mapping a statistically representative, randomly selected subset of the target area. In this type of design, the size of sample plots has a significant impact on program costs and on statistical precision and accuracy; however, no consensus exists on the appropriate plot size for remote monitoring of stream and wetland extent. This study utilized simulated sampling to assess the performance of four plot sizes (1, 4, 9, and 16 km(2)) for three geographic regions of California. Simulation results showed smaller plot sizes (1 and 4 km(2)) were most efficient for achieving desired levels of statistical accuracy and precision. However, larger plot sizes were more likely to contain rare and spatially limited wetland subtypes. Balancing these considerations led to selection of 4 km(2) for the California status and trends program.
NASA Astrophysics Data System (ADS)
He, Honghui; Dong, Yang; Zhou, Jialing; Ma, Hui
2017-03-01
As one of the salient features of light, polarization contains abundant structural and optical information of media. Recently, as a comprehensive description of polarization property, the Mueller matrix polarimetry has been applied to various biomedical studies such as cancerous tissues detections. In previous works, it has been found that the structural information encoded in the 2D Mueller matrix images can be presented by other transformed parameters with more explicit relationship to certain microstructural features. In this paper, we present a statistical analyzing method to transform the 2D Mueller matrix images into frequency distribution histograms (FDHs) and their central moments to reveal the dominant structural features of samples quantitatively. The experimental results of porcine heart, intestine, stomach, and liver tissues demonstrate that the transformation parameters and central moments based on the statistical analysis of Mueller matrix elements have simple relationships to the dominant microstructural properties of biomedical samples, including the density and orientation of fibrous structures, the depolarization power, diattenuation and absorption abilities. It is shown in this paper that the statistical analysis of 2D images of Mueller matrix elements may provide quantitative or semi-quantitative criteria for biomedical diagnosis.
Evaluation of risk communication in a mammography patient decision aid.
Klein, Krystal A; Watson, Lindsey; Ash, Joan S; Eden, Karen B
2016-07-01
We characterized patients' comprehension, memory, and impressions of risk communication messages in a patient decision aid (PtDA), Mammopad, and clarified perceived importance of numeric risk information in medical decision making. Participants were 75 women in their forties with average risk factors for breast cancer. We used mixed methods, comprising a risk estimation problem administered within a pretest-posttest design, and semi-structured qualitative interviews with a subsample of 21 women. Participants' positive predictive value estimates of screening mammography improved after using Mammopad. Although risk information was only briefly memorable, through content analysis, we identified themes describing why participants value quantitative risk information, and obstacles to understanding. We describe ways the most complicated graphic was incompletely comprehended. Comprehension of risk information following Mammopad use could be improved. Patients valued receiving numeric statistical information, particularly in pictograph format. Obstacles to understanding risk information, including potential for confusion between statistics, should be identified and mitigated in PtDA design. Using simple pictographs accompanied by text, PtDAs may enhance a shared decision-making discussion. PtDA designers and providers should be aware of benefits and limitations of graphical risk presentations. Incorporating comprehension checks could help identify and correct misapprehensions of graphically presented statistics. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Tips, Tropes, and Trivia: Ideas for Teaching Educational Research.
ERIC Educational Resources Information Center
Stallings, William M.; And Others
The collective experience of more than 50 years has led to the development of approaches that have enhanced student comprehension in the teaching of educational research methods, statistics, and measurement. Tips for teachers include using illustrative problems with one-digit numbers, using common situations and everyday objects to illustrate…
Computer Simulation of Classic Studies in Psychology.
ERIC Educational Resources Information Center
Bradley, Drake R.
This paper describes DATASIM, a comprehensive software package which generates simulated data for actual or hypothetical research designs. DATASIM is primarily intended for use in statistics and research methods courses, where it is used to generate "individualized" datasets for students to analyze, and later to correct their answers.…
The Perseus computational platform for comprehensive analysis of (prote)omics data.
Tyanova, Stefka; Temu, Tikira; Sinitcyn, Pavel; Carlson, Arthur; Hein, Marco Y; Geiger, Tamar; Mann, Matthias; Cox, Jürgen
2016-09-01
A main bottleneck in proteomics is the downstream biological analysis of highly multivariate quantitative protein abundance data generated using mass-spectrometry-based analysis. We developed the Perseus software platform (http://www.perseus-framework.org) to support biological and biomedical researchers in interpreting protein quantification, interaction and post-translational modification data. Perseus contains a comprehensive portfolio of statistical tools for high-dimensional omics data analysis covering normalization, pattern recognition, time-series analysis, cross-omics comparisons and multiple-hypothesis testing. A machine learning module supports the classification and validation of patient groups for diagnosis and prognosis, and it also detects predictive protein signatures. Central to Perseus is a user-friendly, interactive workflow environment that provides complete documentation of computational methods used in a publication. All activities in Perseus are realized as plugins, and users can extend the software by programming their own, which can be shared through a plugin store. We anticipate that Perseus's arsenal of algorithms and its intuitive usability will empower interdisciplinary analysis of complex large data sets.
Renewable Energy Zones for the Africa Clean Energy Corridor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Grace C.; Deshmukh, Ranjit; Ndhlukula, Kudakwashe
Multi-criteria Analysis for Planning Renewable Energy (MapRE) is a study approach developed by the Lawrence Berkeley National Laboratory with the support of the International Renewable Energy Agency (IRENA). The approach combines geospatial, statistical, energy engineering, and economic methods to comprehensively identify and value high-quality wind, solar PV, and solar CSP resources for grid integration based on techno-economic criteria, generation profiles (for wind), and socio-environmental impacts. The Renewable Energy Zones for the Africa Clean Energy Corridor study sought to identify and comprehensively value high-quality wind, solar photovoltaic (PV), and concentrating solar power (CSP) resources in 21 countries in the East andmore » Southern Africa Power Pools to support the prioritization of areas for development through a multi-criteria planning process. These countries include Angola, Botswana, Burundi, Djibouti, Democratic Republic of Congo, Egypt, Ethiopia, Kenya, Lesotho, Libya, Malawi, Mozambique, Namibia, Rwanda, South Africa, Sudan, Swaziland, Tanzania, Uganda, Zambia, and Zimbabwe. The study includes the methodology and the key results including renewable energy potential for each region.« less
Consistent approach to describing aircraft HIRF protection
NASA Technical Reports Server (NTRS)
Rimbey, P. R.; Walen, D. B.
1995-01-01
The high intensity radiated fields (HIRF) certification process as currently implemented is comprised of an inconsistent combination of factors that tend to emphasize worst case scenarios in assessing commercial airplane certification requirements. By examining these factors which include the process definition, the external HIRF environment, the aircraft coupling and corresponding internal fields, and methods of measuring equipment susceptibilities, activities leading to an approach to appraising airplane vulnerability to HIRF are proposed. This approach utilizes technically based criteria to evaluate the nature of the threat, including the probability of encountering the external HIRF environment. No single test or analytic method comprehensively addresses the full HIRF threat frequency spectrum. Additional tools such as statistical methods must be adopted to arrive at more realistic requirements to reflect commercial aircraft vulnerability to the HIRF threat. Test and analytic data are provided to support the conclusions of this report. This work was performed under NASA contract NAS1-19360, Task 52.
Combined magnetic and gravity analysis
NASA Technical Reports Server (NTRS)
Hinze, W. J.; Braile, L. W.; Chandler, V. W.; Mazella, F. E.
1975-01-01
Efforts are made to identify methods of decreasing magnetic interpretation ambiguity by combined gravity and magnetic analysis, to evaluate these techniques in a preliminary manner, to consider the geologic and geophysical implications of correlation, and to recommend a course of action to evaluate methods of correlating gravity and magnetic anomalies. The major thrust of the study was a search and review of the literature. The literature of geophysics, geology, geography, and statistics was searched for articles dealing with spatial correlation of independent variables. An annotated bibliography referencing the Germane articles and books is presented. The methods of combined gravity and magnetic analysis techniques are identified and reviewed. A more comprehensive evaluation of two types of techniques is presented. Internal correspondence of anomaly amplitudes is examined and a combined analysis is done utilizing Poisson's theorem. The geologic and geophysical implications of gravity and magnetic correlation based on both theoretical and empirical relationships are discussed.
Chen, Jiabo; Li, Fayun; Fan, Zhiping; Wang, Yanjie
2016-01-01
Source apportionment of river water pollution is critical in water resource management and aquatic conservation. Comprehensive application of various GIS-based multivariate statistical methods was performed to analyze datasets (2009–2011) on water quality in the Liao River system (China). Cluster analysis (CA) classified the 12 months of the year into three groups (May–October, February–April and November–January) and the 66 sampling sites into three groups (groups A, B and C) based on similarities in water quality characteristics. Discriminant analysis (DA) determined that temperature, dissolved oxygen (DO), pH, chemical oxygen demand (CODMn), 5-day biochemical oxygen demand (BOD5), NH4+–N, total phosphorus (TP) and volatile phenols were significant variables affecting temporal variations, with 81.2% correct assignments. Principal component analysis (PCA) and positive matrix factorization (PMF) identified eight potential pollution factors for each part of the data structure, explaining more than 61% of the total variance. Oxygen-consuming organics from cropland and woodland runoff were the main latent pollution factor for group A. For group B, the main pollutants were oxygen-consuming organics, oil, nutrients and fecal matter. For group C, the evaluated pollutants primarily included oxygen-consuming organics, oil and toxic organics. PMID:27775679
Meda, Shashwath A.; Giuliani, Nicole R.; Calhoun, Vince D.; Jagannathan, Kanchana; Schretlen, David J.; Pulver, Anne; Cascella, Nicola; Keshavan, Matcheri; Kates, Wendy; Buchanan, Robert; Sharma, Tonmoy; Pearlson, Godfrey D.
2008-01-01
Background Many studies have employed voxel-based morphometry (VBM) of MRI images as an automated method of investigating cortical gray matter differences in schizophrenia. However, results from these studies vary widely, likely due to different methodological or statistical approaches. Objective To use VBM to investigate gray matter differences in schizophrenia in a sample significantly larger than any published to date, and to increase statistical power sufficiently to reveal differences missed in smaller analyses. Methods Magnetic resonance whole brain images were acquired from four geographic sites, all using the same model 1.5T scanner and software version, and combined to form a sample of 200 patients with both first episode and chronic schizophrenia and 200 healthy controls, matched for age, gender and scanner location. Gray matter concentration was assessed and compared using optimized VBM. Results Compared to the healthy controls, schizophrenia patients showed significantly less gray matter concentration in multiple cortical and subcortical regions, some previously unreported. Overall, we found lower concentrations of gray matter in regions identified in prior studies, most of which reported only subsets of the affected areas. Conclusions Gray matter differences in schizophrenia are most comprehensively elucidated using a large, diverse and representative sample. PMID:18378428
Bryant, Fred B
2016-12-01
This paper introduces a special section of the current issue of the Journal of Evaluation in Clinical Practice that includes a set of 6 empirical articles showcasing a versatile, new machine-learning statistical method, known as optimal data (or discriminant) analysis (ODA), specifically designed to produce statistical models that maximize predictive accuracy. As this set of papers clearly illustrates, ODA offers numerous important advantages over traditional statistical methods-advantages that enhance the validity and reproducibility of statistical conclusions in empirical research. This issue of the journal also includes a review of a recently published book that provides a comprehensive introduction to the logic, theory, and application of ODA in empirical research. It is argued that researchers have much to gain by using ODA to analyze their data. © 2016 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Lazar, Dora; Ihasz, Istvan
2013-04-01
The short and medium range operational forecasts, warning and alarm of the severe weather are one of the most important activities of the Hungarian Meteorological Service. Our study provides comprehensive summary of newly developed methods based on ECMWF ensemble forecasts to assist successful prediction of the convective weather situations. . In the first part of the study a brief overview is given about the components of atmospheric convection, which are the atmospheric lifting force, convergence and vertical wind shear. The atmospheric instability is often used to characterize the so-called instability index; one of the most popular and often used indexes is the convective available potential energy. Heavy convective events, like intensive storms, supercells and tornadoes are needed the vertical instability, adequate moisture and vertical wind shear. As a first step statistical studies of these three parameters are based on nine years time series of 51-member ensemble forecasting model based on convective summer time period, various statistical analyses were performed. Relationship of the rate of the convective and total precipitation and above three parameters was studied by different statistical methods. Four new visualization methods were applied for supporting successful forecasts of severe weathers. Two of the four visualization methods the ensemble meteogram and the ensemble vertical profiles had been available at the beginning of our work. Both methods show probability of the meteorological parameters for the selected location. Additionally two new methods have been developed. First method provides probability map of the event exceeding predefined values, so the incident of the spatial uncertainty is well-defined. The convective weather events are characterized by the incident of space often rhapsodic occurs rather have expected the event area can be selected so that the ensemble forecasts give very good support. Another new visualization tool shows time evolution of predefined multiple thresholds in graphical form for any selected location. With applying this tool degree of the dangerous weather conditions can be well estimated. Besides intensive convective periods are clearly marked during the forecasting period. Developments were done by MAGICS++ software under UNIX operating system. The third part of the study usefulness of these tools is demonstrated in three interesting cases studies of last summer.
Zhang, Yiming; Jin, Quan; Wang, Shuting; Ren, Ren
2011-05-01
The mobile behavior of 1481 peptides in ion mobility spectrometry (IMS), which are generated by protease digestion of the Drosophila melanogaster proteome, is modeled and predicted based on two different types of characterization methods, i.e. sequence-based approach and structure-based approach. In this procedure, the sequence-based approach considers both the amino acid composition of a peptide and the local environment profile of each amino acid in the peptide; the structure-based approach is performed with the CODESSA protocol, which regards a peptide as a common organic compound and generates more than 200 statistically significant variables to characterize the whole structure profile of a peptide molecule. Subsequently, the nonlinear support vector machine (SVM) and Gaussian process (GP) as well as linear partial least squares (PLS) regression is employed to correlate the structural parameters of the characterizations with the IMS drift times of these peptides. The obtained quantitative structure-spectrum relationship (QSSR) models are evaluated rigorously and investigated systematically via both one-deep and two-deep cross-validations as well as the rigorous Monte Carlo cross-validation (MCCV). We also give a comprehensive comparison on the resulting statistics arising from the different combinations of variable types with modeling methods and find that the sequence-based approach can give the QSSR models with better fitting ability and predictive power but worse interpretability than the structure-based approach. In addition, though the QSSR modeling using sequence-based approach is not needed for the preparation of the minimization structures of peptides before the modeling, it would be considerably efficient as compared to that using structure-based approach. Copyright © 2011 Elsevier Ltd. All rights reserved.
Classroom Simulation to Prepare Teachers to Use Evidence-Based Comprehension Practices
ERIC Educational Resources Information Center
Ely, Emily; Alves, Kat D.; Dolenc, Nathan R.; Sebolt, Stephanie; Walton, Emily A.
2018-01-01
Reading comprehension is an area of weakness for many students, including those with disabilities. Innovative technology methods may play a role in improving teacher readiness to use evidence-based comprehension practices for all students. In this experimental study, researchers examined a classroom simulation (TLE TeachLivE™) to improve…
Samuelson, David B; Divaris, Kimon; De Kok, Ingeborg J
2017-04-01
This study compared the acceptability and relative effectiveness of case-based learning (CBL) versus traditional lecture-based (LB) instruction in a preclinical removable prosthodontics course in the University of North Carolina at Chapel Hill School of Dentistry DDS curriculum. The entire second-year class (N=82) comprised this crossover study's sample. Assessments of baseline comprehension and confidence in removable partial denture (RPD) treatment planning were conducted at the beginning of the course. Near the end of the course, half of the class received CBL and LB instruction in an RPD module in alternating sequence, with students serving as their own control group. Assessments of perceived RPD treatment planning efficacy, comprehension, and instruction method preference were administered directly after students completed the RPD module and six months later. Analyses of variance accounting for period, carryover, and sequence effects were used to determine the relative effects of each approach using a p<0.05 statistical significance threshold. The results showed that the students preferred CBL (81%) over LB instruction (9%), a pattern that remained unchanged after a six-month period. Despite notable period and carryover effects, CBL was also associated with higher gains in RPD treatment planning comprehension (p=0.04) and perceived efficacy (p=0.01) compared to LB instruction. These gains diminished six months after the course-a finding based on a 49% follow-up response rate. Overall, the students overwhelmingly preferred CBL to LB instruction, and the findings suggest small albeit measurable educational benefits associated with CBL. This study's findings support the introduction and further testing of CBL in the preclinical dental curriculum, in anticipation of possible future benefits evident during clinical training.
voomDDA: discovery of diagnostic biomarkers and classification of RNA-seq data.
Zararsiz, Gokmen; Goksuluk, Dincer; Klaus, Bernd; Korkmaz, Selcuk; Eldem, Vahap; Karabulut, Erdem; Ozturk, Ahmet
2017-01-01
RNA-Seq is a recent and efficient technique that uses the capabilities of next-generation sequencing technology for characterizing and quantifying transcriptomes. One important task using gene-expression data is to identify a small subset of genes that can be used to build diagnostic classifiers particularly for cancer diseases. Microarray based classifiers are not directly applicable to RNA-Seq data due to its discrete nature. Overdispersion is another problem that requires careful modeling of mean and variance relationship of the RNA-Seq data. In this study, we present voomDDA classifiers: variance modeling at the observational level (voom) extensions of the nearest shrunken centroids (NSC) and the diagonal discriminant classifiers. VoomNSC is one of these classifiers and brings voom and NSC approaches together for the purpose of gene-expression based classification. For this purpose, we propose weighted statistics and put these weighted statistics into the NSC algorithm. The VoomNSC is a sparse classifier that models the mean-variance relationship using the voom method and incorporates voom's precision weights into the NSC classifier via weighted statistics. A comprehensive simulation study was designed and four real datasets are used for performance assessment. The overall results indicate that voomNSC performs as the sparsest classifier. It also provides the most accurate results together with power-transformed Poisson linear discriminant analysis, rlog transformed support vector machines and random forests algorithms. In addition to prediction purposes, the voomNSC classifier can be used to identify the potential diagnostic biomarkers for a condition of interest. Through this work, statistical learning methods proposed for microarrays can be reused for RNA-Seq data. An interactive web application is freely available at http://www.biosoft.hacettepe.edu.tr/voomDDA/.
NASA Astrophysics Data System (ADS)
Lutz, Norbert W.; Bernard, Monique
2018-02-01
We recently suggested a new paradigm for statistical analysis of thermal heterogeneity in (semi-)aqueous materials by 1H NMR spectroscopy, using water as a temperature probe. Here, we present a comprehensive in silico and in vitro validation that demonstrates the ability of this new technique to provide accurate quantitative parameters characterizing the statistical distribution of temperature values in a volume of (semi-)aqueous matter. First, line shape parameters of numerically simulated water 1H NMR spectra are systematically varied to study a range of mathematically well-defined temperature distributions. Then, corresponding models based on measured 1H NMR spectra of agarose gel are analyzed. In addition, dedicated samples based on hydrogels or biological tissue are designed to produce temperature gradients changing over time, and dynamic NMR spectroscopy is employed to analyze the resulting temperature profiles at sub-second temporal resolution. Accuracy and consistency of the previously introduced statistical descriptors of temperature heterogeneity are determined: weighted median and mean temperature, standard deviation, temperature range, temperature mode(s), kurtosis, skewness, entropy, and relative areas under temperature curves. Potential and limitations of this method for quantitative analysis of thermal heterogeneity in (semi-)aqueous materials are discussed in view of prospective applications in materials science as well as biology and medicine.
Reinforcing Sampling Distributions through a Randomization-Based Activity for Introducing ANOVA
ERIC Educational Resources Information Center
Taylor, Laura; Doehler, Kirsten
2015-01-01
This paper examines the use of a randomization-based activity to introduce the ANOVA F-test to students. The two main goals of this activity are to successfully teach students to comprehend ANOVA F-tests and to increase student comprehension of sampling distributions. Four sections of students in an advanced introductory statistics course…
ERIC Educational Resources Information Center
May, Julian; Roberts, Benjamin
2005-01-01
Increasingly national statistical agencies are being called upon to provide high quality data on a regular basis, to be used by governments for evidence-based policy development. Poverty Reduction Strategy Papers (PRSPs) give impetus to this, and bring a prerequisite for comprehensive "poverty diagnosis." Often the data that are required…
Predictors of Nutrition Information Comprehension in Adulthood
Miller, Lisa M. Soederberg; Gibson, Tanja N.; Applegate, Elizabeth A.
2009-01-01
Objective The goal of the present study was to examine relationships among several predictors of nutrition comprehension. We were particularly interested in exploring whether nutrition knowledge or motivation moderated the effects of attention on comprehension across a wide age range of adults. Methods Ninety-three participants, ages 18 to 80, completed measures of nutrition knowledge and motivation and then read nutrition information (from which attention allocation was derived) and answered comprehension questions. Results In general, predictor variables were highly intercorrelated. However, knowledge, but not motivation, had direct effects on comprehension accuracy. In contrast, motivation influenced attention, which in turn influenced accuracy. Results also showed that comprehension accuracy decreased- and knowledge increased -with age. When knowledge was statistically controlled, age declines in comprehension increased. Conclusion Knowledge is an important predictor of nutrition information comprehension and its role increases in later life. Motivation is also important; however, its effects on comprehension differ from knowledge. Practice Implications Health educators and clinicians should consider cognitive skills such as knowledge as well as motivation and age of patients when deciding how to best convey health information. The increased role of knowledge among older adults suggests that lifelong educational efforts may have important payoffs in later life. PMID:19854605
Barden-O'Fallon, Janine
2017-05-08
Faith-based organizations (FBOs) have a long history of providing health services in developing countries and are important contributors to healthcare systems. Support for the wellbeing of women, children, and families is evidenced through active participation in the field of family planning (FP). However, there is little quantitative evidence on the availability or quality of FP services by FBOs. The descriptive analysis uses facility-level data collected through recent Service Provision Assessments in Malawi (2013-14), Kenya (2010), and Haiti (2012) to examine 11 indicators of FP service and method availability and nine indicators of comprehensive and quality counseling. The indicators include measures of FP service provision, method mix, method stock, the provision of accurate information, and the discussion of reproductive intentions, client's questions/concerns, prevention of sexually transmitted infections, and return visits, among others. Pearson's Chi-square test is used to assess the selected indicators by managing authority (FBO, public, and other private sector) to determine statistical equivalence. Results show that FBOs are less likely to offer FP services than other managing authorities (p < 0.05). For example, 69% of FBOs in Kenya offer FP services compared to 97% of public facilities and 83% of other private facilities. Offering long-acting or permanent methods in faith-based facilities is especially low (43% in Malawi, 29% in Kenya and 39% in Haiti). There were few statistically significant differences between the managing authorities in comprehensive and quality counseling indicators. Interestingly, Haitian FBOs often perform as well or better than public sector health facilities on counseling indicators, such as discussion of a return visit (79% of FBO providers vs. 68% of public sector providers) and discussion of client concerns/questions (52% vs. 49%, respectively). Results from this analysis indicate that there is room for improvement in the availability of FP services by FBOs in these countries. Quality of counseling should be improved by all managing authorities in the three countries, as indicated by low overall coverage for practices such as ensuring confidentiality (22% in Malawi, 47% in Kenya and 12% in Haiti), discussion of sexually transmitted infections (18%, 25%, 17%, respectively), and providing services to youth (53%, 27%, 32%, respectively).
ERIC Educational Resources Information Center
Kong, Na
2011-01-01
Based on the current contradiction between the grammar-translation method and the communicative teaching method in English teaching, this paper, starting with clarifying the task of comprehensive English as well as the definition of the two teaching methods, objectively analyzes their advantages and disadvantages and proposes establishing a new…
Assessment of sustainable urban transport development based on entropy and unascertained measure.
Li, Yancang; Yang, Jing; Shi, Huawang; Li, Yijie
2017-01-01
To find a more effective method for the assessment of sustainable urban transport development, the comprehensive assessment model of sustainable urban transport development was established based on the unascertained measure. On the basis of considering the factors influencing urban transport development, the comprehensive assessment indexes were selected, including urban economical development, transport demand, environment quality and energy consumption, and the assessment system of sustainable urban transport development was proposed. In view of different influencing factors of urban transport development, the index weight was calculated through the entropy weight coefficient method. Qualitative and quantitative analyses were conducted according to the actual condition. Then, the grade was obtained by using the credible degree recognition criterion from which the urban transport development level can be determined. Finally, a comprehensive assessment method for urban transport development was introduced. The application practice showed that the method can be used reasonably and effectively for the comprehensive assessment of urban transport development.
ERIC Educational Resources Information Center
Herman, Heather A.
2017-01-01
This mixed methods research explores the effects of literacy support tools to support comprehension strategies when reading informational e-books and print-based text with 14 first-grade students. This study focused on the following comprehension strategies: annotating connections, annotating "I wonders," and looking back in the text.…
NASA Astrophysics Data System (ADS)
Gutiérrez, Jose Manuel; Maraun, Douglas; Widmann, Martin; Huth, Radan; Hertig, Elke; Benestad, Rasmus; Roessler, Ole; Wibig, Joanna; Wilcke, Renate; Kotlarski, Sven
2016-04-01
VALUE is an open European network to validate and compare downscaling methods for climate change research (http://www.value-cost.eu). A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. This framework is based on a user-focused validation tree, guiding the selection of relevant validation indices and performance measures for different aspects of the validation (marginal, temporal, spatial, multi-variable). Moreover, several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur (assessment of intrinsic performance, effect of errors inherited from the global models, effect of non-stationarity, etc.). The list of downscaling experiments includes 1) cross-validation with perfect predictors, 2) GCM predictors -aligned with EURO-CORDEX experiment- and 3) pseudo reality predictors (see Maraun et al. 2015, Earth's Future, 3, doi:10.1002/2014EF000259, for more details). The results of these experiments are gathered, validated and publicly distributed through the VALUE validation portal, allowing for a comprehensive community-open downscaling intercomparison study. In this contribution we describe the overall results from Experiment 1), consisting of a European wide 5-fold cross-validation (with consecutive 6-year periods from 1979 to 2008) using predictors from ERA-Interim to downscale precipitation and temperatures (minimum and maximum) over a set of 86 ECA&D stations representative of the main geographical and climatic regions in Europe. As a result of the open call for contribution to this experiment (closed in Dec. 2015), over 40 methods representative of the main approaches (MOS and Perfect Prognosis, PP) and techniques (linear scaling, quantile mapping, analogs, weather typing, linear and generalized regression, weather generators, etc.) were submitted, including information both data (downscaled values) and metadata (characterizing different aspects of the downscaling methods). This constitutes the largest and most comprehensive to date intercomparison of statistical downscaling methods. Here, we present an overall validation, analyzing marginal and temporal aspects to assess the intrinsic performance and added value of statistical downscaling methods at both annual and seasonal levels. This validation takes into account the different properties/limitations of different approaches and techniques (as reported in the provided metadata) in order to perform a fair comparison. It is pointed out that this experiment alone is not sufficient to evaluate the limitations of (MOS) bias correction techniques. Moreover, it also does not fully validate PP since we don't learn whether we have the right predictors and whether the PP assumption is valid. These problems will be analyzed in the subsequent community-open VALUE experiments 2) and 3), which will be open for participation along the present year.
Dai, Mingwei; Ming, Jingsi; Cai, Mingxuan; Liu, Jin; Yang, Can; Wan, Xiang; Xu, Zongben
2017-09-15
Results from genome-wide association studies (GWAS) suggest that a complex phenotype is often affected by many variants with small effects, known as 'polygenicity'. Tens of thousands of samples are often required to ensure statistical power of identifying these variants with small effects. However, it is often the case that a research group can only get approval for the access to individual-level genotype data with a limited sample size (e.g. a few hundreds or thousands). Meanwhile, summary statistics generated using single-variant-based analysis are becoming publicly available. The sample sizes associated with the summary statistics datasets are usually quite large. How to make the most efficient use of existing abundant data resources largely remains an open question. In this study, we propose a statistical approach, IGESS, to increasing statistical power of identifying risk variants and improving accuracy of risk prediction by i ntegrating individual level ge notype data and s ummary s tatistics. An efficient algorithm based on variational inference is developed to handle the genome-wide analysis. Through comprehensive simulation studies, we demonstrated the advantages of IGESS over the methods which take either individual-level data or summary statistics data as input. We applied IGESS to perform integrative analysis of Crohns Disease from WTCCC and summary statistics from other studies. IGESS was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.2% ( ±0.4% ) to 69.4% ( ±0.1% ) using about 240 000 variants. The IGESS software is available at https://github.com/daviddaigithub/IGESS . zbxu@xjtu.edu.cn or xwan@comp.hkbu.edu.hk or eeyang@hkbu.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Spring, C; French, L
1990-01-01
A method of identifying children with specific reading disabilities by identifying discrepancies between their reading and listening comprehension scores was validated with disabled and nondisabled readers in Grades 4, 5, and 6. The method is based on a modification of the reading comprehension subtest of the Peabody Individual Achievement Test (Dunn & Markwardt, 1970). In this modification, even-numbered sentences are read by subjects, and odd-numbered sentences are read by the test administrator as subjects listen. The features of this test that reduce demands on working memory, thereby making it suitable for the detection of a discrepancy between reading and listening comprehension in readers with disabilities, are discussed. A significant group-by-modality interaction was obtained. Children with reading disabilities scored significantly lower on reading than on listening comprehension, while nondisabled readers scored slightly higher, but not significantly so, on reading than on listening comprehension. The appropriateness of this method as a substitute for the traditional method, which is based on the detection of a discrepancy between intelligence and reading and which has recently been proscribed in certain school districts, is discussed. Issues concerning the listening comprehension skills of disabled readers are also discussed.
Logistic regression for southern pine beetle outbreaks with spatial and temporal autocorrelation
M. L. Gumpertz; C.-T. Wu; John M. Pye
2000-01-01
Regional outbreaks of southern pine beetle (Dendroctonus frontalis Zimm.) show marked spatial and temporal patterns. While these patterns are of interest in themselves, we focus on statistical methods for estimating the effects of underlying environmental factors in the presence of spatial and temporal autocorrelation. The most comprehensive available information on...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wurtz, R.; Kaplan, A.
Pulse shape discrimination (PSD) is a variety of statistical classifier. Fully-realized statistical classifiers rely on a comprehensive set of tools for designing, building, and implementing. PSD advances rely on improvements to the implemented algorithm. PSD advances can be improved by using conventional statistical classifier or machine learning methods. This paper provides the reader with a glossary of classifier-building elements and their functions in a fully-designed and operational classifier framework that can be used to discover opportunities for improving PSD classifier projects. This paper recommends reporting the PSD classifier’s receiver operating characteristic (ROC) curve and its behavior at a gamma rejectionmore » rate (GRR) relevant for realistic applications.« less
DIMM-SC: a Dirichlet mixture model for clustering droplet-based single cell transcriptomic data.
Sun, Zhe; Wang, Ting; Deng, Ke; Wang, Xiao-Feng; Lafyatis, Robert; Ding, Ying; Hu, Ming; Chen, Wei
2018-01-01
Single cell transcriptome sequencing (scRNA-Seq) has become a revolutionary tool to study cellular and molecular processes at single cell resolution. Among existing technologies, the recently developed droplet-based platform enables efficient parallel processing of thousands of single cells with direct counting of transcript copies using Unique Molecular Identifier (UMI). Despite the technology advances, statistical methods and computational tools are still lacking for analyzing droplet-based scRNA-Seq data. Particularly, model-based approaches for clustering large-scale single cell transcriptomic data are still under-explored. We developed DIMM-SC, a Dirichlet Mixture Model for clustering droplet-based Single Cell transcriptomic data. This approach explicitly models UMI count data from scRNA-Seq experiments and characterizes variations across different cell clusters via a Dirichlet mixture prior. We performed comprehensive simulations to evaluate DIMM-SC and compared it with existing clustering methods such as K-means, CellTree and Seurat. In addition, we analyzed public scRNA-Seq datasets with known cluster labels and in-house scRNA-Seq datasets from a study of systemic sclerosis with prior biological knowledge to benchmark and validate DIMM-SC. Both simulation studies and real data applications demonstrated that overall, DIMM-SC achieves substantially improved clustering accuracy and much lower clustering variability compared to other existing clustering methods. More importantly, as a model-based approach, DIMM-SC is able to quantify the clustering uncertainty for each single cell, facilitating rigorous statistical inference and biological interpretations, which are typically unavailable from existing clustering methods. DIMM-SC has been implemented in a user-friendly R package with a detailed tutorial available on www.pitt.edu/∼wec47/singlecell.html. wei.chen@chp.edu or hum@ccf.org. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Hierarchical multivariate covariance analysis of metabolic connectivity.
Carbonell, Felix; Charil, Arnaud; Zijdenbos, Alex P; Evans, Alan C; Bedell, Barry J
2014-12-01
Conventional brain connectivity analysis is typically based on the assessment of interregional correlations. Given that correlation coefficients are derived from both covariance and variance, group differences in covariance may be obscured by differences in the variance terms. To facilitate a comprehensive assessment of connectivity, we propose a unified statistical framework that interrogates the individual terms of the correlation coefficient. We have evaluated the utility of this method for metabolic connectivity analysis using [18F]2-fluoro-2-deoxyglucose (FDG) positron emission tomography (PET) data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study. As an illustrative example of the utility of this approach, we examined metabolic connectivity in angular gyrus and precuneus seed regions of mild cognitive impairment (MCI) subjects with low and high β-amyloid burdens. This new multivariate method allowed us to identify alterations in the metabolic connectome, which would not have been detected using classic seed-based correlation analysis. Ultimately, this novel approach should be extensible to brain network analysis and broadly applicable to other imaging modalities, such as functional magnetic resonance imaging (MRI).
Salutogenic factors for mental health promotion in work settings and organizations.
Graeser, Silke
2011-12-01
Accompanied by an increasing awareness of companies and organizations for mental health conditions in work settings and organizations, the salutogenic perspective provides a promising approach to identify supportive factors and resources of organizations to promote mental health. Based on the sense of coherence (SOC) - usually treated as an individual and personality trait concept - an organization-based SOC scale was developed to identify potential salutogenic factors of a university as an organization and work place. Based on results of two samples of employees (n = 362, n = 204), factors associated with the organization-based SOC were evaluated. Statistical analysis yielded significant correlations between mental health and the setting-based SOC as well as the three factors of the SOC yielded by factor analysis yielded three factors comprehensibility, manageability and meaningfulness. Significant statistic results of bivariate and multivariate analyses emphasize the significance of aspects such as participation and comprehensibility referring to the organization, social cohesion and social climate on the social level, and recognition on the individual level for an organization-based SOC. Potential approaches for the further development of interventions for work-place health promotion based on salutogenic factors and resources on the individual, social and organization level are elaborated and the transcultural dimensions of these factors discussed.
Reporting Practices and Use of Quantitative Methods in Canadian Journal Articles in Psychology.
Counsell, Alyssa; Harlow, Lisa L
2017-05-01
With recent focus on the state of research in psychology, it is essential to assess the nature of the statistical methods and analyses used and reported by psychological researchers. To that end, we investigated the prevalence of different statistical procedures and the nature of statistical reporting practices in recent articles from the four major Canadian psychology journals. The majority of authors evaluated their research hypotheses through the use of analysis of variance (ANOVA), t -tests, and multiple regression. Multivariate approaches were less common. Null hypothesis significance testing remains a popular strategy, but the majority of authors reported a standardized or unstandardized effect size measure alongside their significance test results. Confidence intervals on effect sizes were infrequently employed. Many authors provided minimal details about their statistical analyses and less than a third of the articles presented on data complications such as missing data and violations of statistical assumptions. Strengths of and areas needing improvement for reporting quantitative results are highlighted. The paper concludes with recommendations for how researchers and reviewers can improve comprehension and transparency in statistical reporting.
[Current status and trends in the health of the Moscow population].
Tishuk, E A; Plavunov, N F; Soboleva, N P
1997-01-01
Based on vast comprehensive medical statistical database, the authors analyze the health status of the population and the efficacy of public health service in Moscow. The pre-crisis tendencies and the modern status of public health under modern socioeconomic conditions are noted.
Parsons, Brendon A; Marney, Luke C; Siegler, W Christopher; Hoggard, Jamin C; Wright, Bob W; Synovec, Robert E
2015-04-07
Comprehensive two-dimensional (2D) gas chromatography coupled with time-of-flight mass spectrometry (GC × GC-TOFMS) is a versatile instrumental platform capable of collecting highly informative, yet highly complex, chemical data for a variety of samples. Fisher-ratio (F-ratio) analysis applied to the supervised comparison of sample classes algorithmically reduces complex GC × GC-TOFMS data sets to find class distinguishing chemical features. F-ratio analysis, using a tile-based algorithm, significantly reduces the adverse effects of chromatographic misalignment and spurious covariance of the detected signal, enhancing the discovery of true positives while simultaneously reducing the likelihood of detecting false positives. Herein, we report a study using tile-based F-ratio analysis whereby four non-native analytes were spiked into diesel fuel at several concentrations ranging from 0 to 100 ppm. Spike level comparisons were performed in two regimes: comparing the spiked samples to the nonspiked fuel matrix and to each other at relative concentration factors of two. Redundant hits were algorithmically removed by refocusing the tiled results onto the original high resolution pixel level data. To objectively limit the tile-based F-ratio results to only features which are statistically likely to be true positives, we developed a combinatorial technique using null class comparisons, called null distribution analysis, by which we determined a statistically defensible F-ratio cutoff for the analysis of the hit list. After applying null distribution analysis, spiked analytes were reliably discovered at ∼1 to ∼10 ppm (∼5 to ∼50 pg using a 200:1 split), depending upon the degree of mass spectral selectivity and 2D chromatographic resolution, with minimal occurrence of false positives. To place the relevance of this work among other methods in this field, results are compared to those for pixel and peak table-based approaches.
Toward a comprehensive and systematic methylome signature in colorectal cancers.
Ashktorab, Hassan; Rahi, Hamed; Wansley, Daniel; Varma, Sudhir; Shokrani, Babak; Lee, Edward; Daremipouran, Mohammad; Laiyemo, Adeyinka; Goel, Ajay; Carethers, John M; Brim, Hassan
2013-08-01
CpG Island Methylator Phenotype (CIMP) is one of the underlying mechanisms in colorectal cancer (CRC). This study aimed to define a methylome signature in CRC through a methylation microarray analysis and a compilation of promising CIMP markers from the literature. Illumina HumanMethylation27 (IHM27) array data was generated and analyzed based on statistical differences in methylation data (1st approach) or based on overall differences in methylation percentages using lower 95% CI (2nd approach). Pyrosequencing was performed for the validation of nine genes. A meta-analysis was used to identify CIMP and non-CIMP markers that were hypermethylated in CRC but did not yet make it to the CIMP genes' list. Our 1st approach for array data analysis demonstrated the limitations in selecting genes for further validation, highlighting the need for the 2nd bioinformatics approach to adequately select genes with differential aberrant methylation. A more comprehensive list, which included non-CIMP genes, such as APC, EVL, CD109, PTEN, TWIST1, DCC, PTPRD, SFRP1, ICAM5, RASSF1A, EYA4, 30ST2, LAMA1, KCNQ5, ADHEF1, and TFPI2, was established. Array data are useful to categorize and cluster colonic lesions based on their global methylation profiles; however, its usefulness in identifying robust methylation markers is limited and rely on the data analysis method. We have identified 16 non-CIMP-panel genes for which we provide rationale for inclusion in a more comprehensive characterization of CIMP+ CRCs. The identification of a definitive list for methylome specific genes in CRC will contribute to better clinical management of CRC patients.
Caregivers' health literacy and their young children's oral-health-related expenditures.
Vann, W F; Divaris, K; Gizlice, Z; Baker, A D; Lee, J Y
2013-07-01
Caregivers' health literacy has emerged as an important determinant of young children's health care and outcomes. We examined the hypothesis that caregivers' health literacy influences children's oral-health-care-related expenditures. This was a prospective cohort study of 1,132 child/caregiver dyads (children's mean age = 19 months), participating in the Carolina Oral Health Literacy Project. Health literacy was measured by the REALD-30 (word recognition based) and NVS (comprehension based) instruments. Follow-up data included child Medicaid claims for CY2008-10. We quantified expenditures using annualized 2010 fee-adjusted Medicaid-paid dollars for oral-health-related visits involving preventive, restorative, and emergency care. We used descriptive, bivariate, and multivariate statistical methods based on generalized gamma models. Mean oral-health-related annual expenditures totaled $203: preventive--$81, restorative--$99, and emergency care--$22. Among children who received services, mean expenditures were: emergency hospital-based--$1282, preventive--$106, and restorative care--$343. Caregivers' low literacy in the oral health context was associated with a statistically non-significant increase in total expenditures (average annual difference = $40; 95% confidence interval, -32, 111). Nevertheless, with both instruments, emergency dental care expenditures were consistently elevated among children of low-literacy caregivers. These findings provide initial support for health literacy as an important determinant of the meaningful use and cost of oral health care.
ParseCNV integrative copy number variation association software with quality tracking
Glessner, Joseph T.; Li, Jin; Hakonarson, Hakon
2013-01-01
A number of copy number variation (CNV) calling algorithms exist; however, comprehensive software tools for CNV association studies are lacking. We describe ParseCNV, unique software that takes CNV calls and creates probe-based statistics for CNV occurrence in both case–control design and in family based studies addressing both de novo and inheritance events, which are then summarized based on CNV regions (CNVRs). CNVRs are defined in a dynamic manner to allow for a complex CNV overlap while maintaining precise association region. Using this approach, we avoid failure to converge and non-monotonic curve fitting weaknesses of programs, such as CNVtools and CNVassoc, and although Plink is easy to use, it only provides combined CNV state probe-based statistics, not state-specific CNVRs. Existing CNV association methods do not provide any quality tracking information to filter confident associations, a key issue which is fully addressed by ParseCNV. In addition, uncertainty in CNV calls underlying CNV associations is evaluated to verify significant results, including CNV overlap profiles, genomic context, number of probes supporting the CNV and single-probe intensities. When optimal quality control parameters are followed using ParseCNV, 90% of CNVs validate by polymerase chain reaction, an often problematic stage because of inadequate significant association review. ParseCNV is freely available at http://parsecnv.sourceforge.net. PMID:23293001
ParseCNV integrative copy number variation association software with quality tracking.
Glessner, Joseph T; Li, Jin; Hakonarson, Hakon
2013-03-01
A number of copy number variation (CNV) calling algorithms exist; however, comprehensive software tools for CNV association studies are lacking. We describe ParseCNV, unique software that takes CNV calls and creates probe-based statistics for CNV occurrence in both case-control design and in family based studies addressing both de novo and inheritance events, which are then summarized based on CNV regions (CNVRs). CNVRs are defined in a dynamic manner to allow for a complex CNV overlap while maintaining precise association region. Using this approach, we avoid failure to converge and non-monotonic curve fitting weaknesses of programs, such as CNVtools and CNVassoc, and although Plink is easy to use, it only provides combined CNV state probe-based statistics, not state-specific CNVRs. Existing CNV association methods do not provide any quality tracking information to filter confident associations, a key issue which is fully addressed by ParseCNV. In addition, uncertainty in CNV calls underlying CNV associations is evaluated to verify significant results, including CNV overlap profiles, genomic context, number of probes supporting the CNV and single-probe intensities. When optimal quality control parameters are followed using ParseCNV, 90% of CNVs validate by polymerase chain reaction, an often problematic stage because of inadequate significant association review. ParseCNV is freely available at http://parsecnv.sourceforge.net.
ASM Based Synthesis of Handwritten Arabic Text Pages
Al-Hamadi, Ayoub; Elzobi, Moftah; El-etriby, Sherif; Ghoneim, Ahmed
2015-01-01
Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available. PMID:26295059
ASM Based Synthesis of Handwritten Arabic Text Pages.
Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif; Ghoneim, Ahmed
2015-01-01
Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.
On state-of-charge determination for lithium-ion batteries
NASA Astrophysics Data System (ADS)
Li, Zhe; Huang, Jun; Liaw, Bor Yann; Zhang, Jianbo
2017-04-01
Accurate estimation of state-of-charge (SOC) of a battery through its life remains challenging in battery research. Although improved precisions continue to be reported at times, almost all are based on regression methods empirically, while the accuracy is often not properly addressed. Here, a comprehensive review is set to address such issues, from fundamental principles that are supposed to define SOC to methodologies to estimate SOC for practical use. It covers topics from calibration, regression (including modeling methods) to validation in terms of precision and accuracy. At the end, we intend to answer the following questions: 1) can SOC estimation be self-adaptive without bias? 2) Why Ah-counting is a necessity in almost all battery-model-assisted regression methods? 3) How to establish a consistent framework of coupling in multi-physics battery models? 4) To assess the accuracy in SOC estimation, statistical methods should be employed to analyze factors that contribute to the uncertainty. We hope, through this proper discussion of the principles, accurate SOC estimation can be widely achieved.
Ghasemi Damavandi, Hamidreza; Sen Gupta, Ananya; Nelson, Robert K; Reddy, Christopher M
2016-01-01
Comprehensive two-dimensional gas chromatography [Formula: see text] provides high-resolution separations across hundreds of compounds in a complex mixture, thus unlocking unprecedented information for intricate quantitative interpretation. We exploit this compound diversity across the [Formula: see text] topography to provide quantitative compound-cognizant interpretation beyond target compound analysis with petroleum forensics as a practical application. We focus on the [Formula: see text] topography of biomarker hydrocarbons, hopanes and steranes, as they are generally recalcitrant to weathering. We introduce peak topography maps (PTM) and topography partitioning techniques that consider a notably broader and more diverse range of target and non-target biomarker compounds compared to traditional approaches that consider approximately 20 biomarker ratios. Specifically, we consider a range of 33-154 target and non-target biomarkers with highest-to-lowest peak ratio within an injection ranging from 4.86 to 19.6 (precise numbers depend on biomarker diversity of individual injections). We also provide a robust quantitative measure for directly determining "match" between samples, without necessitating training data sets. We validate our methods across 34 [Formula: see text] injections from a diverse portfolio of petroleum sources, and provide quantitative comparison of performance against established statistical methods such as principal components analysis (PCA). Our data set includes a wide range of samples collected following the 2010 Deepwater Horizon disaster that released approximately 160 million gallons of crude oil from the Macondo well (MW). Samples that were clearly collected following this disaster exhibit statistically significant match [Formula: see text] using PTM-based interpretation against other closely related sources. PTM-based interpretation also provides higher differentiation between closely correlated but distinct sources than obtained using PCA-based statistical comparisons. In addition to results based on this experimental field data, we also provide extentive perturbation analysis of the PTM method over numerical simulations that introduce random variability of peak locations over the [Formula: see text] biomarker ROI image of the MW pre-spill sample (sample [Formula: see text] in Additional file 4: Table S1). We compare the robustness of the cross-PTM score against peak location variability in both dimensions and compare the results against PCA analysis over the same set of simulated images. Detailed description of the simulation experiment and discussion of results are provided in Additional file 1: Section S8. We provide a peak-cognizant informational framework for quantitative interpretation of [Formula: see text] topography. Proposed topographic analysis enables [Formula: see text] forensic interpretation across target petroleum biomarkers, while including the nuances of lesser-known non-target biomarkers clustered around the target peaks. This allows potential discovery of hitherto unknown connections between target and non-target biomarkers.
NASA Astrophysics Data System (ADS)
Shafii, M.; Tolson, B.; Matott, L. S.
2012-04-01
Hydrologic modeling has benefited from significant developments over the past two decades. This has resulted in building of higher levels of complexity into hydrologic models, which eventually makes the model evaluation process (parameter estimation via calibration and uncertainty analysis) more challenging. In order to avoid unreasonable parameter estimates, many researchers have suggested implementation of multi-criteria calibration schemes. Furthermore, for predictive hydrologic models to be useful, proper consideration of uncertainty is essential. Consequently, recent research has emphasized comprehensive model assessment procedures in which multi-criteria parameter estimation is combined with statistically-based uncertainty analysis routines such as Bayesian inference using Markov Chain Monte Carlo (MCMC) sampling. Such a procedure relies on the use of formal likelihood functions based on statistical assumptions, and moreover, the Bayesian inference structured on MCMC samplers requires a considerably large number of simulations. Due to these issues, especially in complex non-linear hydrological models, a variety of alternative informal approaches have been proposed for uncertainty analysis in the multi-criteria context. This study aims at exploring a number of such informal uncertainty analysis techniques in multi-criteria calibration of hydrological models. The informal methods addressed in this study are (i) Pareto optimality which quantifies the parameter uncertainty using the Pareto solutions, (ii) DDS-AU which uses the weighted sum of objective functions to derive the prediction limits, and (iii) GLUE which describes the total uncertainty through identification of behavioral solutions. The main objective is to compare such methods with MCMC-based Bayesian inference with respect to factors such as computational burden, and predictive capacity, which are evaluated based on multiple comparative measures. The measures for comparison are calculated both for calibration and evaluation periods. The uncertainty analysis methodologies are applied to a simple 5-parameter rainfall-runoff model, called HYMOD.
Optimal block cosine transform image coding for noisy channels
NASA Technical Reports Server (NTRS)
Vaishampayan, V.; Farvardin, N.
1986-01-01
The two dimensional block transform coding scheme based on the discrete cosine transform was studied extensively for image coding applications. While this scheme has proven to be efficient in the absence of channel errors, its performance degrades rapidly over noisy channels. A method is presented for the joint source channel coding optimization of a scheme based on the 2-D block cosine transform when the output of the encoder is to be transmitted via a memoryless design of the quantizers used for encoding the transform coefficients. This algorithm produces a set of locally optimum quantizers and the corresponding binary code assignment for the assumed transform coefficient statistics. To determine the optimum bit assignment among the transform coefficients, an algorithm was used based on the steepest descent method, which under certain convexity conditions on the performance of the channel optimized quantizers, yields the optimal bit allocation. Comprehensive simulation results for the performance of this locally optimum system over noisy channels were obtained and appropriate comparisons against a reference system designed for no channel error were rendered.
Li, Bo; Tang, Jing; Yang, Qingxia; Cui, Xuejiao; Li, Shuang; Chen, Sijie; Cao, Quanxing; Xue, Weiwei; Chen, Na; Zhu, Feng
2016-12-13
In untargeted metabolomics analysis, several factors (e.g., unwanted experimental &biological variations and technical errors) may hamper the identification of differential metabolic features, which requires the data-driven normalization approaches before feature selection. So far, ≥16 normalization methods have been widely applied for processing the LC/MS based metabolomics data. However, the performance and the sample size dependence of those methods have not yet been exhaustively compared and no online tool for comparatively and comprehensively evaluating the performance of all 16 normalization methods has been provided. In this study, a comprehensive comparison on these methods was conducted. As a result, 16 methods were categorized into three groups based on their normalization performances across various sample sizes. The VSN, the Log Transformation and the PQN were identified as methods of the best normalization performance, while the Contrast consistently underperformed across all sub-datasets of different benchmark data. Moreover, an interactive web tool comprehensively evaluating the performance of 16 methods specifically for normalizing LC/MS based metabolomics data was constructed and hosted at http://server.idrb.cqu.edu.cn/MetaPre/. In summary, this study could serve as a useful guidance to the selection of suitable normalization methods in analyzing the LC/MS based metabolomics data.
Li, Bo; Tang, Jing; Yang, Qingxia; Cui, Xuejiao; Li, Shuang; Chen, Sijie; Cao, Quanxing; Xue, Weiwei; Chen, Na; Zhu, Feng
2016-01-01
In untargeted metabolomics analysis, several factors (e.g., unwanted experimental & biological variations and technical errors) may hamper the identification of differential metabolic features, which requires the data-driven normalization approaches before feature selection. So far, ≥16 normalization methods have been widely applied for processing the LC/MS based metabolomics data. However, the performance and the sample size dependence of those methods have not yet been exhaustively compared and no online tool for comparatively and comprehensively evaluating the performance of all 16 normalization methods has been provided. In this study, a comprehensive comparison on these methods was conducted. As a result, 16 methods were categorized into three groups based on their normalization performances across various sample sizes. The VSN, the Log Transformation and the PQN were identified as methods of the best normalization performance, while the Contrast consistently underperformed across all sub-datasets of different benchmark data. Moreover, an interactive web tool comprehensively evaluating the performance of 16 methods specifically for normalizing LC/MS based metabolomics data was constructed and hosted at http://server.idrb.cqu.edu.cn/MetaPre/. In summary, this study could serve as a useful guidance to the selection of suitable normalization methods in analyzing the LC/MS based metabolomics data. PMID:27958387
School-based clinics: their role in helping students meet the 1990 objectives.
Dryfoos, J G; Klerman, L V
1988-01-01
Service statistics and observations from site visits across the country indicate that school-based clinics (SBCs) may be having an impact on several of the problems targeted in the 1990 health objectives, including unplanned pregnancy and substance abuse. At least 120 junior and senior high schools in 61 communities are currently operating or developing clinics. Growth is attributed to increasing concern about high-risk youth, especially among educators in their roles of "surrogate parents"; to disillusion with categorical interventions and a movement toward more comprehensive services; and to student, parent, school, and community approval of the new programs. This article describes the comprehensive school-based clinic model, including its history, organizational strategies, school/community partnerships, and services.
The Community College Story. Third Edition
ERIC Educational Resources Information Center
Vaughan, George B.
2006-01-01
This concise history of community colleges touches on major themes, including open access and equity, comprehensiveness, community-based philosophy, commitment to teaching, and lifelong learning. The third edition includes revised text as well as updated statistical information, time line, reading list, and Internet resources. In the more than a…
2009-09-01
SAS Statistical Analysis Software SE Systems Engineering SEP Systems Engineering Process SHP Shaft Horsepower SIGINT Signals Intelligence......management occurs (OSD 2002). The Systems Engineering Process (SEP), displayed in Figure 2, is a comprehensive , iterative and recursive problem
High-order fuzzy time-series based on multi-period adaptation model for forecasting stock markets
NASA Astrophysics Data System (ADS)
Chen, Tai-Liang; Cheng, Ching-Hsue; Teoh, Hia-Jong
2008-02-01
Stock investors usually make their short-term investment decisions according to recent stock information such as the late market news, technical analysis reports, and price fluctuations. To reflect these short-term factors which impact stock price, this paper proposes a comprehensive fuzzy time-series, which factors linear relationships between recent periods of stock prices and fuzzy logical relationships (nonlinear relationships) mined from time-series into forecasting processes. In empirical analysis, the TAIEX (Taiwan Stock Exchange Capitalization Weighted Stock Index) and HSI (Heng Seng Index) are employed as experimental datasets, and four recent fuzzy time-series models, Chen’s (1996), Yu’s (2005), Cheng’s (2006) and Chen’s (2007), are used as comparison models. Besides, to compare with conventional statistic method, the method of least squares is utilized to estimate the auto-regressive models of the testing periods within the databases. From analysis results, the performance comparisons indicate that the multi-period adaptation model, proposed in this paper, can effectively improve the forecasting performance of conventional fuzzy time-series models which only factor fuzzy logical relationships in forecasting processes. From the empirical study, the traditional statistic method and the proposed model both reveal that stock price patterns in the Taiwan stock and Hong Kong stock markets are short-term.
Evolution of semilocal string networks. II. Velocity estimators
NASA Astrophysics Data System (ADS)
Lopez-Eiguren, A.; Urrestilla, J.; Achúcarro, A.; Avgoustidis, A.; Martins, C. J. A. P.
2017-07-01
We continue a comprehensive numerical study of semilocal string networks and their cosmological evolution. These can be thought of as hybrid networks comprised of (nontopological) string segments, whose core structure is similar to that of Abelian Higgs vortices, and whose ends have long-range interactions and behavior similar to that of global monopoles. Our study provides further evidence of a linear scaling regime, already reported in previous studies, for the typical length scale and velocity of the network. We introduce a new algorithm to identify the position of the segment cores. This allows us to determine the length and velocity of each individual segment and follow their evolution in time. We study the statistical distribution of segment lengths and velocities for radiation- and matter-dominated evolution in the regime where the strings are stable. Our segment detection algorithm gives higher length values than previous studies based on indirect detection methods. The statistical distribution shows no evidence of (anti)correlation between the speed and the length of the segments.
Mohler, Rachel E; Dombek, Kenneth M; Hoggard, Jamin C; Pierce, Karisa M; Young, Elton T; Synovec, Robert E
2007-08-01
The first extensive study of yeast metabolite GC x GC-TOFMS data from cells grown under fermenting, R, and respiring, DR, conditions is reported. In this study, recently developed chemometric software for use with three-dimensional instrumentation data was implemented, using a statistically-based Fisher ratio method. The Fisher ratio method is fully automated and will rapidly reduce the data to pinpoint two-dimensional chromatographic peaks differentiating sample types while utilizing all the mass channels. The effect of lowering the Fisher ratio threshold on peak identification was studied. At the lowest threshold (just above the noise level), 73 metabolite peaks were identified, nearly three-fold greater than the number of previously reported metabolite peaks identified (26). In addition to the 73 identified metabolites, 81 unknown metabolites were also located. A Parallel Factor Analysis graphical user interface (PARAFAC GUI) was applied to selected mass channels to obtain a concentration ratio, for each metabolite under the two growth conditions. Of the 73 known metabolites identified by the Fisher ratio method, 54 were statistically changing to the 95% confidence limit between the DR and R conditions according to the rigorous Student's t-test. PARAFAC determined the concentration ratio and provided a fully-deconvoluted (i.e. mathematically resolved) mass spectrum for each of the metabolites. The combination of the Fisher ratio method with the PARAFAC GUI provides high-throughput software for discovery-based metabolomics research, and is novel for GC x GC-TOFMS data due to the use of the entire data set in the analysis (640 MB x 70 runs, double precision floating point).
Lee, Mikyung; Kim, Yangseok
2009-12-16
Genomic alterations frequently occur in many cancer patients and play important mechanistic roles in the pathogenesis of cancer. Furthermore, they can modify the expression level of genes due to altered copy number in the corresponding region of the chromosome. An accumulating body of evidence supports the possibility that strong genome-wide correlation exists between DNA content and gene expression. Therefore, more comprehensive analysis is needed to quantify the relationship between genomic alteration and gene expression. A well-designed bioinformatics tool is essential to perform this kind of integrative analysis. A few programs have already been introduced for integrative analysis. However, there are many limitations in their performance of comprehensive integrated analysis using published software because of limitations in implemented algorithms and visualization modules. To address this issue, we have implemented the Java-based program CHESS to allow integrative analysis of two experimental data sets: genomic alteration and genome-wide expression profile. CHESS is composed of a genomic alteration analysis module and an integrative analysis module. The genomic alteration analysis module detects genomic alteration by applying a threshold based method or SW-ARRAY algorithm and investigates whether the detected alteration is phenotype specific or not. On the other hand, the integrative analysis module measures the genomic alteration's influence on gene expression. It is divided into two separate parts. The first part calculates overall correlation between comparative genomic hybridization ratio and gene expression level by applying following three statistical methods: simple linear regression, Spearman rank correlation and Pearson's correlation. In the second part, CHESS detects the genes that are differentially expressed according to the genomic alteration pattern with three alternative statistical approaches: Student's t-test, Fisher's exact test and Chi square test. By successive operations of two modules, users can clarify how gene expression levels are affected by the phenotype specific genomic alterations. As CHESS was developed in both Java application and web environments, it can be run on a web browser or a local machine. It also supports all experimental platforms if a properly formatted text file is provided to include the chromosomal position of probes and their gene identifiers. CHESS is a user-friendly tool for investigating disease specific genomic alterations and quantitative relationships between those genomic alterations and genome-wide gene expression profiling.
Time Series Analysis Based on Running Mann Whitney Z Statistics
USDA-ARS?s Scientific Manuscript database
A sensitive and objective time series analysis method based on the calculation of Mann Whitney U statistics is described. This method samples data rankings over moving time windows, converts those samples to Mann-Whitney U statistics, and then normalizes the U statistics to Z statistics using Monte-...
USDA-ARS?s Scientific Manuscript database
This paper provides a summary of results presented in a much more comprehensive article (Sampson et al. 2014). Specifics regarding methods and statistical procedures can be found in Sampson et al. 2014. Here, we summarize these results for popular cultivars of rabbiteye blueberry (V. virgatum syn. a...
Comparing the Effectiveness of SPSS and EduG Using Different Designs for Generalizability Theory
ERIC Educational Resources Information Center
Teker, Gulsen Tasdelen; Guler, Nese; Uyanik, Gulden Kaya
2015-01-01
Generalizability theory (G theory) provides a broad conceptual framework for social sciences such as psychology and education, and a comprehensive construct for numerous measurement events by using analysis of variance, a strong statistical method. G theory, as an extension of both classical test theory and analysis of variance, is a model which…
1988-09-01
S P a .E REPORT DOCUMENTATION PAGE OMR;oJ ’ , CRR Eo Dale n2 ;R6 ’a 4EPOR- SCRFT CASS F.C.T ON ’b RES’RICTI’,E MARKINGS Unclassified a ECRIT y...and selection of test waves 30. Measured prototype wave data on which a comprehensive statistical analysis of wave conditions could be based were...Tests Existing conditions 32. Prior to testing of the various improvement plans, comprehensive tests were conducted for existing conditions (Plate 1
Evaluation of Low-Voltage Distribution Network Index Based on Improved Principal Component Analysis
NASA Astrophysics Data System (ADS)
Fan, Hanlu; Gao, Suzhou; Fan, Wenjie; Zhong, Yinfeng; Zhu, Lei
2018-01-01
In order to evaluate the development level of the low-voltage distribution network objectively and scientifically, chromatography analysis method is utilized to construct evaluation index model of low-voltage distribution network. Based on the analysis of principal component and the characteristic of logarithmic distribution of the index data, a logarithmic centralization method is adopted to improve the principal component analysis algorithm. The algorithm can decorrelate and reduce the dimensions of the evaluation model and the comprehensive score has a better dispersion degree. The clustering method is adopted to analyse the comprehensive score because the comprehensive score of the courts is concentrated. Then the stratification evaluation of the courts is realized. An example is given to verify the objectivity and scientificity of the evaluation method.
Assessment of sustainable urban transport development based on entropy and unascertained measure
Li, Yancang; Yang, Jing; Li, Yijie
2017-01-01
To find a more effective method for the assessment of sustainable urban transport development, the comprehensive assessment model of sustainable urban transport development was established based on the unascertained measure. On the basis of considering the factors influencing urban transport development, the comprehensive assessment indexes were selected, including urban economical development, transport demand, environment quality and energy consumption, and the assessment system of sustainable urban transport development was proposed. In view of different influencing factors of urban transport development, the index weight was calculated through the entropy weight coefficient method. Qualitative and quantitative analyses were conducted according to the actual condition. Then, the grade was obtained by using the credible degree recognition criterion from which the urban transport development level can be determined. Finally, a comprehensive assessment method for urban transport development was introduced. The application practice showed that the method can be used reasonably and effectively for the comprehensive assessment of urban transport development. PMID:29084281
Orchestrating high-throughput genomic analysis with Bioconductor
Huber, Wolfgang; Carey, Vincent J.; Gentleman, Robert; Anders, Simon; Carlson, Marc; Carvalho, Benilton S.; Bravo, Hector Corrada; Davis, Sean; Gatto, Laurent; Girke, Thomas; Gottardo, Raphael; Hahne, Florian; Hansen, Kasper D.; Irizarry, Rafael A.; Lawrence, Michael; Love, Michael I.; MacDonald, James; Obenchain, Valerie; Oleś, Andrzej K.; Pagès, Hervé; Reyes, Alejandro; Shannon, Paul; Smyth, Gordon K.; Tenenbaum, Dan; Waldron, Levi; Morgan, Martin
2015-01-01
Bioconductor is an open-source, open-development software project for the analysis and comprehension of high-throughput data in genomics and molecular biology. The project aims to enable interdisciplinary research, collaboration and rapid development of scientific software. Based on the statistical programming language R, Bioconductor comprises 934 interoperable packages contributed by a large, diverse community of scientists. Packages cover a range of bioinformatic and statistical applications. They undergo formal initial review and continuous automated testing. We present an overview for prospective users and contributors. PMID:25633503
METHANE EMISSIONS FROM THE NATURAL GAS INDUSTRY VOLUME 4: STATISTICAL METHODOLOGY
The 15-volume report summarizes the results of a comprehensive program to quantify methane (CH4) emissions from the U.S. natural gas industry for the base year. The objective was to determine CH4 emissions from the wellhead and ending downstream at the customer's meter. The accur...
Test Standards for Contingency Base Waste-to-Energy Technologies
2015-08-01
test runs are preferred to allow a more comprehensive statistical evaluation of the results. In 8 • Minimize the complexity , difficulty, and...with water or, in the case of cyanide - or sulfide-bearing wastes, when exposed to mild acidic or basic conditions; 4) explode when subjected to a
26 CFR 6a.103A-2 - Qualified mortgage bond.
Code of Federal Regulations, 2013 CFR
2013-04-01
... CFR 570.452, by the Secretary of Housing and Urban Development. (E) Statistical and descriptive.... Settlement costs include titling and transfer costs, title insurance, survey fees, or other similar costs... actually $43,000. Such determination is based on a comprehensive survey of residential housing sales in the...
26 CFR 6a.103A-2 - Qualified mortgage bond.
Code of Federal Regulations, 2012 CFR
2012-04-01
... CFR 570.452, by the Secretary of Housing and Urban Development. (E) Statistical and descriptive.... Settlement costs include titling and transfer costs, title insurance, survey fees, or other similar costs... actually $43,000. Such determination is based on a comprehensive survey of residential housing sales in the...
26 CFR 6a.103A-2 - Qualified mortgage bond.
Code of Federal Regulations, 2011 CFR
2011-04-01
... CFR 570.452, by the Secretary of Housing and Urban Development. (E) Statistical and descriptive.... Settlement costs include titling and transfer costs, title insurance, survey fees, or other similar costs... actually $43,000. Such determination is based on a comprehensive survey of residential housing sales in the...
26 CFR 6a.103A-2 - Qualified mortgage bond.
Code of Federal Regulations, 2014 CFR
2014-04-01
... CFR 570.452, by the Secretary of Housing and Urban Development. (E) Statistical and descriptive.... Settlement costs include titling and transfer costs, title insurance, survey fees, or other similar costs... actually $43,000. Such determination is based on a comprehensive survey of residential housing sales in the...
26 CFR 6a.103A-2 - Qualified mortgage bond.
Code of Federal Regulations, 2010 CFR
2010-04-01
... CFR 570.452, by the Secretary of Housing and Urban Development. (E) Statistical and descriptive.... Settlement costs include titling and transfer costs, title insurance, survey fees, or other similar costs... actually $43,000. Such determination is based on a comprehensive survey of residential housing sales in the...
Assessment of heavy metals in sediment in a heavily polluted urban river in the Chaohu Basin, China
NASA Astrophysics Data System (ADS)
Shao, Shiguang; Xue, Lianqing; Liu, Cheng; Shang, Jingge; Wang, Zhaode; He, Xiang; Fan, Chengxin
2016-05-01
The Nanfei River (Anhui Province, China) is a severely polluted urban river that flows into Chaohu Lake. In the present study, sediments were collected from the river and analyzed for their heavy metal contents. Multivariate statistics and the fuzzy comprehensive assessment method were used to determine the sources of pollution, the current pollution status, and spatial and temporal variations in heavy metal pollution in sediments. The concentrations of arsenic (As), cadmium (Cd), chromium (Cr), copper (Cu), mercury (Hg), nickel (Ni), lead (Pb), and zinc (Zn) in sediments ranged from 5.67-113, 0.08-40.2, 41.6-524, 15.5-460, 0.03-4.84, 13.5-180, 18.8-250, and 47.9-1 996 mg/kg, and the average concentrations of each metal were 1.7, 38.7, 1.8, 5.5, 18.8, 1.3, 2.5, and 11.1 times greater than the background values, respectively. Multivariate statistical analysis demonstrated that Hg, Cu, Cr, Cd, and Ni may have originated from industrial activities, whereas As and Pb came from agricultural activities. The fuzzy comprehensive assessment method, based on the fuzzy mathematics theory, was used to obtain a detailed assessment of the sediment quality in the Nanfei River watershed. The results indicated that the pollution was moderate in the downstream tributaries of the Nianbu and Dianbu Rivers, but was severe in the main channel of the Nanfei River and in the upstream tributaries of the Sili and Banqiao Rivers. Therefore, sediments in the Nanfei River watershed are heavily polluted and urgent measures should be taken to remedy the status.
NASA Astrophysics Data System (ADS)
Gao, Chen; Ding, Zhongan; Deng, Bofa; Yan, Shengteng
2017-10-01
According to the characteristics of electric energy data acquire system (EEDAS), considering the availability of each index data and the connection between the index integrity, establishing the performance evaluation index system of electric energy data acquire system from three aspects as master station system, communication channel, terminal equipment. To determine the comprehensive weight of each index based on triangular fuzzy number analytic hierarchy process with entropy weight method, and both subjective preference and objective attribute are taken into consideration, thus realize the performance comprehensive evaluation more reasonable and reliable. Example analysis shows that, by combination with analytic hierarchy process (AHP) and triangle fuzzy numbers (TFN) to establish comprehensive index evaluation system based on entropy method, the evaluation results not only convenient and practical, but also more objective and accurate.
Zubkova, O V; Samosiuk, I Z; Polishchuk, O V; Shul'ga, N M; Samosiuk, N I
2012-01-01
The efficacy of magnetic-laser therapy used according to the method developed by us was studied in patients having the brain concussion (BC) in an acute period. The study was based on the dynamics of values of the evoked vestibular potentials and the disease clinical course. It was shown that following the magnetic-laser therapy in combination with traditional pharmacotherapy in BC acute period, the statistically significant positive changes were registered in the quantitative characteristics of the evoked vestibular brain potentials that correlated with the dynamics of the disease clinical course. The data obtained substantiate the possibility of using the magnetic-laser therapy in patients with a mild craniocereblal injury in an acute period.
Detecting Disease Specific Pathway Substructures through an Integrated Systems Biology Approach
Alaimo, Salvatore; Marceca, Gioacchino Paolo; Ferro, Alfredo; Pulvirenti, Alfredo
2017-01-01
In the era of network medicine, pathway analysis methods play a central role in the prediction of phenotype from high throughput experiments. In this paper, we present a network-based systems biology approach capable of extracting disease-perturbed subpathways within pathway networks in connection with expression data taken from The Cancer Genome Atlas (TCGA). Our system extends pathways with missing regulatory elements, such as microRNAs, and their interactions with genes. The framework enables the extraction, visualization, and analysis of statistically significant disease-specific subpathways through an easy to use web interface. Our analysis shows that the methodology is able to fill the gap in current techniques, allowing a more comprehensive analysis of the phenomena underlying disease states. PMID:29657291
Gillam, Ronald B.; Evans, Julia L.
2016-01-01
Purpose Compared with same-age typically developing peers, school-age children with specific language impairment (SLI) exhibit significant deficits in spoken sentence comprehension. They also demonstrate a range of memory limitations. Whether these 2 deficit areas are related is unclear. The present review article aims to (a) review 2 main theoretical accounts of SLI sentence comprehension and various studies supporting each and (b) offer a new, broader, more integrated memory-based framework to guide future SLI research, as we believe the available evidence favors a memory-based perspective of SLI comprehension limitations. Method We reviewed the literature on the sentence comprehension abilities of English-speaking children with SLI from 2 theoretical perspectives. Results The sentence comprehension limitations of children with SLI appear to be more fully captured by a memory-based perspective than by a syntax-specific deficit perspective. Conclusions Although a memory-based view appears to be the better account of SLI sentence comprehension deficits, this view requires refinement and expansion. Current memory-based perspectives of adult sentence comprehension, with proper modification, offer SLI investigators new, more integrated memory frameworks within which to study and better understand the sentence comprehension abilities of children with SLI. PMID:27973643
Jancey, Jonine; Howat, Peter; Ledger, Melissa; Lee, Andy H.
2013-01-01
Introduction Workplace health promotion programs to prevent overweight and obesity in office-based employees should be evidence-based and comprehensive and should consider behavioral, social, organizational, and environmental factors. The objective of this study was to identify barriers to and enablers of physical activity and nutrition as well as intervention strategies for health promotion in office-based workplaces in the Perth, Western Australia, metropolitan area in 2012. Methods We conducted an online survey of 111 employees from 55 organizations. The online survey investigated demographics, individual and workplace characteristics, barriers and enablers, intervention-strategy preferences, and physical activity and nutrition behaviors. We used χ2 and Mann–Whitney U statistics to test for differences between age and sex groups for barriers and enablers, intervention-strategy preferences, and physical activity and nutrition behaviors. Stepwise multiple regression analysis determined factors that affect physical activity and nutrition behaviors. Results We identified several factors that affected physical activity and nutrition behaviors, including the most common barriers (“too tired” and “access to unhealthy food”) and enablers (“enjoy physical activity” and “nutrition knowledge”). Intervention-strategy preferences demonstrated employee support for health promotion in the workplace. Conclusion The findings provide useful insights into employees’ preferences for interventions; they can be used to develop comprehensive programs for evidence-based workplace health promotion that consider environmental and policy influences as well as the individual. PMID:24028834
NASA Astrophysics Data System (ADS)
Liu, J.
2017-12-01
Accurately estimate of ET is crucial for studies of land-atmosphere interactions. A series of ET products have been developed recently relying on various simulation methods, however, uncertainties in accuracy of products limit their implications. In this study, accuracies of total 8 popular global ET products simulated based on satellite retrieves (ETMODIS and ETZhang), reanalysis (ETJRA55), machine learning method (ETJung) and land surface models (ETCLM, ETMOS, ETNoah and ETVIC) forcing by Global Land Data Assimilation System (GLDAS), respectively, were comprehensively evaluated against observations from eddy covariance FLUXNET sites by yearly, land cover and climate zones. The result shows that all simulated ET products tend to underestimate in the lower ET ranges or overestimate in higher ET ranges compared with ET observations. Through the examining of four statistic criterias, the root mean square error (RMSE), mean bias error (MBE), R2, and Taylor skill score (TSS), ETJung provided a high performance whether yearly or land cover or climatic zones. Satellite based ET products also have impressive performance. ETMODIS and ETZhang present comparable accuracy, while were skilled for different land cover and climate zones, respectively. Generally, the ET products from GLDAS show reasonable accuracy, despite ETCLM has relative higher RMSE and MBE for yearly, land cover and climate zones comparisons. Although the ETJRA55 shows comparable R2 with other products, its performance was constraint by the high RMSE and MBE. Knowledge from this study is crucial for ET products improvement and selection when they were used.
Yan, Chang-An; Zhang, Wanchang; Zhang, Zhijie; Liu, Yuanmin; Deng, Cai; Nie, Ning
2015-01-01
Water quality assessment at the watershed scale requires not only an investigation of water pollution and the recognition of main pollution factors, but also the identification of polluted risky regions resulted in polluted surrounding river sections. To realize this objective, we collected water samplings from 67 sampling sites in the Honghe River watershed of China with Grid GIS method to analyze six parameters including dissolved oxygen (DO), ammonia nitrogen (NH3-N), nitrate nitrogen (NO3-N), nitrite nitrogen (NO2-N), total nitrogen (TN) and total phosphorus (TP). Single factor pollution index and comprehensive pollution index were adopted to explore main water pollutants and evaluate water quality pollution level. Based on two evaluate methods, Geo-statistical analysis and Geographical Information System (GIS) were used to visualize the spatial pollution characteristics and identifying potential polluted risky regions. The results indicated that the general water quality in the watershed has been exposed to various pollutants, in which TP, NO2-N and TN were the main pollutants and seriously exceeded the standard of Category III. The zones of TP, TN, DO, NO2-N and NH3-N pollution covered 99.07%, 62.22%, 59.72%, 37.34% and 13.82% of the watershed respectively, and they were from medium to serious polluted. 83.27% of the watershed in total was polluted by comprehensive pollutants. These conclusions may provide useful and effective information for watershed water pollution control and management.
Yan, Chang-An; Zhang, Wanchang; Zhang, Zhijie; Liu, Yuanmin; Deng, Cai; Nie, Ning
2015-01-01
Water quality assessment at the watershed scale requires not only an investigation of water pollution and the recognition of main pollution factors, but also the identification of polluted risky regions resulted in polluted surrounding river sections. To realize this objective, we collected water samplings from 67 sampling sites in the Honghe River watershed of China with Grid GIS method to analyze six parameters including dissolved oxygen (DO), ammonia nitrogen (NH3-N), nitrate nitrogen (NO3-N), nitrite nitrogen (NO2-N), total nitrogen (TN) and total phosphorus (TP). Single factor pollution index and comprehensive pollution index were adopted to explore main water pollutants and evaluate water quality pollution level. Based on two evaluate methods, Geo-statistical analysis and Geographical Information System (GIS) were used to visualize the spatial pollution characteristics and identifying potential polluted risky regions. The results indicated that the general water quality in the watershed has been exposed to various pollutants, in which TP, NO2-N and TN were the main pollutants and seriously exceeded the standard of Category III. The zones of TP, TN, DO, NO2-N and NH3-N pollution covered 99.07%, 62.22%, 59.72%, 37.34% and 13.82% of the watershed respectively, and they were from medium to serious polluted. 83.27% of the watershed in total was polluted by comprehensive pollutants. These conclusions may provide useful and effective information for watershed water pollution control and management. PMID:25768942
Lindsay, Kaitlin E; Rühli, Frank J; Deleon, Valerie Burke
2015-06-01
The technique of forensic facial approximation, or reconstruction, is one of many facets of the field of mummy studies. Although far from a rigorous scientific technique, evidence-based visualization of antemortem appearance may supplement radiological, chemical, histological, and epidemiological studies of ancient remains. Published guidelines exist for creating facial approximations, but few approximations are published with documentation of the specific process and references used. Additionally, significant new research has taken place in recent years which helps define best practices in the field. This case study records the facial approximation of a 3,000-year-old ancient Egyptian woman using medical imaging data and the digital sculpting program, ZBrush. It represents a synthesis of current published techniques based on the most solid anatomical and/or statistical evidence. Through this study, it was found that although certain improvements have been made in developing repeatable, evidence-based guidelines for facial approximation, there are many proposed methods still awaiting confirmation from comprehensive studies. This study attempts to assist artists, anthropologists, and forensic investigators working in facial approximation by presenting the recommended methods in a chronological and usable format. © 2015 Wiley Periodicals, Inc.
Geostatistics and GIS: tools for characterizing environmental contamination.
Henshaw, Shannon L; Curriero, Frank C; Shields, Timothy M; Glass, Gregory E; Strickland, Paul T; Breysse, Patrick N
2004-08-01
Geostatistics is a set of statistical techniques used in the analysis of georeferenced data that can be applied to environmental contamination and remediation studies. In this study, the 1,1-dichloro-2,2-bis(p-chlorophenyl)ethylene (DDE) contamination at a Superfund site in western Maryland is evaluated. Concern about the site and its future clean up has triggered interest within the community because residential development surrounds the area. Spatial statistical methods, of which geostatistics is a subset, are becoming increasingly popular, in part due to the availability of geographic information system (GIS) software in a variety of application packages. In this article, the joint use of ArcGIS software and the R statistical computing environment are demonstrated as an approach for comprehensive geostatistical analyses. The spatial regression method, kriging, is used to provide predictions of DDE levels at unsampled locations both within the site and the surrounding areas where residential development is ongoing.
Vander Zwart, Karlijn E; Geytenbeek, Joke J; de Kleijn, Maaike; Oostrom, Kim J; Gorter, Jan Willem; Hidecker, Mary Jo Cooley; Vermeulen, R Jeroen
2016-02-01
The aims of this study were to determine the intra- and interrater reliability of the Dutch-language version of the Communication Function Classification System (CFCS-NL) and to investigate the association between the CFCS level and (1) spoken language comprehension and (2) preferred method of communication in children with cerebral palsy (CP). Participants were 93 children with CP (50 males, 43 females; mean age 7y, SD 2y 6mo, range 2y 9mo-12y 10mo; unilateral spastic [n=22], bilateral spastic [n=51], dyskinetic [n=15], ataxic [n=3], not specified [n=2]; Gross Motor Function Classification System level I [n=16], II [n=14], III, [n=7], IV [n=24], V [n=31], unknown [n=1]), recruited from rehabilitation centres throughout the Netherlands. Because some centres only contributed to part of the study, different numbers of participants are presented for different aspects of the study. Parents and speech and language therapists (SLTs) classified the communication level using the CFCS. Kappa was used to determine the intra- and interrater reliability. Spearman's correlation coefficient was used to determine the association between CFCS level and spoken language comprehension, and Fisher's exact test was used to examine the association between the CFCS level and method of communication. Interrater reliability of the CFCS-NL between parents and SLTs was fair (r=0.54), between SLTs good (r=0.78), and the intrarater (SLT) reliability very good (r=0.85). The association between the CFCS and spoken language comprehension was strong for SLTs (r=0.63) and moderate for parents (r=0.51). There was a statistically significant difference between the CFCS level and the preferred method of communication of the child (p<0.01). Also, CFCS level classification showed a statistically significant difference between parents and SLTs (p<0.01). These data suggest that the CFCS-NL is a valid and reliable clinical tool to classify everyday communication in children with CP. Preferably, professionals should classify the child's CFCS level in collaboration with the parents to acquire the most comprehensive information about the everyday communication of the child in various situations both with familiar and with unfamiliar partners. © 2015 Mac Keith Press.
Probabilistic arithmetic automata and their applications.
Marschall, Tobias; Herms, Inke; Kaltenbach, Hans-Michael; Rahmann, Sven
2012-01-01
We present a comprehensive review on probabilistic arithmetic automata (PAAs), a general model to describe chains of operations whose operands depend on chance, along with two algorithms to numerically compute the distribution of the results of such probabilistic calculations. PAAs provide a unifying framework to approach many problems arising in computational biology and elsewhere. We present five different applications, namely 1) pattern matching statistics on random texts, including the computation of the distribution of occurrence counts, waiting times, and clump sizes under hidden Markov background models; 2) exact analysis of window-based pattern matching algorithms; 3) sensitivity of filtration seeds used to detect candidate sequence alignments; 4) length and mass statistics of peptide fragments resulting from enzymatic cleavage reactions; and 5) read length statistics of 454 and IonTorrent sequencing reads. The diversity of these applications indicates the flexibility and unifying character of the presented framework. While the construction of a PAA depends on the particular application, we single out a frequently applicable construction method: We introduce deterministic arithmetic automata (DAAs) to model deterministic calculations on sequences, and demonstrate how to construct a PAA from a given DAA and a finite-memory random text model. This procedure is used for all five discussed applications and greatly simplifies the construction of PAAs. Implementations are available as part of the MoSDi package. Its application programming interface facilitates the rapid development of new applications based on the PAA framework.
Comprehensive risk assessment method of catastrophic accident based on complex network properties
NASA Astrophysics Data System (ADS)
Cui, Zhen; Pang, Jun; Shen, Xiaohong
2017-09-01
On the macro level, the structural properties of the network and the electrical characteristics of the micro components determine the risk of cascading failures. And the cascading failures, as a process with dynamic development, not only the direct risk but also potential risk should be considered. In this paper, comprehensively considered the direct risk and potential risk of failures based on uncertain risk analysis theory and connection number theory, quantified uncertain correlation by the node degree and node clustering coefficient, then established a comprehensive risk indicator of failure. The proposed method has been proved by simulation on the actual power grid. Modeling a network according to the actual power grid, and verified the rationality of the proposed method.
NASA Astrophysics Data System (ADS)
Oliveira, Sérgio C.; Zêzere, José L.; Lajas, Sara; Melo, Raquel
2017-07-01
Approaches used to assess shallow slide susceptibility at the basin scale are conceptually different depending on the use of statistical or physically based methods. The former are based on the assumption that the same causes are more likely to produce the same effects, whereas the latter are based on the comparison between forces which tend to promote movement along the slope and the counteracting forces that are resistant to motion. Within this general framework, this work tests two hypotheses: (i) although conceptually and methodologically distinct, the statistical and deterministic methods generate similar shallow slide susceptibility results regarding the model's predictive capacity and spatial agreement; and (ii) the combination of shallow slide susceptibility maps obtained with statistical and physically based methods, for the same study area, generate a more reliable susceptibility model for shallow slide occurrence. These hypotheses were tested at a small test site (13.9 km2) located north of Lisbon (Portugal), using a statistical method (the information value method, IV) and a physically based method (the infinite slope method, IS). The landslide susceptibility maps produced with the statistical and deterministic methods were combined into a new landslide susceptibility map. The latter was based on a set of integration rules defined by the cross tabulation of the susceptibility classes of both maps and analysis of the corresponding contingency tables. The results demonstrate a higher predictive capacity of the new shallow slide susceptibility map, which combines the independent results obtained with statistical and physically based models. Moreover, the combination of the two models allowed the identification of areas where the results of the information value and the infinite slope methods are contradictory. Thus, these areas were classified as uncertain and deserve additional investigation at a more detailed scale.
The research on the fairness of carbon emissions for China's energy based on GIS
NASA Astrophysics Data System (ADS)
Wang, Qiuxian; Gao, Zhiqiang; Ning, Jicai; Lu, Qingshui; Shi, Runhe; Gao, Wei
2013-09-01
This article firstly calculated China's energy carbon emissions of 30 provinces in 2010 with the method of carbon emission inventories of 2006 IPCC based on the data of China energy statistical yearbook, and then calculated its carbon emission intensity with GDP data in China's statistical yearbook. Next according to the formed formula the author calculated the EEI (Economic Efficiency Index) and ECI (Ecological Carrying Index) and made some corresponding figures with the help of GIS to analyze the fairness of the China's energy CO2 emissions in 2010.The results showed that the distribution of China's CO2 emissions for energy in 2010 become lower from the Bohai bay to the surroundings and the west circle provinces are with the lowest energy carbon emissions. The intensity distribution of China's CO2 emissions for energy in 2010 becomes higher from southeast China to north China. The distributions of EEI, ECI and for China's energy CO2 emissions are quite different from each other, and also with their comprehensive result. As to the fairness of China's energy CO2 emissions in 2010, we can say that the south provinces are better than those of Bohai bay areas (except Beijing and Tianjing).
From reading numbers to seeing ratios: a benefit of icons for risk comprehension.
Tubau, Elisabet; Rodríguez-Ferreiro, Javier; Barberia, Itxaso; Colomé, Àngels
2018-06-21
Promoting a better understanding of statistical data is becoming increasingly important for improving risk comprehension and decision-making. In this regard, previous studies on Bayesian problem solving have shown that iconic representations help infer frequencies in sets and subsets. Nevertheless, the mechanisms by which icons enhance performance remain unclear. Here, we tested the hypothesis that the benefit offered by icon arrays lies in a better alignment between presented and requested relationships, which should facilitate the comprehension of the requested ratio beyond the represented quantities. To this end, we analyzed individual risk estimates based on data presented either in standard verbal presentations (percentages and natural frequency formats) or as icon arrays. Compared to the other formats, icons led to estimates that were more accurate, and importantly, promoted the use of equivalent expressions for the requested probability. Furthermore, whereas the accuracy of the estimates based on verbal formats depended on their alignment with the text, all the estimates based on icons were equally accurate. Therefore, these results support the proposal that icons enhance the comprehension of the ratio and its mapping onto the requested probability and point to relational misalignment as potential interference for text-based Bayesian reasoning. The present findings also argue against an intrinsic difficulty with understanding single-event probabilities.
Chen Peng; Ao Li
2017-01-01
The emergence of multi-dimensional data offers opportunities for more comprehensive analysis of the molecular characteristics of human diseases and therefore improving diagnosis, treatment, and prevention. In this study, we proposed a heterogeneous network based method by integrating multi-dimensional data (HNMD) to identify GBM-related genes. The novelty of the method lies in that the multi-dimensional data of GBM from TCGA dataset that provide comprehensive information of genes, are combined with protein-protein interactions to construct a weighted heterogeneous network, which reflects both the general and disease-specific relationships between genes. In addition, a propagation algorithm with resistance is introduced to precisely score and rank GBM-related genes. The results of comprehensive performance evaluation show that the proposed method significantly outperforms the network based methods with single-dimensional data and other existing approaches. Subsequent analysis of the top ranked genes suggests they may be functionally implicated in GBM, which further corroborates the superiority of the proposed method. The source code and the results of HNMD can be downloaded from the following URL: http://bioinformatics.ustc.edu.cn/hnmd/ .
Staudacher, Erich M.; Huetteroth, Wolf; Schachtner, Joachim; Daly, Kevin C.
2009-01-01
A central problem facing studies of neural encoding in sensory systems is how to accurately quantify the extent of spatial and temporal responses. In this study, we take advantage of the relatively simple and stereotypic neural architecture found in invertebrates. We combine standard electrophysiological techniques, recently developed population analysis techniques, and novel anatomical methods to form an innovative 4-dimensional view of odor output representations in the antennal lobe of the moth Manduca sexta. This novel approach allows quantification of olfactory responses of characterized neurons with spike time resolution. Additionally, arbitrary integration windows can be used for comparisons with other methods such as imaging. By assigning statistical significance to changes in neuronal firing, this method can visualize activity across the entire antennal lobe. The resulting 4-dimensional representation of antennal lobe output complements imaging and multi-unit experiments yet provides a more comprehensive and accurate view of glomerular activation patterns in spike time resolution. PMID:19464513
A preliminary study of DTI Fingerprinting on stroke analysis.
Ma, Heather T; Ye, Chenfei; Wu, Jun; Yang, Pengfei; Chen, Xuhui; Yang, Zhengyi; Ma, Jingbo
2014-01-01
DTI (Diffusion Tensor Imaging) is a well-known MRI (Magnetic Resonance Imaging) technique which provides useful structural information about human brain. However, the quantitative measurement to physiological variation of subtypes of ischemic stroke is not available. An automatically quantitative method for DTI analysis will enhance the DTI application in clinics. In this study, we proposed a DTI Fingerprinting technology to quantitatively analyze white matter tissue, which was applied in stroke classification. The TBSS (Tract Based Spatial Statistics) method was employed to generate mask automatically. To evaluate the clustering performance of the automatic method, lesion ROI (Region of Interest) is manually drawn on the DWI images as a reference. The results from the DTI Fingerprinting were compared with those obtained from the reference ROIs. It indicates that the DTI Fingerprinting could identify different states of ischemic stroke and has promising potential to provide a more comprehensive measure of the DTI data. Further development should be carried out to improve DTI Fingerprinting technology in clinics.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Adams, Brian M.; Ebeida, Mohamed Salah; Eldred, Michael S
The Dakota (Design Analysis Kit for Optimization and Terascale Applications) toolkit provides a exible and extensible interface between simulation codes and iterative analysis methods. Dakota contains algorithms for optimization with gradient and nongradient-based methods; uncertainty quanti cation with sampling, reliability, and stochastic expansion methods; parameter estimation with nonlinear least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods. These capabilities may be used on their own or as components within advanced strategies such as surrogate-based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing object-oriented design to implement abstractions of the key components requiredmore » for iterative systems analyses, the Dakota toolkit provides a exible and extensible problem-solving environment for design and performance analysis of computational models on high performance computers. This report serves as a theoretical manual for selected algorithms implemented within the Dakota software. It is not intended as a comprehensive theoretical treatment, since a number of existing texts cover general optimization theory, statistical analysis, and other introductory topics. Rather, this manual is intended to summarize a set of Dakota-related research publications in the areas of surrogate-based optimization, uncertainty quanti cation, and optimization under uncertainty that provide the foundation for many of Dakota's iterative analysis capabilities.« less
Schott, Ann-Sophie; Behr, Jürgen; Quinn, Jennifer; Vogel, Rudi F.
2016-01-01
Lactic acid bacteria (LAB) are widely used as starter cultures in the manufacture of foods. Upon preparation, these cultures undergo various stresses resulting in losses of survival and fitness. In order to find conditions for the subsequent identification of proteomic biomarkers and their exploitation for preconditioning of strains, we subjected Lactobacillus (Lb.) paracasei subsp. paracasei TMW 1.1434 (F19) to different stress qualities (osmotic stress, oxidative stress, temperature stress, pH stress and starvation stress). We analysed the dynamics of its stress responses based on the expression of stress proteins using MALDI-TOF mass spectrometry (MS), which has so far been used for species identification. Exploiting the methodology of accumulating protein expression profiles by MALDI-TOF MS followed by the statistical evaluation with cluster analysis and discriminant analysis of principle components (DAPC), it was possible to monitor the expression of low molecular weight stress proteins, identify a specific time point when the expression of stress proteins reached its maximum, and statistically differentiate types of adaptive responses into groups. Above the specific result for F19 and its stress response, these results demonstrate the discriminatory power of MALDI-TOF MS to characterize even dynamics of stress responses of bacteria and enable a knowledge-based focus on the laborious identification of biomarkers and stress proteins. To our knowledge, the implementation of MALDI-TOF MS protein profiling for the fast and comprehensive analysis of various stress responses is new to the field of bacterial stress responses. Consequently, we generally propose MALDI-TOF MS as an easy and quick method to characterize responses of microbes to different environmental conditions, to focus efforts of more elaborate approaches on time points and dynamics of stress responses. PMID:27783652
Enriched pathways for major depressive disorder identified from a genome-wide association study.
Kao, Chung-Feng; Jia, Peilin; Zhao, Zhongming; Kuo, Po-Hsiu
2012-11-01
Major depressive disorder (MDD) has caused a substantial burden of disease worldwide with moderate heritability. Despite efforts through conducting numerous association studies and now, genome-wide association (GWA) studies, the success of identifying susceptibility loci for MDD has been limited, which is partially attributed to the complex nature of depression pathogenesis. A pathway-based analytic strategy to investigate the joint effects of various genes within specific biological pathways has emerged as a powerful tool for complex traits. The present study aimed to identify enriched pathways for depression using a GWA dataset for MDD. For each gene, we estimated its gene-wise p value using combined and minimum p value, separately. Canonical pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG) and BioCarta were used. We employed four pathway-based analytic approaches (gene set enrichment analysis, hypergeometric test, sum-square statistic, sum-statistic). We adjusted for multiple testing using Benjamini & Hochberg's method to report significant pathways. We found 17 significantly enriched pathways for depression, which presented low-to-intermediate crosstalk. The top four pathways were long-term depression (p⩽1×10-5), calcium signalling (p⩽6×10-5), arrhythmogenic right ventricular cardiomyopathy (p⩽1.6×10-4) and cell adhesion molecules (p⩽2.2×10-4). In conclusion, our comprehensive pathway analyses identified promising pathways for depression that are related to neurotransmitter and neuronal systems, immune system and inflammatory response, which may be involved in the pathophysiological mechanisms underlying depression. We demonstrated that pathway enrichment analysis is promising to facilitate our understanding of complex traits through a deeper interpretation of GWA data. Application of this comprehensive analytic strategy in upcoming GWA data for depression could validate the findings reported in this study.
NASA Astrophysics Data System (ADS)
Boudard, Emmanuel; Morlaix, Sophie
2003-09-01
This article addresses the main predictors of adult education, using statistical methods different from those generally used by social science researchers. Its aim is twofold. First, it seeks to explain in a simple and comprehensible manner the methodological value of these methods (in relation to the use of structural models); secondly, it demonstrates the concrete usefulness of these methods on the basis of a recent piece of research on the data from the International Adult Literacy Survey (IALS).
Assessment of NDE reliability data
NASA Technical Reports Server (NTRS)
Yee, B. G. W.; Couchman, J. C.; Chang, F. H.; Packman, D. F.
1975-01-01
Twenty sets of relevant nondestructive test (NDT) reliability data were identified, collected, compiled, and categorized. A criterion for the selection of data for statistical analysis considerations was formulated, and a model to grade the quality and validity of the data sets was developed. Data input formats, which record the pertinent parameters of the defect/specimen and inspection procedures, were formulated for each NDE method. A comprehensive computer program was written and debugged to calculate the probability of flaw detection at several confidence limits by the binomial distribution. This program also selects the desired data sets for pooling and tests the statistical pooling criteria before calculating the composite detection reliability. An example of the calculated reliability of crack detection in bolt holes by an automatic eddy current method is presented.
NASA Astrophysics Data System (ADS)
Li, Jingwan; Sharma, Ashish; Evans, Jason; Johnson, Fiona
2018-01-01
Addressing systematic biases in regional climate model simulations of extreme rainfall is a necessary first step before assessing changes in future rainfall extremes. Commonly used bias correction methods are designed to match statistics of the overall simulated rainfall with observations. This assumes that change in the mix of different types of extreme rainfall events (i.e. convective and non-convective) in a warmer climate is of little relevance in the estimation of overall change, an assumption that is not supported by empirical or physical evidence. This study proposes an alternative approach to account for the potential change of alternate rainfall types, characterized here by synoptic weather patterns (SPs) using self-organizing maps classification. The objective of this study is to evaluate the added influence of SPs on the bias correction, which is achieved by comparing the corrected distribution of future extreme rainfall with that using conventional quantile mapping. A comprehensive synthetic experiment is first defined to investigate the conditions under which the additional information of SPs makes a significant difference to the bias correction. Using over 600,000 synthetic cases, statistically significant differences are found to be present in 46% cases. This is followed by a case study over the Sydney region using a high-resolution run of the Weather Research and Forecasting (WRF) regional climate model, which indicates a small change in the proportions of the SPs and a statistically significant change in the extreme rainfall over the region, although the differences between the changes obtained from the two bias correction methods are not statistically significant.
ERIC Educational Resources Information Center
Osler, James Edward, II; Mansaray, Mahmud
2014-01-01
Many universities and colleges are increasingly concerned about enhancing the comprehension and knowledge of their students, particularly in the classroom. One of the method to enhancing student success is teaching effectiveness. The objective of this research paper is to propose a novel research model which examines the relationship between…
Advances in segmentation modeling for health communication and social marketing campaigns.
Albrecht, T L; Bryant, C
1996-01-01
Large-scale communication campaigns for health promotion and disease prevention involve analysis of audience demographic and psychographic factors for effective message targeting. A variety of segmentation modeling techniques, including tree-based methods such as Chi-squared Automatic Interaction Detection and logistic regression, are used to identify meaningful target groups within a large sample or population (N = 750-1,000+). Such groups are based on statistically significant combinations of factors (e.g., gender, marital status, and personality predispositions). The identification of groups or clusters facilitates message design in order to address the particular needs, attention patterns, and concerns of audience members within each group. We review current segmentation techniques, their contributions to conceptual development, and cost-effective decision making. Examples from a major study in which these strategies were used are provided from the Texas Women, Infants and Children Program's Comprehensive Social Marketing Program.
Experience and Sentence Processing: Statistical Learning and Relative Clause Comprehension
Wells, Justine B.; Christiansen, Morten H.; Race, David S.; Acheson, Daniel J.; MacDonald, Maryellen C.
2009-01-01
Many explanations of the difficulties associated with interpreting object relative clauses appeal to the demands that object relatives make on working memory. MacDonald and Christiansen (2002) pointed to variations in reading experience as a source of differences, arguing that the unique word order of object relatives makes their processing more difficult and more sensitive to the effects of previous experience than the processing of subject relatives. This hypothesis was tested in a large-scale study manipulating reading experiences of adults over several weeks. The group receiving relative clause experience increased reading speeds for object relatives more than for subject relatives, whereas a control experience group did not. The reading time data were compared to performance of a computational model given different amounts of experience. The results support claims for experience-based individual differences and an important role for statistical learning in sentence comprehension processes. PMID:18922516
Fatty acid, cholesterol, vitamin, and mineral content of cooked beef cuts from a national study
USDA-ARS?s Scientific Manuscript database
The U.S. Department of Agriculture (USDA) provides foundational nutrient data for U.S. and international databases. For currency of retail beef data in USDA’s database, a nationwide comprehensive study obtained samples by primal categories using a statistically based sampling plan, resulting in 72 ...
Comprehensive database of diameter-based biomass regressions for North American tree species
Jennifer C. Jenkins; David C. Chojnacky; Linda S. Heath; Richard A. Birdsey
2004-01-01
A database consisting of 2,640 equations compiled from the literature for predicting the biomass of trees and tree components from diameter measurements of species found in North America. Bibliographic information, geographic locations, diameter limits, diameter and biomass units, equation forms, statistical errors, and coefficients are provided for each equation,...
Interpretation of statistical results.
García Garmendia, J L; Maroto Monserrat, F
2018-02-21
The appropriate interpretation of the statistical results is crucial to understand the advances in medical science. The statistical tools allow us to transform the uncertainty and apparent chaos in nature to measurable parameters which are applicable to our clinical practice. The importance of understanding the meaning and actual extent of these instruments is essential for researchers, the funders of research and for professionals who require a permanent update based on good evidence and supports to decision making. Various aspects of the designs, results and statistical analysis are reviewed, trying to facilitate his comprehension from the basics to what is most common but no better understood, and bringing a constructive, non-exhaustive but realistic look. Copyright © 2018 Elsevier España, S.L.U. y SEMICYUC. All rights reserved.
Crowdsourcing Participatory Evaluation of Medical Pictograms Using Amazon Mechanical Turk
Willis, Matt; Sun, Peiyuan; Wang, Jun
2013-01-01
Background Consumer and patient participation proved to be an effective approach for medical pictogram design, but it can be costly and time-consuming. We proposed and evaluated an inexpensive approach that crowdsourced the pictogram evaluation task to Amazon Mechanical Turk (MTurk) workers, who are usually referred to as the “turkers”. Objective To answer two research questions: (1) Is the turkers’ collective effort effective for identifying design problems in medical pictograms? and (2) Do the turkers’ demographic characteristics affect their performance in medical pictogram comprehension? Methods We designed a Web-based survey (open-ended tests) to ask 100 US turkers to type in their guesses of the meaning of 20 US pharmacopeial pictograms. Two judges independently coded the turkers’ guesses into four categories: correct, partially correct, wrong, and completely wrong. The comprehensibility of a pictogram was measured by the percentage of correct guesses, with each partially correct guess counted as 0.5 correct. We then conducted a content analysis on the turkers’ interpretations to identify misunderstandings and assess whether the misunderstandings were common. We also conducted a statistical analysis to examine the relationship between turkers’ demographic characteristics and their pictogram comprehension performance. Results The survey was completed within 3 days of our posting the task to the MTurk, and the collected data are publicly available in the multimedia appendix for download. The comprehensibility for the 20 tested pictograms ranged from 45% to 98%, with an average of 72.5%. The comprehensibility scores of 10 pictograms were strongly correlated to the scores of the same pictograms reported in another study that used oral response–based open-ended testing with local people. The turkers’ misinterpretations shared common errors that exposed design problems in the pictograms. Participant performance was positively correlated with their educational level. Conclusions The results confirmed that crowdsourcing can be used as an effective and inexpensive approach for participatory evaluation of medical pictograms. Through Web-based open-ended testing, the crowd can effectively identify problems in pictogram designs. The results also confirmed that education has a significant effect on the comprehension of medical pictograms. Since low-literate people are underrepresented in the turker population, further investigation is needed to examine to what extent turkers’ misunderstandings overlap with those elicited from low-literate people. PMID:23732572
Bednarz, Haley M; Maximo, Jose O; Murdaugh, Donna L; O'Kelley, Sarah; Kana, Rajesh K
2017-06-01
Despite intact decoding ability, deficits in reading comprehension are relatively common in children with autism spectrum disorders (ASD). However, few neuroimaging studies have tested the neural bases of this specific profile of reading deficit in ASD. This fMRI study examined activation and synchronization of the brain's reading network in children with ASD with specific reading comprehension deficits during a word similarities task. Thirteen typically developing children and 18 children with ASD performed the task in the MRI scanner. No statistically significant group differences in functional activation were observed; however, children with ASD showed decreased functional connectivity between the left inferior frontal gyrus (LIFG) and the left inferior occipital gyrus (LIOG). In addition, reading comprehension ability significantly positively predicted functional connectivity between the LIFG and left thalamus (LTHAL) among all subjects. The results of this study provide evidence for altered recruitment of reading-related neural resources in ASD children and suggest specific weaknesses in top-down modulation of semantic processing. Copyright © 2017 Elsevier Inc. All rights reserved.
2012-01-01
Background Teaching evidence-based medicine (EBM) should be evaluated and guided by evidence of its own effectiveness. However, no data are available on adoption of EBM by Syrian undergraduate, postgraduate, or practicing physicians. In fact, the teaching of EBM in Syria is not yet a part of undergraduate medical curricula. The authors evaluated education of evidence-based medicine through a two-day intensive training course. Methods The authors evaluated education of evidence-based medicine through a two-day intensive training course that took place in 2011. The course included didactic lectures as well as interactive hands-on workshops on all topics of EBM. A comprehensive questionnaire, that included the Berlin questionnaire, was used to inspect medical students’ awareness of, attitudes toward, and competencies’ in EBM. Results According to students, problems facing proper EBM practice in Syria were the absence of the following: an EBM teaching module in medical school curriculum (94%), role models among professors and instructors (92%), a librarian (70%), institutional subscription to medical journals (94%), and sufficient IT hardware (58%). After the course, there was a statistically significant increase in medical students' perceived ability to go through steps of EBM, namely: formulating PICO questions (56.9%), searching for evidence (39.8%), appraising the evidence (27.3%), understanding statistics (48%), and applying evidence at point of care (34.1%). However, mean increase in Berlin scores after the course was 2.68, a non-statistically significant increase of 17.86%. Conclusion The road to a better EBM reality in Syria starts with teaching EBM in medical school and developing the proper environment to facilitate transforming current medical education and practice to an evidence-based standard in Syria. PMID:22882872
Kim, Yoonsang; Choi, Young-Ku; Emery, Sherry
2013-08-01
Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods' performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages-SAS GLIMMIX Laplace and SuperMix Gaussian quadrature-perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes.
Dosimetric treatment course simulation based on a statistical model of deformable organ motion
NASA Astrophysics Data System (ADS)
Söhn, M.; Sobotta, B.; Alber, M.
2012-06-01
We present a method of modeling dosimetric consequences of organ deformation and correlated motion of adjacent organ structures in radiotherapy. Based on a few organ geometry samples and the respective deformation fields as determined by deformable registration, principal component analysis (PCA) is used to create a low-dimensional parametric statistical organ deformation model (Söhn et al 2005 Phys. Med. Biol. 50 5893-908). PCA determines the most important geometric variability in terms of eigenmodes, which represent 3D vector fields of correlated organ deformations around the mean geometry. Weighted sums of a few dominating eigenmodes can be used to simulate synthetic geometries, which are statistically meaningful inter- and extrapolations of the input geometries, and predict their probability of occurrence. We present the use of PCA as a versatile treatment simulation tool, which allows comprehensive dosimetric assessment of the detrimental effects that deformable geometric uncertainties can have on a planned dose distribution. For this, a set of random synthetic geometries is generated by a PCA model for each simulated treatment course, and the dose of a given treatment plan is accumulated in the moving tissue elements via dose warping. This enables the calculation of average voxel doses, local dose variability, dose-volume histogram uncertainties, marginal as well as joint probability distributions of organ equivalent uniform doses and thus of TCP and NTCP, and other dosimetric and biologic endpoints. The method is applied to the example of deformable motion of prostate/bladder/rectum in prostate IMRT. Applications include dosimetric assessment of the adequacy of margin recipes, adaptation schemes, etc, as well as prospective ‘virtual’ evaluation of the possible benefits of new radiotherapy schemes.
Dosimetric treatment course simulation based on a statistical model of deformable organ motion.
Söhn, M; Sobotta, B; Alber, M
2012-06-21
We present a method of modeling dosimetric consequences of organ deformation and correlated motion of adjacent organ structures in radiotherapy. Based on a few organ geometry samples and the respective deformation fields as determined by deformable registration, principal component analysis (PCA) is used to create a low-dimensional parametric statistical organ deformation model (Söhn et al 2005 Phys. Med. Biol. 50 5893-908). PCA determines the most important geometric variability in terms of eigenmodes, which represent 3D vector fields of correlated organ deformations around the mean geometry. Weighted sums of a few dominating eigenmodes can be used to simulate synthetic geometries, which are statistically meaningful inter- and extrapolations of the input geometries, and predict their probability of occurrence. We present the use of PCA as a versatile treatment simulation tool, which allows comprehensive dosimetric assessment of the detrimental effects that deformable geometric uncertainties can have on a planned dose distribution. For this, a set of random synthetic geometries is generated by a PCA model for each simulated treatment course, and the dose of a given treatment plan is accumulated in the moving tissue elements via dose warping. This enables the calculation of average voxel doses, local dose variability, dose-volume histogram uncertainties, marginal as well as joint probability distributions of organ equivalent uniform doses and thus of TCP and NTCP, and other dosimetric and biologic endpoints. The method is applied to the example of deformable motion of prostate/bladder/rectum in prostate IMRT. Applications include dosimetric assessment of the adequacy of margin recipes, adaptation schemes, etc, as well as prospective 'virtual' evaluation of the possible benefits of new radiotherapy schemes.
Comparison between Two Methods for agricultural drought disaster risk in southwestern China
NASA Astrophysics Data System (ADS)
han, lanying; zhang, qiang
2016-04-01
The drought is a natural disaster, which lead huge loss to agricultural yield in the world. The drought risk has become increasingly prominent because of the climatic warming during the past century, and which is also one of the main meteorological disasters and serious problem in southwestern China, where drought risk exceeds the national average. Climate change is likely to exacerbate the problem, thereby endangering Chinaʹs food security. In this paper, drought disaster in the southwestern China (where there are serious drought risk and the comprehensive loss accounted for 3.9% of national drought area) were selected to show the drought change under climate change, and two methods were used to assess the drought disaster risk, drought risk assessment model and comprehensive drought risk index. Firstly, we used the analytic hierarchy process and meteorological, geographic, soil, and remote-sensing data to develop a drought risk assessment model (defined using a comprehensive drought disaster risk index, R) based on the drought hazard, environmental vulnerability, sensitivity and exposure of the values at risk, and capacity to prevent or mitigate the problem. Second, we built the comprehensive drought risk index (defined using a comprehensive drought disaster loss, L) based on statistical drought disaster data, including crop yields, drought-induced areas, drought-occurred areas, no harvest areas caused by drought and planting areas. Using the model, we assessed the drought risk. The results showed that spatial distribution of two drought disaster risks were coherent, and revealed complex zonality in southwestern China. The results also showed the drought risk is becoming more and more serious and frequent in the country under the global climatic warming background. The eastern part of the study area had an extremely high risk, and risk was generally greater in the north than in the south, and increased from southwest to northeast. The drought disaster risk or loss was highest in Sichuan Province and Chongqing Municipality. It was lowest in Yunnan province. The comprehensive drought disaster loss were uptrend in nearly 60 years, and the trend of drought occurrence in nearly 60 years was overall upward in every province of Xinan region. Drought risk of all provinces has certain relationship with the regional climate change, such as temperature and precipitation, soil moisture and vegetation coverage. The contribution of the risk factors to R was highest for the capacity for prevention and mitigation, followed by the drought hazard, sensitivity and exposure, and environmental vulnerability.
Individual Differences in Statistical Learning Predict Children's Comprehension of Syntax
ERIC Educational Resources Information Center
Kidd, Evan; Arciuli, Joanne
2016-01-01
Variability in children's language acquisition is likely due to a number of cognitive and social variables. The current study investigated whether individual differences in statistical learning (SL), which has been implicated in language acquisition, independently predicted 6- to 8-year-old's comprehension of syntax. Sixty-eight (N = 68)…
NASA Astrophysics Data System (ADS)
Smid, Marek; Costa, Ana; Pebesma, Edzer; Granell, Carlos; Bhattacharya, Devanjan
2016-04-01
Human kind is currently predominantly urban based, and the majority of ever continuing population growth will take place in urban agglomerations. Urban systems are not only major drivers of climate change, but also the impact hot spots. Furthermore, climate change impacts are commonly managed at city scale. Therefore, assessing climate change impacts on urban systems is a very relevant subject of research. Climate and its impacts on all levels (local, meso and global scale) and also the inter-scale dependencies of those processes should be a subject to detail analysis. While global and regional projections of future climate are currently available, local-scale information is lacking. Hence, statistical downscaling methodologies represent a potentially efficient way to help to close this gap. In general, the methodological reviews of downscaling procedures cover the various methods according to their application (e.g. downscaling for the hydrological modelling). Some of the most recent and comprehensive studies, such as the ESSEM COST Action ES1102 (VALUE), use the concept of Perfect Prog and MOS. Other examples of classification schemes of downscaling techniques consider three main categories: linear methods, weather classifications and weather generators. Downscaling and climate modelling represent a multidisciplinary field, where researchers from various backgrounds intersect their efforts, resulting in specific terminology, which may be somewhat confusing. For instance, the Polynomial Regression (also called the Surface Trend Analysis) is a statistical technique. In the context of the spatial interpolation procedures, it is commonly classified as a deterministic technique, and kriging approaches are classified as stochastic. Furthermore, the terms "statistical" and "stochastic" (frequently used as names of sub-classes in downscaling methodological reviews) are not always considered as synonymous, even though both terms could be seen as identical since they are referring to methods handling input modelling factors as variables with certain probability distributions. In addition, the recent development is going towards multi-step methodologies containing deterministic and stochastic components. This evolution leads to the introduction of new terms like hybrid or semi-stochastic approaches, which makes the efforts to systematically classifying downscaling methods to the previously defined categories even more challenging. This work presents a review of statistical downscaling procedures, which classifies the methods in two steps. In the first step, we describe several techniques that produce a single climatic surface based on observations. The methods are classified into two categories using an approximation to the broadest consensual statistical terms: linear and non-linear methods. The second step covers techniques that use simulations to generate alternative surfaces, which correspond to different realizations of the same processes. Those simulations are essential because there is a limited number of real observational data, and such procedures are crucial for modelling extremes. This work emphasises the link between statistical downscaling methods and the research of climate change impacts at city scale.
A Study of Bicycle and Passenger Car Collisions Based on Insurance Claims Data
Isaksson-Hellman, Irene
2012-01-01
In Sweden, bicycle crashes are under-reported in the official statistics that are based on police reports. Statistics from hospital reports show that cyclists constitute the highest percentage of severely injured road users compared to other road user groups. However, hospital reports lack detailed information about the crash. To get a more comprehensive view, additional data are needed to accurately reflect the casualty situation for cyclists. An analysis based on 438 cases of bicycle and passenger car collisions is presented, using data collected from insurance claims. The most frequent crash situations are described with factors as to where and when collisions occur, age and gender of the involved cyclists and drivers. Information on environmental circumstances such as road status, weather- and light conditions, speedlimits and traffic environment is also included. Based on the various crash events, a total of 32 different scenarios have been categorized, and it was found that more than 75% were different kinds of intersection related situations. From the data, it was concluded that factors such as estimated impact speed and age significantly influence injury severity. The insurance claims data complement the official statistics and provide a more comprehensive view of bicycle and passenger car collisions by considering all levels of crash and injury severity. The detailed descriptions of the crash situations also provide an opportunity to find countermeasures to prevent or mitigate collisions. The results provide a useful basis, and facilitates the work of reducing the number of bicycle and passenger car collisions with serious consequences. PMID:23169111
Discrete Inverse and State Estimation Problems
NASA Astrophysics Data System (ADS)
Wunsch, Carl
2006-06-01
The problems of making inferences about the natural world from noisy observations and imperfect theories occur in almost all scientific disciplines. This book addresses these problems using examples taken from geophysical fluid dynamics. It focuses on discrete formulations, both static and time-varying, known variously as inverse, state estimation or data assimilation problems. Starting with fundamental algebraic and statistical ideas, the book guides the reader through a range of inference tools including the singular value decomposition, Gauss-Markov and minimum variance estimates, Kalman filters and related smoothers, and adjoint (Lagrange multiplier) methods. The final chapters discuss a variety of practical applications to geophysical flow problems. Discrete Inverse and State Estimation Problems is an ideal introduction to the topic for graduate students and researchers in oceanography, meteorology, climate dynamics, and geophysical fluid dynamics. It is also accessible to a wider scientific audience; the only prerequisite is an understanding of linear algebra. Provides a comprehensive introduction to discrete methods of inference from incomplete information Based upon 25 years of practical experience using real data and models Develops sequential and whole-domain analysis methods from simple least-squares Contains many examples and problems, and web-based support through MIT opencourseware
A comprehensive prediction and evaluation method of pilot workload
Feng, Chuanyan; Wanyan, Xiaoru; Yang, Kun; Zhuang, Damin; Wu, Xu
2018-01-01
BACKGROUND: The prediction and evaluation of pilot workload is a key problem in human factor airworthiness of cockpit. OBJECTIVE: A pilot traffic pattern task was designed in a flight simulation environment in order to carry out the pilot workload prediction and improve the evaluation method. METHODS: The prediction of typical flight subtasks and dynamic workloads (cruise, approach, and landing) were built up based on multiple resource theory, and a favorable validity was achieved by the correlation analysis verification between sensitive physiological data and the predicted value. RESULTS: Statistical analysis indicated that eye movement indices (fixation frequency, mean fixation time, saccade frequency, mean saccade time, and mean pupil diameter), Electrocardiogram indices (mean normal-to-normal interval and the ratio between low frequency and sum of low frequency and high frequency), and Electrodermal Activity indices (mean tonic and mean phasic) were all sensitive to typical workloads of subjects. CONCLUSION: A multinominal logistic regression model based on combination of physiological indices (fixation frequency, mean normal-to-normal interval, the ratio between low frequency and sum of low frequency and high frequency, and mean tonic) was constructed, and the discriminate accuracy was comparatively ideal with a rate of 84.85%. PMID:29710742
Sanchez, Ana M; Denny, Thomas N; O'Gorman, Maurice
2014-07-01
This Special Issue of the Journal of Immunological Methods includes 16 manuscripts describing quality assurance activities related to virologic and immunologic monitoring of six global laboratory resource programs that support international HIV/AIDS clinical trial studies: Collaboration for AIDS Vaccine Discovery (CAVD); Center for HIV/AIDS Vaccine Immunology (CHAVI); External Quality Assurance Program Oversight Laboratory (EQAPOL); HIV Vaccine Trial Network (HVTN); International AIDS Vaccine Initiative (IAVI); and Immunology Quality Assessment (IQA). The reports from these programs address the many components required to develop comprehensive quality control activities and subsequent quality assurance programs for immune monitoring in global clinical trials including: all aspects of processing, storing, and quality assessment of PBMC preparations used ubiquitously in HIV clinical trials, the development and optimization of assays for CD8 HIV responses and HIV neutralization, a comprehensive global HIV virus repository, and reports on the development and execution of novel external proficiency testing programs for immunophenotyping, intracellular cytokine staining, ELISPOT and luminex based cytokine measurements. In addition, there are articles describing the implementation of Good Clinical Laboratory Practices (GCLP) in a large quality assurance laboratory, the development of statistical methods specific for external proficiency testing assessment, a discussion on the ability to set objective thresholds for measuring rare events by flow cytometry, and finally, a manuscript which addresses a framework for the structured reporting of T cell immune function based assays. It is anticipated that this series of manuscripts covering a wide range of quality assurance activities associated with the conduct of global clinical trials will provide a resource for individuals and programs involved in improving the harmonization, standardization, accuracy, and sensitivity of virologic and immunologic testing. Copyright © 2014 Elsevier B.V. All rights reserved.
Cross-validation and Peeling Strategies for Survival Bump Hunting using Recursive Peeling Methods
Dazard, Jean-Eudes; Choe, Michael; LeBlanc, Michael; Rao, J. Sunil
2015-01-01
We introduce a framework to build a survival/risk bump hunting model with a censored time-to-event response. Our Survival Bump Hunting (SBH) method is based on a recursive peeling procedure that uses a specific survival peeling criterion derived from non/semi-parametric statistics such as the hazards-ratio, the log-rank test or the Nelson--Aalen estimator. To optimize the tuning parameter of the model and validate it, we introduce an objective function based on survival or prediction-error statistics, such as the log-rank test and the concordance error rate. We also describe two alternative cross-validation techniques adapted to the joint task of decision-rule making by recursive peeling and survival estimation. Numerical analyses show the importance of replicated cross-validation and the differences between criteria and techniques in both low and high-dimensional settings. Although several non-parametric survival models exist, none addresses the problem of directly identifying local extrema. We show how SBH efficiently estimates extreme survival/risk subgroups unlike other models. This provides an insight into the behavior of commonly used models and suggests alternatives to be adopted in practice. Finally, our SBH framework was applied to a clinical dataset. In it, we identified subsets of patients characterized by clinical and demographic covariates with a distinct extreme survival outcome, for which tailored medical interventions could be made. An R package PRIMsrc (Patient Rule Induction Method in Survival, Regression and Classification settings) is available on CRAN (Comprehensive R Archive Network) and GitHub. PMID:27034730
Wu, Jiaxin; Li, Yanda; Jiang, Rui
2014-03-01
Exome sequencing has been widely used in detecting pathogenic nonsynonymous single nucleotide variants (SNVs) for human inherited diseases. However, traditional statistical genetics methods are ineffective in analyzing exome sequencing data, due to such facts as the large number of sequenced variants, the presence of non-negligible fraction of pathogenic rare variants or de novo mutations, and the limited size of affected and normal populations. Indeed, prevalent applications of exome sequencing have been appealing for an effective computational method for identifying causative nonsynonymous SNVs from a large number of sequenced variants. Here, we propose a bioinformatics approach called SPRING (Snv PRioritization via the INtegration of Genomic data) for identifying pathogenic nonsynonymous SNVs for a given query disease. Based on six functional effect scores calculated by existing methods (SIFT, PolyPhen2, LRT, MutationTaster, GERP and PhyloP) and five association scores derived from a variety of genomic data sources (gene ontology, protein-protein interactions, protein sequences, protein domain annotations and gene pathway annotations), SPRING calculates the statistical significance that an SNV is causative for a query disease and hence provides a means of prioritizing candidate SNVs. With a series of comprehensive validation experiments, we demonstrate that SPRING is valid for diseases whose genetic bases are either partly known or completely unknown and effective for diseases with a variety of inheritance styles. In applications of our method to real exome sequencing data sets, we show the capability of SPRING in detecting causative de novo mutations for autism, epileptic encephalopathies and intellectual disability. We further provide an online service, the standalone software and genome-wide predictions of causative SNVs for 5,080 diseases at http://bioinfo.au.tsinghua.edu.cn/spring.
Hierarchical multivariate covariance analysis of metabolic connectivity
Carbonell, Felix; Charil, Arnaud; Zijdenbos, Alex P; Evans, Alan C; Bedell, Barry J
2014-01-01
Conventional brain connectivity analysis is typically based on the assessment of interregional correlations. Given that correlation coefficients are derived from both covariance and variance, group differences in covariance may be obscured by differences in the variance terms. To facilitate a comprehensive assessment of connectivity, we propose a unified statistical framework that interrogates the individual terms of the correlation coefficient. We have evaluated the utility of this method for metabolic connectivity analysis using [18F]2-fluoro-2-deoxyglucose (FDG) positron emission tomography (PET) data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study. As an illustrative example of the utility of this approach, we examined metabolic connectivity in angular gyrus and precuneus seed regions of mild cognitive impairment (MCI) subjects with low and high β-amyloid burdens. This new multivariate method allowed us to identify alterations in the metabolic connectome, which would not have been detected using classic seed-based correlation analysis. Ultimately, this novel approach should be extensible to brain network analysis and broadly applicable to other imaging modalities, such as functional magnetic resonance imaging (MRI). PMID:25294129
Anima: Modular Workflow System for Comprehensive Image Data Analysis
Rantanen, Ville; Valori, Miko; Hautaniemi, Sampsa
2014-01-01
Modern microscopes produce vast amounts of image data, and computational methods are needed to analyze and interpret these data. Furthermore, a single image analysis project may require tens or hundreds of analysis steps starting from data import and pre-processing to segmentation and statistical analysis; and ending with visualization and reporting. To manage such large-scale image data analysis projects, we present here a modular workflow system called Anima. Anima is designed for comprehensive and efficient image data analysis development, and it contains several features that are crucial in high-throughput image data analysis: programing language independence, batch processing, easily customized data processing, interoperability with other software via application programing interfaces, and advanced multivariate statistical analysis. The utility of Anima is shown with two case studies focusing on testing different algorithms developed in different imaging platforms and an automated prediction of alive/dead C. elegans worms by integrating several analysis environments. Anima is a fully open source and available with documentation at www.anduril.org/anima. PMID:25126541
Pathway analysis with next-generation sequencing data.
Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric; Xiong, Momiao
2015-04-01
Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.
Zhang, Yan; Deng, Xi-Hai; Peng, Bu-Zhuo
2006-08-01
It is difficult to evaluate comprehensive quality of sediment and to understand development trend of pollution because of absence of monitoring data, especially history data. Combining the method of 137Cs dating with the ways of general sampling and measurement can easily resolve the problem of absence of data and also provide the possibility for calculating weighted environmental quality comprehensive index using the adjusted analytical hierarchy process (AHP) method. In order to overcome the willfulness the judgment matrix is formed objectively based on calculating monitoring data. Based on the monitoring data of sediment pollution and the weights of various factors gained by adjusted AHP method the comprehensive quality of sediment in each zone of Dianchi Lake was evaluated and the results indicated that the pollution of sediments in each zone at the present be serious more than that in the history. The condition may be related to the industrial development and distribution of industries in Dianchi Lake basin. Therefore, in order to improve the comprehensive quality of sediment in Dianchi Lake and to prevent the secondary pollution of heavy metals in sediment from happening, it is necessary to control the pollutants discharge and to remove the pollutants with various ways.
2010-01-01
Background Irregularly shaped spatial clusters are difficult to delineate. A cluster found by an algorithm often spreads through large portions of the map, impacting its geographical meaning. Penalized likelihood methods for Kulldorff's spatial scan statistics have been used to control the excessive freedom of the shape of clusters. Penalty functions based on cluster geometry and non-connectivity have been proposed recently. Another approach involves the use of a multi-objective algorithm to maximize two objectives: the spatial scan statistics and the geometric penalty function. Results & Discussion We present a novel scan statistic algorithm employing a function based on the graph topology to penalize the presence of under-populated disconnection nodes in candidate clusters, the disconnection nodes cohesion function. A disconnection node is defined as a region within a cluster, such that its removal disconnects the cluster. By applying this function, the most geographically meaningful clusters are sifted through the immense set of possible irregularly shaped candidate cluster solutions. To evaluate the statistical significance of solutions for multi-objective scans, a statistical approach based on the concept of attainment function is used. In this paper we compared different penalized likelihoods employing the geometric and non-connectivity regularity functions and the novel disconnection nodes cohesion function. We also build multi-objective scans using those three functions and compare them with the previous penalized likelihood scans. An application is presented using comprehensive state-wide data for Chagas' disease in puerperal women in Minas Gerais state, Brazil. Conclusions We show that, compared to the other single-objective algorithms, multi-objective scans present better performance, regarding power, sensitivity and positive predicted value. The multi-objective non-connectivity scan is faster and better suited for the detection of moderately irregularly shaped clusters. The multi-objective cohesion scan is most effective for the detection of highly irregularly shaped clusters. PMID:21034451
Orientation Examples Showing Application of the C.A.M.P.U.S. Simulation Model.
ERIC Educational Resources Information Center
Hansen, B. L.; Barron, J. G.
This pamphlet contains information and examples intended to show how the University of Toronto C.A.M.P.U.S. model operates. C.A.M.P.U.S. (Comprehensive Analytical Method for Planning in the University Sphere) is a computer model which processes projected enrollment statistics and other necessary information in such a way as to yield time-based…
WordCluster: detecting clusters of DNA words and genomic elements
2011-01-01
Background Many k-mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds. Results We introduce here an algorithm to detect clusters of DNA words (k-mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used WordCluster to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome. Conclusions WordCluster seems to predict biological meaningful clusters of DNA words (k-mers) and genomic entities. The implementation of the method into a web server is available at http://bioinfo2.ugr.es/wordCluster/wordCluster.php including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes. PMID:21261981
A generalized plate method for estimating total aerobic microbial count.
Ho, Kai Fai
2004-01-01
The plate method outlined in Chapter 61: Microbial Limit Tests of the U.S. Pharmacopeia (USP 61) provides very specific guidance for assessing total aerobic bioburden in pharmaceutical articles. This methodology, while comprehensive, lacks the flexibility to be useful in all situations. By studying the plate method as a special case within a more general family of assays, the effects of each parameter in the guidance can be understood. Using a mathematical model to describe the plate counting procedure, a statistical framework for making more definitive statements about total aerobic bioburden is developed. Such a framework allows the laboratory scientist to adjust the USP 61 methods to satisfy specific practical constraints. In particular, it is shown that the plate method can be conducted, albeit with stricter acceptance criteria, using a test specimen quantity that is smaller than the 10 g or 10 mL prescribed in the guidance. Finally, the interpretation of results proffered by the guidance is re-examined within this statistical framework and shown to be overly aggressive.
Physics-based statistical model and simulation method of RF propagation in urban environments
Pao, Hsueh-Yuan; Dvorak, Steven L.
2010-09-14
A physics-based statistical model and simulation/modeling method and system of electromagnetic wave propagation (wireless communication) in urban environments. In particular, the model is a computationally efficient close-formed parametric model of RF propagation in an urban environment which is extracted from a physics-based statistical wireless channel simulation method and system. The simulation divides the complex urban environment into a network of interconnected urban canyon waveguides which can be analyzed individually; calculates spectral coefficients of modal fields in the waveguides excited by the propagation using a database of statistical impedance boundary conditions which incorporates the complexity of building walls in the propagation model; determines statistical parameters of the calculated modal fields; and determines a parametric propagation model based on the statistical parameters of the calculated modal fields from which predictions of communications capability may be made.
NASA Technical Reports Server (NTRS)
Petty, Grant W.
1990-01-01
A reasonably rigorous basis for understanding and extracting the physical information content of Special Sensor Microwave/Imager (SSM/I) satellite images of the marine environment is provided. To this end, a comprehensive algebraic parameterization is developed for the response of the SSM/I to a set of nine atmospheric and ocean surface parameters. The brightness temperature model includes a closed-form approximation to microwave radiative transfer in a non-scattering atmosphere and fitted models for surface emission and scattering based on geometric optics calculations for the roughened sea surface. The combined model is empirically tuned using suitable sets of SSM/I data and coincident surface observations. The brightness temperature model is then used to examine the sensitivity of the SSM/I to realistic variations in the scene being observed and to evaluate the theoretical maximum precision of global SSM/I retrievals of integrated water vapor, integrated cloud liquid water, and surface wind speed. A general minimum-variance method for optimally retrieving geophysical parameters from multichannel brightness temperature measurements is outlined, and several global statistical constraints of the type required by this method are computed. Finally, a unified set of efficient statistical and semi-physical algorithms is presented for obtaining fields of surface wind speed, integrated water vapor, cloud liquid water, and precipitation from SSM/I brightness temperature data. Features include: a semi-physical method for retrieving integrated cloud liquid water at 15 km resolution and with rms errors as small as approximately 0.02 kg/sq m; a 3-channel statistical algorithm for integrated water vapor which was constructed so as to have improved linear response to water vapor and reduced sensitivity to precipitation; and two complementary indices of precipitation activity (based on 37 GHz attenuation and 85 GHz scattering, respectively), each of which are relatively insensitive to variations in other environmental parameters.
Kim, Yoonsang; Emery, Sherry
2013-01-01
Several statistical packages are capable of estimating generalized linear mixed models and these packages provide one or more of three estimation methods: penalized quasi-likelihood, Laplace, and Gauss-Hermite. Many studies have investigated these methods’ performance for the mixed-effects logistic regression model. However, the authors focused on models with one or two random effects and assumed a simple covariance structure between them, which may not be realistic. When there are multiple correlated random effects in a model, the computation becomes intensive, and often an algorithm fails to converge. Moreover, in our analysis of smoking status and exposure to anti-tobacco advertisements, we have observed that when a model included multiple random effects, parameter estimates varied considerably from one statistical package to another even when using the same estimation method. This article presents a comprehensive review of the advantages and disadvantages of each estimation method. In addition, we compare the performances of the three methods across statistical packages via simulation, which involves two- and three-level logistic regression models with at least three correlated random effects. We apply our findings to a real dataset. Our results suggest that two packages—SAS GLIMMIX Laplace and SuperMix Gaussian quadrature—perform well in terms of accuracy, precision, convergence rates, and computing speed. We also discuss the strengths and weaknesses of the two packages in regard to sample sizes. PMID:24288415
ERIC Educational Resources Information Center
Parrott, Roxanne; Silk, Kami; Dorgan, Kelly; Condit, Celeste; Harris, Tina
2005-01-01
Too little theory and research has considered the effects of communicating statistics in various forms on comprehension, perceptions of evidence quality, or evaluations of message persuasiveness. In a considered extension of Subjective Message Construct Theory (Morley, 1987), we advance a rationale relating evidence form to the formation of…
Text-Based Recall and Extra-Textual Generations Resulting from Simplified and Authentic Texts
ERIC Educational Resources Information Center
Crossley, Scott A.; McNamara, Danielle S.
2016-01-01
This study uses a moving windows self-paced reading task to assess text comprehension of beginning and intermediate-level simplified texts and authentic texts by L2 learners engaged in a text-retelling task. Linear mixed effects (LME) models revealed statistically significant main effects for reading proficiency and text level on the number of…
Using Bloom's Taxonomy to Evaluate the Cognitive Levels of Master Class Textbook's Questions
ERIC Educational Resources Information Center
Assaly, Ibtihal R.; Smadi, Oqlah M.
2015-01-01
This study aimed at evaluating the cognitive levels of the questions following the reading texts of Master Class textbook. A checklist based on Bloom's Taxonomy was the instrument used to categorize the cognitive levels of these questions. The researchers used proper statistics to rank the cognitive levels of the comprehension questions. The…
The State of Arizona's Children 1997: Kids Count Data Book.
ERIC Educational Resources Information Center
Thompson, Anne, Ed.
This Kids Count report examines statewide trends between 1990 and 1996 in the well-being of Arizona's children. The statistical portrait is based on 16 indicators of well-being: (1) prenatal care; (2) incidence of low birth weight; (3) state-approved child care spaces; (4) comprehensive preschool services; (5) lack of health insurance; (6) infant…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gueorguiev, G; Cotter, C; Young, M
2016-06-15
Purpose: To present a 3D QA method and clinical results for 550 patients. Methods: Five hundred and fifty patient treatment deliveries (400 IMRT, 75 SBRT and 75 VMAT) from various treatment sites, planned on Raystation treatment planning system (TPS), were measured on three beam-matched Elekta linear accelerators using IBA’s COMPASS system. The difference between TPS computed and delivered dose was evaluated in 3D by applying three statistical parameters to each structure of interest: absolute average dose difference (AADD, 6% allowed difference), absolute dose difference greater than 6% (ADD6, 4% structure volume allowed to fail) and 3D gamma test (3%/3mm DTA,more » 4% structure volume allowed to fail). If the allowed value was not met for a given structure, manual review was performed. The review consisted of overlaying dose difference or gamma results with the patient CT, scrolling through the slices. For QA to pass, areas of high dose difference or gamma must be small and not on consecutive slices. For AADD to manually pass QA, the average dose difference in cGy must be less than 50cGy. The QA protocol also includes DVH analysis based on QUANTEC and TG-101 recommended dose constraints. Results: Figures 1–3 show the results for the three parameters per treatment modality. Manual review was performed on 67 deliveries (27 IMRT, 22 SBRT and 18 VMAT), for which all passed QA. Results show that statistical parameter AADD may be overly sensitive for structures receiving low dose, especially for the SBRT deliveries (Fig.1). The TPS computed and measured DVH values were in excellent agreement and with minimum difference. Conclusion: Applying DVH analysis and different statistical parameters to any structure of interest, as part of the 3D QA protocol, provides a comprehensive treatment plan evaluation. Author G. Gueorguiev discloses receiving travel and research funding from IBA for unrelated to this project work. Author B. Crawford discloses receiving travel funding from IBA for unrelated to this project work.« less
Hybrid statistics-simulations based method for atom-counting from ADF STEM images.
De Wael, Annelies; De Backer, Annick; Jones, Lewys; Nellist, Peter D; Van Aert, Sandra
2017-06-01
A hybrid statistics-simulations based method for atom-counting from annular dark field scanning transmission electron microscopy (ADF STEM) images of monotype crystalline nanostructures is presented. Different atom-counting methods already exist for model-like systems. However, the increasing relevance of radiation damage in the study of nanostructures demands a method that allows atom-counting from low dose images with a low signal-to-noise ratio. Therefore, the hybrid method directly includes prior knowledge from image simulations into the existing statistics-based method for atom-counting, and accounts in this manner for possible discrepancies between actual and simulated experimental conditions. It is shown by means of simulations and experiments that this hybrid method outperforms the statistics-based method, especially for low electron doses and small nanoparticles. The analysis of a simulated low dose image of a small nanoparticle suggests that this method allows for far more reliable quantitative analysis of beam-sensitive materials. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Chen, Yi
2018-03-01
The comprehensive water quality identification index method is able to assess the general water quality situation comprehensively and represent the water quality classification; water environment functional zone achieves pollution level and standard objectively and systematically. This paper selects 3 representative zones along deep-water channel of Guangzhou port and applies comprehensive water quality identification index method to calculate sea water quality monitoring data for different selected zones from year 2006 to 2014, in order to investigate the temporal variation of water quality along deep-water channel of Guangzhou port. The comprehensive water quality level from north to south presents an increased trend, and the water quality of the three zones in 2014 is much better than in 2006. This paper puts forward environmental protection measurements and suggestions for Pearl River Estuary, provides data support and theoretical basis for studied sea area pollution prevention and control.
Use of derivatives to assess preservation of hydrocarbon deposits
NASA Astrophysics Data System (ADS)
Koshkin, K. A.; Melkishev, O. A.
2018-05-01
The paper considers the calculation of derivatives along the surface of a modern and paleostructure map of a Tl2-b formation top used to forecast the preservation of oil and gas deposits in traps according to 3D seismic survey via statistical methods. It also suggests a method to evaluate morphological changes of the formation top by calculating the difference between derivatives. The proposed method allows analyzing structural changes of the formation top in time towards primary migration of hydrocarbons. The comprehensive use of calculated indicators allowed ranking the prepared structures in terms of preservation of hydrocarbon deposits.
The Impact of Normalization Methods on RNA-Seq Data Analysis
Zyprych-Walczak, J.; Szabelska, A.; Handschuh, L.; Górczak, K.; Klamecka, K.; Figlerowicz, M.; Siatkowski, I.
2015-01-01
High-throughput sequencing technologies, such as the Illumina Hi-seq, are powerful new tools for investigating a wide range of biological and medical problems. Massive and complex data sets produced by the sequencers create a need for development of statistical and computational methods that can tackle the analysis and management of data. The data normalization is one of the most crucial steps of data processing and this process must be carefully considered as it has a profound effect on the results of the analysis. In this work, we focus on a comprehensive comparison of five normalization methods related to sequencing depth, widely used for transcriptome sequencing (RNA-seq) data, and their impact on the results of gene expression analysis. Based on this study, we suggest a universal workflow that can be applied for the selection of the optimal normalization procedure for any particular data set. The described workflow includes calculation of the bias and variance values for the control genes, sensitivity and specificity of the methods, and classification errors as well as generation of the diagnostic plots. Combining the above information facilitates the selection of the most appropriate normalization method for the studied data sets and determines which methods can be used interchangeably. PMID:26176014
Monte Carlo investigation of thrust imbalance of solid rocket motor pairs
NASA Technical Reports Server (NTRS)
Sforzini, R. H.; Foster, W. A., Jr.
1976-01-01
The Monte Carlo method of statistical analysis is used to investigate the theoretical thrust imbalance of pairs of solid rocket motors (SRMs) firing in parallel. Sets of the significant variables are selected using a random sampling technique and the imbalance calculated for a large number of motor pairs using a simplified, but comprehensive, model of the internal ballistics. The treatment of burning surface geometry allows for the variations in the ovality and alignment of the motor case and mandrel as well as those arising from differences in the basic size dimensions and propellant properties. The analysis is used to predict the thrust-time characteristics of 130 randomly selected pairs of Titan IIIC SRMs. A statistical comparison of the results with test data for 20 pairs shows the theory underpredicts the standard deviation in maximum thrust imbalance by 20% with variability in burning times matched within 2%. The range in thrust imbalance of Space Shuttle type SRM pairs is also estimated using applicable tolerances and variabilities and a correction factor based on the Titan IIIC analysis.
NASA Astrophysics Data System (ADS)
Torres Irribarra, D.; Freund, R.; Fisher, W.; Wilson, M.
2015-02-01
Computer-based, online assessments modelled, designed, and evaluated for adaptively administered invariant measurement are uniquely suited to defining and maintaining traceability to standardized units in education. An assessment of this kind is embedded in the Assessing Data Modeling and Statistical Reasoning (ADM) middle school mathematics curriculum. Diagnostic information about middle school students' learning of statistics and modeling is provided via computer-based formative assessments for seven constructs that comprise a learning progression for statistics and modeling from late elementary through the middle school grades. The seven constructs are: Data Display, Meta-Representational Competence, Conceptions of Statistics, Chance, Modeling Variability, Theory of Measurement, and Informal Inference. The end product is a web-delivered system built with Ruby on Rails for use by curriculum development teams working with classroom teachers in designing, developing, and delivering formative assessments. The online accessible system allows teachers to accurately diagnose students' unique comprehension and learning needs in a common language of real-time assessment, logging, analysis, feedback, and reporting.
ERIC Educational Resources Information Center
Clark, Amy K.
2013-01-01
The present study sought to fit a cognitive diagnostic model (CDM) across multiple forms of a passage-based reading comprehension assessment using the attribute hierarchy method. Previous research on CDMs for reading comprehension assessments served as a basis for the attributes in the hierarchy. The two attribute hierarchies were fit to data from…
Cardiac surgery report cards: comprehensive review and statistical critique.
Shahian, D M; Normand, S L; Torchiana, D F; Lewis, S M; Pastore, J O; Kuntz, R E; Dreyer, P I
2001-12-01
Public report cards and confidential, collaborative peer education represent distinctly different approaches to cardiac surgery quality assessment and improvement. This review discusses the controversies regarding their methodology and relative effectiveness. Report cards have been the more commonly used approach, typically as a result of state legislation. They are based on the presumption that publication of outcomes effectively motivates providers, and that market forces will reward higher quality. Numerous studies have challenged the validity of these hypotheses. Furthermore, although states with report cards have reported significant decreases in risk-adjusted mortality, it is unclear whether this improvement resulted from public disclosure or, rather, from the development of internal quality programs by hospitals. An additional confounding factor is the nationwide decline in heart surgery mortality, including states without quality monitoring. Finally, report cards may engender negative behaviors such as high-risk case avoidance and "gaming" of the reporting system, especially if individual surgeon results are published. The alternative approach, continuous quality improvement, may provide an opportunity to enhance performance and reduce interprovider variability while avoiding the unintended negative consequences of report cards. This collaborative method, which uses exchange visits between programs and determination of best practice, has been highly effective in northern New England and in the Veterans Affairs Administration. However, despite their potential advantages, quality programs based solely on confidential continuous quality improvement do not address the issue of public accountability. For this reason, some states may continue to mandate report cards. In such instances, it is imperative that appropriate statistical techniques and report formats are used, and that professional organizations simultaneously implement continuous quality improvement programs. The statistical methodology underlying current report cards is flawed, and does not justify the degree of accuracy presented to the public. All existing risk-adjustment methods have substantial inherent imprecision, and this is compounded when the results of such patient-level models are aggregated and used inappropriately to assess provider performance. Specific problems include sample size differences, clustering of observations, multiple comparisons, and failure to account for the random component of interprovider variability. We advocate the use of hierarchical or multilevel statistical models to address these concerns, as well as report formats that emphasize the statistical uncertainty of the results.
Study/experimental/research design: much more than statistics.
Knight, Kenneth L
2010-01-01
The purpose of study, experimental, or research design in scientific manuscripts has changed significantly over the years. It has evolved from an explanation of the design of the experiment (ie, data gathering or acquisition) to an explanation of the statistical analysis. This practice makes "Methods" sections hard to read and understand. To clarify the difference between study design and statistical analysis, to show the advantages of a properly written study design on article comprehension, and to encourage authors to correctly describe study designs. The role of study design is explored from the introduction of the concept by Fisher through modern-day scientists and the AMA Manual of Style. At one time, when experiments were simpler, the study design and statistical design were identical or very similar. With the complex research that is common today, which often includes manipulating variables to create new variables and the multiple (and different) analyses of a single data set, data collection is very different than statistical design. Thus, both a study design and a statistical design are necessary. Scientific manuscripts will be much easier to read and comprehend. A proper experimental design serves as a road map to the study methods, helping readers to understand more clearly how the data were obtained and, therefore, assisting them in properly analyzing the results.
Novel Image Encryption Scheme Based on Chebyshev Polynomial and Duffing Map
2014-01-01
We present a novel image encryption algorithm using Chebyshev polynomial based on permutation and substitution and Duffing map based on substitution. Comprehensive security analysis has been performed on the designed scheme using key space analysis, visual testing, histogram analysis, information entropy calculation, correlation coefficient analysis, differential analysis, key sensitivity test, and speed test. The study demonstrates that the proposed image encryption algorithm shows advantages of more than 10113 key space and desirable level of security based on the good statistical results and theoretical arguments. PMID:25143970
NASA Astrophysics Data System (ADS)
Luzzi, R.; Vasconcellos, A. R.; Ramos, J. G.; Rodrigues, C. G.
2018-01-01
We describe the formalism of statistical irreversible thermodynamics constructed based on Zubarev's nonequilibrium statistical operator (NSO) method, which is a powerful and universal tool for investigating the most varied physical phenomena. We present brief overviews of the statistical ensemble formalism and statistical irreversible thermodynamics. The first can be constructed either based on a heuristic approach or in the framework of information theory in the Jeffreys-Jaynes scheme of scientific inference; Zubarev and his school used both approaches in formulating the NSO method. We describe the main characteristics of statistical irreversible thermodynamics and discuss some particular considerations of several authors. We briefly describe how Rosenfeld, Bohr, and Prigogine proposed to derive a thermodynamic uncertainty principle.
Yang, Yan-Mei; Lin, Li; Lu, You-Yuan; Ma, Xiao-Hui; Jin, Ling; Zhu, Tian-Tian
2016-03-01
The study is aimed to analyze the commercial specifications and grades of wild and cultivated Gentianae Macrophllae Radix based on multi-indicative constituents. The seven kinds of main chemical components containing in Gentianae Macrophyllae Radix were determined by UPLC, and then the quality levels of chemical component of Gentianae Macrophyllae Radix were clustered and classified by modern statistical methods (canonical correspondence analysis, Fisher discriminant analysis and so on). The quality indices were selected and their correlations were analyzed. Lastly, comprehensively quantitative grade division for quality under different commodity-specifications and different grades of same commodity-specifications of wild and planting were divided. The results provide a basis for a reasonable division of specification and grade of the commodity of Gentianae Macrophyllae Radix. The range of quality evaluation of main index components (gentiopicrin, loganin acid and swertiamarin) was proposed, and the Herbal Quality Index (HQI) was introduced. The rank discriminant function was established based on the quality by Fisher discriminant analysis. According to the analysis, the quality of wild and cultivated Luobojiao, one of the commercial specification of Gentianae Macrophyllae Radix was the best, Mahuajiao, the other commercial specification, was average , Xiaoqinjiao was inferior. Among grades, the quality of first-class cultivated Luobojiao was the worst, of second class secondary, and the third class the best; The quality of the first-class of wild Luobojiao was secondary, and the second-class the best; The quality of the second-class of Mahuajiao was secondary, and the first-class was the best; the quality of first-class Xiaoqinjiao was secondary, and the second-class was the better one between the two grades, but not obvious significantly. The method provides a new idea and method for evaluation of comprehensively quantitative on the quality of Gentianae Macrophyllae Radix. Copyright© by the Chinese Pharmaceutical Association.
Sabel, Michael S.; Rice, John D.; Griffith, Kent A.; Lowe, Lori; Wong, Sandra L.; Chang, Alfred E.; Johnson, Timothy M.; Taylor, Jeremy M.G.
2013-01-01
Introduction To identify melanoma patients at sufficiently low risk of nodal metastases who could avoid SLN biopsy (SLNB). Several statistical models have been proposed based upon patient/tumor characteristics, including logistic regression, classification trees, random forests and support vector machines. We sought to validate recently published models meant to predict sentinel node status. Methods We queried our comprehensive, prospectively-collected melanoma database for consecutive melanoma patients undergoing SLNB. Prediction values were estimated based upon 4 published models, calculating the same reported metrics: negative predictive value (NPV), rate of negative predictions (RNP), and false negative rate (FNR). Results Logistic regression performed comparably with our data when considering NPV (89.4% vs. 93.6%); however the model’s specificity was not high enough to significantly reduce the rate of biopsies (SLN reduction rate of 2.9%). When applied to our data, the classification tree produced NPV and reduction in biopsies rates that were lower 87.7% vs. 94.1% and 29.8% vs. 14.3%, respectively. Two published models could not be applied to our data due to model complexity and the use of proprietary software. Conclusions Published models meant to reduce the SLNB rate among patients with melanoma either underperformed when applied to our larger dataset, or could not be validated. Differences in selection criteria and histopathologic interpretation likely resulted in underperformance. Development of statistical predictive models must be created in a clinically applicable manner to allow for both validation and ultimately clinical utility. PMID:21822550
Boulesteix, Anne-Laure; Wilson, Rory; Hapfelmeier, Alexander
2017-09-09
The goal of medical research is to develop interventions that are in some sense superior, with respect to patient outcome, to interventions currently in use. Similarly, the goal of research in methodological computational statistics is to develop data analysis tools that are themselves superior to the existing tools. The methodology of the evaluation of medical interventions continues to be discussed extensively in the literature and it is now well accepted that medicine should be at least partly "evidence-based". Although we statisticians are convinced of the importance of unbiased, well-thought-out study designs and evidence-based approaches in the context of clinical research, we tend to ignore these principles when designing our own studies for evaluating statistical methods in the context of our methodological research. In this paper, we draw an analogy between clinical trials and real-data-based benchmarking experiments in methodological statistical science, with datasets playing the role of patients and methods playing the role of medical interventions. Through this analogy, we suggest directions for improvement in the design and interpretation of studies which use real data to evaluate statistical methods, in particular with respect to dataset inclusion criteria and the reduction of various forms of bias. More generally, we discuss the concept of "evidence-based" statistical research, its limitations and its impact on the design and interpretation of real-data-based benchmark experiments. We suggest that benchmark studies-a method of assessment of statistical methods using real-world datasets-might benefit from adopting (some) concepts from evidence-based medicine towards the goal of more evidence-based statistical research.
ERIC Educational Resources Information Center
Kiran, Swathi; Caplan, David; Sandberg, Chaleece; Levy, Joshua; Berardino, Alex; Ascenso, Elsa; Villard, Sarah; Tripodis, Yorghos
2012-01-01
Purpose: Two new treatments, 1 based on sentence to picture matching (SPM) and the other on object manipulation (OM), that train participants on the thematic roles of sentences using pictures or by manipulating objects were piloted. Method: Using a single-subject multiple-baseline design, sentence comprehension was trained on the affected sentence…
ERIC Educational Resources Information Center
Gutkind, Rebeka Chaia
2012-01-01
This mixed method study investigated the schema strategy uses of fourth-grade boys with reading challenges; specifically, their ability to understand text based on two components within schema theory: tuning and restructuring. Based on the reading comprehension scores from the Iowa Test of Basic Skills (Form 2010), four comparison groups were…
Gandolla, Marta; Molteni, Franco; Ward, Nick S; Guanziroli, Eleonora; Ferrigno, Giancarlo; Pedrocchi, Alessandra
2015-11-01
The foreseen outcome of a rehabilitation treatment is a stable improvement on the functional outcomes, which can be longitudinally assessed through multiple measures to help clinicians in functional evaluation. In this study, we propose an automatic comprehensive method of combining multiple measures in order to assess a functional improvement. As test-bed, a functional electrical stimulation based treatment for foot drop correction performed with chronic post-stroke participants is presented. Patients were assessed on five relevant outcome measures before, after intervention, and at a follow-up time-point. A novel algorithm based on variables minimum detectable change is proposed and implemented in a custom-made software, combining the outcome measures to obtain a unique parameter: capacity score. The difference between capacity scores at different timing is three holded to obtain improvement evaluation. Ten clinicians evaluated patients on the Improvement Clinical Global Impression scale. Eleven patients underwent the treatment, and five resulted to achieve a stable functional improvement, as assessed by the proposed algorithm. A statistically significant agreement between intra-clinicians and algorithm-clinicians evaluations was demonstrated. The proposed method evaluates functional improvement on a single-subject yes/no base by merging different measures (e.g., kinematic, muscular) and it is validated against clinical evaluation.
NASA Astrophysics Data System (ADS)
Noda, Isao
2014-07-01
A comprehensive survey review of new and noteworthy developments, which are advancing forward the frontiers in the field of 2D correlation spectroscopy during the last four years, is compiled. This review covers books, proceedings, and review articles published on 2D correlation spectroscopy, a number of significant conceptual developments in the field, data pretreatment methods and other pertinent topics, as well as patent and publication trends and citation activities. Developments discussed include projection 2D correlation analysis, concatenated 2D correlation, and correlation under multiple perturbation effects, as well as orthogonal sample design, predicting 2D correlation spectra, manipulating and comparing 2D spectra, correlation strategy based on segmented data blocks, such as moving-window analysis, features like determination of sequential order and enhanced spectral resolution, statistical 2D spectroscopy using covariance and other statistical metrics, hetero-correlation analysis, and sample-sample correlation technique. Data pretreatment operations prior to 2D correlation analysis are discussed, including the correction for physical effects, background and baseline subtraction, selection of reference spectrum, normalization and scaling of data, derivatives spectra and deconvolution technique, and smoothing and noise reduction. Other pertinent topics include chemometrics and statistical considerations, peak position shift phenomena, variable sampling increments, computation and software, display schemes, such as color coded format, slice and power spectra, tabulation, and other schemes.
Sanchez, Travis H; Stein, Aryeh D; Stephenson, Rob; Zlotorzynska, Maria; Sineath, Robert Craig; Sullivan, Patrick S
2017-01-01
Background Web-based surveys are increasingly used to capture data essential for human immunodeficiency virus (HIV) prevention research. However, there are challenges in ensuring the informed consent of Web-based research participants. Objective The aim of our study was to develop and assess the efficacy of alternative methods of administering informed consent in Web-based HIV research with men who have sex with men (MSM). Methods From July to September 2014, paid advertisements on Facebook were used to recruit adult MSM living in the United States for a Web-based survey about risk and preventive behaviors. Participants were randomized to one of the 4 methods of delivering informed consent: a professionally produced video, a study staff-produced video, a frequently asked questions (FAQs) text page, and a standard informed consent text page. Following the behavior survey, participants answered 15 questions about comprehension of consent information. Correct responses to each question were given a score of 1, for a total possible scale score of 15. General linear regression and post-hoc Tukey comparisons were used to assess difference (P<.001) in mean consent comprehension scores. A mediation analysis was used to examine the relationship between time spent on consent page and consent comprehension. Results Of the 665 MSM participants who completed the comprehension questions, 24.2% (161/665) received the standard consent, 27.1% (180/665) received the FAQ consent, 26.8% (178/665) received the professional consent video, and 22.0% (146/665) received the staff video. The overall average consent comprehension score was 6.28 (SD=2.89). The average consent comprehension score differed significantly across consent type (P<.001), age (P=.04), race or ethnicity (P<.001), and highest level of education (P=.001). Compared with those who received the standard consent, comprehension was significantly higher for participants who received the professional video consent (score increase=1.79; 95% CI 1.02-2.55) and participants who received the staff video consent (score increase=1.79; 95% CI 0.99-2.59). There was no significant difference in comprehension for those who received the FAQ consent. Participants spent more time on the 2 video consents (staff video median time=117 seconds; professional video median time=115 seconds) than the FAQ (median=21 seconds) and standard consents (median=37 seconds). Mediation analysis showed that though time spent on the consent page was partially responsible for some of the differences in comprehension, the direct effects of the professional video (score increase=0.93; 95% CI 0.39-1.48) and the staff-produced video (score increase=0.99; 95% CI 0.42-1.56) were still significant. Conclusions Video-based consent methods improve consent comprehension of MSM participating in a Web-based HIV behavioral survey. This effect may be partially mediated through increased time spent reviewing the consent material; however, the video consent may still be superior to standard consent in improving participant comprehension of key study facts. Trail Registration Clinicaltrials.gov NCT02139566; https://clinicaltrials.gov/ct2/show/NCT02139566 (Archived by WebCite at http://www.webcitation.org/6oRnL261N). PMID:28264794
COMAN: a web server for comprehensive metatranscriptomics analysis.
Ni, Yueqiong; Li, Jun; Panagiotou, Gianni
2016-08-11
Microbiota-oriented studies based on metagenomic or metatranscriptomic sequencing have revolutionised our understanding on microbial ecology and the roles of both clinical and environmental microbes. The analysis of massive metatranscriptomic data requires extensive computational resources, a collection of bioinformatics tools and expertise in programming. We developed COMAN (Comprehensive Metatranscriptomics Analysis), a web-based tool dedicated to automatically and comprehensively analysing metatranscriptomic data. COMAN pipeline includes quality control of raw reads, removal of reads derived from non-coding RNA, followed by functional annotation, comparative statistical analysis, pathway enrichment analysis, co-expression network analysis and high-quality visualisation. The essential data generated by COMAN are also provided in tabular format for additional analysis and integration with other software. The web server has an easy-to-use interface and detailed instructions, and is freely available at http://sbb.hku.hk/COMAN/ CONCLUSIONS: COMAN is an integrated web server dedicated to comprehensive functional analysis of metatranscriptomic data, translating massive amount of reads to data tables and high-standard figures. It is expected to facilitate the researchers with less expertise in bioinformatics in answering microbiota-related biological questions and to increase the accessibility and interpretation of microbiota RNA-Seq data.
Suner, Aslı; Karakülah, Gökhan; Dicle, Oğuz
2014-01-01
Statistical hypothesis testing is an essential component of biological and medical studies for making inferences and estimations from the collected data in the study; however, the misuse of statistical tests is widely common. In order to prevent possible errors in convenient statistical test selection, it is currently possible to consult available test selection algorithms developed for various purposes. However, the lack of an algorithm presenting the most common statistical tests used in biomedical research in a single flowchart causes several problems such as shifting users among the algorithms, poor decision support in test selection and lack of satisfaction of potential users. Herein, we demonstrated a unified flowchart; covers mostly used statistical tests in biomedical domain, to provide decision aid to non-statistician users while choosing the appropriate statistical test for testing their hypothesis. We also discuss some of the findings while we are integrating the flowcharts into each other to develop a single but more comprehensive decision algorithm.
Methods of Comprehensive Assessment for China’s Energy Sustainability
NASA Astrophysics Data System (ADS)
Xu, Zhijin; Song, Yankui
2018-02-01
In order to assess the sustainable development of China’s energy objectively and accurately, we need to establish a reasonable indicator system for energy sustainability and make a targeted comprehensive assessment with the scientific methods. This paper constructs a comprehensive indicator system for energy sustainability from five aspects of economy, society, environment, energy resources and energy technology based on the theory of sustainable development and the theory of symbiosis. On this basis, it establishes and discusses the assessment models and the general assessment methods for energy sustainability with the help of fuzzy mathematics. It is of some reference for promoting the sustainable development of China’s energy, economy and society.
Application of Competency-Based Education in Laparoscopic Training
Xue, Dongbo; Bo, Hong; Zhao, Song; Meng, Xianzhi
2015-01-01
Background and Objectives: To induce competency-based education/developing a curriculum in the training of postgraduate students in laparoscopic surgery. Methods: This study selected postgraduate students before the implementation of competency-based education (n = 16) or after the implementation of competency-based education (n = 17). On the basis of the 5 competencies of patient care, medical knowledge, practice-based learning and improvement, interpersonal and communication skills, and professionalism, the research team created a developing a curriculum chart and specific improvement measures that were implemented in the competency-based education group. Results: On the basis of the developing a curriculum chart, the assessment of the 5 comprehensive competencies using the 360° assessment method indicated that the competency-based education group's competencies were significantly improved compared with those of the traditional group (P < .05). The improvement in the comprehensive assessment was also significant compared with the traditional group (P < .05). Conclusion: The implementation of competency-based education/developing a curriculum teaching helps to improve the comprehensive competencies of postgraduate students and enables them to become qualified clinicians equipped to meet society's needs. PMID:25901105
Reframing Serial Murder Within Empirical Research.
Gurian, Elizabeth A
2017-04-01
Empirical research on serial murder is limited due to the lack of consensus on a definition, the continued use of primarily descriptive statistics, and linkage to popular culture depictions. These limitations also inhibit our understanding of these offenders and affect credibility in the field of research. Therefore, this comprehensive overview of a sample of 508 cases (738 total offenders, including partnered groups of two or more offenders) provides analyses of solo male, solo female, and partnered serial killers to elucidate statistical differences and similarities in offending and adjudication patterns among the three groups. This analysis of serial homicide offenders not only supports previous research on offending patterns present in the serial homicide literature but also reveals that empirically based analyses can enhance our understanding beyond traditional case studies and descriptive statistics. Further research based on these empirical analyses can aid in the development of more accurate classifications and definitions of serial murderers.
Use of Statistical Analyses in the Ophthalmic Literature
Lisboa, Renato; Meira-Freitas, Daniel; Tatham, Andrew J.; Marvasti, Amir H.; Sharpsten, Lucie; Medeiros, Felipe A.
2014-01-01
Purpose To identify the most commonly used statistical analyses in the ophthalmic literature and to determine the likely gain in comprehension of the literature that readers could expect if they were to sequentially add knowledge of more advanced techniques to their statistical repertoire. Design Cross-sectional study Methods All articles published from January 2012 to December 2012 in Ophthalmology, American Journal of Ophthalmology and Archives of Ophthalmology were reviewed. A total of 780 peer-reviewed articles were included. Two reviewers examined each article and assigned categories to each one depending on the type of statistical analyses used. Discrepancies between reviewers were resolved by consensus. Main Outcome Measures Total number and percentage of articles containing each category of statistical analysis were obtained. Additionally we estimated the accumulated number and percentage of articles that a reader would be expected to be able to interpret depending on their statistical repertoire. Results Readers with little or no statistical knowledge would be expected to be able to interpret the statistical methods presented in only 20.8% of articles. In order to understand more than half (51.4%) of the articles published, readers were expected to be familiar with at least 15 different statistical methods. Knowledge of 21 categories of statistical methods was necessary to comprehend 70.9% of articles, while knowledge of more than 29 categories was necessary to comprehend more than 90% of articles. Articles in retina and glaucoma subspecialties showed a tendency for using more complex analysis when compared to cornea. Conclusions Readers of clinical journals in ophthalmology need to have substantial knowledge of statistical methodology to understand the results of published studies in the literature. The frequency of use of complex statistical analyses also indicates that those involved in the editorial peer-review process must have sound statistical knowledge in order to critically appraise articles submitted for publication. The results of this study could provide guidance to direct the statistical learning of clinical ophthalmologists, researchers and educators involved in the design of courses for residents and medical students. PMID:24612977
Callegaro, Giulia; Malkoc, Kasja; Corvi, Raffaella; Urani, Chiara; Stefanini, Federico M
2017-12-01
The identification of the carcinogenic risk of chemicals is currently mainly based on animal studies. The in vitro Cell Transformation Assays (CTAs) are a promising alternative to be considered in an integrated approach. CTAs measure the induction of foci of transformed cells. CTAs model key stages of the in vivo neoplastic process and are able to detect both genotoxic and some non-genotoxic compounds, being the only in vitro method able to deal with the latter. Despite their favorable features, CTAs can be further improved, especially reducing the possible subjectivity arising from the last phase of the protocol, namely visual scoring of foci using coded morphological features. By taking advantage of digital image analysis, the aim of our work is to translate morphological features into statistical descriptors of foci images, and to use them to mimic the classification performances of the visual scorer to discriminate between transformed and non-transformed foci. Here we present a classifier based on five descriptors trained on a dataset of 1364 foci, obtained with different compounds and concentrations. Our classifier showed accuracy, sensitivity and specificity equal to 0.77 and an area under the curve (AUC) of 0.84. The presented classifier outperforms a previously published model. Copyright © 2017 Elsevier Ltd. All rights reserved.
3Drefine: an interactive web server for efficient protein structure refinement
Bhattacharya, Debswapna; Nowotny, Jackson; Cao, Renzhi; Cheng, Jianlin
2016-01-01
3Drefine is an interactive web server for consistent and computationally efficient protein structure refinement with the capability to perform web-based statistical and visual analysis. The 3Drefine refinement protocol utilizes iterative optimization of hydrogen bonding network combined with atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields for efficient protein structure refinement. The method has been extensively evaluated on blind CASP experiments as well as on large-scale and diverse benchmark datasets and exhibits consistent improvement over the initial structure in both global and local structural quality measures. The 3Drefine web server allows for convenient protein structure refinement through a text or file input submission, email notification, provided example submission and is freely available without any registration requirement. The server also provides comprehensive analysis of submissions through various energy and statistical feedback and interactive visualization of multiple refined models through the JSmol applet that is equipped with numerous protein model analysis tools. The web server has been extensively tested and used by many users. As a result, the 3Drefine web server conveniently provides a useful tool easily accessible to the community. The 3Drefine web server has been made publicly available at the URL: http://sysbio.rnet.missouri.edu/3Drefine/. PMID:27131371
NASA Astrophysics Data System (ADS)
Wang, Dong
2016-03-01
Gears are the most commonly used components in mechanical transmission systems. Their failures may cause transmission system breakdown and result in economic loss. Identification of different gear crack levels is important to prevent any unexpected gear failure because gear cracks lead to gear tooth breakage. Signal processing based methods mainly require expertize to explain gear fault signatures which is usually not easy to be achieved by ordinary users. In order to automatically identify different gear crack levels, intelligent gear crack identification methods should be developed. The previous case studies experimentally proved that K-nearest neighbors based methods exhibit high prediction accuracies for identification of 3 different gear crack levels under different motor speeds and loads. In this short communication, to further enhance prediction accuracies of existing K-nearest neighbors based methods and extend identification of 3 different gear crack levels to identification of 5 different gear crack levels, redundant statistical features are constructed by using Daubechies 44 (db44) binary wavelet packet transform at different wavelet decomposition levels, prior to the use of a K-nearest neighbors method. The dimensionality of redundant statistical features is 620, which provides richer gear fault signatures. Since many of these statistical features are redundant and highly correlated with each other, dimensionality reduction of redundant statistical features is conducted to obtain new significant statistical features. At last, the K-nearest neighbors method is used to identify 5 different gear crack levels under different motor speeds and loads. A case study including 3 experiments is investigated to demonstrate that the developed method provides higher prediction accuracies than the existing K-nearest neighbors based methods for recognizing different gear crack levels under different motor speeds and loads. Based on the new significant statistical features, some other popular statistical models including linear discriminant analysis, quadratic discriminant analysis, classification and regression tree and naive Bayes classifier, are compared with the developed method. The results show that the developed method has the highest prediction accuracies among these statistical models. Additionally, selection of the number of new significant features and parameter selection of K-nearest neighbors are thoroughly investigated.
Jennings, Mary Carol; Pradhan, Subarna; Schleiff, Meike; Sacks, Emma; Freeman, Paul A; Gupta, Sundeep; Rassekh, Bahie M; Perry, Henry B
2017-01-01
Background We summarize the findings of assessments of projects, programs, and research studies (collectively referred to as projects) included in a larger review of the effectiveness of community–based primary health care (CBPHC) in improving maternal, neonatal and child health (MNCH). Findings on neonatal and child health are reported elsewhere in this series. Methods We searched PUBMED and other databases through December 2015, and included assessments that underwent data extraction. Data were analyzed to identify themes in interventions implemented, health outcomes, and strategies used in implementation. Results 152 assessments met inclusion criteria. The majority of assessments were set in rural communities. 72% of assessments included 1–10 specific interventions aimed at improving maternal health. A total of 1298 discrete interventions were assessed. Outcome measures were grouped into five main categories: maternal mortality (19% of assessments); maternal morbidity (21%); antenatal care attendance (50%); attended delivery (66%) and facility delivery (69%), with many assessments reporting results on multiple indicators. 15 assessments reported maternal mortality as a primary outcome, and of the seven that performed statistical testing, six reported significant decreases. Seven assessments measured changes in maternal morbidity: postpartum hemorrhage, malaria or eclampsia. Of those, six reported significant decreases and one did not find a significant effect. Assessments of community–based interventions on antenatal care attendance, attended delivery and facility–based deliveries all showed a positive impact. The community–based strategies used to achieve these results often involved community collaboration, home visits, formation of participatory women’s groups, and provision of services by outreach teams from peripheral health facilities. Conclusions This comprehensive and systematic review provides evidence of the effectiveness of CBPHC in improving key indicators of maternal morbidity and mortality. Most projects combined community– and facility–based approaches, emphasizing potential added benefits from such holistic approaches. Community–based interventions will be an important component of a comprehensive approach to accelerate improvements in maternal health and to end preventable maternal deaths by 2030. PMID:28685040
Modeling Area-Level Health Rankings.
Courtemanche, Charles; Soneji, Samir; Tchernis, Rusty
2015-10-01
Rank county health using a Bayesian factor analysis model. Secondary county data from the National Center for Health Statistics (through 2007) and Behavioral Risk Factor Surveillance System (through 2009). Our model builds on the existing county health rankings (CHRs) by using data-derived weights to compute ranks from mortality and morbidity variables, and by quantifying uncertainty based on population, spatial correlation, and missing data. We apply our model to Wisconsin, which has comprehensive data, and Texas, which has substantial missing information. The data were downloaded from www.countyhealthrankings.org. Our estimated rankings are more similar to the CHRs for Wisconsin than Texas, as the data-derived factor weights are closer to the assigned weights for Wisconsin. The correlations between the CHRs and our ranks are 0.89 for Wisconsin and 0.65 for Texas. Uncertainty is especially severe for Texas given the state's substantial missing data. The reliability of comprehensive CHRs varies from state to state. We advise focusing on the counties that remain among the least healthy after incorporating alternate weighting methods and accounting for uncertainty. Our results also highlight the need for broader geographic coverage in health data. © Health Research and Educational Trust.
Su, Weixing; Chen, Hanning; Liu, Fang; Lin, Na; Jing, Shikai; Liang, Xiaodan; Liu, Wei
2017-03-01
There are many dynamic optimization problems in the real world, whose convergence and searching ability is cautiously desired, obviously different from static optimization cases. This requires an optimization algorithm adaptively seek the changing optima over dynamic environments, instead of only finding the global optimal solution in the static environment. This paper proposes a novel comprehensive learning artificial bee colony optimizer (CLABC) for optimization in dynamic environments problems, which employs a pool of optimal foraging strategies to balance the exploration and exploitation tradeoff. The main motive of CLABC is to enrich artificial bee foraging behaviors in the ABC model by combining Powell's pattern search method, life-cycle, and crossover-based social learning strategy. The proposed CLABC is a more bee-colony-realistic model that the bee can reproduce and die dynamically throughout the foraging process and population size varies as the algorithm runs. The experiments for evaluating CLABC are conducted on the dynamic moving peak benchmarks. Furthermore, the proposed algorithm is applied to a real-world application of dynamic RFID network optimization. Statistical analysis of all these cases highlights the significant performance improvement due to the beneficial combination and demonstrates the performance superiority of the proposed algorithm.
Transmission overhaul and replacement predictions using Weibull and renewel theory
NASA Technical Reports Server (NTRS)
Savage, M.; Lewicki, D. G.
1989-01-01
A method to estimate the frequency of transmission overhauls is presented. This method is based on the two-parameter Weibull statistical distribution for component life. A second method is presented to estimate the number of replacement components needed to support the transmission overhaul pattern. The second method is based on renewal theory. Confidence statistics are applied with both methods to improve the statistical estimate of sample behavior. A transmission example is also presented to illustrate the use of the methods. Transmission overhaul frequency and component replacement calculations are included in the example.
A note on the kappa statistic for clustered dichotomous data.
Zhou, Ming; Yang, Zhao
2014-06-30
The kappa statistic is widely used to assess the agreement between two raters. Motivated by a simulation-based cluster bootstrap method to calculate the variance of the kappa statistic for clustered physician-patients dichotomous data, we investigate its special correlation structure and develop a new simple and efficient data generation algorithm. For the clustered physician-patients dichotomous data, based on the delta method and its special covariance structure, we propose a semi-parametric variance estimator for the kappa statistic. An extensive Monte Carlo simulation study is performed to evaluate the performance of the new proposal and five existing methods with respect to the empirical coverage probability, root-mean-square error, and average width of the 95% confidence interval for the kappa statistic. The variance estimator ignoring the dependence within a cluster is generally inappropriate, and the variance estimators from the new proposal, bootstrap-based methods, and the sampling-based delta method perform reasonably well for at least a moderately large number of clusters (e.g., the number of clusters K ⩾50). The new proposal and sampling-based delta method provide convenient tools for efficient computations and non-simulation-based alternatives to the existing bootstrap-based methods. Moreover, the new proposal has acceptable performance even when the number of clusters is as small as K = 25. To illustrate the practical application of all the methods, one psychiatric research data and two simulated clustered physician-patients dichotomous data are analyzed. Copyright © 2014 John Wiley & Sons, Ltd.
Taguchi optimization of bismuth-telluride based thermoelectric cooler
NASA Astrophysics Data System (ADS)
Anant Kishore, Ravi; Kumar, Prashant; Sanghadasa, Mohan; Priya, Shashank
2017-07-01
In the last few decades, considerable effort has been made to enhance the figure-of-merit (ZT) of thermoelectric (TE) materials. However, the performance of commercial TE devices still remains low due to the fact that the module figure-of-merit not only depends on the material ZT, but also on the operating conditions and configuration of TE modules. This study takes into account comprehensive set of parameters to conduct the numerical performance analysis of the thermoelectric cooler (TEC) using a Taguchi optimization method. The Taguchi method is a statistical tool that predicts the optimal performance with a far less number of experimental runs than the conventional experimental techniques. Taguchi results are also compared with the optimized parameters obtained by a full factorial optimization method, which reveals that the Taguchi method provides optimum or near-optimum TEC configuration using only 25 experiments against 3125 experiments needed by the conventional optimization method. This study also shows that the environmental factors such as ambient temperature and cooling coefficient do not significantly affect the optimum geometry and optimum operating temperature of TECs. The optimum TEC configuration for simultaneous optimization of cooling capacity and coefficient of performance is also provided.
Dinov, Ivo D; Heavner, Ben; Tang, Ming; Glusman, Gustavo; Chard, Kyle; Darcy, Mike; Madduri, Ravi; Pa, Judy; Spino, Cathie; Kesselman, Carl; Foster, Ian; Deutsch, Eric W; Price, Nathan D; Van Horn, John D; Ames, Joseph; Clark, Kristi; Hood, Leroy; Hampstead, Benjamin M; Dauer, William; Toga, Arthur W
2016-01-01
A unique archive of Big Data on Parkinson's Disease is collected, managed and disseminated by the Parkinson's Progression Markers Initiative (PPMI). The integration of such complex and heterogeneous Big Data from multiple sources offers unparalleled opportunities to study the early stages of prevalent neurodegenerative processes, track their progression and quickly identify the efficacies of alternative treatments. Many previous human and animal studies have examined the relationship of Parkinson's disease (PD) risk to trauma, genetics, environment, co-morbidities, or life style. The defining characteristics of Big Data-large size, incongruency, incompleteness, complexity, multiplicity of scales, and heterogeneity of information-generating sources-all pose challenges to the classical techniques for data management, processing, visualization and interpretation. We propose, implement, test and validate complementary model-based and model-free approaches for PD classification and prediction. To explore PD risk using Big Data methodology, we jointly processed complex PPMI imaging, genetics, clinical and demographic data. Collective representation of the multi-source data facilitates the aggregation and harmonization of complex data elements. This enables joint modeling of the complete data, leading to the development of Big Data analytics, predictive synthesis, and statistical validation. Using heterogeneous PPMI data, we developed a comprehensive protocol for end-to-end data characterization, manipulation, processing, cleaning, analysis and validation. Specifically, we (i) introduce methods for rebalancing imbalanced cohorts, (ii) utilize a wide spectrum of classification methods to generate consistent and powerful phenotypic predictions, and (iii) generate reproducible machine-learning based classification that enables the reporting of model parameters and diagnostic forecasting based on new data. We evaluated several complementary model-based predictive approaches, which failed to generate accurate and reliable diagnostic predictions. However, the results of several machine-learning based classification methods indicated significant power to predict Parkinson's disease in the PPMI subjects (consistent accuracy, sensitivity, and specificity exceeding 96%, confirmed using statistical n-fold cross-validation). Clinical (e.g., Unified Parkinson's Disease Rating Scale (UPDRS) scores), demographic (e.g., age), genetics (e.g., rs34637584, chr12), and derived neuroimaging biomarker (e.g., cerebellum shape index) data all contributed to the predictive analytics and diagnostic forecasting. Model-free Big Data machine learning-based classification methods (e.g., adaptive boosting, support vector machines) can outperform model-based techniques in terms of predictive precision and reliability (e.g., forecasting patient diagnosis). We observed that statistical rebalancing of cohort sizes yields better discrimination of group differences, specifically for predictive analytics based on heterogeneous and incomplete PPMI data. UPDRS scores play a critical role in predicting diagnosis, which is expected based on the clinical definition of Parkinson's disease. Even without longitudinal UPDRS data, however, the accuracy of model-free machine learning based classification is over 80%. The methods, software and protocols developed here are openly shared and can be employed to study other neurodegenerative disorders (e.g., Alzheimer's, Huntington's, amyotrophic lateral sclerosis), as well as for other predictive Big Data analytics applications.
NASA Astrophysics Data System (ADS)
Yang, Zhou; Zhu, Yunpeng; Ren, Hongrui; Zhang, Yimin
2015-03-01
Reliability allocation of computerized numerical controlled(CNC) lathes is very important in industry. Traditional allocation methods only focus on high-failure rate components rather than moderate failure rate components, which is not applicable in some conditions. Aiming at solving the problem of CNC lathes reliability allocating, a comprehensive reliability allocation method based on cubic transformed functions of failure modes and effects analysis(FMEA) is presented. Firstly, conventional reliability allocation methods are introduced. Then the limitations of direct combination of comprehensive allocation method with the exponential transformed FMEA method are investigated. Subsequently, a cubic transformed function is established in order to overcome these limitations. Properties of the new transformed functions are discussed by considering the failure severity and the failure occurrence. Designers can choose appropriate transform amplitudes according to their requirements. Finally, a CNC lathe and a spindle system are used as an example to verify the new allocation method. Seven criteria are considered to compare the results of the new method with traditional methods. The allocation results indicate that the new method is more flexible than traditional methods. By employing the new cubic transformed function, the method covers a wider range of problems in CNC reliability allocation without losing the advantages of traditional methods.
Mayo, Charles S; Yao, John; Eisbruch, Avraham; Balter, James M; Litzenberg, Dale W; Matuszak, Martha M; Kessler, Marc L; Weyburn, Grant; Anderson, Carlos J; Owen, Dawn; Jackson, William C; Haken, Randall Ten
2017-01-01
To develop statistical dose-volume histogram (DVH)-based metrics and a visualization method to quantify the comparison of treatment plans with historical experience and among different institutions. The descriptive statistical summary (ie, median, first and third quartiles, and 95% confidence intervals) of volume-normalized DVH curve sets of past experiences was visualized through the creation of statistical DVH plots. Detailed distribution parameters were calculated and stored in JavaScript Object Notation files to facilitate management, including transfer and potential multi-institutional comparisons. In the treatment plan evaluation, structure DVH curves were scored against computed statistical DVHs and weighted experience scores (WESs). Individual, clinically used, DVH-based metrics were integrated into a generalized evaluation metric (GEM) as a priority-weighted sum of normalized incomplete gamma functions. Historical treatment plans for 351 patients with head and neck cancer, 104 with prostate cancer who were treated with conventional fractionation, and 94 with liver cancer who were treated with stereotactic body radiation therapy were analyzed to demonstrate the usage of statistical DVH, WES, and GEM in a plan evaluation. A shareable dashboard plugin was created to display statistical DVHs and integrate GEM and WES scores into a clinical plan evaluation within the treatment planning system. Benchmarking with normal tissue complication probability scores was carried out to compare the behavior of GEM and WES scores. DVH curves from historical treatment plans were characterized and presented, with difficult-to-spare structures (ie, frequently compromised organs at risk) identified. Quantitative evaluations by GEM and/or WES compared favorably with the normal tissue complication probability Lyman-Kutcher-Burman model, transforming a set of discrete threshold-priority limits into a continuous model reflecting physician objectives and historical experience. Statistical DVH offers an easy-to-read, detailed, and comprehensive way to visualize the quantitative comparison with historical experiences and among institutions. WES and GEM metrics offer a flexible means of incorporating discrete threshold-prioritizations and historic context into a set of standardized scoring metrics. Together, they provide a practical approach for incorporating big data into clinical practice for treatment plan evaluations.
Gao, Wen; Yang, Hua; Qi, Lian-Wen; Liu, E-Hu; Ren, Mei-Ting; Yan, Yu-Ting; Chen, Jun; Li, Ping
2012-07-06
Plant-based medicines become increasingly popular over the world. Authentication of herbal raw materials is important to ensure their safety and efficacy. Some herbs belonging to closely related species but differing in medicinal properties are difficult to be identified because of similar morphological and microscopic characteristics. Chromatographic fingerprinting is an alternative method to distinguish them. Existing approaches do not allow a comprehensive analysis for herbal authentication. We have now developed a strategy consisting of (1) full metabolic profiling of herbal medicines by rapid resolution liquid chromatography (RRLC) combined with quadrupole time-of-flight mass spectrometry (QTOF MS), (2) global analysis of non-targeted compounds by molecular feature extraction algorithm, (3) multivariate statistical analysis for classification and prediction, and (4) marker compounds characterization. This approach has provided a fast and unbiased comparative multivariate analysis of the metabolite composition of 33-batch samples covering seven Lonicera species. Individual metabolic profiles are performed at the level of molecular fragments without prior structural assignment. In the entire set, the obtained classifier for seven Lonicera species flower buds showed good prediction performance and a total of 82 statistically different components were rapidly obtained by the strategy. The elemental compositions of discriminative metabolites were characterized by the accurate mass measurement of the pseudomolecular ions and their chemical types were assigned by the MS/MS spectra. The high-resolution, comprehensive and unbiased strategy for metabolite data analysis presented here is powerful and opens the new direction of authentication in herbal analysis. Copyright © 2012 Elsevier B.V. All rights reserved.
Nimbalkar, Prakash Madhav; Tripathi, Nitin Kumar
2016-11-21
Influenza-like illness (ILI) is an acute respiratory disease that remains a public health concern for its ability to circulate globally affecting any age group and gender causing serious illness with mortality risk. Comprehensive assessment of the spatio-temporal dynamics of ILI is a prerequisite for effective risk assessment and application of control measures. Though meteorological parameters, such as rainfall, average relative humidity and temperature, influence ILI and represent crucial information for control of this disease, the relation between the disease and these variables is not clearly understood in tropical climates. The aim of this study was to analyse the epidemiology of ILI cases using integrated methods (space-time analysis, spatial autocorrelation and other correlation statistics). After 2009s H1N1 influenza pandemic, Phitsanulok Province in northern Thailand was strongly affected by ILI for many years. This study is based on ILI cases in villages in this province from 2005 to 2012. We used highly precise weekly incidence records covering eight years, which allowed accurate estimation of the ILI outbreak. Comprehensive methodology was developed to analyse the global and local patterns of the spread of the disease. Significant space-time clusters were detected over the study region during eight different periods. ILI cases showed seasonal clustered patterns with a peak in 2010 (P>0.05-9.999 iterations). Local indicators of spatial association identified hotspots for each year. Statistically, the weather pattern showed a clear influence on ILI cases and it strongly correlated with humidity at a lag of 1 month, while temperature had a weaker correlation.
Language Sampling for Preschoolers With Severe Speech Impairments
Ragsdale, Jamie; Bustos, Aimee
2016-01-01
Purpose The purposes of this investigation were to determine if measures such as mean length of utterance (MLU) and percentage of comprehensible words can be derived reliably from language samples of children with severe speech impairments and if such measures correlate with tools that measure constructs assumed to be related. Method Language samples of 15 preschoolers with severe speech impairments (but receptive language within normal limits) were transcribed independently by 2 transcribers. Nonparametric statistics were used to determine which measures, if any, could be transcribed reliably and to determine if correlations existed between language sample measures and standardized measures of speech, language, and cognition. Results Reliable measures were extracted from the majority of the language samples, including MLU in words, mean number of syllables per utterance, and percentage of comprehensible words. Language sample comprehensibility measures were correlated with a single word comprehensibility task. Also, language sample MLUs and mean length of the participants' 3 longest sentences from the MacArthur–Bates Communicative Development Inventory (Fenson et al., 2006) were correlated. Conclusion Language sampling, given certain modifications, may be used for some 3-to 5-year-old children with normal receptive language who have severe speech impairments to provide reliable expressive language and comprehensibility information. PMID:27552110
Informing public policy toward binational health insurance: Empirical evidence from California
Fulton, Brent D; Galárraga, Omar; Dow, William H
2015-01-01
Objective To estimate reimbursement rate differences between Mexico and US based physicians reimbursed by a binational health insurance (BHI) plan and US payers, respectively; and show the relationship between plan benefit designs and health care utilization in Mexico. Materials and methods Data include 33 841 and 53 909 HMO enrollees in California from Sistemas Médicos Nacionales (SIMNSA) and Salud con Health Net, respectively. We use descriptive statistical methods. Results SIMNSA’s physician reimbursement rates averaged 50.7% (95% CI: 34.5%–67.0%) of Medi-Cal’s, 28.3% (95% CI: 19.6%–37.0%) of Medicare’s, and 22% of US private plans’. Each year, 99.4% of SIMNSA enrollees but only 0.1% of Salud con Health Net enrollees obtained care in Mexico. Conclusion SIMNSA only covers emergency and urgent care in the US, while Salud con Health Net covers comprehensive care with higher patient cost sharing than in Mexico. To realize potential savings, plans need strong incentives to increase utilization in Mexico. PMID:25153186
Structural kinetic modeling of metabolic networks.
Steuer, Ralf; Gross, Thilo; Selbig, Joachim; Blasius, Bernd
2006-08-08
To develop and investigate detailed mathematical models of metabolic processes is one of the primary challenges in systems biology. However, despite considerable advance in the topological analysis of metabolic networks, kinetic modeling is still often severely hampered by inadequate knowledge of the enzyme-kinetic rate laws and their associated parameter values. Here we propose a method that aims to give a quantitative account of the dynamical capabilities of a metabolic system, without requiring any explicit information about the functional form of the rate equations. Our approach is based on constructing a local linear model at each point in parameter space, such that each element of the model is either directly experimentally accessible or amenable to a straightforward biochemical interpretation. This ensemble of local linear models, encompassing all possible explicit kinetic models, then allows for a statistical exploration of the comprehensive parameter space. The method is exemplified on two paradigmatic metabolic systems: the glycolytic pathway of yeast and a realistic-scale representation of the photosynthetic Calvin cycle.
LFSTAT - Low-Flow Analysis in R
NASA Astrophysics Data System (ADS)
Koffler, Daniel; Laaha, Gregor
2013-04-01
The calculation of characteristic stream flow during dry conditions is a basic requirement for many problems in hydrology, ecohydrology and water resources management. As opposed to floods, a number of different indices are used to characterise low flows and streamflow droughts. Although these indices and methods of calculation have been well documented in the WMO Manual on Low-flow Estimation and Prediction [1], a comprehensive software was missing which enables a fast and standardized calculation of low flow statistics. We present the new software package lfstat to fill in this obvious gap. Our software package is based on the statistical open source software R, and expands it to analyse daily stream flow data records focusing on low-flows. As command-line based programs are not everyone's preference, we also offer a plug-in for the R-Commander, an easy to use graphical user interface (GUI) provided for R which is based on tcl/tk. The functionality of lfstat includes estimation methods for low-flow indices, extreme value statistics, deficit characteristics, and additional graphical methods to control the computation of complex indices and to illustrate the data. Beside the basic low flow indices, the baseflow index and recession constants can be computed. For extreme value statistics, state-of-the-art methods for L-moment based local and regional frequency analysis (RFA) are available. The tools for deficit characteristics include various pooling and threshold selection methods to support the calculation of drought duration and deficit indices. The most common graphics for low flow analysis are available, and the plots can be modified according to the user preferences. Graphics include hydrographs for different periods, flexible streamflow deficit plots, baseflow visualisation, recession diagnostic, flow duration curves as well as double mass curves, and many more. From a technical point of view, the package uses a S3-class called lfobj (low-flow objects). This objects are usual R-data-frames including date, flow, hydrological year and possibly baseflow information. Once these objects are created, analysis can be performed by mouse-click and a script can be saved to make the analysis easily reproducible. At the moment we are offering implementation of all major methods proposed in the WMO manual on Low-flow Estimation and Predictions [1]. Future plans include a dynamic low flow report in odt-file format using odf-weave which allows automatic updates if data or analysis change. We hope to offer a tool to ease and structure the analysis of stream flow data focusing on low-flows and to make analysis transparent and communicable. The package can also be used in teaching students the first steps in low-flow hydrology. The software packages can be installed from CRAN (latest stable) and R-Forge: http://r-forge.r-project.org (development version). References: [1] Gustard, Alan; Demuth, Siegfried, (eds.) Manual on Low-flow Estimation and Prediction. Geneva, Switzerland, World Meteorological Organization, (Operational Hydrology Report No. 50, WMO-No. 1029).
Impact of Implementation and Conduct of the HEALTHY Primary Prevention Trial on Student Performance
Hernandez, Arthur E.; Marcus, Marsha D.; Hirst, Kathryn; Faith, Myles S.; Goldberg, Linn; Treviño, Roberto P.
2016-01-01
Purpose To determine whether a school-wide intervention program to reduce risk factors for type 2 diabetes (T2D) affected student achievement, rates of disciplinary actions, and attendance rates. Design The HEALTHY primary prevention trial was designed to evaluate a comprehensive school-based intervention to reduce factors for T2D, especially overweight and obesity. Students were followed up from beginning of sixth grade (Fall 2006) through end of eighth grade (Spring 2009). Setting Forty-two middle schools at seven U.S. sites. Subjects Schools were randomized in equal numbers at each site to intervention (21 schools, 2307 students) or control (21 schools, 2296 students). Intervention An integrated school-wide program that focused on (1) foods and beverages, (2) physical education, (3) classroom-based behavior change and education, and (4) social marketing communication and promotional campaigns. Measures Aggregate (grade- and school-wide) test performance (passing rate), attendance, and referrals for disciplinary actions. Analysis Descriptive statistics and tests of intervention versus control using mixed linear models methods to adjust for the clustering of students within schools. Results There were no differences between intervention and control schools in test performance for mathematics (p = .7835) or reading (p = .6387), attendance (p = .5819), or referrals for disciplinary action (p = .8671). Conclusion The comprehensive HEALTHY intervention and associated research procedures did not negatively impact student achievement test scores, attendance, or referrals for disciplinary action. PMID:24200256
Impact of implementation and conduct of the HEALTHY primary prevention trial on student performance.
Hernandez, Arthur E; Marcus, Marsha D; Hirst, Kathryn; Faith, Myles S; Goldberg, Linn; Treviño, Roberto P
2014-01-01
To determine whether a school-wide intervention program to reduce risk factors for type 2 diabetes (T2D) affected student achievement, rates of disciplinary actions, and attendance rates. The HEALTHY primary prevention trial was designed to evaluate a comprehensive school-based intervention to reduce factors for T2D, especially overweight and obesity. Students were followed up from beginning of sixth grade (Fall 2006) through end of eighth grade (Spring 2009). Forty-two middle schools at seven U.S. sites. Schools were randomized in equal numbers at each site to intervention (21 schools, 2307 students) or control (21 schools, 2296 students). Intervention . An integrated school-wide program that focused on (1) foods and beverages, (2) physical education, (3) classroom-based behavior change and education, and (4) social marketing communication and promotional campaigns. Aggregate (grade- and school-wide) test performance (passing rate), attendance, and referrals for disciplinary actions. Descriptive statistics and tests of intervention versus control using mixed linear models methods to adjust for the clustering of students within schools. There were no differences between intervention and control schools in test performance for mathematics (p = .7835) or reading (p = .6387), attendance (p = .5819), or referrals for disciplinary action (p = .8671). The comprehensive HEALTHY intervention and associated research procedures did not negatively impact student achievement test scores, attendance, or referrals for disciplinary action.
Comprehensive European dietary exposure model (CEDEM) for food additives.
Tennant, David R
2016-05-01
European methods for assessing dietary exposures to nutrients, additives and other substances in food are limited by the availability of detailed food consumption data for all member states. A proposed comprehensive European dietary exposure model (CEDEM) applies summary data published by the European Food Safety Authority (EFSA) in a deterministic model based on an algorithm from the EFSA intake method for food additives. The proposed approach can predict estimates of food additive exposure provided in previous EFSA scientific opinions that were based on the full European food consumption database.
Binocular optical axis parallelism detection precision analysis based on Monte Carlo method
NASA Astrophysics Data System (ADS)
Ying, Jiaju; Liu, Bingqi
2018-02-01
According to the working principle of the binocular photoelectric instrument optical axis parallelism digital calibration instrument, and in view of all components of the instrument, the various factors affect the system precision is analyzed, and then precision analysis model is established. Based on the error distribution, Monte Carlo method is used to analyze the relationship between the comprehensive error and the change of the center coordinate of the circle target image. The method can further guide the error distribution, optimize control the factors which have greater influence on the comprehensive error, and improve the measurement accuracy of the optical axis parallelism digital calibration instrument.
ERIC Educational Resources Information Center
Wang, Bo; Meier, Ann; Shah, Iqbal; Li, Xiaoming
2006-01-01
The purpose of this study was to evaluate a community-based comprehensive sex education program among unmarried youth in China. The impact of the intervention on sexual knowledge, attitudes, and sexual initiation were assessed, using a pre-test post-test quasi-experimental research design. The program used six methods for providing sex-related…
Kralj, Damir; Kern, Josipa; Tonkovic, Stanko; Koncar, Miroslav
2015-09-09
Family medicine practices (FMPs) make the basis for the Croatian health care system. Use of electronic health record (EHR) software is mandatory and it plays an important role in running these practices, but important functional features still remain uneven and largely left to the will of the software developers. The objective of this study was to develop a novel and comprehensive model for functional evaluation of the EHR software in FMPs, based on current world standards, models and projects, as well as on actual user satisfaction and requirements. Based on previous theoretical and experimental research in this area, we made the initial framework model consisting of six basic categories as a base for online survey questionnaire. Family doctors assessed perceived software quality by using a five-point Likert-type scale. Using exploratory factor analysis and appropriate statistical methods over the collected data, the final optimal structure of the novel model was formed. Special attention was focused on the validity and quality of the novel model. The online survey collected a total of 384 cases. The obtained results indicate both the quality of the assessed software and the quality in use of the novel model. The intense ergonomic orientation of the novel measurement model was particularly emphasised. The resulting novel model is multiple validated, comprehensive and universal. It could be used to assess the user-perceived quality of almost all forms of the ambulatory EHR software and therefore useful to all stakeholders in this area of the health care informatisation.
Iwata, Hiroaki; Sawada, Ryusuke; Mizutani, Sayaka; Yamanishi, Yoshihiro
2015-02-23
Drug repositioning, or the application of known drugs to new indications, is a challenging issue in pharmaceutical science. In this study, we developed a new computational method to predict unknown drug indications for systematic drug repositioning in a framework of supervised network inference. We defined a descriptor for each drug-disease pair based on the phenotypic features of drugs (e.g., medicinal effects and side effects) and various molecular features of diseases (e.g., disease-causing genes, diagnostic markers, disease-related pathways, and environmental factors) and constructed a statistical model to predict new drug-disease associations for a wide range of diseases in the International Classification of Diseases. Our results show that the proposed method outperforms previous methods in terms of accuracy and applicability, and its performance does not depend on drug chemical structure similarity. Finally, we performed a comprehensive prediction of a drug-disease association network consisting of 2349 drugs and 858 diseases and described biologically meaningful examples of newly predicted drug indications for several types of cancers and nonhereditary diseases.
Coupling functions: Universal insights into dynamical interaction mechanisms
NASA Astrophysics Data System (ADS)
Stankovski, Tomislav; Pereira, Tiago; McClintock, Peter V. E.; Stefanovska, Aneta
2017-10-01
The dynamical systems found in nature are rarely isolated. Instead they interact and influence each other. The coupling functions that connect them contain detailed information about the functional mechanisms underlying the interactions and prescribe the physical rule specifying how an interaction occurs. A coherent and comprehensive review is presented encompassing the rapid progress made recently in the analysis, understanding, and applications of coupling functions. The basic concepts and characteristics of coupling functions are presented through demonstrative examples of different domains, revealing the mechanisms and emphasizing their multivariate nature. The theory of coupling functions is discussed through gradually increasing complexity from strong and weak interactions to globally coupled systems and networks. A variety of methods that have been developed for the detection and reconstruction of coupling functions from measured data is described. These methods are based on different statistical techniques for dynamical inference. Stemming from physics, such methods are being applied in diverse areas of science and technology, including chemistry, biology, physiology, neuroscience, social sciences, mechanics, and secure communications. This breadth of application illustrates the universality of coupling functions for studying the interaction mechanisms of coupled dynamical systems.
A statistical method for the detection of variants from next-generation resequencing of DNA pools.
Bansal, Vikas
2010-06-15
Next-generation sequencing technologies have enabled the sequencing of several human genomes in their entirety. However, the routine resequencing of complete genomes remains infeasible. The massive capacity of next-generation sequencers can be harnessed for sequencing specific genomic regions in hundreds to thousands of individuals. Sequencing-based association studies are currently limited by the low level of multiplexing offered by sequencing platforms. Pooled sequencing represents a cost-effective approach for studying rare variants in large populations. To utilize the power of DNA pooling, it is important to accurately identify sequence variants from pooled sequencing data. Detection of rare variants from pooled sequencing represents a different challenge than detection of variants from individual sequencing. We describe a novel statistical approach, CRISP [Comprehensive Read analysis for Identification of Single Nucleotide Polymorphisms (SNPs) from Pooled sequencing] that is able to identify both rare and common variants by using two approaches: (i) comparing the distribution of allele counts across multiple pools using contingency tables and (ii) evaluating the probability of observing multiple non-reference base calls due to sequencing errors alone. Information about the distribution of reads between the forward and reverse strands and the size of the pools is also incorporated within this framework to filter out false variants. Validation of CRISP on two separate pooled sequencing datasets generated using the Illumina Genome Analyzer demonstrates that it can detect 80-85% of SNPs identified using individual sequencing while achieving a low false discovery rate (3-5%). Comparison with previous methods for pooled SNP detection demonstrates the significantly lower false positive and false negative rates for CRISP. Implementation of this method is available at http://polymorphism.scripps.edu/~vbansal/software/CRISP/.
Voormolen, Eduard H.J.; Wei, Corie; Chow, Eva W.C.; Bassett, Anne S.; Mikulis, David J.; Crawley, Adrian P.
2011-01-01
Voxel-based morphometry (VBM) and automated lobar region of interest (ROI) volumetry are comprehensive and fast methods to detect differences in overall brain anatomy on magnetic resonance images. However, VBM and automated lobar ROI volumetry have detected dissimilar gray matter differences within identical image sets in our own experience and in previous reports. To gain more insight into how diverging results arise and to attempt to establish whether one method is superior to the other, we investigated how differences in spatial scale and in the need to statistically correct for multiple spatial comparisons influence the relative sensitivity of either technique to group differences in gray matter volumes. We assessed the performance of both techniques on a small dataset containing simulated gray matter deficits and additionally on a dataset of 22q11-deletion syndrome patients with schizophrenia (22q11DS-SZ) vs. matched controls. VBM was more sensitive to simulated focal deficits compared to automated ROI volumetry, and could detect global cortical deficits equally well. Moreover, theoretical calculations of VBM and ROI detection sensitivities to focal deficits showed that at increasing ROI size, ROI volumetry suffers more from loss in sensitivity than VBM. Furthermore, VBM and automated ROI found corresponding GM deficits in 22q11DS-SZ patients, except in the parietal lobe. Here, automated lobar ROI volumetry found a significant deficit only after a smaller subregion of interest was employed. Thus, sensitivity to focal differences is impaired relatively more by averaging over larger volumes in automated ROI methods than by the correction for multiple comparisons in VBM. These findings indicate that VBM is to be preferred over automated lobar-scale ROI volumetry for assessing gray matter volume differences between groups. PMID:19619660
Evaluation of normalization methods in mammalian microRNA-Seq data
Garmire, Lana Xia; Subramaniam, Shankar
2012-01-01
Simple total tag count normalization is inadequate for microRNA sequencing data generated from the next generation sequencing technology. However, so far systematic evaluation of normalization methods on microRNA sequencing data is lacking. We comprehensively evaluate seven commonly used normalization methods including global normalization, Lowess normalization, Trimmed Mean Method (TMM), quantile normalization, scaling normalization, variance stabilization, and invariant method. We assess these methods on two individual experimental data sets with the empirical statistical metrics of mean square error (MSE) and Kolmogorov-Smirnov (K-S) statistic. Additionally, we evaluate the methods with results from quantitative PCR validation. Our results consistently show that Lowess normalization and quantile normalization perform the best, whereas TMM, a method applied to the RNA-Sequencing normalization, performs the worst. The poor performance of TMM normalization is further evidenced by abnormal results from the test of differential expression (DE) of microRNA-Seq data. Comparing with the models used for DE, the choice of normalization method is the primary factor that affects the results of DE. In summary, Lowess normalization and quantile normalization are recommended for normalizing microRNA-Seq data, whereas the TMM method should be used with caution. PMID:22532701
Tian, Zengshan; Xu, Kunjie; Yu, Xiang
2014-01-01
This paper studies the statistical errors for the fingerprint-based RADAR neighbor matching localization with the linearly calibrated reference points (RPs) in logarithmic received signal strength (RSS) varying Wi-Fi environment. To the best of our knowledge, little comprehensive analysis work has appeared on the error performance of neighbor matching localization with respect to the deployment of RPs. However, in order to achieve the efficient and reliable location-based services (LBSs) as well as the ubiquitous context-awareness in Wi-Fi environment, much attention has to be paid to the highly accurate and cost-efficient localization systems. To this end, the statistical errors by the widely used neighbor matching localization are significantly discussed in this paper to examine the inherent mathematical relations between the localization errors and the locations of RPs by using a basic linear logarithmic strength varying model. Furthermore, based on the mathematical demonstrations and some testing results, the closed-form solutions to the statistical errors by RADAR neighbor matching localization can be an effective tool to explore alternative deployment of fingerprint-based neighbor matching localization systems in the future. PMID:24683349
Zhou, Mu; Tian, Zengshan; Xu, Kunjie; Yu, Xiang; Wu, Haibo
2014-01-01
This paper studies the statistical errors for the fingerprint-based RADAR neighbor matching localization with the linearly calibrated reference points (RPs) in logarithmic received signal strength (RSS) varying Wi-Fi environment. To the best of our knowledge, little comprehensive analysis work has appeared on the error performance of neighbor matching localization with respect to the deployment of RPs. However, in order to achieve the efficient and reliable location-based services (LBSs) as well as the ubiquitous context-awareness in Wi-Fi environment, much attention has to be paid to the highly accurate and cost-efficient localization systems. To this end, the statistical errors by the widely used neighbor matching localization are significantly discussed in this paper to examine the inherent mathematical relations between the localization errors and the locations of RPs by using a basic linear logarithmic strength varying model. Furthermore, based on the mathematical demonstrations and some testing results, the closed-form solutions to the statistical errors by RADAR neighbor matching localization can be an effective tool to explore alternative deployment of fingerprint-based neighbor matching localization systems in the future.
Statistical lamb wave localization based on extreme value theory
NASA Astrophysics Data System (ADS)
Harley, Joel B.
2018-04-01
Guided wave localization methods based on delay-and-sum imaging, matched field processing, and other techniques have been designed and researched to create images that locate and describe structural damage. The maximum value of these images typically represent an estimated damage location. Yet, it is often unclear if this maximum value, or any other value in the image, is a statistically significant indicator of damage. Furthermore, there are currently few, if any, approaches to assess the statistical significance of guided wave localization images. As a result, we present statistical delay-and-sum and statistical matched field processing localization methods to create statistically significant images of damage. Our framework uses constant rate of false alarm statistics and extreme value theory to detect damage with little prior information. We demonstrate our methods with in situ guided wave data from an aluminum plate to detect two 0.75 cm diameter holes. Our results show an expected improvement in statistical significance as the number of sensors increase. With seventeen sensors, both methods successfully detect damage with statistical significance.
Hu, Yiwen; Chen, Jiahui; Hu, Guping; Yu, Jianchen; Zhu, Xun; Lin, Yongcheng; Chen, Shengping; Yuan, Jie
2015-01-07
Every year, hundreds of new compounds are discovered from the metabolites of marine organisms. Finding new and useful compounds is one of the crucial drivers for this field of research. Here we describe the statistics of bioactive compounds discovered from marine organisms from 1985 to 2012. This work is based on our database, which contains information on more than 15,000 chemical substances including 4196 bioactive marine natural products. We performed a comprehensive statistical analysis to understand the characteristics of the novel bioactive compounds and detail temporal trends, chemical structures, species distribution, and research progress. We hope this meta-analysis will provide useful information for research into the bioactivity of marine natural products and drug development.
ERIC Educational Resources Information Center
Brickner, Daniel R.; McCombs, Gary B.
2004-01-01
In this article, the authors provide an instructional resource for presenting the indirect method of the statement of cash flows (SCF) in an introductory financial accounting course. The authors focus primarily on presenting a comprehensive example that illustrates the "why" of SCF preparation and show how journal entries and T-accounts can be…
The effect of teaching method on long-term knowledge retention.
Beers, Geri W; Bowden, Susan
2005-11-01
Choosing a teaching strategy that results in knowledge retention on the part of learners can be challenging for educators. Studies on problem-based learning (PBL) have supported its effectiveness, compared to other, more traditional strategies. The results of a previous study comparing the effect of lecture versus PBL on objective test scores indicated there was no significant difference in scores. To measure long-term knowledge retention, the same groups were evaluated 1 year after instruction. The posttest administered in the original study was repeated, and the scores from a comprehensive adult health examination and the endocrine subsection were analyzed. At an alpha level of 0.05, a statistically significant difference was found in the scores on two of the measures. The scores of the PBL group were significantly higher on the endocrine section of the examination and the repeat posttest.
INfORM: Inference of NetwOrk Response Modules.
Marwah, Veer Singh; Kinaret, Pia Anneli Sofia; Serra, Angela; Scala, Giovanni; Lauerma, Antti; Fortino, Vittorio; Greco, Dario
2018-06-15
Detecting and interpreting responsive modules from gene expression data by using network-based approaches is a common but laborious task. It often requires the application of several computational methods implemented in different software packages, forcing biologists to compile complex analytical pipelines. Here we introduce INfORM (Inference of NetwOrk Response Modules), an R shiny application that enables non-expert users to detect, evaluate and select gene modules with high statistical and biological significance. INfORM is a comprehensive tool for the identification of biologically meaningful response modules from consensus gene networks inferred by using multiple algorithms. It is accessible through an intuitive graphical user interface allowing for a level of abstraction from the computational steps. INfORM is freely available for academic use at https://github.com/Greco-Lab/INfORM. Supplementary data are available at Bioinformatics online.
Jiang, Wei; Yu, Weichuan
2017-02-15
In genome-wide association studies (GWASs) of common diseases/traits, we often analyze multiple GWASs with the same phenotype together to discover associated genetic variants with higher power. Since it is difficult to access data with detailed individual measurements, summary-statistics-based meta-analysis methods have become popular to jointly analyze datasets from multiple GWASs. In this paper, we propose a novel summary-statistics-based joint analysis method based on controlling the joint local false discovery rate (Jlfdr). We prove that our method is the most powerful summary-statistics-based joint analysis method when controlling the false discovery rate at a certain level. In particular, the Jlfdr-based method achieves higher power than commonly used meta-analysis methods when analyzing heterogeneous datasets from multiple GWASs. Simulation experiments demonstrate the superior power of our method over meta-analysis methods. Also, our method discovers more associations than meta-analysis methods from empirical datasets of four phenotypes. The R-package is available at: http://bioinformatics.ust.hk/Jlfdr.html . eeyu@ust.hk. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Zhao, Xing; Zhou, Xiao-Hua; Feng, Zijian; Guo, Pengfei; He, Hongyan; Zhang, Tao; Duan, Lei; Li, Xiaosong
2013-01-01
As a useful tool for geographical cluster detection of events, the spatial scan statistic is widely applied in many fields and plays an increasingly important role. The classic version of the spatial scan statistic for the binary outcome is developed by Kulldorff, based on the Bernoulli or the Poisson probability model. In this paper, we apply the Hypergeometric probability model to construct the likelihood function under the null hypothesis. Compared with existing methods, the likelihood function under the null hypothesis is an alternative and indirect method to identify the potential cluster, and the test statistic is the extreme value of the likelihood function. Similar with Kulldorff's methods, we adopt Monte Carlo test for the test of significance. Both methods are applied for detecting spatial clusters of Japanese encephalitis in Sichuan province, China, in 2009, and the detected clusters are identical. Through a simulation to independent benchmark data, it is indicated that the test statistic based on the Hypergeometric model outweighs Kulldorff's statistics for clusters of high population density or large size; otherwise Kulldorff's statistics are superior.
Probing the Statistical Properties of Unknown Texts: Application to the Voynich Manuscript
Amancio, Diego R.; Altmann, Eduardo G.; Rybski, Diego; Oliveira, Osvaldo N.; Costa, Luciano da F.
2013-01-01
While the use of statistical physics methods to analyze large corpora has been useful to unveil many patterns in texts, no comprehensive investigation has been performed on the interdependence between syntactic and semantic factors. In this study we propose a framework for determining whether a text (e.g., written in an unknown alphabet) is compatible with a natural language and to which language it could belong. The approach is based on three types of statistical measurements, i.e. obtained from first-order statistics of word properties in a text, from the topology of complex networks representing texts, and from intermittency concepts where text is treated as a time series. Comparative experiments were performed with the New Testament in 15 different languages and with distinct books in English and Portuguese in order to quantify the dependency of the different measurements on the language and on the story being told in the book. The metrics found to be informative in distinguishing real texts from their shuffled versions include assortativity, degree and selectivity of words. As an illustration, we analyze an undeciphered medieval manuscript known as the Voynich Manuscript. We show that it is mostly compatible with natural languages and incompatible with random texts. We also obtain candidates for keywords of the Voynich Manuscript which could be helpful in the effort of deciphering it. Because we were able to identify statistical measurements that are more dependent on the syntax than on the semantics, the framework may also serve for text analysis in language-dependent applications. PMID:23844002
Probing the statistical properties of unknown texts: application to the Voynich Manuscript.
Amancio, Diego R; Altmann, Eduardo G; Rybski, Diego; Oliveira, Osvaldo N; Costa, Luciano da F
2013-01-01
While the use of statistical physics methods to analyze large corpora has been useful to unveil many patterns in texts, no comprehensive investigation has been performed on the interdependence between syntactic and semantic factors. In this study we propose a framework for determining whether a text (e.g., written in an unknown alphabet) is compatible with a natural language and to which language it could belong. The approach is based on three types of statistical measurements, i.e. obtained from first-order statistics of word properties in a text, from the topology of complex networks representing texts, and from intermittency concepts where text is treated as a time series. Comparative experiments were performed with the New Testament in 15 different languages and with distinct books in English and Portuguese in order to quantify the dependency of the different measurements on the language and on the story being told in the book. The metrics found to be informative in distinguishing real texts from their shuffled versions include assortativity, degree and selectivity of words. As an illustration, we analyze an undeciphered medieval manuscript known as the Voynich Manuscript. We show that it is mostly compatible with natural languages and incompatible with random texts. We also obtain candidates for keywords of the Voynich Manuscript which could be helpful in the effort of deciphering it. Because we were able to identify statistical measurements that are more dependent on the syntax than on the semantics, the framework may also serve for text analysis in language-dependent applications.
Multiplex Microsphere Immunoassays for the Detection of IgM and IgG to Arboviral Diseases
Basile, Alison J.; Horiuchi, Kalanthe; Panella, Amanda J.; Laven, Janeen; Kosoy, Olga; Lanciotti, Robert S.; Venkateswaran, Neeraja; Biggerstaff, Brad J.
2013-01-01
Serodiagnosis of arthropod-borne viruses (arboviruses) at the Division of Vector-Borne Diseases, CDC, employs a combination of individual enzyme-linked immunosorbent assays and microsphere immunoassays (MIAs) to test for IgM and IgG, followed by confirmatory plaque-reduction neutralization tests. Based upon the geographic origin of a sample, it may be tested concurrently for multiple arboviruses, which can be a cumbersome task. The advent of multiplexing represents an opportunity to streamline these types of assays; however, because serologic cross-reactivity of the arboviral antigens often confounds results, it is of interest to employ data analysis methods that address this issue. Here, we constructed 13-virus multiplexed IgM and IgG MIAs that included internal and external controls, based upon the Luminex platform. Results from samples tested using these methods were analyzed using 8 different statistical schemes to identify the best way to classify the data. Geographic batteries were also devised to serve as a more practical diagnostic format, and further samples were tested using the abbreviated multiplexes. Comparative error rates for the classification schemes identified a specific boosting method based on logistic regression “Logitboost” as the classification method of choice. When the data from all samples tested were combined into one set, error rates from the multiplex IgM and IgG MIAs were <5% for all geographic batteries. This work represents both the most comprehensive, validated multiplexing method for arboviruses to date, and also the most systematic attempt to determine the most useful classification method for use with these types of serologic tests. PMID:24086608
Forecasting runout of rock and debris avalanches
Iverson, Richard M.; Evans, S.G.; Mugnozza, G.S.; Strom, A.; Hermanns, R.L.
2006-01-01
Physically based mathematical models and statistically based empirical equations each may provide useful means of forecasting runout of rock and debris avalanches. This paper compares the foundations, strengths, and limitations of a physically based model and a statistically based forecasting method, both of which were developed to predict runout across three-dimensional topography. The chief advantage of the physically based model results from its ties to physical conservation laws and well-tested axioms of soil and rock mechanics, such as the Coulomb friction rule and effective-stress principle. The output of this model provides detailed information about the dynamics of avalanche runout, at the expense of high demands for accurate input data, numerical computation, and experimental testing. In comparison, the statistical method requires relatively modest computation and no input data except identification of prospective avalanche source areas and a range of postulated avalanche volumes. Like the physically based model, the statistical method yields maps of predicted runout, but it provides no information on runout dynamics. Although the two methods differ significantly in their structure and objectives, insights gained from one method can aid refinement of the other.
Evaluation of methods for managing censored results when calculating the geometric mean.
Mikkonen, Hannah G; Clarke, Bradley O; Dasika, Raghava; Wallis, Christian J; Reichman, Suzie M
2018-01-01
Currently, there are conflicting views on the best statistical methods for managing censored environmental data. The method commonly applied by environmental science researchers and professionals is to substitute half the limit of reporting for derivation of summary statistics. This approach has been criticised by some researchers, raising questions around the interpretation of historical scientific data. This study evaluated four complete soil datasets, at three levels of simulated censorship, to test the accuracy of a range of censored data management methods for calculation of the geometric mean. The methods assessed included removal of censored results, substitution of a fixed value (near zero, half the limit of reporting and the limit of reporting), substitution by nearest neighbour imputation, maximum likelihood estimation, regression on order substitution and Kaplan-Meier/survival analysis. This is the first time such a comprehensive range of censored data management methods have been applied to assess the accuracy of calculation of the geometric mean. The results of this study show that, for describing the geometric mean, the simple method of substitution of half the limit of reporting is comparable or more accurate than alternative censored data management methods, including nearest neighbour imputation methods. Copyright © 2017 Elsevier Ltd. All rights reserved.
Mishmast Nehy, GhA
2015-01-01
Developing and expanding the universities and increasing the admission of medical students did resolve the physician shortage, but it brought down the educational quality in return. To face this problem, the administrates needed to promote the quality of education which in turn needed accurate up to date information about conditions in different universities. Information about these issues was collected by the Medical Education Council Secretariat and finally published as the Data Bank and Ranking of the Medical Faculties. Method: Although nowadays ranking is more qualitative rather than quantitative, the above ranking was done by a statistical method. In this research, the intended statistic population consisted of the data included in the database and the ranking of all 38 medical faculties. To perform this research, the ranking of faculties in the comprehensive entrance exam which indicated the input of educational system was considered the index at first, and later, the ranking of the faculties in the effective factors in education, was arranged according to the regulation of the input system; then outputs of the educational system were adjusted according to the input system and finally a comprehensive table of all the educational information was provided. Then, the relationship of various factors in education with outputs of educational system were discussed. Result: The correlations of each and all factors, which have an effective part on education were considered separately, collectively, and together, based on the information of the above book. No connection was detected within the factors, which affected the education and the output in different universities. The only relation notable was the admission degree and the outcomes of the national basic science exams. Since no meaningful connection was found within the present parameters, it seemed to be wrong to follow the path that the other sections of the world have taken in choosing the ranking factors. PMID:28316660
Mishmast Nehy, GhA
2015-01-01
Developing and expanding the universities and increasing the admission of medical students did resolve the physician shortage, but it brought down the educational quality in return. To face this problem, the administrates needed to promote the quality of education which in turn needed accurate up to date information about conditions in different universities. Information about these issues was collected by the Medical Education Council Secretariat and finally published as the Data Bank and Ranking of the Medical Faculties. Method: Although nowadays ranking is more qualitative rather than quantitative, the above ranking was done by a statistical method. In this research, the intended statistic population consisted of the data included in the database and the ranking of all 38 medical faculties. To perform this research, the ranking of faculties in the comprehensive entrance exam which indicated the input of educational system was considered the index at first, and later, the ranking of the faculties in the effective factors in education, was arranged according to the regulation of the input system; then outputs of the educational system were adjusted according to the input system and finally a comprehensive table of all the educational information was provided. Then, the relationship of various factors in education with outputs of educational system were discussed. Result: The correlations of each and all factors, which have an effective part on education were considered separately, collectively, and together, based on the information of the above book. No connection was detected within the factors, which affected the education and the output in different universities. The only relation notable was the admission degree and the outcomes of the national basic science exams. Since no meaningful connection was found within the present parameters, it seemed to be wrong to follow the path that the other sections of the world have taken in choosing the ranking factors.
Völkel, Gabriela; Seabi, Joseph; Cockcroft, Kate; Goldschagg, Paul
2016-01-01
The current study constituted part of a larger, longitudinal, South African-based study, namely, The Road and Aircraft Noise Exposure on Children’s Cognition and Health (RANCH—South Africa). In the context of a multicultural South Africa and varying demographic variables thereof, this study sought to investigate and describe the effects of gender, socioeconomic status and home language on primary school children’s reading comprehension in KwaZulu-Natal. In total, 834 learners across 5 public schools in the KwaZulu-Natal province participated in the study. A biographical questionnaire was used to obtain biographical data relevant to this study, and the Suffolk Reading Scale 2 (SRS2) was used to obtain reading comprehension scores. The findings revealed that there was no statistical difference between males and females on reading comprehension scores. In terms of socioeconomic status (SES), learners from a low socioeconomic background performed significantly better than those from a high socioeconomic background. English as a First Language (EL1) speakers had a higher mean reading comprehension score than speakers who spoke English as an Additional Language (EAL). Reading comprehension is indeed affected by a variety of variables, most notably that of language proficiency. The tool to measure reading comprehension needs to be standardized and administered in more than one language, which will ensure increased reliability and validity of reading comprehension scores. PMID:26999169
Völkel, Gabriela; Seabi, Joseph; Cockcroft, Kate; Goldschagg, Paul
2016-03-15
The current study constituted part of a larger, longitudinal, South African-based study, namely, The Road and Aircraft Noise Exposure on Children's Cognition and Health (RANCH-South Africa). In the context of a multicultural South Africa and varying demographic variables thereof, this study sought to investigate and describe the effects of gender, socioeconomic status and home language on primary school children's reading comprehension in KwaZulu-Natal. In total, 834 learners across 5 public schools in the KwaZulu-Natal province participated in the study. A biographical questionnaire was used to obtain biographical data relevant to this study, and the Suffolk Reading Scale 2 (SRS2) was used to obtain reading comprehension scores. The findings revealed that there was no statistical difference between males and females on reading comprehension scores. In terms of socioeconomic status (SES), learners from a low socioeconomic background performed significantly better than those from a high socioeconomic background. English as a First Language (EL1) speakers had a higher mean reading comprehension score than speakers who spoke English as an Additional Language (EAL). Reading comprehension is indeed affected by a variety of variables, most notably that of language proficiency. The tool to measure reading comprehension needs to be standardized and administered in more than one language, which will ensure increased reliability and validity of reading comprehension scores.
Concurrent profiling of polar metabolites and lipids in human plasma using HILIC-FTMS
NASA Astrophysics Data System (ADS)
Cai, Xiaoming; Li, Ruibin
2016-11-01
Blood plasma is the most popularly used sample matrix for metabolite profiling studies, which aim to achieve global metabolite profiling and biomarker discovery. However, most of the current studies on plasma metabolite profiling focused on either the polar metabolites or lipids. In this study, a comprehensive analysis approach based on HILIC-FTMS was developed to concurrently examine polar metabolites and lipids. The HILIC-FTMS method was developed using mixed standards of polar metabolites and lipids, the separation efficiency of which is better in HILIC mode than in C5 and C18 reversed phase (RP) chromatography. This method exhibits good reproducibility in retention times (CVs < 3.43%) and high mass accuracy (<3.5 ppm). In addition, we found MeOH/ACN/Acetone (1:1:1, v/v/v) as extraction cocktail could achieve desirable gathering of demanded extracts from plasma samples. We further integrated the MeOH/ACN/Acetone extraction with the HILIC-FTMS method for metabolite profiling and smoking-related biomarker discovery in human plasma samples. Heavy smokers could be successfully distinguished from non smokers by univariate and multivariate statistical analysis of the profiling data, and 62 biomarkers for cigarette smoke were found. These results indicate that our concurrent analysis approach could be potentially used for clinical biomarker discovery, metabolite-based diagnosis, etc.
Zuin, Vânia G; Budarin, Vitaliy L; De Bruyn, Mario; Shuttleworth, Peter S; Hunt, Andrew J; Pluciennik, Camille; Borisova, Aleksandra; Dodson, Jennifer; Parker, Helen L; Clark, James H
2017-09-21
The recovery and separation of high value and low volume extractives are a considerable challenge for the commercial realisation of zero-waste biorefineries. Using solid-phase extractions (SPE) based on sustainable sorbents is a promising method to enable efficient, green and selective separation of these complex extractive mixtures. Mesoporous carbonaceous solids derived from renewable polysaccharides are ideal stationary phases due to their tuneable functionality and surface structure. In this study, the structure-separation relationships of thirteen polysaccharide-derived mesoporous materials and two modified types as sorbents for ten naturally-occurring bioactive phenolic compounds were investigated. For the first time, a comprehensive statistical analysis of the key molecular and surface properties influencing the recovery of these species was carried out. The obtained results show the possibility of developing tailored materials for purification, separation or extraction, depending on the molecular composition of the analyte. The wide versatility and application span of these polysaccharide-derived mesoporous materials offer new sustainable and inexpensive alternatives to traditional silica-based stationary phases.
Li, Yingxue; Hu, Yiying; Yang, Jingang; Li, Xiang; Liu, Haifeng; Xie, Guotong; Xu, Meilin; Hu, Jingyi; Yang, Yuejin
2017-01-01
Treatment effectiveness plays a fundamental role in patient therapies. In most observational studies, researchers often design an analysis pipeline for a specific treatment based on the study cohort. To evaluate other treatments in the data set, much repeated and multifarious work including cohort construction, statistical analysis need to be done. In addition, as treatments are often with an intrinsic hierarchical relationship, many rational comparable treatment pairs can be derived as new treatment variables besides the original single treatment one from the original cohort data set. In this paper, we propose an automatic treatment effectiveness analysis approach to solve this problem. With our approach, clinicians can assess the effect of treatments not only more conveniently but also more thoroughly and comprehensively. We applied this method to a real world case of estimating the drug effectiveness on Chinese Acute Myocardial Infarction (CAMI) data set and some meaningful results are obtained for potential improvement of patient treatments.
Liao, Pei-Hung; Hsu, Pei-Ti; Chu, William; Chu, Woei-Chyn
2015-06-01
This study applied artificial intelligence to help nurses address problems and receive instructions through information technology. Nurses make diagnoses according to professional knowledge, clinical experience, and even instinct. Without comprehensive knowledge and thinking, diagnostic accuracy can be compromised and decisions may be delayed. We used a back-propagation neural network and other tools for data mining and statistical analysis. We further compared the prediction accuracy of the previous methods with an adaptive-network-based fuzzy inference system and the back-propagation neural network, identifying differences in the questions and in nurse satisfaction levels before and after using the nursing information system. This study investigated the use of artificial intelligence to generate nursing diagnoses. The percentage of agreement between diagnoses suggested by the information system and those made by nurses was as much as 87 percent. When patients are hospitalized, we can calculate the probability of various nursing diagnoses based on certain characteristics. © The Author(s) 2013.
Kamal, Ghulam Mustafa; Wang, Xiaohua; Bin Yuan; Wang, Jie; Sun, Peng; Zhang, Xu; Liu, Maili
2016-09-01
Soy sauce a well known seasoning all over the world, especially in Asia, is available in global market in a wide range of types based on its purpose and the processing methods. Its composition varies with respect to the fermentation processes and addition of additives, preservatives and flavor enhancers. A comprehensive (1)H NMR based study regarding the metabonomic variations of soy sauce to differentiate among different types of soy sauce available on the global market has been limited due to the complexity of the mixture. In present study, (13)C NMR spectroscopy coupled with multivariate statistical data analysis like principle component analysis (PCA), and orthogonal partial least square-discriminant analysis (OPLS-DA) was applied to investigate metabonomic variations among different types of soy sauce, namely super light, super dark, red cooking and mushroom soy sauce. The main additives in soy sauce like glutamate, sucrose and glucose were easily distinguished and quantified using (13)C NMR spectroscopy which were otherwise difficult to be assigned and quantified due to serious signal overlaps in (1)H NMR spectra. The significantly higher concentration of sucrose in dark, red cooking and mushroom flavored soy sauce can directly be linked to the addition of caramel in soy sauce. Similarly, significantly higher level of glutamate in super light as compared to super dark and mushroom flavored soy sauce may come from the addition of monosodium glutamate. The study highlights the potentiality of (13)C NMR based metabonomics coupled with multivariate statistical data analysis in differentiating between the types of soy sauce on the basis of level of additives, raw materials and fermentation procedures. Copyright © 2016 Elsevier B.V. All rights reserved.
An entropy-based statistic for genomewide association studies.
Zhao, Jinying; Boerwinkle, Eric; Xiong, Momiao
2005-07-01
Efficient genotyping methods and the availability of a large collection of single-nucleotide polymorphisms provide valuable tools for genetic studies of human disease. The standard chi2 statistic for case-control studies, which uses a linear function of allele frequencies, has limited power when the number of marker loci is large. We introduce a novel test statistic for genetic association studies that uses Shannon entropy and a nonlinear function of allele frequencies to amplify the differences in allele and haplotype frequencies to maintain statistical power with large numbers of marker loci. We investigate the relationship between the entropy-based test statistic and the standard chi2 statistic and show that, in most cases, the power of the entropy-based statistic is greater than that of the standard chi2 statistic. The distribution of the entropy-based statistic and the type I error rates are validated using simulation studies. Finally, we apply the new entropy-based test statistic to two real data sets, one for the COMT gene and schizophrenia and one for the MMP-2 gene and esophageal carcinoma, to evaluate the performance of the new method for genetic association studies. The results show that the entropy-based statistic obtained smaller P values than did the standard chi2 statistic.
NASA Astrophysics Data System (ADS)
Zhang, Y.; Li, F.; Zhang, S.; Hao, W.; Zhu, T.; Yuan, L.; Xiao, F.
2017-09-01
In this paper, Statistical Distribution based Conditional Random Fields (STA-CRF) algorithm is exploited for improving marginal ice-water classification. Pixel level ice concentration is presented as the comparison of methods based on CRF. Furthermore, in order to explore the effective statistical distribution model to be integrated into STA-CRF, five statistical distribution models are investigated. The STA-CRF methods are tested on 2 scenes around Prydz Bay and Adélie Depression, where contain a variety of ice types during melt season. Experimental results indicate that the proposed method can resolve sea ice edge well in Marginal Ice Zone (MIZ) and show a robust distinction of ice and water.
Shulman, Lawrence N; Palis, Bryan E; McCabe, Ryan; Mallin, Kathy; Loomis, Ashley; Winchester, David; McKellar, Daniel
2018-01-01
Survival is considered an important indicator of the quality of cancer care, but the validity of different methodologies to measure comparative survival rates is less well understood. We explored whether the National Cancer Data Base (NCDB) could serve as a source of unadjusted and risk-adjusted cancer survival data and whether these data could be used as quality indicators for individual hospitals or in the aggregate by hospital type. The NCDB, an aggregate of > 1,500 hospital cancer registries, was queried to analyze unadjusted and risk-adjusted hazards of death for patients with stage III breast cancer (n = 116,787) and stage IIIB or IV non-small-cell lung cancer (n = 252,392). Data were analyzed at the individual hospital level and by hospital type. At the hospital level, after risk adjustment, few hospitals had comparative risk-adjusted survival rates that were statistically better or worse. By hospital type, National Cancer Institute-designated comprehensive cancer centers had risk-adjusted survival ratios that were statistically significantly better than those of academic cancer centers and community hospitals. Using the NCDB as the data source, survival rates for patients with stage III breast cancer and stage IIIB or IV non-small-cell lung cancer were statistically better at National Cancer Institute-designated comprehensive cancer centers when compared with other hospital types. Compared with academic hospitals, risk-adjusted survival was lower in community hospitals. At the individual hospital level, after risk adjustment, few hospitals were shown to have statistically better or worse survival, suggesting that, using NCDB data, survival may not be a good metric to determine relative quality of cancer care at this level.
Wisdom, Jennifer P; Cavaleri, Mary A; Onwuegbuzie, Anthony J; Green, Carla A
2012-04-01
Methodologically sound mixed methods research can improve our understanding of health services by providing a more comprehensive picture of health services than either method can alone. This study describes the frequency of mixed methods in published health services research and compares the presence of methodological components indicative of rigorous approaches across mixed methods, qualitative, and quantitative articles. All empirical articles (n = 1,651) published between 2003 and 2007 from four top-ranked health services journals. All mixed methods articles (n = 47) and random samples of qualitative and quantitative articles were evaluated to identify reporting of key components indicating rigor for each method, based on accepted standards for evaluating the quality of research reports (e.g., use of p-values in quantitative reports, description of context in qualitative reports, and integration in mixed method reports). We used chi-square tests to evaluate differences between article types for each component. Mixed methods articles comprised 2.85 percent (n = 47) of empirical articles, quantitative articles 90.98 percent (n = 1,502), and qualitative articles 6.18 percent (n = 102). There was a statistically significant difference (χ(2) (1) = 12.20, p = .0005, Cramer's V = 0.09, odds ratio = 1.49 [95% confidence interval = 1,27, 1.74]) in the proportion of quantitative methodological components present in mixed methods compared to quantitative papers (21.94 versus 47.07 percent, respectively) but no statistically significant difference (χ(2) (1) = 0.02, p = .89, Cramer's V = 0.01) in the proportion of qualitative methodological components in mixed methods compared to qualitative papers (21.34 versus 25.47 percent, respectively). Few published health services research articles use mixed methods. The frequency of key methodological components is variable. Suggestions are provided to increase the transparency of mixed methods studies and the presence of key methodological components in published reports. © Health Research and Educational Trust.
Wisdom, Jennifer P; Cavaleri, Mary A; Onwuegbuzie, Anthony J; Green, Carla A
2012-01-01
Objectives Methodologically sound mixed methods research can improve our understanding of health services by providing a more comprehensive picture of health services than either method can alone. This study describes the frequency of mixed methods in published health services research and compares the presence of methodological components indicative of rigorous approaches across mixed methods, qualitative, and quantitative articles. Data Sources All empirical articles (n = 1,651) published between 2003 and 2007 from four top-ranked health services journals. Study Design All mixed methods articles (n = 47) and random samples of qualitative and quantitative articles were evaluated to identify reporting of key components indicating rigor for each method, based on accepted standards for evaluating the quality of research reports (e.g., use of p-values in quantitative reports, description of context in qualitative reports, and integration in mixed method reports). We used chi-square tests to evaluate differences between article types for each component. Principal Findings Mixed methods articles comprised 2.85 percent (n = 47) of empirical articles, quantitative articles 90.98 percent (n = 1,502), and qualitative articles 6.18 percent (n = 102). There was a statistically significant difference (χ2(1) = 12.20, p = .0005, Cramer's V = 0.09, odds ratio = 1.49 [95% confidence interval = 1,27, 1.74]) in the proportion of quantitative methodological components present in mixed methods compared to quantitative papers (21.94 versus 47.07 percent, respectively) but no statistically significant difference (χ2(1) = 0.02, p = .89, Cramer's V = 0.01) in the proportion of qualitative methodological components in mixed methods compared to qualitative papers (21.34 versus 25.47 percent, respectively). Conclusion Few published health services research articles use mixed methods. The frequency of key methodological components is variable. Suggestions are provided to increase the transparency of mixed methods studies and the presence of key methodological components in published reports. PMID:22092040
A Study of Wind Turbine Comprehensive Operational Assessment Model Based on EM-PCA Algorithm
NASA Astrophysics Data System (ADS)
Zhou, Minqiang; Xu, Bin; Zhan, Yangyan; Ren, Danyuan; Liu, Dexing
2018-01-01
To assess wind turbine performance accurately and provide theoretical basis for wind farm management, a hybrid assessment model based on Entropy Method and Principle Component Analysis (EM-PCA) was established, which took most factors of operational performance into consideration and reach to a comprehensive result. To verify the model, six wind turbines were chosen as the research objects, the ranking obtained by the method proposed in the paper were 4#>6#>1#>5#>2#>3#, which are completely in conformity with the theoretical ranking, which indicates that the reliability and effectiveness of the EM-PCA method are high. The method could give guidance for processing unit state comparison among different units and launching wind farm operational assessment.
Water Quality Evaluation of the Yellow River Basin Based on Gray Clustering Method
NASA Astrophysics Data System (ADS)
Fu, X. Q.; Zou, Z. H.
2018-03-01
Evaluating the water quality of 12 monitoring sections in the Yellow River Basin comprehensively by grey clustering method based on the water quality monitoring data from the Ministry of environmental protection of China in May 2016 and the environmental quality standard of surface water. The results can reflect the water quality of the Yellow River Basin objectively. Furthermore, the evaluation results are basically the same when compared with the fuzzy comprehensive evaluation method. The results also show that the overall water quality of the Yellow River Basin is good and coincident with the actual situation of the Yellow River basin. Overall, gray clustering method for water quality evaluation is reasonable and feasible and it is also convenient to calculate.
Dinov, Ivo D.; Heavner, Ben; Tang, Ming; Glusman, Gustavo; Chard, Kyle; Darcy, Mike; Madduri, Ravi; Pa, Judy; Spino, Cathie; Kesselman, Carl; Foster, Ian; Deutsch, Eric W.; Price, Nathan D.; Van Horn, John D.; Ames, Joseph; Clark, Kristi; Hood, Leroy; Hampstead, Benjamin M.; Dauer, William; Toga, Arthur W.
2016-01-01
Background A unique archive of Big Data on Parkinson’s Disease is collected, managed and disseminated by the Parkinson’s Progression Markers Initiative (PPMI). The integration of such complex and heterogeneous Big Data from multiple sources offers unparalleled opportunities to study the early stages of prevalent neurodegenerative processes, track their progression and quickly identify the efficacies of alternative treatments. Many previous human and animal studies have examined the relationship of Parkinson’s disease (PD) risk to trauma, genetics, environment, co-morbidities, or life style. The defining characteristics of Big Data–large size, incongruency, incompleteness, complexity, multiplicity of scales, and heterogeneity of information-generating sources–all pose challenges to the classical techniques for data management, processing, visualization and interpretation. We propose, implement, test and validate complementary model-based and model-free approaches for PD classification and prediction. To explore PD risk using Big Data methodology, we jointly processed complex PPMI imaging, genetics, clinical and demographic data. Methods and Findings Collective representation of the multi-source data facilitates the aggregation and harmonization of complex data elements. This enables joint modeling of the complete data, leading to the development of Big Data analytics, predictive synthesis, and statistical validation. Using heterogeneous PPMI data, we developed a comprehensive protocol for end-to-end data characterization, manipulation, processing, cleaning, analysis and validation. Specifically, we (i) introduce methods for rebalancing imbalanced cohorts, (ii) utilize a wide spectrum of classification methods to generate consistent and powerful phenotypic predictions, and (iii) generate reproducible machine-learning based classification that enables the reporting of model parameters and diagnostic forecasting based on new data. We evaluated several complementary model-based predictive approaches, which failed to generate accurate and reliable diagnostic predictions. However, the results of several machine-learning based classification methods indicated significant power to predict Parkinson’s disease in the PPMI subjects (consistent accuracy, sensitivity, and specificity exceeding 96%, confirmed using statistical n-fold cross-validation). Clinical (e.g., Unified Parkinson's Disease Rating Scale (UPDRS) scores), demographic (e.g., age), genetics (e.g., rs34637584, chr12), and derived neuroimaging biomarker (e.g., cerebellum shape index) data all contributed to the predictive analytics and diagnostic forecasting. Conclusions Model-free Big Data machine learning-based classification methods (e.g., adaptive boosting, support vector machines) can outperform model-based techniques in terms of predictive precision and reliability (e.g., forecasting patient diagnosis). We observed that statistical rebalancing of cohort sizes yields better discrimination of group differences, specifically for predictive analytics based on heterogeneous and incomplete PPMI data. UPDRS scores play a critical role in predicting diagnosis, which is expected based on the clinical definition of Parkinson’s disease. Even without longitudinal UPDRS data, however, the accuracy of model-free machine learning based classification is over 80%. The methods, software and protocols developed here are openly shared and can be employed to study other neurodegenerative disorders (e.g., Alzheimer’s, Huntington’s, amyotrophic lateral sclerosis), as well as for other predictive Big Data analytics applications. PMID:27494614
VALUE - A Framework to Validate Downscaling Approaches for Climate Change Studies
NASA Astrophysics Data System (ADS)
Maraun, Douglas; Widmann, Martin; Gutiérrez, José M.; Kotlarski, Sven; Chandler, Richard E.; Hertig, Elke; Wibig, Joanna; Huth, Radan; Wilke, Renate A. I.
2015-04-01
VALUE is an open European network to validate and compare downscaling methods for climate change research. VALUE aims to foster collaboration and knowledge exchange between climatologists, impact modellers, statisticians, and stakeholders to establish an interdisciplinary downscaling community. A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. Here, we present the key ingredients of this framework. VALUE's main approach to validation is user-focused: starting from a specific user problem, a validation tree guides the selection of relevant validation indices and performance measures. Several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur: what is the isolated downscaling skill? How do statistical and dynamical methods compare? How do methods perform at different spatial scales? Do methods fail in representing regional climate change? How is the overall representation of regional climate, including errors inherited from global climate models? The framework will be the basis for a comprehensive community-open downscaling intercomparison study, but is intended also to provide general guidance for other validation studies.
VALUE: A framework to validate downscaling approaches for climate change studies
NASA Astrophysics Data System (ADS)
Maraun, Douglas; Widmann, Martin; Gutiérrez, José M.; Kotlarski, Sven; Chandler, Richard E.; Hertig, Elke; Wibig, Joanna; Huth, Radan; Wilcke, Renate A. I.
2015-01-01
VALUE is an open European network to validate and compare downscaling methods for climate change research. VALUE aims to foster collaboration and knowledge exchange between climatologists, impact modellers, statisticians, and stakeholders to establish an interdisciplinary downscaling community. A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. In this paper, we present the key ingredients of this framework. VALUE's main approach to validation is user- focused: starting from a specific user problem, a validation tree guides the selection of relevant validation indices and performance measures. Several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur: what is the isolated downscaling skill? How do statistical and dynamical methods compare? How do methods perform at different spatial scales? Do methods fail in representing regional climate change? How is the overall representation of regional climate, including errors inherited from global climate models? The framework will be the basis for a comprehensive community-open downscaling intercomparison study, but is intended also to provide general guidance for other validation studies.
Three Reading Comprehension Strategies: TELLS, Story Mapping, and QARs.
ERIC Educational Resources Information Center
Sorrell, Adrian L.
1990-01-01
Three reading comprehension strategies are presented to assist learning-disabled students: an advance organizer technique called "TELLS Fact or Fiction" used before reading a passage, a schema-based technique called "Story Mapping" used while reading, and a postreading method of categorizing questions called…
Jacob LaFontaine; Lauren Hay; Stacey Archfield; William Farmer; Julie Kiang
2016-01-01
The U.S. Geological Survey (USGS) has developed a National Hydrologic Model (NHM) to support coordinated, comprehensive and consistent hydrologic model development, and facilitate the application of hydrologic simulations within the continental US. The portion of the NHM located within the Gulf Coastal Plains and Ozarks Landscape Conservation Cooperative (GCPO LCC) is...
A New Mathematical Framework for Design Under Uncertainty
2016-05-05
blending multiple information sources via auto-regressive stochastic modeling. A computationally efficient machine learning framework is developed based on...sion and machine learning approaches; see Fig. 1. This will lead to a comprehensive description of system performance with less uncertainty than in the...Bayesian optimization of super-cavitating hy- drofoils The goal of this study is to demonstrate the capabilities of statistical learning and
Statistical Abstract of the United States: 2012. 131st Edition
ERIC Educational Resources Information Center
US Census Bureau, 2011
2011-01-01
"The Statistical Abstract of the United States," published from 1878 to 2012, is the authoritative and comprehensive summary of statistics on the social, political, and economic organization of the United States. It is designed to serve as a convenient volume for statistical reference, and as a guide to other statistical publications and…
NASA Astrophysics Data System (ADS)
Dieppois, B.; Pohl, B.; Eden, J.; Crétat, J.; Rouault, M.; Keenlyside, N.; New, M. G.
2017-12-01
The water management community has hitherto neglected or underestimated many of the uncertainties in climate impact scenarios, in particular, uncertainties associated with decadal climate variability. Uncertainty in the state-of-the-art global climate models (GCMs) is time-scale-dependant, e.g. stronger at decadal than at interannual timescales, in response to the different parameterizations and to internal climate variability. In addition, non-stationarity in statistical downscaling is widely recognized as a key problem, in which time-scale dependency of predictors plays an important role. As with global climate modelling, therefore, the selection of downscaling methods must proceed with caution to avoid unintended consequences of over-correcting the noise in GCMs (e.g. interpreting internal climate variability as a model bias). GCM outputs from the Coupled Model Intercomparison Project 5 (CMIP5) have therefore first been selected based on their ability to reproduce southern African summer rainfall variability and their teleconnections with Pacific sea-surface temperature across the dominant timescales. In observations, southern African summer rainfall has recently been shown to exhibit significant periodicities at the interannual timescale (2-8 years), quasi-decadal (8-13 years) and inter-decadal (15-28 years) timescales, which can be interpret as the signature of ENSO, the IPO, and the PDO over the region. Most of CMIP5 GCMs underestimate southern African summer rainfall variability and their teleconnections with Pacific SSTs at these three timescales. In addition, according to a more in-depth analysis of historical and pi-control runs, this bias is might result from internal climate variability in some of the CMIP5 GCMs, suggesting potential for bias-corrected prediction based empirical statistical downscaling. A multi-timescale regression based downscaling procedure, which determines the predictors across the different timescales, has thus been used to simulate southern African summer rainfall. This multi-timescale procedure shows much better skills in simulating decadal timescales of variability compared to commonly used statistical downscaling approaches.
A nonparametric spatial scan statistic for continuous data.
Jung, Inkyung; Cho, Ho Jin
2015-10-20
Spatial scan statistics are widely used for spatial cluster detection, and several parametric models exist. For continuous data, a normal-based scan statistic can be used. However, the performance of the model has not been fully evaluated for non-normal data. We propose a nonparametric spatial scan statistic based on the Wilcoxon rank-sum test statistic and compared the performance of the method with parametric models via a simulation study under various scenarios. The nonparametric method outperforms the normal-based scan statistic in terms of power and accuracy in almost all cases under consideration in the simulation study. The proposed nonparametric spatial scan statistic is therefore an excellent alternative to the normal model for continuous data and is especially useful for data following skewed or heavy-tailed distributions.
A comprehensive prediction and evaluation method of pilot workload.
Feng, Chuanyan; Wanyan, Xiaoru; Yang, Kun; Zhuang, Damin; Wu, Xu
2018-01-01
The prediction and evaluation of pilot workload is a key problem in human factor airworthiness of cockpit. A pilot traffic pattern task was designed in a flight simulation environment in order to carry out the pilot workload prediction and improve the evaluation method. The prediction of typical flight subtasks and dynamic workloads (cruise, approach, and landing) were built up based on multiple resource theory, and a favorable validity was achieved by the correlation analysis verification between sensitive physiological data and the predicted value. Statistical analysis indicated that eye movement indices (fixation frequency, mean fixation time, saccade frequency, mean saccade time, and mean pupil diameter), Electrocardiogram indices (mean normal-to-normal interval and the ratio between low frequency and sum of low frequency and high frequency), and Electrodermal Activity indices (mean tonic and mean phasic) were all sensitive to typical workloads of subjects. A multinominal logistic regression model based on combination of physiological indices (fixation frequency, mean normal-to-normal interval, the ratio between low frequency and sum of low frequency and high frequency, and mean tonic) was constructed, and the discriminate accuracy was comparatively ideal with a rate of 84.85%.
nRC: non-coding RNA Classifier based on structural features.
Fiannaca, Antonino; La Rosa, Massimo; La Paglia, Laura; Rizzo, Riccardo; Urso, Alfonso
2017-01-01
Non-coding RNA (ncRNA) are small non-coding sequences involved in gene expression regulation of many biological processes and diseases. The recent discovery of a large set of different ncRNAs with biologically relevant roles has opened the way to develop methods able to discriminate between the different ncRNA classes. Moreover, the lack of knowledge about the complete mechanisms in regulative processes, together with the development of high-throughput technologies, has required the help of bioinformatics tools in addressing biologists and clinicians with a deeper comprehension of the functional roles of ncRNAs. In this work, we introduce a new ncRNA classification tool, nRC (non-coding RNA Classifier). Our approach is based on features extraction from the ncRNA secondary structure together with a supervised classification algorithm implementing a deep learning architecture based on convolutional neural networks. We tested our approach for the classification of 13 different ncRNA classes. We obtained classification scores, using the most common statistical measures. In particular, we reach an accuracy and sensitivity score of about 74%. The proposed method outperforms other similar classification methods based on secondary structure features and machine learning algorithms, including the RNAcon tool that, to date, is the reference classifier. nRC tool is freely available as a docker image at https://hub.docker.com/r/tblab/nrc/. The source code of nRC tool is also available at https://github.com/IcarPA-TBlab/nrc.
Coenen, Michaela; Stamm, Tanja A; Stucki, Gerold; Cieza, Alarcos
2012-03-01
To compare two different approaches to performing focus groups and individual interviews, an open approach, and an approach based on the International Classification of Functioning, Disability and Health (ICF). Patients with rheumatoid arthritis attended focus groups (n = 49) and individual interviews (n = 21). Time, number of concepts, ICF categories identified, and sample size for reaching saturation of data were compared. Descriptive statistics, Chi-square tests, and independent t tests were performed. With an overall time of 183 h, focus groups were more time consuming than individual interviews (t = 9.782; P < 0.001). In the open approach, 188 categories in the focus groups and 102 categories in the interviews were identified compared to the 231 and 110 respective categories identified in the ICF-based approach. Saturation of data was reached after performing five focus groups and nine individual interviews in the open approach and five focus groups and 12 individual interviews in the ICF-based approach. The method chosen should depend on the objective of the study, issues related to the health condition, and the study's participants. We recommend performing focus groups if the objective of the study is to comprehensively explore the patient perspective.
Hsiao, Tzu-Hung; Chiu, Yu-Chiao; Hsu, Pei-Yin; Lu, Tzu-Pin; Lai, Liang-Chuan; Tsai, Mong-Hsun; Huang, Tim H.-M.; Chuang, Eric Y.; Chen, Yidong
2016-01-01
Several mutual information (MI)-based algorithms have been developed to identify dynamic gene-gene and function-function interactions governed by key modulators (genes, proteins, etc.). Due to intensive computation, however, these methods rely heavily on prior knowledge and are limited in genome-wide analysis. We present the modulated gene/gene set interaction (MAGIC) analysis to systematically identify genome-wide modulation of interaction networks. Based on a novel statistical test employing conjugate Fisher transformations of correlation coefficients, MAGIC features fast computation and adaption to variations of clinical cohorts. In simulated datasets MAGIC achieved greatly improved computation efficiency and overall superior performance than the MI-based method. We applied MAGIC to construct the estrogen receptor (ER) modulated gene and gene set (representing biological function) interaction networks in breast cancer. Several novel interaction hubs and functional interactions were discovered. ER+ dependent interaction between TGFβ and NFκB was further shown to be associated with patient survival. The findings were verified in independent datasets. Using MAGIC, we also assessed the essential roles of ER modulation in another hormonal cancer, ovarian cancer. Overall, MAGIC is a systematic framework for comprehensively identifying and constructing the modulated interaction networks in a whole-genome landscape. MATLAB implementation of MAGIC is available for academic uses at https://github.com/chiuyc/MAGIC. PMID:26972162
Advances for the Topographic Characterisation of SMC Materials
Calvimontes, Alfredo; Grundke, Karina; Müller, Anett; Stamm, Manfred
2009-01-01
For a comprehensive study of Sheet Moulding Compound (SMC) surfaces, topographical data obtained by a contact-free optical method (chromatic aberration confocal imaging) were systematically acquired to characterise these surfaces with regard to their statistical, functional and volumetrical properties. Optimal sampling conditions (cut-off length and resolution) were obtained by a topographical-statistical procedure proposed in the present work. By using different length scales specific morphologies due to the influence of moulding conditions, metallic mould topography, glass fibre content and glass fibre orientation can be characterized. The aim of this study is to suggest a systematic topographical characterization procedure for composite materials in order to study and recognize the influence of production conditions on their surface quality.
Ing, Alex; Schwarzbauer, Christian
2014-01-01
Functional connectivity has become an increasingly important area of research in recent years. At a typical spatial resolution, approximately 300 million connections link each voxel in the brain with every other. This pattern of connectivity is known as the functional connectome. Connectivity is often compared between experimental groups and conditions. Standard methods used to control the type 1 error rate are likely to be insensitive when comparisons are carried out across the whole connectome, due to the huge number of statistical tests involved. To address this problem, two new cluster based methods--the cluster size statistic (CSS) and cluster mass statistic (CMS)--are introduced to control the family wise error rate across all connectivity values. These methods operate within a statistical framework similar to the cluster based methods used in conventional task based fMRI. Both methods are data driven, permutation based and require minimal statistical assumptions. Here, the performance of each procedure is evaluated in a receiver operator characteristic (ROC) analysis, utilising a simulated dataset. The relative sensitivity of each method is also tested on real data: BOLD (blood oxygen level dependent) fMRI scans were carried out on twelve subjects under normal conditions and during the hypercapnic state (induced through the inhalation of 6% CO2 in 21% O2 and 73%N2). Both CSS and CMS detected significant changes in connectivity between normal and hypercapnic states. A family wise error correction carried out at the individual connection level exhibited no significant changes in connectivity.
Ing, Alex; Schwarzbauer, Christian
2014-01-01
Functional connectivity has become an increasingly important area of research in recent years. At a typical spatial resolution, approximately 300 million connections link each voxel in the brain with every other. This pattern of connectivity is known as the functional connectome. Connectivity is often compared between experimental groups and conditions. Standard methods used to control the type 1 error rate are likely to be insensitive when comparisons are carried out across the whole connectome, due to the huge number of statistical tests involved. To address this problem, two new cluster based methods – the cluster size statistic (CSS) and cluster mass statistic (CMS) – are introduced to control the family wise error rate across all connectivity values. These methods operate within a statistical framework similar to the cluster based methods used in conventional task based fMRI. Both methods are data driven, permutation based and require minimal statistical assumptions. Here, the performance of each procedure is evaluated in a receiver operator characteristic (ROC) analysis, utilising a simulated dataset. The relative sensitivity of each method is also tested on real data: BOLD (blood oxygen level dependent) fMRI scans were carried out on twelve subjects under normal conditions and during the hypercapnic state (induced through the inhalation of 6% CO2 in 21% O2 and 73%N2). Both CSS and CMS detected significant changes in connectivity between normal and hypercapnic states. A family wise error correction carried out at the individual connection level exhibited no significant changes in connectivity. PMID:24906136
Lee, Kyubum; Kim, Byounggun; Jeon, Minji; Kim, Jihye; Tan, Aik Choon
2018-01-01
Background With the development of artificial intelligence (AI) technology centered on deep-learning, the computer has evolved to a point where it can read a given text and answer a question based on the context of the text. Such a specific task is known as the task of machine comprehension. Existing machine comprehension tasks mostly use datasets of general texts, such as news articles or elementary school-level storybooks. However, no attempt has been made to determine whether an up-to-date deep learning-based machine comprehension model can also process scientific literature containing expert-level knowledge, especially in the biomedical domain. Objective This study aims to investigate whether a machine comprehension model can process biomedical articles as well as general texts. Since there is no dataset for the biomedical literature comprehension task, our work includes generating a large-scale question answering dataset using PubMed and manually evaluating the generated dataset. Methods We present an attention-based deep neural model tailored to the biomedical domain. To further enhance the performance of our model, we used a pretrained word vector and biomedical entity type embedding. We also developed an ensemble method of combining the results of several independent models to reduce the variance of the answers from the models. Results The experimental results showed that our proposed deep neural network model outperformed the baseline model by more than 7% on the new dataset. We also evaluated human performance on the new dataset. The human evaluation result showed that our deep neural model outperformed humans in comprehension by 22% on average. Conclusions In this work, we introduced a new task of machine comprehension in the biomedical domain using a deep neural model. Since there was no large-scale dataset for training deep neural models in the biomedical domain, we created the new cloze-style datasets Biomedical Knowledge Comprehension Title (BMKC_T) and Biomedical Knowledge Comprehension Last Sentence (BMKC_LS) (together referred to as BioMedical Knowledge Comprehension) using the PubMed corpus. The experimental results showed that the performance of our model is much higher than that of humans. We observed that our model performed consistently better regardless of the degree of difficulty of a text, whereas humans have difficulty when performing biomedical literature comprehension tasks that require expert level knowledge. PMID:29305341
2010-01-01
Background The strongest causal evidence that customary spanking increases antisocial behavior is based on prospective studies that control statistically for initial antisocial differences. None of those studies have investigated alternative disciplinary tactics that parents could use instead of spanking, however. Further, the small effects in those studies could be artifactual due to residual confounding, reflecting child effects on the frequency of all disciplinary tactics. This study re-analyzes the strongest causal evidence against customary spanking and uses these same methods to determine whether alternative disciplinary tactics are more effective in reducing antisocial behavior. Methods This study re-analyzed a study by Straus et al.[1] on spanking and antisocial behavior using a sample of 785 children who were 6 to 9 years old in the 1988 cohort of the American National Longitudinal Survey of Youth. The comprehensiveness and reliability of the covariate measure of initial antisocial behavior were varied to test for residual confounding. All analyses were repeated for grounding, privilege removal, and sending children to their room, and for psychotherapy. To account for covarying use of disciplinary tactics, the analyses were redone first for the 73% who had reported using at least one discipline tactic and second by controlling for usage of other disciplinary tactics and psychotherapy. Results The apparently adverse effect of spanking on antisocial behavior was replicated using the original trichotomous covariate for initial antisocial behavior. A similar pattern of adverse effects was shown for grounding and psychotherapy and partially for the other two disciplinary tactics. All of these effects became non-significant after controlling for latent comprehensive measures of externalizing behavior problems. Conclusions These results are consistent with residual confounding, a statistical artifact that makes all corrective actions by parents and psychologists appear to increase children's antisocial behavior due to child effects on parents. Improved research methods are needed to discriminate between effective vs. counterproductive implementations of disciplinary tactics. How and when disciplinary tactics are used may be more important than which type of tactic is used. PMID:20175902
SPOTting Model Parameters Using a Ready-Made Python Package
NASA Astrophysics Data System (ADS)
Houska, Tobias; Kraft, Philipp; Chamorro-Chavez, Alejandro; Breuer, Lutz
2017-04-01
The choice for specific parameter estimation methods is often more dependent on its availability than its performance. We developed SPOTPY (Statistical Parameter Optimization Tool), an open source python package containing a comprehensive set of methods typically used to calibrate, analyze and optimize parameters for a wide range of ecological models. SPOTPY currently contains eight widely used algorithms, 11 objective functions, and can sample from eight parameter distributions. SPOTPY has a model-independent structure and can be run in parallel from the workstation to large computation clusters using the Message Passing Interface (MPI). We tested SPOTPY in five different case studies to parameterize the Rosenbrock, Griewank and Ackley functions, a one-dimensional physically based soil moisture routine, where we searched for parameters of the van Genuchten-Mualem function and a calibration of a biogeochemistry model with different objective functions. The case studies reveal that the implemented SPOTPY methods can be used for any model with just a minimal amount of code for maximal power of parameter optimization. They further show the benefit of having one package at hand that includes number of well performing parameter search methods, since not every case study can be solved sufficiently with every algorithm or every objective function.
SPOTting Model Parameters Using a Ready-Made Python Package.
Houska, Tobias; Kraft, Philipp; Chamorro-Chavez, Alejandro; Breuer, Lutz
2015-01-01
The choice for specific parameter estimation methods is often more dependent on its availability than its performance. We developed SPOTPY (Statistical Parameter Optimization Tool), an open source python package containing a comprehensive set of methods typically used to calibrate, analyze and optimize parameters for a wide range of ecological models. SPOTPY currently contains eight widely used algorithms, 11 objective functions, and can sample from eight parameter distributions. SPOTPY has a model-independent structure and can be run in parallel from the workstation to large computation clusters using the Message Passing Interface (MPI). We tested SPOTPY in five different case studies to parameterize the Rosenbrock, Griewank and Ackley functions, a one-dimensional physically based soil moisture routine, where we searched for parameters of the van Genuchten-Mualem function and a calibration of a biogeochemistry model with different objective functions. The case studies reveal that the implemented SPOTPY methods can be used for any model with just a minimal amount of code for maximal power of parameter optimization. They further show the benefit of having one package at hand that includes number of well performing parameter search methods, since not every case study can be solved sufficiently with every algorithm or every objective function.
SPOTting Model Parameters Using a Ready-Made Python Package
Houska, Tobias; Kraft, Philipp; Chamorro-Chavez, Alejandro; Breuer, Lutz
2015-01-01
The choice for specific parameter estimation methods is often more dependent on its availability than its performance. We developed SPOTPY (Statistical Parameter Optimization Tool), an open source python package containing a comprehensive set of methods typically used to calibrate, analyze and optimize parameters for a wide range of ecological models. SPOTPY currently contains eight widely used algorithms, 11 objective functions, and can sample from eight parameter distributions. SPOTPY has a model-independent structure and can be run in parallel from the workstation to large computation clusters using the Message Passing Interface (MPI). We tested SPOTPY in five different case studies to parameterize the Rosenbrock, Griewank and Ackley functions, a one-dimensional physically based soil moisture routine, where we searched for parameters of the van Genuchten-Mualem function and a calibration of a biogeochemistry model with different objective functions. The case studies reveal that the implemented SPOTPY methods can be used for any model with just a minimal amount of code for maximal power of parameter optimization. They further show the benefit of having one package at hand that includes number of well performing parameter search methods, since not every case study can be solved sufficiently with every algorithm or every objective function. PMID:26680783
Sun, Qian; Chang, Lu; Ren, Yanping; Cao, Liang; Sun, Yingguang; Du, Yingfeng; Shi, Xiaowei; Wang, Qiao; Zhang, Lantong
2012-11-01
A novel method based on high-performance liquid chromatography coupled with electrospray ionization tandem mass spectrometry was developed for simultaneous determination of the 11 major active components including ten flavonoids and one phenolic acid in Cirsium setosum. Separation was performed on a reversed-phase C(18) column with gradient elution of methanol and 0.1‰ acetic acid (v/v). The identification and quantification of the analytes were achieved on a hybrid quadrupole linear ion trap mass spectrometer. Multiple-reaction monitoring scanning was employed for quantification with switching electrospray ion source polarity between positive and negative modes in a single run. Full validation of the assay was carried out including linearity, precision, accuracy, stability, limits of detection and quantification. The results demonstrated that the method developed was reliable, rapid, and specific. The 25 batches of C. setosum samples from different sources were first determined using the developed method and the total contents of 11 analytes ranged from 1717.460 to 23028.258 μg/g. Among them, the content of linarin was highest, and its mean value was 7340.967 μg/g. Principal component analysis and hierarchical clustering analysis were performed to differentiate and classify the samples, which is helpful for comprehensive evaluation of the quality of C. setosum. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Shuster, Sara M.; Davey, Cynthia S.
2014-01-01
Objective: Determine the percentage of subjects taking antipsychotics who meet criteria for metabolic syndrome based on point-of-care testing analyses. Evaluate pharmacist comprehensive medication management services using point-of-care tests to reduce the mean difference in number of metabolic syndrome risk parameters at 6 and 12 months. Method: This 12-month, prospective, multisite, randomized, controlled study included 120 subjects taking antipsychotics (mean [SD] age of 42.9 [11.3] years) recruited from 3 community mental health clinics in Minnesota. Subjects consented to receive either pharmacist (PCS; n = 60) or no pharmacist (NCS; n = 60) comprehensive medication management services. Data were collected from February 2010 to January 2012. Results: No statistical differences in metabolic syndrome based on point-of-care tests were observed between the 2 groups at baseline (PCS: 85.2%, n = 46 versus NCS: 71.2%, n = 42, P = .073) or at 12 months (PCS: 84.4%, n = 38 versus NCS: 70.2%, n = 33, P = .104). Subjects, overall, screened positive at baseline for dyslipidemia (85.8%, n = 106), hypertension (52.5%, n = 63), and diabetes (22.5%, n = 27) based on point-of-care testing for metabolic risk criteria. After 12 months, a nonsignificant (P = .099) higher adjusted mean number of metabolic syndrome parameters in PCS subjects compared to NCS subjects (mean difference [95% CI] = 0.41 [−0.08 to 0.90]) were found. Conclusions: A relatively high proportion of subjects met criteria for metabolic syndrome, although no significant improvement was observed between the groups after 12 months. Point-of-care test analyses identified a high proportion of subjects meeting criteria for dyslipidemia, hypertension, and diabetes. Utilizing point-of-care tests in mental health settings and fostering interprofessional partnerships with comprehensive medication management pharmacists may improve identification and long-term management of metabolic risks among patients prescribed antipsychotics. Trial Registration: ClinicalTrials.gov identifier: NCT02029989 PMID:25667811
Weinberg, Benjamin A.; Gowen, Kyle; Lee, Thomas K.; Ou, Sai‐Hong Ignatius; Bristow, Robert; Krill, Lauren; Almira‐Suarez, M. Isabel; Ali, Siraj M.; Miller, Vincent A.; Liu, Stephen V.
2017-01-01
Abstract Background. Metastatic recurrence after treatment for locoregional cancer is a major cause of morbidity and cancer‐specific mortality. Distinguishing metastatic recurrence from the development of a second primary cancer has important prognostic and therapeutic value and represents a difficult clinical scenario. Advances beyond histopathological comparison are needed. We sought to interrogate the ability of comprehensive genomic profiling (CGP) to aid in distinguishing between these clinical scenarios. Materials and Methods. We identified three prospective cases of recurrent tumors in patients previously treated for localized cancers in which histologic analyses suggested subsequent development of a distinct second primary. Paired samples from the original primary and recurrent tumor were subjected to hybrid capture next‐generation sequencing‐based CGP to identify base pair substitutions, insertions, deletions, copy number alterations (CNA), and chromosomal rearrangements. Genomic profiles between paired samples were compared using previously established statistical clonality assessment software to gauge relatedness beyond global CGP similarities. Results. A high degree of similarity was observed among genomic profiles from morphologically distinct primary and recurrent tumors. Genomic information suggested reclassification as recurrent metastatic disease, and patients received therapy for metastatic disease based on the molecular determination. Conclusions. Our cases demonstrate an important adjunct role for CGP technologies in separating metastatic recurrence from development of a second primary cancer. Larger series are needed to confirm our observations, but comparative CGP may be considered in patients for whom distinguishing metastatic recurrence from a second primary would alter the therapeutic approach. Implications for Practice. Distinguishing a metastatic recurrence from a second primary cancer can represent a difficult clinicopathologic problem but has important prognostic and therapeutic implications. Approaches to aid histologic analysis may improve clinician and pathologist confidence in this increasingly common clinical scenario. Our series provides early support for incorporating paired comprehensive genomic profiling in clinical situations in which determination of metastatic recurrence versus a distinct second primary cancer would influence patient management. PMID:28193735
Participant comprehension of research for which they volunteer: a systematic review.
Montalvo, Wanda; Larson, Elaine
2014-11-01
Evidence indicates that research participants often do not fully understand the studies for which they have volunteered. The aim of this systematic review was to examine the relationship between the process of obtaining informed consent for research and participant comprehension and satisfaction with the research. Systematic review of published research on informed consent and participant comprehension of research for which they volunteer using the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) Statement as a guide. PubMed, Cumulative Index for Nursing and Allied Health Literature, Cochrane Central Register of Controlled Trails, and Cochrane Database of Systematic Reviews were used to search the literature for studies meeting the following inclusion criteria: (a) published between January 1, 2006, and December 31, 2013, (b) interventional or descriptive quantitative design, (c) published in a peer-reviewed journal, (d) written in English, and (e) assessed participant comprehension or satisfaction with the research process. Studies were assessed for quality using seven indicators: sampling method, use of controls or comparison groups, response rate, description of intervention, description of outcome, statistical method, and health literacy assessment. Of 176 studies identified, 27 met inclusion criteria: 13 (48%) were randomized interventional designs and 14 (52%) were descriptive. Three categories of studies included projects assessing (a) enhanced consent process or form, (b) multimedia methods, and (c) education to improve participant understanding. Most (78%) used investigator-developed tools to assess participant comprehension, did not assess participant health literacy (74%), or did not assess the readability level of the consent form (89%). Researchers found participants lacked basic understanding of research elements: randomization, placebo, risks, and therapeutic misconception. Findings indicate (a) inconsistent assessment of participant reading or health literacy level, (b) measurement variation associated with use of nonstandardized tools, and (c) continued therapeutic misconception and lack of understanding among research participants of randomization, placebo, benefit, and risk. While the Agency for Healthcare and Quality and National Quality Forum have published informed consent and authorization toolkits, previously published validated tools are underutilized. Informed consent requires the assessment of health literacy, reading level, and comprehension of research participants using validated assessment tools and methods. © 2014 Sigma Theta Tau International.
Evaluation on Cost Overrun Risks of Long-distance Water Diversion Project Based on SPA-IAHP Method
NASA Astrophysics Data System (ADS)
Yuanyue, Yang; Huimin, Li
2018-02-01
Large investment, long route, many change orders and etc. are main causes for costs overrun of long-distance water diversion project. This paper, based on existing research, builds a full-process cost overrun risk evaluation index system for water diversion project, apply SPA-IAHP method to set up cost overrun risk evaluation mode, calculate and rank weight of every risk evaluation indexes. Finally, the cost overrun risks are comprehensively evaluated by calculating linkage measure, and comprehensive risk level is acquired. SPA-IAHP method can accurately evaluate risks, and the reliability is high. By case calculation and verification, it can provide valid cost overrun decision making information to construction companies.
Sybil--efficient constraint-based modelling in R.
Gelius-Dietrich, Gabriel; Desouki, Abdelmoneim Amer; Fritzemeier, Claus Jonathan; Lercher, Martin J
2013-11-13
Constraint-based analyses of metabolic networks are widely used to simulate the properties of genome-scale metabolic networks. Publicly available implementations tend to be slow, impeding large scale analyses such as the genome-wide computation of pairwise gene knock-outs, or the automated search for model improvements. Furthermore, available implementations cannot easily be extended or adapted by users. Here, we present sybil, an open source software library for constraint-based analyses in R; R is a free, platform-independent environment for statistical computing and graphics that is widely used in bioinformatics. Among other functions, sybil currently provides efficient methods for flux-balance analysis (FBA), MOMA, and ROOM that are about ten times faster than previous implementations when calculating the effect of whole-genome single gene deletions in silico on a complete E. coli metabolic model. Due to the object-oriented architecture of sybil, users can easily build analysis pipelines in R or even implement their own constraint-based algorithms. Based on its highly efficient communication with different mathematical optimisation programs, sybil facilitates the exploration of high-dimensional optimisation problems on small time scales. Sybil and all its dependencies are open source. Sybil and its documentation are available for download from the comprehensive R archive network (CRAN).
[Strengthen the cancer surveillance to promote cancer prevention and control in China].
He, J
2018-01-23
Cancer is a major chronic disease threatening the people's health in China. We reviewed the latest advances on cancer surveillance, prevention and control in our country, which may provide important clues for future cancer control. We used data from the National Central Cancer Registry, to describe and analyze the latest cancer statistics in China. We summarized updated informations on cancer control policies, conducting network, as well as programs in the country. We provided important suggestions on the future strategies of cancer prevention and control. The overall cancer burden in China has been increasing during the past decades. In 2014, there were about 3 804 000 new cancer cases and 2 296 000 cancer deaths in China. The age-standardized cancer incidence and mortality rates were 190.63/100 000 and 106.98/100 000, respectively. China has formed a comprehensive network on cancer prevention and control. Nationwide population-based cancer surveillance has been built up. The population coverage of cancer surveillance has been expanded, and the data quality has been improved. As the aging population is increasing and unhealthy life styles persist in our country, there will be an unnegligible cancer burden in China. Based on the comprehensive rationale of cancer control and prevention, National Cancer Center of China will perform its duty for future precise cancer control and prevention, based on cancer surveillance statistics.
von Krogh, Gunn; Nåden, Dagfinn; Aasland, Olaf Gjerløw
2012-10-01
To present the results from the test site application of the documentation model KPO (quality assurance, problem solving and caring) designed to impact the quality of nursing information in electronic patient record (EPR). The KPO model was developed by means of consensus group and clinical testing. Four documentation arenas and eight content categories, nursing terminologies and a decision-support system were designed to impact the completeness, comprehensiveness and consistency of nursing information. The testing was performed in a pre-test/post-test time series design, three times at a one-year interval. Content analysis of nursing documentation was accomplished through the identification, interpretation and coding of information units. Data from the pre-test and post-test 2 were subjected to statistical analyses. To estimate the differences, paired t-tests were used. At post-test 2, the information is found to be more complete, comprehensive and consistent than at pre-test. The findings indicate that documentation arenas combining work flow and content categories deduced from theories on nursing practice can influence the quality of nursing information. The KPO model can be used as guide when shifting from paper-based to electronic-based nursing documentation with the aim of obtaining complete, comprehensive and consistent nursing information. © 2012 Blackwell Publishing Ltd.
Rhodes, Lindsay A.; Huisingh, Carrie E.; Quinn, Adam E.; McGwin, Gerald; LaRussa, Frank; Box, Daniel; Owsley, Cynthia; Girkin, Christopher A.
2016-01-01
Purpose To examine if racial differences in Bruch's membrane opening-minimum rim width (BMO-MRW) in spectral domain optical coherence tomography (SDOCT) exist, specifically between people of African descent (AD) and European descent (ED) in normal ocular health. Design Cross-sectional study Methods Patients presenting for a comprehensive eye exam at retail-based primary eye clinics were enrolled based on ≥1 of the following at-risk criteria for glaucoma: AD aged ≥ 40 years, ED aged ≥50 years, diabetes, family history of glaucoma, and/or preexisting diagnosis of glaucoma. Participants with normal optic nerves on exam received SDOCT of the optic nerve head (24 radial scans). Global and regional (temporal, superotemporal, inferotemporal, nasal, superonasal, and inferonasal) BMO-MRW were measured and compared by race using generalized estimating equations. Models were adjusted for age, gender, and BMO area. Results SDOCT scans from 269 eyes (148 participants) were included in the analysis. Mean global BMO-MRW declined as age increased. After adjusting for age, gender, and BMO area, there was not a statistically significant difference in mean global BMO-MRW by race (P = 0.60). Regionally, the mean BMO-MRW was lower in the crude model among AD eyes in the temporal, superotemporal, and nasal regions and higher in the inferotemporal, superonasal, and inferonasal regions. However, in the adjusted model, these differences were not statistically significant. Conclusions BMO-MRW was not statistically different between those of AD and ED. Race-specific normative data may not be necessary for the deployment of BMO-MRW in AD patients. PMID:27825982
Lu, R; Xiao, Y
2017-07-18
Objective: To evaluate the clinical value of ultrasonic elastography and ultrasonography comprehensive scoring method in the diagnosis of cervical lesions. Methods: A total of 116 patients were selected from the Department of Gynecology of the first hospital affiliated with Central South University from March 2014 to September 2015.All of the lesions were preoperatively examined by Doppler Ultrasound and elastography.The elasticity score was determined by a 5-point scoring method. Calculation of the strain ratio was based on a comparison of the average strain measured in the lesion with the adjacent tissue of the same depth, size, and shape.All these ultrasonic parameters were quantified, added, and arrived at ultrasonography comprehensive scores.To use surgical pathology as the gold standard, the sensitivity, specificity, accuracy of Doppler Ultrasound, elasticity score and strain ratio methods and ultrasonography comprehensive scoring method were comparatively analyzed. Results: (1) The sensitivity, specificity, and accuracy of Doppler Ultrasound in diagnosing cervical lesions were 82.89% (63/76), 85.0% (34/40), and 83.62% (97/116), respectively.(2) The sensitivity, specificity, and accuracy of the elasticity score method were 77.63% (59/76), 82.5% (33/40), and 79.31% (92/116), respectively; the sensitivity, specificity, and accuracy of the strain ratio measure method were 84.21% (64/76), 87.5% (35/40), and 85.34% (99/116), respectively.(3) The sensitivity, specificity, and accuracy of ultrasonography comprehensive scoring method were 90.79% (69/76), 92.5% (37/40), and 91.38% (106/116), respectively. Conclusion: (1) It was obvious that ultrasonic elastography had certain diagnostic value in cervical lesions. Strain ratio measurement can be more objective than elasticity score method.(2) The combined application of ultrasonography comprehensive scoring method, ultrasonic elastography and conventional sonography was more accurate than single parameter.
On Teaching about the Coefficient of Variation in Introductory Statistics Courses
ERIC Educational Resources Information Center
Trafimow, David
2014-01-01
The standard deviation is related to the mean by virtue of the coefficient of variation. Teachers of statistics courses can make use of that fact to make the standard deviation more comprehensible for statistics students.
Driban, Jeffrey B.; Stout, Alina C.; Lo, Grace H.; Eaton, Charles B.; Price, Lori Lyn; Lu, Bing; Barbe, Mary F.; McAlindon, Timothy E.
2016-01-01
Background: We evaluated agreement among several definitions of accelerated knee osteoarthritis (AKOA) and construct validity by comparing their individual associations with injury, age, obesity, and knee pain. Methods: We selected knees from the Osteoarthritis Initiative that had no radiographic knee osteoarthritis [Kellgren–Lawrence (KL) 0 or 1] at baseline and had high-quality quantitative medial joint space width (JSW) measures on two or more consecutive visits (n = 1655 knees, 1143 participants). Quantitative medial JSW was based on a semi-automated method and was location specific (x = 0.25). We compared six definitions of AKOA: stringent JSW (averaged): average JSW loss greater than 1.05 mm/year over 4 years; stringent JSW (consistent): JSW loss greater than 1.05 mm/year for at least 2 years; lenient JSW (averaged): average JSW loss greater than 0.25 mm/year over 4 years; lenient JSW (consistent): JSW loss greater than 0.25 mm/year for at least 2 years; comprehensive KL based: progression from no radiographic osteoarthritis to advance-stage osteoarthritis (KL 3 or 4; development of definite osteophyte and joint space narrowing) within 4 years; and lenient KL based: an increase of at least two KL grades within 4 years. Results: Over 4 years the incidence rate of AKOA was 0.4%, 0.8%, 15.5%, 22.1%, 12.4%, and 7.2% based on the stringent JSW (averaged and consistent), lenient JSW (averaged and consistent), lenient KL-based definition, and comprehensive KL-based definition. All but one knee that met the stringent JSW definition also met the comprehensive KL-based definition. There was fair substantial agreement between the lenient JSW (averaged), lenient KL-based, and comprehensive KL-based definitions. A comprehensive KL-based definition led to larger effect sizes for injury, age, body mass index, and average pain over 4 years. Conclusions: A comprehensive KL-based definition of AKOA may be ideal because it represents a broader definition of joint deterioration compared with those focused on just joint space or osteophytes alone. PMID:27721902
Harris, R. Alan; Wang, Ting; Coarfa, Cristian; Nagarajan, Raman P.; Hong, Chibo; Downey, Sara L.; Johnson, Brett E.; Fouse, Shaun D.; Delaney, Allen; Zhao, Yongjun; Olshen, Adam; Ballinger, Tracy; Zhou, Xin; Forsberg, Kevin J.; Gu, Junchen; Echipare, Lorigail; O’Geen, Henriette; Lister, Ryan; Pelizzola, Mattia; Xi, Yuanxin; Epstein, Charles B.; Bernstein, Bradley E.; Hawkins, R. David; Ren, Bing; Chung, Wen-Yu; Gu, Hongcang; Bock, Christoph; Gnirke, Andreas; Zhang, Michael Q.; Haussler, David; Ecker, Joseph; Li, Wei; Farnham, Peggy J.; Waterland, Robert A.; Meissner, Alexander; Marra, Marco A.; Hirst, Martin; Milosavljevic, Aleksandar; Costello, Joseph F.
2010-01-01
Sequencing-based DNA methylation profiling methods are comprehensive and, as accuracy and affordability improve, will increasingly supplant microarrays for genome-scale analyses. Here, four sequencing-based methodologies were applied to biological replicates of human embryonic stem cells to compare their CpG coverage genome-wide and in transposons, resolution, cost, concordance and its relationship with CpG density and genomic context. The two bisulfite methods reached concordance of 82% for CpG methylation levels and 99% for non-CpG cytosine methylation levels. Using binary methylation calls, two enrichment methods were 99% concordant, while regions assessed by all four methods were 97% concordant. To achieve comprehensive methylome coverage while reducing cost, an approach integrating two complementary methods was examined. The integrative methylome profile along with histone methylation, RNA, and SNP profiles derived from the sequence reads allowed genome-wide assessment of allele-specific epigenetic states, identifying most known imprinted regions and new loci with monoallelic epigenetic marks and monoallelic expression. PMID:20852635
Nevada's Children: Selected Educational and Social Statistics. Nevada and National.
ERIC Educational Resources Information Center
Horner, Mary P., Comp.
This statistical report describes the successes and shortcomings of education in Nevada and compares some statistics concerning education in Nevada to national norms. The report, which provides a comprehensive array of information helpful to policy makers and citizens, is divided into three sections. The first section presents statistics about…
Statistics Report on TEQSA Registered Higher Education Providers
ERIC Educational Resources Information Center
Australian Government Tertiary Education Quality and Standards Agency, 2015
2015-01-01
This statistics report provides a comprehensive snapshot of national statistics on all parts of the sector for the year 2013, by bringing together data collected directly by TEQSA with data sourced from the main higher education statistics collections managed by the Australian Government Department of Education and Training. The report provides…
NASA Astrophysics Data System (ADS)
Lv, Z. H.; Li, Q.; Huang, R. W.; Liu, H. M.; Liu, D.
2016-08-01
Based on the discussion about topology structure of integrated distributed photovoltaic (PV) power generation system and energy storage (ES) in single or mixed type, this paper focuses on analyzing grid-connected performance of integrated distributed photovoltaic and energy storage (PV-ES) systems, and proposes a comprehensive evaluation index system. Then a multi-level fuzzy comprehensive evaluation method based on grey correlation degree is proposed, and the calculations for weight matrix and fuzzy matrix are presented step by step. Finally, a distributed integrated PV-ES power generation system connected to a 380 V low voltage distribution network is taken as the example, and some suggestions are made based on the evaluation results.
Perinetti, G; Perillo, L; Franchi, L; Di Lenarda, R; Contardo, L
2014-11-01
Diagnostic agreement on individual basis between the third middle phalanx maturation (MPM) method and the cervical vertebral maturation (CVM) method has conjecturally been based mainly on overall correlation analyses. Herein, the true agreement between methods according to stage and sex has been evaluated through a comprehensive diagnostic performance analysis. Four hundred and fifty-one Caucasian subjects were included in the study, 231 females and 220 males (mean age, 12.2 ± 2.5 years; range, 7.0-17.9 years). The X-rays of the middle phalanx of the third finger and the lateral cephalograms were examined for staging by blinded operators, blinded for MPM stages and subjects' age. The MPM and CVM methods based on six stages, two pre-pubertal (1 and 2), two pubertal (3 and 4), and two post-pubertal (5 and 6), were considered. Specifically, for each MPM stage, the diagnostic performance in the identification of the corresponding CVM stage was described by Bayesian statistics. For both sexes, overall agreement was 77.6%. Most of the disagreement was due to 1 stage apart. Slight disagreement was seen for the stages 5 and 6, where the third middle phalanx shows an earlier maturation. The two maturational methods show an overall satisfactorily diagnostic agreement. However, at post-pubertal stages, the middle phalanx of the third finger appears to mature earlier than the cervical vertebrae. Post-pubertal growth phase should thus be based on the presence of stage 6 in MPM. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Comparisons of non-Gaussian statistical models in DNA methylation analysis.
Ma, Zhanyu; Teschendorff, Andrew E; Yu, Hong; Taghia, Jalil; Guo, Jun
2014-06-16
As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance.
Comparisons of Non-Gaussian Statistical Models in DNA Methylation Analysis
Ma, Zhanyu; Teschendorff, Andrew E.; Yu, Hong; Taghia, Jalil; Guo, Jun
2014-01-01
As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance. PMID:24937687
Complex patterns of abnormal heartbeats
NASA Technical Reports Server (NTRS)
Schulte-Frohlinde, Verena; Ashkenazy, Yosef; Goldberger, Ary L.; Ivanov, Plamen Ch; Costa, Madalena; Morley-Davies, Adrian; Stanley, H. Eugene; Glass, Leon
2002-01-01
Individuals having frequent abnormal heartbeats interspersed with normal heartbeats may be at an increased risk of sudden cardiac death. However, mechanistic understanding of such cardiac arrhythmias is limited. We present a visual and qualitative method to display statistical properties of abnormal heartbeats. We introduce dynamical "heartprints" which reveal characteristic patterns in long clinical records encompassing approximately 10(5) heartbeats and may provide information about underlying mechanisms. We test if these dynamics can be reproduced by model simulations in which abnormal heartbeats are generated (i) randomly, (ii) at a fixed time interval following a preceding normal heartbeat, or (iii) by an independent oscillator that may or may not interact with the normal heartbeat. We compare the results of these three models and test their limitations to comprehensively simulate the statistical features of selected clinical records. This work introduces methods that can be used to test mathematical models of arrhythmogenesis and to develop a new understanding of underlying electrophysiologic mechanisms of cardiac arrhythmia.
Hu, Yiwen; Chen, Jiahui; Hu, Guping; Yu, Jianchen; Zhu, Xun; Lin, Yongcheng; Chen, Shengping; Yuan, Jie
2015-01-01
Every year, hundreds of new compounds are discovered from the metabolites of marine organisms. Finding new and useful compounds is one of the crucial drivers for this field of research. Here we describe the statistics of bioactive compounds discovered from marine organisms from 1985 to 2012. This work is based on our database, which contains information on more than 15,000 chemical substances including 4196 bioactive marine natural products. We performed a comprehensive statistical analysis to understand the characteristics of the novel bioactive compounds and detail temporal trends, chemical structures, species distribution, and research progress. We hope this meta-analysis will provide useful information for research into the bioactivity of marine natural products and drug development. PMID:25574736
Allen, Robert C; Rutan, Sarah C
2011-10-31
Simulated and experimental data were used to measure the effectiveness of common interpolation techniques during chromatographic alignment of comprehensive two-dimensional liquid chromatography-diode array detector (LC×LC-DAD) data. Interpolation was used to generate a sufficient number of data points in the sampled first chromatographic dimension to allow for alignment of retention times from different injections. Five different interpolation methods, linear interpolation followed by cross correlation, piecewise cubic Hermite interpolating polynomial, cubic spline, Fourier zero-filling, and Gaussian fitting, were investigated. The fully aligned chromatograms, in both the first and second chromatographic dimensions, were analyzed by parallel factor analysis to determine the relative area for each peak in each injection. A calibration curve was generated for the simulated data set. The standard error of prediction and percent relative standard deviation were calculated for the simulated peak for each technique. The Gaussian fitting interpolation technique resulted in the lowest standard error of prediction and average relative standard deviation for the simulated data. However, upon applying the interpolation techniques to the experimental data, most of the interpolation methods were not found to produce statistically different relative peak areas from each other. While most of the techniques were not statistically different, the performance was improved relative to the PARAFAC results obtained when analyzing the unaligned data. Copyright © 2011 Elsevier B.V. All rights reserved.
Hall, Eric William; Sanchez, Travis H; Stein, Aryeh D; Stephenson, Rob; Zlotorzynska, Maria; Sineath, Robert Craig; Sullivan, Patrick S
2017-03-06
Web-based surveys are increasingly used to capture data essential for human immunodeficiency virus (HIV) prevention research. However, there are challenges in ensuring the informed consent of Web-based research participants. The aim of our study was to develop and assess the efficacy of alternative methods of administering informed consent in Web-based HIV research with men who have sex with men (MSM). From July to September 2014, paid advertisements on Facebook were used to recruit adult MSM living in the United States for a Web-based survey about risk and preventive behaviors. Participants were randomized to one of the 4 methods of delivering informed consent: a professionally produced video, a study staff-produced video, a frequently asked questions (FAQs) text page, and a standard informed consent text page. Following the behavior survey, participants answered 15 questions about comprehension of consent information. Correct responses to each question were given a score of 1, for a total possible scale score of 15. General linear regression and post-hoc Tukey comparisons were used to assess difference (P<.001) in mean consent comprehension scores. A mediation analysis was used to examine the relationship between time spent on consent page and consent comprehension. Of the 665 MSM participants who completed the comprehension questions, 24.2% (161/665) received the standard consent, 27.1% (180/665) received the FAQ consent, 26.8% (178/665) received the professional consent video, and 22.0% (146/665) received the staff video. The overall average consent comprehension score was 6.28 (SD=2.89). The average consent comprehension score differed significantly across consent type (P<.001), age (P=.04), race or ethnicity (P<.001), and highest level of education (P=.001). Compared with those who received the standard consent, comprehension was significantly higher for participants who received the professional video consent (score increase=1.79; 95% CI 1.02-2.55) and participants who received the staff video consent (score increase=1.79; 95% CI 0.99-2.59). There was no significant difference in comprehension for those who received the FAQ consent. Participants spent more time on the 2 video consents (staff video median time=117 seconds; professional video median time=115 seconds) than the FAQ (median=21 seconds) and standard consents (median=37 seconds). Mediation analysis showed that though time spent on the consent page was partially responsible for some of the differences in comprehension, the direct effects of the professional video (score increase=0.93; 95% CI 0.39-1.48) and the staff-produced video (score increase=0.99; 95% CI 0.42-1.56) were still significant. Video-based consent methods improve consent comprehension of MSM participating in a Web-based HIV behavioral survey. This effect may be partially mediated through increased time spent reviewing the consent material; however, the video consent may still be superior to standard consent in improving participant comprehension of key study facts. Clinicaltrials.gov NCT02139566; https://clinicaltrials.gov/ct2/show/NCT02139566 (Archived by WebCite at http://www.webcitation.org/6oRnL261N). ©Eric William Hall, Travis H Sanchez, Aryeh D Stein, Rob Stephenson, Maria Zlotorzynska, Robert Craig Sineath, Patrick S Sullivan. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 06.03.2017.
NASA Astrophysics Data System (ADS)
Lafontaine, J.; Hay, L.; Archfield, S. A.; Farmer, W. H.; Kiang, J. E.
2014-12-01
The U.S. Geological Survey (USGS) has developed a National Hydrologic Model (NHM) to support coordinated, comprehensive and consistent hydrologic model development, and facilitate the application of hydrologic simulations within the continental US. The portion of the NHM located within the Gulf Coastal Plains and Ozarks Landscape Conservation Cooperative (GCPO LCC) is being used to test the feasibility of improving streamflow simulations in gaged and ungaged watersheds by linking statistically- and physically-based hydrologic models. The GCPO LCC covers part or all of 12 states and 5 sub-geographies, totaling approximately 726,000 km2, and is centered on the lower Mississippi Alluvial Valley. A total of 346 USGS streamgages in the GCPO LCC region were selected to evaluate the performance of this new calibration methodology for the period 1980 to 2013. Initially, the physically-based models are calibrated to measured streamflow data to provide a baseline for comparison. An enhanced calibration procedure then is used to calibrate the physically-based models in the gaged and ungaged areas of the GCPO LCC using statistically-based estimates of streamflow. For this application, the calibration procedure is adjusted to address the limitations of the statistically generated time series to reproduce measured streamflow in gaged basins, primarily by incorporating error and bias estimates. As part of this effort, estimates of uncertainty in the model simulations are also computed for the gaged and ungaged watersheds.
ERIC Educational Resources Information Center
Webster, Collin A.; Nesbitt, Danielle; Lee, Heesu; Egan, Cate
2017-01-01
Purpose: The purpose of this study was to examine preservice physical education teachers' (PPET) service learning experiences planning and implementing course assignments aligned with comprehensive school physical activity program (CSPAP) recommendations. Methods: Based on service learning principles, PPETs (N = 18) enrolled in a physical…
A review of approaches to identifying patient phenotype cohorts using electronic health records
Shivade, Chaitanya; Raghavan, Preethi; Fosler-Lussier, Eric; Embi, Peter J; Elhadad, Noemie; Johnson, Stephen B; Lai, Albert M
2014-01-01
Objective To summarize literature describing approaches aimed at automatically identifying patients with a common phenotype. Materials and methods We performed a review of studies describing systems or reporting techniques developed for identifying cohorts of patients with specific phenotypes. Every full text article published in (1) Journal of American Medical Informatics Association, (2) Journal of Biomedical Informatics, (3) Proceedings of the Annual American Medical Informatics Association Symposium, and (4) Proceedings of Clinical Research Informatics Conference within the past 3 years was assessed for inclusion in the review. Only articles using automated techniques were included. Results Ninety-seven articles met our inclusion criteria. Forty-six used natural language processing (NLP)-based techniques, 24 described rule-based systems, 41 used statistical analyses, data mining, or machine learning techniques, while 22 described hybrid systems. Nine articles described the architecture of large-scale systems developed for determining cohort eligibility of patients. Discussion We observe that there is a rise in the number of studies associated with cohort identification using electronic medical records. Statistical analyses or machine learning, followed by NLP techniques, are gaining popularity over the years in comparison with rule-based systems. Conclusions There are a variety of approaches for classifying patients into a particular phenotype. Different techniques and data sources are used, and good performance is reported on datasets at respective institutions. However, no system makes comprehensive use of electronic medical records addressing all of their known weaknesses. PMID:24201027
Statistical analysis of short-term water stress conditions at Riggs Creek OzFlux tower site
NASA Astrophysics Data System (ADS)
Azmi, Mohammad; Rüdiger, Christoph; Walker, Jeffrey P.
2017-10-01
A large range of indices and proxies are available to describe the water stress conditions of an area subject to different applications, which have varying capabilities and limitations depending on the prevailing local climatic conditions and land cover. The present study uses a range of spatio-temporally high-resolution (daily and within daily) data sources to evaluate a number of drought indices (DIs) for the Riggs Creek OzFlux tower site in southeastern Australia. Therefore, the main aim of this study is to evaluate the statistical characteristics of individual DIs subject to short-term water stress conditions. In order to derive a more general and therefore representative DI, a new criterion is required to specify the statistical similarity between each pair of indices to allow determining the dominant drought types along with their representative DIs. The results show that the monitoring of water stress at this case study area can be achieved by evaluating the individual behaviour of three clusters of (i) vegetation conditions, (ii) water availability and (iii) water consumptions. This indicates that it is not necessary to assess all individual DIs one by one to derive a comprehensive and informative data set about the water stress of an area; instead, this can be achieved by analysing one of the DIs from each cluster or deriving a new combinatory index for each cluster, based on established combination methods.
Haas, Kevin R; Yang, Haw; Chu, Jhih-Wei
2013-12-12
The dynamics of a protein along a well-defined coordinate can be formally projected onto the form of an overdamped Lagevin equation. Here, we present a comprehensive statistical-learning framework for simultaneously quantifying the deterministic force (the potential of mean force, PMF) and the stochastic force (characterized by the diffusion coefficient, D) from single-molecule Förster-type resonance energy transfer (smFRET) experiments. The likelihood functional of the Langevin parameters, PMF and D, is expressed by a path integral of the latent smFRET distance that follows Langevin dynamics and realized by the donor and the acceptor photon emissions. The solution is made possible by an eigen decomposition of the time-symmetrized form of the corresponding Fokker-Planck equation coupled with photon statistics. To extract the Langevin parameters from photon arrival time data, we advance the expectation-maximization algorithm in statistical learning, originally developed for and mostly used in discrete-state systems, to a general form in the continuous space that allows for a variational calculus on the continuous PMF function. We also introduce the regularization of the solution space in this Bayesian inference based on a maximum trajectory-entropy principle. We use a highly nontrivial example with realistically simulated smFRET data to illustrate the application of this new method.
Dubois, Albertine; Hérard, Anne-Sophie; Delatour, Benoît; Hantraye, Philippe; Bonvento, Gilles; Dhenain, Marc; Delzescaux, Thierry
2010-06-01
Biomarkers and technologies similar to those used in humans are essential for the follow-up of Alzheimer's disease (AD) animal models, particularly for the clarification of mechanisms and the screening and validation of new candidate treatments. In humans, changes in brain metabolism can be detected by 1-deoxy-2-[(18)F] fluoro-D-glucose PET (FDG-PET) and assessed in a user-independent manner with dedicated software, such as Statistical Parametric Mapping (SPM). FDG-PET can be carried out in small animals, but its resolution is low as compared to the size of rodent brain structures. In mouse models of AD, changes in cerebral glucose utilization are usually detected by [(14)C]-2-deoxyglucose (2DG) autoradiography, but this requires prior manual outlining of regions of interest (ROI) on selected sections. Here, we evaluate the feasibility of applying the SPM method to 3D autoradiographic data sets mapping brain metabolic activity in a transgenic mouse model of AD. We report the preliminary results obtained with 4 APP/PS1 (64+/-1 weeks) and 3 PS1 (65+/-2 weeks) mice. We also describe new procedures for the acquisition and use of "blockface" photographs and provide the first demonstration of their value for the 3D reconstruction and spatial normalization of post mortem mouse brain volumes. Despite this limited sample size, our results appear to be meaningful, consistent, and more comprehensive than findings from previously published studies based on conventional ROI-based methods. The establishment of statistical significance at the voxel level, rather than with a user-defined ROI, makes it possible to detect more reliably subtle differences in geometrically complex regions, such as the hippocampus. Our approach is generic and could be easily applied to other biomarkers and extended to other species and applications. Copyright 2010 Elsevier Inc. All rights reserved.
Chern, Yahn-Bor; Ho, Pei-Shan; Kuo, Li-Chueh; Chen, Jin-Bor
2013-01-01
Peritoneal dialysis (PD)-related peritonitis remains an important complication in PD patients, potentially causing technique failure and influencing patient outcome. To date, no comprehensive study in the Taiwanese PD population has used a time-dependent statistical method to analyze the factors associated with PD-related peritonitis. Our single-center retrospective cohort study, conducted in southern Taiwan between February 1999 and July 2010, used time-dependent statistical methods to analyze the factors associated with PD-related peritonitis. The study recruited 404 PD patients for analysis, 150 of whom experienced at least 1 episode of peritonitis during the follow-up period. The incidence rate of peritonitis was highest during the first 6 months after PD start. A comparison of patients in the two groups (peritonitis vs null-peritonitis) by univariate analysis showed that the peritonitis group included fewer men (p = 0.048) and more patients of older age (≥65 years, p = 0.049). In addition, patients who had never received compulsory education showed a statistically higher incidence of PD-related peritonitis in the univariate analysis (p = 0.04). A proportional hazards model identified education level (less than elementary school vs any higher education level) as having an independent association with PD-related peritonitis [hazard ratio (HR): 1.45; 95% confidence interval (CI): 1.01 to 2.06; p = 0.045). Comorbidities measured using the Charlson comorbidity index (score >2 vs ≤2) showed borderline statistical significance (HR: 1.44; 95% CI: 1.00 to 2.13; p = 0.053). A lower education level is a major risk factor for PD-related peritonitis independent of age, sex, hypoalbuminemia, and comorbidities. Our study emphasizes that a comprehensive PD education program is crucial for PD patients with a lower education level.
General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies
Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong
2013-01-01
We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515
Kwan, Paul; Welch, Mitchell
2017-01-01
In order to understand the distribution and prevalence of Ommatissus lybicus (Hemiptera: Tropiduchidae) as well as analyse their current biographical patterns and predict their future spread, comprehensive and detailed information on the environmental, climatic, and agricultural practices are essential. The spatial analytical techniques such as Remote Sensing and Spatial Statistics Tools, can help detect and model spatial links and correlations between the presence, absence and density of O. lybicus in response to climatic, environmental, and human factors. The main objective of this paper is to review remote sensing and relevant analytical techniques that can be applied in mapping and modelling the habitat and population density of O. lybicus. An exhaustive search of related literature revealed that there are very limited studies linking location-based infestation levels of pests like the O. lybicus with climatic, environmental, and human practice related variables. This review also highlights the accumulated knowledge and addresses the gaps in this area of research. Furthermore, it makes recommendations for future studies, and gives suggestions on monitoring and surveillance methods in designing both local and regional level integrated pest management strategies of palm tree and other affected cultivated crops. PMID:28875085
Al-Kindi, Khalifa M; Kwan, Paul; R Andrew, Nigel; Welch, Mitchell
2017-01-01
In order to understand the distribution and prevalence of Ommatissus lybicus (Hemiptera: Tropiduchidae) as well as analyse their current biographical patterns and predict their future spread, comprehensive and detailed information on the environmental, climatic, and agricultural practices are essential. The spatial analytical techniques such as Remote Sensing and Spatial Statistics Tools, can help detect and model spatial links and correlations between the presence, absence and density of O. lybicus in response to climatic, environmental, and human factors. The main objective of this paper is to review remote sensing and relevant analytical techniques that can be applied in mapping and modelling the habitat and population density of O. lybicus . An exhaustive search of related literature revealed that there are very limited studies linking location-based infestation levels of pests like the O. lybicus with climatic, environmental, and human practice related variables. This review also highlights the accumulated knowledge and addresses the gaps in this area of research. Furthermore, it makes recommendations for future studies, and gives suggestions on monitoring and surveillance methods in designing both local and regional level integrated pest management strategies of palm tree and other affected cultivated crops.
Schmitt, J Eric; Scanlon, Mary H; Servaes, Sabah; Levin, Dayna; Cook, Tessa S
2015-10-01
The advent of the ACGME's Next Accreditation System represents a significant new challenge for residencies and fellowships, owing to its requirements for more complex and detailed information. We developed a system of online assessment tools to provide comprehensive coverage of the twelve ACGME Milestones and digitized them using freely available cloud-based productivity tools. These tools include a combination of point-of-care procedural assessments, electronic quizzes, online modules, and other data entry forms. Using free statistical analytic tools, we also developed an automated system for management, processing, and data reporting. After one year of use, our Milestones project has resulted in the submission of over 20,000 individual data points. The use of automated statistical methods to generate resident-specific profiles has allowed for dynamic reports of individual residents' progress. These profiles both summarize data and also allow program directors access to more granular information as needed. Informatics-driven strategies for data assessment and processing represent feasible solutions to Milestones assessment and analysis, reducing the potential administrative burden for program directors, residents, and staff. Copyright © 2015 AUR. Published by Elsevier Inc. All rights reserved.
1H NMR-based metabolic profiling for evaluating poppy seed rancidity and brewing.
Jawień, Ewa; Ząbek, Adam; Deja, Stanisław; Łukaszewicz, Marcin; Młynarz, Piotr
2015-12-01
Poppy seeds are widely used in household and commercial confectionery. The aim of this study was to demonstrate the application of metabolic profiling for industrial monitoring of the molecular changes which occur during minced poppy seed rancidity and brewing processes performed on raw seeds. Both forms of poppy seeds were obtained from a confectionery company. Proton nuclear magnetic resonance (1H NMR) was applied as the analytical method of choice together with multivariate statistical data analysis. Metabolic fingerprinting was applied as a bioprocess control tool to monitor rancidity with the trajectory of change and brewing progressions. Low molecular weight compounds were found to be statistically significant biomarkers of these bioprocesses. Changes in concentrations of chemical compounds were explained relative to the biochemical processes and external conditions. The obtained results provide valuable and comprehensive information to gain a better understanding of the biology of rancidity and brewing processes, while demonstrating the potential for applying NMR spectroscopy combined with multivariate data analysis tools for quality control in food industries involved in the processing of oilseeds. This precious and versatile information gives a better understanding of the biology of these processes.
Story Processing Ability in Cognitively Healthy Younger and Older Adults
Wright, Heather Harris; Capilouto, Gilson J.; Srinivasan, Cidambi; Fergadiotis, Gerasimos
2012-01-01
Purpose The purpose of the study was to examine the relationships among measures of comprehension and production for stories depicted in wordless pictures books and measures of memory and attention for 2 age groups. Method Sixty cognitively healthy adults participated. They consisted of two groups—young adults (20–29 years of age) and older adults (70–89 years of age). Participants completed cognitive measures and several discourse tasks; these included telling stories depicted in wordless picture books and answering multiple-choice comprehension questions pertaining to the story. Results The 2 groups did not differ significantly for proportion of story propositions conveyed; however, the younger group performed significantly better on the comprehension measure as compared with the older group. Only the older group demonstrated a statistically significant relationship between the story measures. Performance on the production and comprehension measures significantly correlated with performance on the cognitive measures for the older group but not for the younger group. Conclusions The relationship between adults’ comprehension of stimuli used to elicit narrative production samples and their narrative productions differed across the life span, suggesting that discourse processing performance changes in healthy aging. Finally, the study’s findings suggest that memory and attention contribute to older adults’ story processing performance. PMID:21106701
Giri, Veda N.; Coups, Elliot J.; Ruth, Karen; Goplerud, Julia; Raysor, Susan; Kim, Taylor Y.; Bagden, Loretta; Mastalski, Kathleen; Zakrzewski, Debra; Leimkuhler, Suzanne; Watkins-Bruner, Deborah
2009-01-01
Purpose Men with a family history (FH) of prostate cancer (PCA) and African American (AA) men are at higher risk for PCA. Recruitment and retention of these high-risk men into early detection programs has been challenging. We report a comprehensive analysis on recruitment methods, show rates, and participant factors from the Prostate Cancer Risk Assessment Program (PRAP), which is a prospective, longitudinal PCA screening study. Materials and Methods Men 35–69 years are eligible if they have a FH of PCA, are AA, or have a BRCA1/2 mutation. Recruitment methods were analyzed with respect to participant demographics and show to the first PRAP appointment using standard statistical methods Results Out of 707 men recruited, 64.9% showed to the initial PRAP appointment. More individuals were recruited via radio than from referral or other methods (χ2 = 298.13, p < .0001). Men recruited via radio were more likely to be AA (p<0.001), less educated (p=0.003), not married or partnered (p=0.007), and have no FH of PCA (p<0.001). Men recruited via referrals had higher incomes (p=0.007). Men recruited via referral were more likely to attend their initial PRAP visit than those recruited by radio or other methods (χ2 = 27.08, p < .0001). Conclusions This comprehensive analysis finds that radio leads to higher recruitment of AA men with lower socioeconomic status. However, these are the high-risk men that have lower show rates for PCA screening. Targeted motivational measures need to be studied to improve show rates for PCA risk assessment for these high-risk men. PMID:19758657
Late paleozoic fusulinoidean gigantism driven by atmospheric hyperoxia.
Payne, Jonathan L; Groves, John R; Jost, Adam B; Nguyen, Thienan; Moffitt, Sarah E; Hill, Tessa M; Skotheim, Jan M
2012-09-01
Atmospheric hyperoxia, with pO(2) in excess of 30%, has long been hypothesized to account for late Paleozoic (360-250 million years ago) gigantism in numerous higher taxa. However, this hypothesis has not been evaluated statistically because comprehensive size data have not been compiled previously at sufficient temporal resolution to permit quantitative analysis. In this study, we test the hyperoxia-gigantism hypothesis by examining the fossil record of fusulinoidean foraminifers, a dramatic example of protistan gigantism with some individuals exceeding 10 cm in length and exceeding their relatives by six orders of magnitude in biovolume. We assembled and examined comprehensive regional and global, species-level datasets containing 270 and 1823 species, respectively. A statistical model of size evolution forced by atmospheric pO(2) is conclusively favored over alternative models based on random walks or a constant tendency toward size increase. Moreover, the ratios of volume to surface area in the largest fusulinoideans are consistent in magnitude and trend with a mathematical model based on oxygen transport limitation. We further validate the hyperoxia-gigantism model through an examination of modern foraminiferal species living along a measured gradient in oxygen concentration. These findings provide the first quantitative confirmation of a direct connection between Paleozoic gigantism and atmospheric hyperoxia. © 2012 The Author(s). Evolution© 2012 The Society for the Study of Evolution.
Clinical research of comprehensive rehabilitation in treating brachial plexus injury patients.
Zhou, Jun-Ming; Gu, Yu-Dong; Xu, Xiao-Jun; Zhang, Shen-Yu; Zhao, Xin
2012-07-01
Brachial plexus injury is one of the difficult medical problems in the world. The aim of this study was to observe the clinical therapeutic effect of comprehensive rehabilitation in treating dysfunction after brachial plexus injury. Forty-three cases of dysfunction after brachial plexus injury were divided into two groups randomly. The treatment group, which totaled 21 patients (including 14 cases of total brachial plexus injury and seven cases of branch brachial plexus injury), was treated with comprehensive rehabilitation including transcutaneous electrical nerve stimulation, mid-frequency electrotherapy, Tuina therapy, and occupational therapy. The control group, which totaled 22 patients (including 16 cases of total brachial plexus injury and six cases of branch brachial plexus injury), was treated with home-based electrical nerve stimulation and occupational therapy. Each course was of 30 days duration and the patients received four courses totally. After four courses, the rehabilitation effect was evaluated according to the brachial plexus function evaluation standard and electromyogram (EMG) assessment. In the treatment group, there was significant difference in the scores of brachial plexus function pre- and post-treatment (P < 0.01) in both "total" and "branch" injury. The scores of two "total injury" groups had statistical differences (P < 0.01), while the scores of two "branch injury" groups had statistical differences (P < 0.05) after four courses. EMG suggested that the appearance of regeneration potentials of the recipient nerves in the treatment group was earlier than the control group and had significant differences (P < 0.05). Comprehensive rehabilitation was more effective in treating dysfunction after brachial plexus injury than nonintegrated rehabilitation.
Schaid, Daniel J
2010-01-01
Measures of genomic similarity are the basis of many statistical analytic methods. We review the mathematical and statistical basis of similarity methods, particularly based on kernel methods. A kernel function converts information for a pair of subjects to a quantitative value representing either similarity (larger values meaning more similar) or distance (smaller values meaning more similar), with the requirement that it must create a positive semidefinite matrix when applied to all pairs of subjects. This review emphasizes the wide range of statistical methods and software that can be used when similarity is based on kernel methods, such as nonparametric regression, linear mixed models and generalized linear mixed models, hierarchical models, score statistics, and support vector machines. The mathematical rigor for these methods is summarized, as is the mathematical framework for making kernels. This review provides a framework to move from intuitive and heuristic approaches to define genomic similarities to more rigorous methods that can take advantage of powerful statistical modeling and existing software. A companion paper reviews novel approaches to creating kernels that might be useful for genomic analyses, providing insights with examples [1]. Copyright © 2010 S. Karger AG, Basel.
Probabilistic biological network alignment.
Todor, Andrei; Dobra, Alin; Kahveci, Tamer
2013-01-01
Interactions between molecules are probabilistic events. An interaction may or may not happen with some probability, depending on a variety of factors such as the size, abundance, or proximity of the interacting molecules. In this paper, we consider the problem of aligning two biological networks. Unlike existing methods, we allow one of the two networks to contain probabilistic interactions. Allowing interaction probabilities makes the alignment more biologically relevant at the expense of explosive growth in the number of alternative topologies that may arise from different subsets of interactions that take place. We develop a novel method that efficiently and precisely characterizes this massive search space. We represent the topological similarity between pairs of aligned molecules (i.e., proteins) with the help of random variables and compute their expected values. We validate our method showing that, without sacrificing the running time performance, it can produce novel alignments. Our results also demonstrate that our method identifies biologically meaningful mappings under a comprehensive set of criteria used in the literature as well as the statistical coherence measure that we developed to analyze the statistical significance of the similarity of the functions of the aligned protein pairs.
Adapt-Mix: learning local genetic correlation structure improves summary statistics-based analyses
Park, Danny S.; Brown, Brielin; Eng, Celeste; Huntsman, Scott; Hu, Donglei; Torgerson, Dara G.; Burchard, Esteban G.; Zaitlen, Noah
2015-01-01
Motivation: Approaches to identifying new risk loci, training risk prediction models, imputing untyped variants and fine-mapping causal variants from summary statistics of genome-wide association studies are playing an increasingly important role in the human genetics community. Current summary statistics-based methods rely on global ‘best guess’ reference panels to model the genetic correlation structure of the dataset being studied. This approach, especially in admixed populations, has the potential to produce misleading results, ignores variation in local structure and is not feasible when appropriate reference panels are missing or small. Here, we develop a method, Adapt-Mix, that combines information across all available reference panels to produce estimates of local genetic correlation structure for summary statistics-based methods in arbitrary populations. Results: We applied Adapt-Mix to estimate the genetic correlation structure of both admixed and non-admixed individuals using simulated and real data. We evaluated our method by measuring the performance of two summary statistics-based methods: imputation and joint-testing. When using our method as opposed to the current standard of ‘best guess’ reference panels, we observed a 28% decrease in mean-squared error for imputation and a 73.7% decrease in mean-squared error for joint-testing. Availability and implementation: Our method is publicly available in a software package called ADAPT-Mix available at https://github.com/dpark27/adapt_mix. Contact: noah.zaitlen@ucsf.edu PMID:26072481
Evaluating and interpreting cross-taxon congruence: Potential pitfalls and solutions
NASA Astrophysics Data System (ADS)
Gioria, Margherita; Bacaro, Giovanni; Feehan, John
2011-05-01
Characterizing the relationship between different taxonomic groups is critical to identify potential surrogates for biodiversity. Previous studies have shown that cross-taxa relationships are generally weak and/or inconsistent. The difficulties in finding predictive patterns have often been attributed to the spatial and temporal scales of these studies and on the differences in the measure used to evaluate such relationships (species richness versus composition). However, the choice of the analytical approach used to evaluate cross-taxon congruence inevitably represents a major source of variation. Here, we described the use of a range of methods that can be used to comprehensively assess cross-taxa relationships. To do so, we used data for two taxonomic groups, wetland plants and water beetles, collected from 54 farmland ponds in Ireland. Specifically, we used the Pearson correlation and rarefaction curves to analyse patterns in species richness, while Mantel tests, Procrustes analysis, and co-correspondence analysis were used to evaluate congruence in species composition. We compared the results of these analyses and we described some of the potential pitfalls associated with the use of each of these statistical approaches. Cross-taxon congruence was moderate to strong, depending on the choice of the analytical approach, on the nature of the response variable, and on local and environmental conditions. Our findings indicate that multiple approaches and measures of community structure are required for a comprehensive assessment of cross-taxa relationships. In particular, we showed that selection of surrogate taxa in conservation planning should not be based on a single statistic expressing the degree of correlation in species richness or composition. Potential solutions to the analytical issues associated with the assessment of cross-taxon congruence are provided and the implications of our findings in the selection of surrogates for biodiversity are discussed.
Montagna, Matteo; Sassera, Davide; Epis, Sara; Bazzocchi, Chiara; Vannini, Claudia; Lo, Nathan; Sacchi, Luciano; Fukatsu, Takema; Petroni, Giulio
2013-01-01
“Candidatus Midichloria mitochondrii” is an intramitochondrial bacterium of the order Rickettsiales associated with the sheep tick Ixodes ricinus. Bacteria phylogenetically related to “Ca. Midichloria mitochondrii” (midichloria and like organisms [MALOs]) have been shown to be associated with a wide range of hosts, from amoebae to a variety of animals, including humans. Despite numerous studies focused on specific members of the MALO group, no comprehensive phylogenetic and statistical analyses have so far been performed on the group as a whole. Here, we present a multidisciplinary investigation based on 16S rRNA gene sequences using both phylogenetic and statistical methods, thereby analyzing MALOs in the overall framework of the Rickettsiales. This study revealed that (i) MALOs form a monophyletic group; (ii) the MALO group is structured into distinct subgroups, verifying current genera as significant evolutionary units and identifying several subclades that could represent novel genera; (iii) the MALO group ranks at the level of described Rickettsiales families, leading to the proposal of the novel family “Candidatus Midichloriaceae.” In addition, based on the phylogenetic trees generated, we present an evolutionary scenario to interpret the distribution and life history transitions of these microorganisms associated with highly divergent eukaryotic hosts: we suggest that aquatic/environmental protista have acted as evolutionary reservoirs for members of this novel family, from which one or more lineages with the capacity of infecting metazoa have evolved. PMID:23503305
3Drefine: an interactive web server for efficient protein structure refinement.
Bhattacharya, Debswapna; Nowotny, Jackson; Cao, Renzhi; Cheng, Jianlin
2016-07-08
3Drefine is an interactive web server for consistent and computationally efficient protein structure refinement with the capability to perform web-based statistical and visual analysis. The 3Drefine refinement protocol utilizes iterative optimization of hydrogen bonding network combined with atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields for efficient protein structure refinement. The method has been extensively evaluated on blind CASP experiments as well as on large-scale and diverse benchmark datasets and exhibits consistent improvement over the initial structure in both global and local structural quality measures. The 3Drefine web server allows for convenient protein structure refinement through a text or file input submission, email notification, provided example submission and is freely available without any registration requirement. The server also provides comprehensive analysis of submissions through various energy and statistical feedback and interactive visualization of multiple refined models through the JSmol applet that is equipped with numerous protein model analysis tools. The web server has been extensively tested and used by many users. As a result, the 3Drefine web server conveniently provides a useful tool easily accessible to the community. The 3Drefine web server has been made publicly available at the URL: http://sysbio.rnet.missouri.edu/3Drefine/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Zhu, Zhi-Liang; Reed, Bradley C.; Zhu, Zhi-Liang; Reed, Bradley C.
2014-01-01
This assessment was conducted to fulfill the requirements of section 712 of the Energy Independence and Security Act of 2007 and to conduct a comprehensive national assessment of storage and flux (flow) of carbon and the fluxes of other greenhouse gases in ecosystems of the Eastern United States. These carbon and greenhouse gas variables were examined for major terrestrial ecosystems (forests, grasslands/shrublands, agricultural lands, and wetlands) and aquatic ecosystems (rivers, streams, lakes, estuaries, and coastal waters) in the Eastern United States in two time periods: baseline (from 2001 through 2005) and future (projections from the end of the baseline through 2050). The Great Lakes were not included in this assessment due to a lack of input data. The assessment was based on measured and observed data collected by the U.S. Geological Survey and many other agencies and organizations and used remote sensing, statistical methods, and simulation models.
On the Computation of Comprehensive Boolean Gröbner Bases
NASA Astrophysics Data System (ADS)
Inoue, Shutaro
We show that a comprehensive Boolean Gröbner basis of an ideal I in a Boolean polynomial ring B (bar A,bar X) with main variables bar X and parameters bar A can be obtained by simply computing a usual Boolean Gröbner basis of I regarding both bar X and bar A as variables with a certain block term order such that bar X ≫ bar A. The result together with a fact that a finite Boolean ring is isomorphic to a direct product of the Galois field mathbb{GF}_2 enables us to compute a comprehensive Boolean Gröbner basis by only computing corresponding Gröbner bases in a polynomial ring over mathbb{GF}_2. Our implementation in a computer algebra system Risa/Asir shows that our method is extremely efficient comparing with existing computation algorithms of comprehensive Boolean Gröbner bases.
NASA Astrophysics Data System (ADS)
Shulgina, T.; Genina, E.; Gordov, E.; Nikitchuk, K.
2009-04-01
At present numerous data archives which include meteorological observations as well as climate processes modeling data are available for Earth Science specialists. Methods of mathematical statistics are widely used for their processing and analysis. In many cases they represent the only way of quantitative assessment of the meteorological and climatic information. Unified set of analysis methods allows us to compare climatic characteristics calculated on the basis of different datasets with the purpose of performing more detailed analysis of climate dynamics for both regional and global levels. The report presents the results of comparative analysis of atmosphere temperature behavior for the Northern Eurasia territory for the period from 1979 to 2004 based on the NCEP/NCAR Reanalysis, NCEP/DOE Reanalysis AMIP II, JMA/CRIEPI JRA-25 Reanalysis, ECMWF ERA-40 Reanalysis data and observation data obtained from meteorological stations of the former Soviet Union. Statistical processing of atmosphere temperature data included analysis of time series homogeneity of climate indices approved by WMO, such as "Number of frost days", "Number of summer days", "Number of icing days", "Number of tropical nights", etc. by means of parametric methods of mathematical statistics (Fisher and Student tests). That allowed conducting comprehensive research of spatio-temporal features of the atmosphere temperature. Analysis of the atmosphere temperature dynamics revealed inhomogeneity of the data obtained for large observation intervals. Particularly, analysis performed for the period 1979 - 2004 showed the significant increase of the number of frost and icing days approximately by 1 day for every 2 years and decrease roughly by 1 day for 2 years for the number of summer days. Also it should be mentioned that the growth period mean temperature have increased by 1.5 - 2° C for the time period being considered. The usage of different Reanalysis datasets in conjunction with in-situ observed data allowed comparing of climate indices values calculated on the basis of different datasets that improves the reliability of the results obtained. Partial support of SB RAS Basic Research Program 4.5.2 (Project 2) is acknowledged.
Ries, Kernell G.; Eng, Ken
2010-01-01
The U.S. Geological Survey, in cooperation with the Maryland Department of the Environment, operated a network of 20 low-flow partial-record stations during 2008 in a region that extends from southwest of Baltimore to the northeastern corner of Maryland to obtain estimates of selected streamflow statistics at the station locations. The study area is expected to face a substantial influx of new residents and businesses as a result of military and civilian personnel transfers associated with the Federal Base Realignment and Closure Act of 2005. The estimated streamflow statistics, which include monthly 85-percent duration flows, the 10-year recurrence-interval minimum base flow, and the 7-day, 10-year low flow, are needed to provide a better understanding of the availability of water resources in the area to be affected by base-realignment activities. Streamflow measurements collected for this study at the low-flow partial-record stations and measurements collected previously for 8 of the 20 stations were related to concurrent daily flows at nearby index streamgages to estimate the streamflow statistics. Three methods were used to estimate the streamflow statistics and two methods were used to select the index streamgages. Of the three methods used to estimate the streamflow statistics, two of them--the Moments and MOVE1 methods--rely on correlating the streamflow measurements at the low-flow partial-record stations with concurrent streamflows at nearby, hydrologically similar index streamgages to determine the estimates. These methods, recommended for use by the U.S. Geological Survey, generally require about 10 streamflow measurements at the low-flow partial-record station. The third method transfers the streamflow statistics from the index streamgage to the partial-record station based on the average of the ratios of the measured streamflows at the partial-record station to the concurrent streamflows at the index streamgage. This method can be used with as few as one pair of streamflow measurements made on a single streamflow recession at the low-flow partial-record station, although additional pairs of measurements will increase the accuracy of the estimates. Errors associated with the two correlation methods generally were lower than the errors associated with the flow-ratio method, but the advantages of the flow-ratio method are that it can produce reasonably accurate estimates from streamflow measurements much faster and at lower cost than estimates obtained using the correlation methods. The two index-streamgage selection methods were (1) selection based on the highest correlation coefficient between the low-flow partial-record station and the index streamgages, and (2) selection based on Euclidean distance, where the Euclidean distance was computed as a function of geographic proximity and the basin characteristics: drainage area, percentage of forested area, percentage of impervious area, and the base-flow recession time constant, t. Method 1 generally selected index streamgages that were significantly closer to the low-flow partial-record stations than method 2. The errors associated with the estimated streamflow statistics generally were lower for method 1 than for method 2, but the differences were not statistically significant. The flow-ratio method for estimating streamflow statistics at low-flow partial-record stations was shown to be independent from the two correlation-based estimation methods. As a result, final estimates were determined for eight low-flow partial-record stations by weighting estimates from the flow-ratio method with estimates from one of the two correlation methods according to the respective variances of the estimates. Average standard errors of estimate for the final estimates ranged from 90.0 to 7.0 percent, with an average value of 26.5 percent. Average standard errors of estimate for the weighted estimates were, on average, 4.3 percent less than the best average standard errors of estima
The Effect of Local Smokefree Regulations on Birth Outcomes and Prenatal Smoking.
Bartholomew, Karla S; Abouk, Rahi
2016-07-01
Objectives We assessed the impact of varying levels of smokefree regulations on birth outcomes and prenatal smoking. Methods We exploited variations in timing and regulation restrictiveness of West Virginia's county smokefree regulations to assess their impact on birthweight, gestational age, low birthweight, very low birthweight, preterm birth, and prenatal smoking. We conducted regression analysis using state Vital Statistics individual-level data for singletons born to West Virginia residents between 1995-2010 (N = 293,715). Results Only more comprehensive smokefree regulations were associated with statistically significant favorable effects on birth outcomes in the full sample: Comprehensive (workplace/restaurant/bar ban) demonstrated increased birthweight (29 grams, p < 0.05) and gestational age (1.64 days, p < 0.01), as well as reductions in very low birthweight (-0.4 %, p < 0.05) and preterm birth (-1.5 %, p < 0.01); Restrictive (workplace/restaurant ban) demonstrated a small decrease in very low birthweight (-0.2 %, p < 0.05). Among less restrictive regulations: Moderate (workplace ban) was associated with a 23 g (p < 0.01) decrease in birthweight; Limited (partial ban) had no effect. Comprehensive's improvements extended to most maternal groups, and were broadest among mothers 21+ years, non-smokers, and unmarried mothers. Prenatal smoking declined slightly (-1.7 %, p < 0.01) only among married women with Comprehensive. Conclusions Regulation restrictiveness is a determining factor in the impact of smokefree regulations on birth outcomes, with comprehensive smokefree regulations showing promise in improving birth outcomes. Favorable effects on birth outcomes appear to stem from reduced secondhand smoke exposure rather than reduced prenatal smoking prevalence. This study is limited by an inability to measure secondhand smoke exposure and the paucity of data on policy implementation and enforcement.
ERIC Educational Resources Information Center
McCulloch, Ryan Sterling
2017-01-01
The role of any statistics course is to increase the understanding and comprehension of statistical concepts and those goals can be achieved via both theoretical instruction and statistical software training. However, many introductory courses either forego advanced software usage, or leave its use to the student as a peripheral activity. The…
Method and system for efficient video compression with low-complexity encoder
NASA Technical Reports Server (NTRS)
Chen, Jun (Inventor); He, Dake (Inventor); Sheinin, Vadim (Inventor); Jagmohan, Ashish (Inventor); Lu, Ligang (Inventor)
2012-01-01
Disclosed are a method and system for video compression, wherein the video encoder has low computational complexity and high compression efficiency. The disclosed system comprises a video encoder and a video decoder, wherein the method for encoding includes the steps of converting a source frame into a space-frequency representation; estimating conditional statistics of at least one vector of space-frequency coefficients; estimating encoding rates based on the said conditional statistics; and applying Slepian-Wolf codes with the said computed encoding rates. The preferred method for decoding includes the steps of; generating a side-information vector of frequency coefficients based on previously decoded source data, encoder statistics, and previous reconstructions of the source frequency vector; and performing Slepian-Wolf decoding of at least one source frequency vector based on the generated side-information, the Slepian-Wolf code bits and the encoder statistics.
Association of ED with chronic periodontal disease.
Matsumoto, S; Matsuda, M; Takekawa, M; Okada, M; Hashizume, K; Wada, N; Hori, J; Tamaki, G; Kita, M; Iwata, T; Kakizaki, H
2014-01-01
To examine the relationship between chronic periodontal disease (CPD) and ED, the interview sheet including the CPD self-checklist (CPD score) and the five-item version of the International Index of Erectile Function (IIEF-5) was distributed to 300 adult men who received a comprehensive dental examination. Statistical analyses were performed by the Spearman's rank correlation coefficient and other methods. Statistical significance was accepted at the level of P<0.05. The interview sheets were collected from 88 men (response rate 29.3%, 50.9±16.6 years old). There was a statistically significant correlation between the CPD score and the presence of ED (P=0.0415). The results in the present study suggest that ED is related to the damage caused by endothelial dysfunction and the systematic inflammatory changes associated with CPD. The present study also suggests that dental health is important as a preventive medicine for ED.
2009-01-01
Background Biomedical scientists need to access figures to validate research facts and to formulate or to test novel research hypotheses. However, figures are difficult to comprehend without associated text (e.g., figure legend and other reference text). We are developing automated systems to extract the relevant explanatory information along with figures extracted from full text articles. Such systems could be very useful in improving figure retrieval and in reducing the workload of biomedical scientists, who otherwise have to retrieve and read the entire full-text journal article to determine which figures are relevant to their research. As a crucial step, we studied the importance of associated text in biomedical figure comprehension. Methods Twenty subjects evaluated three figure-text combinations: figure+legend, figure+legend+title+abstract, and figure+full-text. Using a Likert scale, each subject scored each figure+text according to the extent to which the subject thought he/she understood the meaning of the figure and the confidence in providing the assigned score. Additionally, each subject entered a free text summary for each figure-text. We identified missing information using indicator words present within the text summaries. Both the Likert scores and the missing information were statistically analyzed for differences among the figure-text types. We also evaluated the quality of text summaries with the text-summarization evaluation method the ROUGE score. Results Our results showed statistically significant differences in figure comprehension when varying levels of text were provided. When the full-text article is not available, presenting just the figure+legend left biomedical researchers lacking 39–68% of the information about a figure as compared to having complete figure comprehension; adding the title and abstract improved the situation, but still left biomedical researchers missing 30% of the information. When the full-text article is available, figure comprehension increased to 86–97%; this indicates that researchers felt that only 3–14% of the necessary information for full figure comprehension was missing when full text was available to them. Clearly there is information in the abstract and in the full text that biomedical scientists deem important for understanding the figures that appear in full-text biomedical articles. Conclusion We conclude that the texts that appear in full-text biomedical articles are useful for understanding the meaning of a figure, and an effective figure-mining system needs to unlock the information beyond figure legend. Our work provides important guidance to the figure mining systems that extract information only from figure and figure legend. PMID:19126221
Using Purposefully Created Stories to Teach Academic Vocabulary
ERIC Educational Resources Information Center
Lee, Changnam; Roberts, Carly; Coffey, Debra
2017-01-01
Students' knowledge of vocabulary affects their reading comprehension. Despite abundant research findings in vocabulary learning, practical instructional methods for use in schools are typically underdeveloped. This article proposes a research-based method for teaching the meanings of base academic vocabulary (i.e., Tier 2) words. The method…
Barik, Amita; Das, Santasabuj
2018-01-02
Small RNAs (sRNAs) in bacteria have emerged as key players in transcriptional and post-transcriptional regulation of gene expression. Here, we present a statistical analysis of different sequence- and structure-related features of bacterial sRNAs to identify the descriptors that could discriminate sRNAs from other bacterial RNAs. We investigated a comprehensive and heterogeneous collection of 816 sRNAs, identified by northern blotting across 33 bacterial species and compared their various features with other classes of bacterial RNAs, such as tRNAs, rRNAs and mRNAs. We observed that sRNAs differed significantly from the rest with respect to G+C composition, normalized minimum free energy of folding, motif frequency and several RNA-folding parameters like base-pairing propensity, Shannon entropy and base-pair distance. Based on the selected features, we developed a predictive model using Random Forests (RF) method to classify the above four classes of RNAs. Our model displayed an overall predictive accuracy of 89.5%. These findings would help to differentiate bacterial sRNAs from other RNAs and further promote prediction of novel sRNAs in different bacterial species.
Alternative evaluation metrics for risk adjustment methods.
Park, Sungchul; Basu, Anirban
2018-06-01
Risk adjustment is instituted to counter risk selection by accurately equating payments with expected expenditures. Traditional risk-adjustment methods are designed to estimate accurate payments at the group level. However, this generates residual risks at the individual level, especially for high-expenditure individuals, thereby inducing health plans to avoid those with high residual risks. To identify an optimal risk-adjustment method, we perform a comprehensive comparison of prediction accuracies at the group level, at the tail distributions, and at the individual level across 19 estimators: 9 parametric regression, 7 machine learning, and 3 distributional estimators. Using the 2013-2014 MarketScan database, we find that no one estimator performs best in all prediction accuracies. Generally, machine learning and distribution-based estimators achieve higher group-level prediction accuracy than parametric regression estimators. However, parametric regression estimators show higher tail distribution prediction accuracy and individual-level prediction accuracy, especially at the tails of the distribution. This suggests that there is a trade-off in selecting an appropriate risk-adjustment method between estimating accurate payments at the group level and lower residual risks at the individual level. Our results indicate that an optimal method cannot be determined solely on the basis of statistical metrics but rather needs to account for simulating plans' risk selective behaviors. Copyright © 2018 John Wiley & Sons, Ltd.
[Effectiveness of different maintenance methods for codonopsis radix].
Shi, Yan-Bin; Wang, Yu-Ping; Li, Yan; Liu, Cheng-Song; Li, Hui-Li; Zhang, Xiao-Yun; Li, Shou-Tang
2014-05-01
To observe different maintenance methods including vacuum-packing, storage together with tobacco, storage together with fennel, ethanol steam and sulfur fumigation for the protection of Codonopsis Radix against mildew and insect damage, and to analyze the content of polysaccharide and flavonoids of Codonopsis Radix tested in this studies, so as to look for the scientific maintenance methods replacing traditional sulfur fumigation. Except for the sulfur fumigation, naturally air-dried Codonopsis Radix was used to investigate the maintenance effectiveness of the above methods, respectively. Mildew was observed by visual inspection, and the content of polysaccharide and flavonoids were determined by ultra-violet and visible spectrophotometer. Comprehensive evaluation was given based on the results of the different maintenance methods. Low-temperature vacuum-packing, ambient-temperature vacuum-packing and sulfur fumigation could keep Codonopsis Radix from mildew and insect damage for one year, but ambient-temperature vacuum-packing showed flatulent phenomenon; ethanol steam could keep Codonopsis Radix from mildew and insects for over half a year; storage together with tobacco or fennel did not have maintenance effect. The difference of polysaccharide and flavonoids contents of all tested Codonopsis Radix was not statistically significant. Low temperature vacuum-packing maintenance can replace traditional sulfur fumigation, and it can maintain the quality of Codonopsis Radix to a certain extent.
Harper, Marc; Gronenberg, Luisa; Liao, James; Lee, Christopher
2014-01-01
Discovering all the genetic causes of a phenotype is an important goal in functional genomics. We combine an experimental design for detecting independent genetic causes of a phenotype with a high-throughput sequencing analysis that maximizes sensitivity for comprehensively identifying them. Testing this approach on a set of 24 mutant strains generated for a metabolic phenotype with many known genetic causes, we show that this pathway-based phenotype sequencing analysis greatly improves sensitivity of detection compared with previous methods, and reveals a wide range of pathways that can cause this phenotype. We demonstrate our approach on a metabolic re-engineering phenotype, the PEP/OAA metabolic node in E. coli, which is crucial to a substantial number of metabolic pathways and under renewed interest for biofuel research. Out of 2157 mutations in these strains, pathway-phenoseq discriminated just five gene groups (12 genes) as statistically significant causes of the phenotype. Experimentally, these five gene groups, and the next two high-scoring pathway-phenoseq groups, either have a clear connection to the PEP metabolite level or offer an alternative path of producing oxaloacetate (OAA), and thus clearly explain the phenotype. These high-scoring gene groups also show strong evidence of positive selection pressure, compared with strictly neutral selection in the rest of the genome.
Wieser, Stefan; Axmann, Markus; Schütz, Gerhard J.
2008-01-01
We propose here an approach for the analysis of single-molecule trajectories which is based on a comprehensive comparison of an experimental data set with multiple Monte Carlo simulations of the diffusion process. It allows quantitative data analysis, particularly whenever analytical treatment of a model is infeasible. Simulations are performed on a discrete parameter space and compared with the experimental results by a nonparametric statistical test. The method provides a matrix of p-values that assess the probability for having observed the experimental data at each setting of the model parameters. We show the testing approach for three typical situations observed in the cellular plasma membrane: i), free Brownian motion of the tracer, ii), hop diffusion of the tracer in a periodic meshwork of squares, and iii), transient binding of the tracer to slowly diffusing structures. By plotting the p-value as a function of the model parameters, one can easily identify the most consistent parameter settings but also recover mutual dependencies and ambiguities which are difficult to determine by standard fitting routines. Finally, we used the test to reanalyze previous data obtained on the diffusion of the glycosylphosphatidylinositol-protein CD59 in the plasma membrane of the human T24 cell line. PMID:18805933
Development of a Comprehensive Heart Disease Knowledge Questionnaire
Bergman, Hannah E.; Reeve, Bryce B.; Moser, Richard P.; Scholl, Sarah; Klein, William M. P.
2011-01-01
Background Heart disease is the number one killer of both men and women in the United States, yet a comprehensive and evidence-based heart disease knowledge assessment is currently not available. Purpose This paper describes the 2 phase development of a novel heart disease knowledge questionnaire. Methods After review and critique of the existing literature, a questionnaire addressing 5 central domains of heart disease knowledge was constructed. In Phase I, 606 undergraduates completed a 82-item questionnaire. In Phase II, 248 undergraduates completed a revised 74-item questionnaire. In both phases, item clarity and difficulty were evaluated, along with the overall factor structure of the scale. Results Exploratory and confirmatory factor analyses were used to reduce the scale to 30 items with fit statistics, CFI = .82, TLI = .88, and RMSEA = .03. Scores were correlated moderately positively with an existing scale and weakly positively with a measure of health literacy, thereby establishing both convergent and divergent validity. Discussion The finalized 30-item questionnaire is a concise, yet discriminating instrument that reliably measures participants' heart disease knowledge levels. Translation to Health Education Practice Health professionals can use this scale to assess their patients' heart disease knowledge so that they can create a tailored program to help their patients reduce their heart disease risk. PMID:21720571
a Probability-Based Statistical Method to Extract Water Body of TM Images with Missing Information
NASA Astrophysics Data System (ADS)
Lian, Shizhong; Chen, Jiangping; Luo, Minghai
2016-06-01
Water information cannot be accurately extracted using TM images because true information is lost in some images because of blocking clouds and missing data stripes, thereby water information cannot be accurately extracted. Water is continuously distributed in natural conditions; thus, this paper proposed a new method of water body extraction based on probability statistics to improve the accuracy of water information extraction of TM images with missing information. Different disturbing information of clouds and missing data stripes are simulated. Water information is extracted using global histogram matching, local histogram matching, and the probability-based statistical method in the simulated images. Experiments show that smaller Areal Error and higher Boundary Recall can be obtained using this method compared with the conventional methods.
A computerized data base of nitrate concentrations in Indiana ground water
Risch, M.R.; Cohen, D.A.
1995-01-01
The nitrate data base was compiled from numerous data sets that were readily accessible in electronic format. The uses of these data may be limited because they were neither comprehensive nor of a single statistical design. Nonetheless, the nitrate data can be used in several ways: (1) to identify geographic areas with and without nitrate data; (2) to evaluate assumptions, models, and maps of ground-water-contamination potential; and (3) to investigate the relation between environmental factors, land-use types, and the occurrence of nitrate.
A virtual climate library of surface temperature over North America for 1979-2015
NASA Astrophysics Data System (ADS)
Kravtsov, Sergey; Roebber, Paul; Brazauskas, Vytaras
2017-10-01
The most comprehensive continuous-coverage modern climatic data sets, known as reanalyses, come from combining state-of-the-art numerical weather prediction (NWP) models with diverse available observations. These reanalysis products estimate the path of climate evolution that actually happened, and their use in a probabilistic context—for example, to document trends in extreme events in response to climate change—is, therefore, limited. Free runs of NWP models without data assimilation can in principle be used for the latter purpose, but such simulations are computationally expensive and are prone to systematic biases. Here we produce a high-resolution, 100-member ensemble simulation of surface atmospheric temperature over North America for the 1979-2015 period using a comprehensive spatially extended non-stationary statistical model derived from the data based on the North American Regional Reanalysis. The surrogate climate realizations generated by this model are independent from, yet nearly statistically congruent with reality. This data set provides unique opportunities for the analysis of weather-related risk, with applications in agriculture, energy development, and protection of human life.
A virtual climate library of surface temperature over North America for 1979–2015
Kravtsov, Sergey; Roebber, Paul; Brazauskas, Vytaras
2017-01-01
The most comprehensive continuous-coverage modern climatic data sets, known as reanalyses, come from combining state-of-the-art numerical weather prediction (NWP) models with diverse available observations. These reanalysis products estimate the path of climate evolution that actually happened, and their use in a probabilistic context—for example, to document trends in extreme events in response to climate change—is, therefore, limited. Free runs of NWP models without data assimilation can in principle be used for the latter purpose, but such simulations are computationally expensive and are prone to systematic biases. Here we produce a high-resolution, 100-member ensemble simulation of surface atmospheric temperature over North America for the 1979–2015 period using a comprehensive spatially extended non-stationary statistical model derived from the data based on the North American Regional Reanalysis. The surrogate climate realizations generated by this model are independent from, yet nearly statistically congruent with reality. This data set provides unique opportunities for the analysis of weather-related risk, with applications in agriculture, energy development, and protection of human life. PMID:29039842
A virtual climate library of surface temperature over North America for 1979-2015.
Kravtsov, Sergey; Roebber, Paul; Brazauskas, Vytaras
2017-10-17
The most comprehensive continuous-coverage modern climatic data sets, known as reanalyses, come from combining state-of-the-art numerical weather prediction (NWP) models with diverse available observations. These reanalysis products estimate the path of climate evolution that actually happened, and their use in a probabilistic context-for example, to document trends in extreme events in response to climate change-is, therefore, limited. Free runs of NWP models without data assimilation can in principle be used for the latter purpose, but such simulations are computationally expensive and are prone to systematic biases. Here we produce a high-resolution, 100-member ensemble simulation of surface atmospheric temperature over North America for the 1979-2015 period using a comprehensive spatially extended non-stationary statistical model derived from the data based on the North American Regional Reanalysis. The surrogate climate realizations generated by this model are independent from, yet nearly statistically congruent with reality. This data set provides unique opportunities for the analysis of weather-related risk, with applications in agriculture, energy development, and protection of human life.
Alsharif, Abdelhamid M; Potts, Michelle; Laws, Regina; Freire, Amado X; Sultan-Ali, Ibrahim
2016-10-01
Obstructive sleep apnea (OSA) is a prevalent disorder that is associated with multiple medical consequences. Although in-laboratory polysomnography is the gold standard for the diagnosis of OSA, portable monitors have been developed and studied to help increase efficiency and ease of diagnosis. We aimed to assess the adequacy of a midlevel provider specializing in sleep medicine to risk-stratify patients for OSA based on a chart review versus a comprehensive clinic evaluation before scheduling an unattended sleep study. This study was an observational, nonrandomized, retrospective data collection by chart review of patients accrued prospectively who underwent an unattended sleep study at the Sleep Health Center at the Memphis Veterans Affairs Medical Center during the first 13 months of the program (May 1, 2011-May 31, 2012). A total of 205 patients were included in the data analysis. Analysis showed no statistically significant differences between chart review and clinic visit groups ( P = 0.54) in terms of OSA diagnosis. Although not statistically significant, the analysis shows a trend toward higher mean age (50.3 vs 47.4 years; P = 0.10) and lower mean body mass index (34.4 vs 36.0; P = 0.08) in individuals who were evaluated during a comprehensive clinic visit. A statistically significant difference is seen in terms of the pretest clinical probability of OSA being moderate or high in 62.2% of patients in the clinic visit group and 95.7% in the chart review group, with a χ 2 P ≤ 0.0001. In the Veterans Health Administration's system, the assessment of pretest probability may be determined by a midlevel provider using chart review with equal efficacy to a comprehensive face-to-face evaluation in terms of OSA diagnosis via unattended sleep studies.
Statistical tools for transgene copy number estimation based on real-time PCR.
Yuan, Joshua S; Burris, Jason; Stewart, Nathan R; Mentewab, Ayalew; Stewart, C Neal
2007-11-01
As compared with traditional transgene copy number detection technologies such as Southern blot analysis, real-time PCR provides a fast, inexpensive and high-throughput alternative. However, the real-time PCR based transgene copy number estimation tends to be ambiguous and subjective stemming from the lack of proper statistical analysis and data quality control to render a reliable estimation of copy number with a prediction value. Despite the recent progresses in statistical analysis of real-time PCR, few publications have integrated these advancements in real-time PCR based transgene copy number determination. Three experimental designs and four data quality control integrated statistical models are presented. For the first method, external calibration curves are established for the transgene based on serially-diluted templates. The Ct number from a control transgenic event and putative transgenic event are compared to derive the transgene copy number or zygosity estimation. Simple linear regression and two group T-test procedures were combined to model the data from this design. For the second experimental design, standard curves were generated for both an internal reference gene and the transgene, and the copy number of transgene was compared with that of internal reference gene. Multiple regression models and ANOVA models can be employed to analyze the data and perform quality control for this approach. In the third experimental design, transgene copy number is compared with reference gene without a standard curve, but rather, is based directly on fluorescence data. Two different multiple regression models were proposed to analyze the data based on two different approaches of amplification efficiency integration. Our results highlight the importance of proper statistical treatment and quality control integration in real-time PCR-based transgene copy number determination. These statistical methods allow the real-time PCR-based transgene copy number estimation to be more reliable and precise with a proper statistical estimation. Proper confidence intervals are necessary for unambiguous prediction of trangene copy number. The four different statistical methods are compared for their advantages and disadvantages. Moreover, the statistical methods can also be applied for other real-time PCR-based quantification assays including transfection efficiency analysis and pathogen quantification.
Zhang, Yun; Baheti, Saurabh; Sun, Zhifu
2018-05-01
High-throughput bisulfite methylation sequencing such as reduced representation bisulfite sequencing (RRBS), Agilent SureSelect Human Methyl-Seq (Methyl-seq) or whole-genome bisulfite sequencing is commonly used for base resolution methylome research. These data are represented either by the ratio of methylated cytosine versus total coverage at a CpG site or numbers of methylated and unmethylated cytosines. Multiple statistical methods can be used to detect differentially methylated CpGs (DMCs) between conditions, and these methods are often the base for the next step of differentially methylated region identification. The ratio data have a flexibility of fitting to many linear models, but the raw count data take consideration of coverage information. There is an array of options in each datatype for DMC detection; however, it is not clear which is an optimal statistical method. In this study, we systematically evaluated four statistic methods on methylation ratio data and four methods on count-based data and compared their performances with regard to type I error control, sensitivity and specificity of DMC detection and computational resource demands using real RRBS data along with simulation. Our results show that the ratio-based tests are generally more conservative (less sensitive) than the count-based tests. However, some count-based methods have high false-positive rates and should be avoided. The beta-binomial model gives a good balance between sensitivity and specificity and is preferred method. Selection of methods in different settings, signal versus noise and sample size estimation are also discussed.
NASA Astrophysics Data System (ADS)
Pignalosa, Antonio; Di Crescenzo, Giuseppe; Marino, Ermanno; Terracciano, Rosario; Santo, Antonio
2015-04-01
The work here presented concerns a case study in which a complete multidisciplinary workflow has been applied for an extensive assessment of the rockslide susceptibility and hazard in a common scenario such as a vertical and fractured rocky cliffs. The studied area is located in a high-relief zone in Southern Italy (Sacco, Salerno, Campania), characterized by wide vertical rocky cliffs formed by tectonized thick successions of shallow-water limestones. The study concerned the following phases: a) topographic surveying integrating of 3d laser scanning, photogrammetry and GNSS; b) gelogical surveying, characterization of single instabilities and geomecanichal surveying, conducted by geologists rock climbers; c) processing of 3d data and reconstruction of high resolution geometrical models; d) structural and geomechanical analyses; e) data filing in a GIS-based spatial database; f) geo-statistical and spatial analyses and mapping of the whole set of data; g) 3D rockfall analysis; The main goals of the study have been a) to set-up an investigation method to achieve a complete and thorough characterization of the slope stability conditions and b) to provide a detailed base for an accurate definition of the reinforcement and mitigation systems. For this purposes the most up-to-date methods of field surveying, remote sensing, 3d modelling and geospatial data analysis have been integrated in a systematic workflow, accounting of the economic sustainability of the whole project. A novel integrated approach have been applied both fusing deterministic and statistical surveying methods. This approach enabled to deal with the wide extension of the studied area (near to 200.000 m2), without compromising an high accuracy of the results. The deterministic phase, based on a field characterization of single instabilities and their further analyses on 3d models, has been applied for delineating the peculiarity of each single feature. The statistical approach, based on geostructural field mapping and on punctual geomechanical data from scan-line surveying, allowed the rock mass partitioning in homogeneous geomechanical sectors and data interpolation through bounded geostatistical analyses on 3d models. All data, resulting from both approaches, have been referenced and filed in a single spatial database and considered in global geo-statistical analyses for deriving a fully modelled and comprehensive evaluation of the rockslide susceptibility. The described workflow yielded the following innovative results: a) a detailed census of single potential instabilities, through a spatial database recording the geometrical, geological and mechanical features, along with the expected failure modes; b) an high resolution characterization of the whole slope rockslide susceptibility, based on the partitioning of the area according to the stability and mechanical conditions which can be directly related to specific hazard mitigation systems; c) the exact extension of the area exposed to the rockslide hazard, along with the dynamic parameters of expected phenomena; d) an intervention design for hazard mitigation.
Education Statistics Quarterly, Spring 2001.
ERIC Educational Resources Information Center
Education Statistics Quarterly, 2001
2001-01-01
The "Education Statistics Quarterly" gives a comprehensive overview of work done across all parts of the National Center for Education Statistics (NCES). Each issue contains short publications, summaries, and descriptions that cover all NCES publications, data products and funding opportunities developed over a 3-month period. Each issue…
Who Benefits from an Intensive Comprehensive Aphasia Program?
ERIC Educational Resources Information Center
Babbitt, Edna M.; Worrall, Linda; Cherney, Leora R.
2016-01-01
Purpose: This article summarizes current outcomes from intensive comprehensive aphasia programs (ICAPs) and examines data from one ICAP to identify those who respond and do not respond to treatment. Methods: Participants were divided into 2 groups, responders and nonresponders, based on ±5-point change score on the Western Aphasia Battery-Revised…
Development of a Comprehensive Heart Disease Knowledge Questionnaire
ERIC Educational Resources Information Center
Bergman, Hannah E.; Reeve, Bryce B.; Moser, Richard P.; Scholl, Sarah; Klein, William M. P.
2011-01-01
Background: Heart disease is the number one killer of both men and women in the United States, yet a comprehensive and evidence-based heart disease knowledge assessment is currently not available. Purpose: This paper describes the two-phase development of a novel heart disease knowledge questionnaire. Methods: After review and critique of the…
Weighted Statistical Binning: Enabling Statistically Consistent Genome-Scale Phylogenetic Analyses
Bayzid, Md Shamsuzzoha; Mirarab, Siavash; Boussau, Bastien; Warnow, Tandy
2015-01-01
Because biological processes can result in different loci having different evolutionary histories, species tree estimation requires multiple loci from across multiple genomes. While many processes can result in discord between gene trees and species trees, incomplete lineage sorting (ILS), modeled by the multi-species coalescent, is considered to be a dominant cause for gene tree heterogeneity. Coalescent-based methods have been developed to estimate species trees, many of which operate by combining estimated gene trees, and so are called "summary methods". Because summary methods are generally fast (and much faster than more complicated coalescent-based methods that co-estimate gene trees and species trees), they have become very popular techniques for estimating species trees from multiple loci. However, recent studies have established that summary methods can have reduced accuracy in the presence of gene tree estimation error, and also that many biological datasets have substantial gene tree estimation error, so that summary methods may not be highly accurate in biologically realistic conditions. Mirarab et al. (Science 2014) presented the "statistical binning" technique to improve gene tree estimation in multi-locus analyses, and showed that it improved the accuracy of MP-EST, one of the most popular coalescent-based summary methods. Statistical binning, which uses a simple heuristic to evaluate "combinability" and then uses the larger sets of genes to re-calculate gene trees, has good empirical performance, but using statistical binning within a phylogenomic pipeline does not have the desirable property of being statistically consistent. We show that weighting the re-calculated gene trees by the bin sizes makes statistical binning statistically consistent under the multispecies coalescent, and maintains the good empirical performance. Thus, "weighted statistical binning" enables highly accurate genome-scale species tree estimation, and is also statistically consistent under the multi-species coalescent model. New data used in this study are available at DOI: http://dx.doi.org/10.6084/m9.figshare.1411146, and the software is available at https://github.com/smirarab/binning. PMID:26086579
A hierarchical fuzzy rule-based approach to aphasia diagnosis.
Akbarzadeh-T, Mohammad-R; Moshtagh-Khorasani, Majid
2007-10-01
Aphasia diagnosis is a particularly challenging medical diagnostic task due to the linguistic uncertainty and vagueness, inconsistencies in the definition of aphasic syndromes, large number of measurements with imprecision, natural diversity and subjectivity in test objects as well as in opinions of experts who diagnose the disease. To efficiently address this diagnostic process, a hierarchical fuzzy rule-based structure is proposed here that considers the effect of different features of aphasia by statistical analysis in its construction. This approach can be efficient for diagnosis of aphasia and possibly other medical diagnostic applications due to its fuzzy and hierarchical reasoning construction. Initially, the symptoms of the disease which each consists of different features are analyzed statistically. The measured statistical parameters from the training set are then used to define membership functions and the fuzzy rules. The resulting two-layered fuzzy rule-based system is then compared with a back propagating feed-forward neural network for diagnosis of four Aphasia types: Anomic, Broca, Global and Wernicke. In order to reduce the number of required inputs, the technique is applied and compared on both comprehensive and spontaneous speech tests. Statistical t-test analysis confirms that the proposed approach uses fewer Aphasia features while also presenting a significant improvement in terms of accuracy.
Technology in Social Work Education: A Systematic Review
ERIC Educational Resources Information Center
Wretman, Christopher J.; Macy, Rebecca J.
2016-01-01
Given the growing prevalence of technology-based instruction, social work faculty need a clear understanding of the strengths and limitations of these methods. We systematically examined the evidence for technology-based instruction in social work education. Using comprehensive and rigorous methods, 38 articles were included in the review. Of…
Eppig, Joel S; Edmonds, Emily C; Campbell, Laura; Sanderson-Cimino, Mark; Delano-Wood, Lisa; Bondi, Mark W
2017-08-01
Research demonstrates heterogeneous neuropsychological profiles among individuals with mild cognitive impairment (MCI). However, few studies have included visuoconstructional ability or used latent mixture modeling to statistically identify MCI subtypes. Therefore, we examined whether unique neuropsychological MCI profiles could be ascertained using latent profile analysis (LPA), and subsequently investigated cerebrospinal fluid (CSF) biomarkers, genotype, and longitudinal clinical outcomes between the empirically derived classes. A total of 806 participants diagnosed by means of the Alzheimer's Disease Neuroimaging Initiative (ADNI) MCI criteria received a comprehensive neuropsychological battery assessing visuoconstructional ability, language, attention/executive function, and episodic memory. Test scores were adjusted for demographic characteristics using standardized regression coefficients based on "robust" normal control performance (n=260). Calculated Z-scores were subsequently used in the LPA, and CSF-derived biomarkers, genotype, and longitudinal clinical outcome were evaluated between the LPA-derived MCI classes. Statistical fit indices suggested a 3-class model was the optimal LPA solution. The three-class LPA consisted of a mixed impairment MCI class (n=106), an amnestic MCI class (n=455), and an LPA-derived normal class (n=245). Additionally, the amnestic and mixed classes were more likely to be apolipoprotein e4+ and have worse Alzheimer's disease CSF biomarkers than LPA-derived normal subjects. Our study supports significant heterogeneity in MCI neuropsychological profiles using LPA and extends prior work (Edmonds et al., 2015) by demonstrating a lower rate of progression in the approximately one-third of ADNI MCI individuals who may represent "false-positive" diagnoses. Our results underscore the importance of using sensitive, actuarial methods for diagnosing MCI, as current diagnostic methods may be over-inclusive. (JINS, 2017, 23, 564-576).
NASA Astrophysics Data System (ADS)
Martinez, Patricia
This thesis describes a research study that resulted in an instructional model directed at helping fourth grade diverse students improve their science knowledge, their reading comprehension, their awareness of the relationship between science and reading, and their ability to transfer strategies. The focus of the instructional model emerged from the intersection of constructs in science and reading literacy; the model identifies cognitive strategies that can be used in science and reading, and inquiry-based instruction related to the science content read by participants. The intervention is termed INSCIREAD (Instruction in Science and Reading). The GoInquire web-based system (2006) was used to develop students' content knowledge in slow landform change. Seventy-eight students participated in the study. The treatment group comprised 49 students without disabilities and 8 students with disabilities. The control group comprised 21 students without disabilities. The design of the study is a combination of a mixed-methods quasi-experimental design (Study 1), and a single subject design with groups as the unit of analysis (Study 2). The results from the quantitative measures demonstrated that the text recall data analysis from Study 1 yielded near significant statistical levels when comparing the performance of students without disabilities in the treatment group to that of the control group. Visual analyses of the results from the text recall data from Study 2 showed at least minimal change in all groups. The results of the data analysis of the level of the generated questions show there was a statistically significant increase in the scores students without disabilities obtained in the questions they generated from the pre to the posttest. The analyses conducted to detect incongruities, to summarize and rate importance, and to determine the number of propositions on a science and reading concept map data showed a statistically significant difference between students without disabilities in the treatment and the control groups on post-intervention scores. The analysis of the data from the number of misconceptions of students without disabilities showed that the frequency of 4 of the 11 misconceptions changed significantly from pre to post elicitation stages. The analyses of the qualitative measures of the think alouds and interviews generally supported the above findings.
Yang, James J; Li, Jia; Williams, L Keoki; Buu, Anne
2016-01-05
In genome-wide association studies (GWAS) for complex diseases, the association between a SNP and each phenotype is usually weak. Combining multiple related phenotypic traits can increase the power of gene search and thus is a practically important area that requires methodology work. This study provides a comprehensive review of existing methods for conducting GWAS on complex diseases with multiple phenotypes including the multivariate analysis of variance (MANOVA), the principal component analysis (PCA), the generalizing estimating equations (GEE), the trait-based association test involving the extended Simes procedure (TATES), and the classical Fisher combination test. We propose a new method that relaxes the unrealistic independence assumption of the classical Fisher combination test and is computationally efficient. To demonstrate applications of the proposed method, we also present the results of statistical analysis on the Study of Addiction: Genetics and Environment (SAGE) data. Our simulation study shows that the proposed method has higher power than existing methods while controlling for the type I error rate. The GEE and the classical Fisher combination test, on the other hand, do not control the type I error rate and thus are not recommended. In general, the power of the competing methods decreases as the correlation between phenotypes increases. All the methods tend to have lower power when the multivariate phenotypes come from long tailed distributions. The real data analysis also demonstrates that the proposed method allows us to compare the marginal results with the multivariate results and specify which SNPs are specific to a particular phenotype or contribute to the common construct. The proposed method outperforms existing methods in most settings and also has great applications in GWAS on complex diseases with multiple phenotypes such as the substance abuse disorders.
A Comparison of the Achievement of Statistics Students Enrolled in Online and Face-to-Face Settings
ERIC Educational Resources Information Center
Christmann, Edwin P.
2017-01-01
This study compared the achievement of male and female students who were enrolled in an online univariate statistics course to students enrolled in a traditional face-to-face univariate statistics course. The subjects, 47 graduate students enrolled in univariate statistics classes at a public, comprehensive university, were randomly assigned to…
A voxel-based investigation for MRI-only radiotherapy of the brain using ultra short echo times
NASA Astrophysics Data System (ADS)
Edmund, Jens M.; Kjer, Hans M.; Van Leemput, Koen; Hansen, Rasmus H.; Andersen, Jon AL; Andreasen, Daniel
2014-12-01
Radiotherapy (RT) based on magnetic resonance imaging (MRI) as the only modality, so-called MRI-only RT, would remove the systematic registration error between MR and computed tomography (CT), and provide co-registered MRI for assessment of treatment response and adaptive RT. Electron densities, however, need to be assigned to the MRI images for dose calculation and patient setup based on digitally reconstructed radiographs (DRRs). Here, we investigate the geometric and dosimetric performance for a number of popular voxel-based methods to generate a so-called pseudo CT (pCT). Five patients receiving cranial irradiation, each containing a co-registered MRI and CT scan, were included. An ultra short echo time MRI sequence for bone visualization was used. Six methods were investigated for three popular types of voxel-based approaches; (1) threshold-based segmentation, (2) Bayesian segmentation and (3) statistical regression. Each approach contained two methods. Approach 1 used bulk density assignment of MRI voxels into air, soft tissue and bone based on logical masks and the transverse relaxation time T2 of the bone. Approach 2 used similar bulk density assignments with Bayesian statistics including or excluding additional spatial information. Approach 3 used a statistical regression correlating MRI voxels with their corresponding CT voxels. A similar photon and proton treatment plan was generated for a target positioned between the nasal cavity and the brainstem for all patients. The CT agreement with the pCT of each method was quantified and compared with the other methods geometrically and dosimetrically using both a number of reported metrics and introducing some novel metrics. The best geometrical agreement with CT was obtained with the statistical regression methods which performed significantly better than the threshold and Bayesian segmentation methods (excluding spatial information). All methods agreed significantly better with CT than a reference water MRI comparison. The mean dosimetric deviation for photons and protons compared to the CT was about 2% and highest in the gradient dose region of the brainstem. Both the threshold based method and the statistical regression methods showed the highest dosimetrical agreement. Generation of pCTs using statistical regression seems to be the most promising candidate for MRI-only RT of the brain. Further, the total amount of different tissues needs to be taken into account for dosimetric considerations regardless of their correct geometrical position.
Development of methods for establishing nutrient criteria in lakes and reservoirs: A review.
Huo, Shouliang; Ma, Chunzi; Xi, Beidou; Zhang, Yali; Wu, Fengchang; Liu, Hongliang
2018-05-01
Nutrient criteria provide a scientific foundation for the comprehensive evaluation, prevention, control and management of water eutrophication. In this review, the literature was examined to systematically evaluate the benefits, drawbacks, and applications of statistical analysis, paleolimnological reconstruction, stressor-response model, and model inference approaches for nutrient criteria determination. The developments and challenges in the determination of nutrient criteria in lakes and reservoirs are presented. Reference lakes can reflect the original states of lakes, but reference sites are often unavailable. Using the paleolimnological reconstruction method, it is often difficult to reconstruct the historical nutrient conditions of shallow lakes in which the sediments are easily disturbed. The model inference approach requires sufficient data to identify the appropriate equations and characterize a waterbody or group of waterbodies, thereby increasing the difficulty of establishing nutrient criteria. The stressor-response model is a potential development direction for nutrient criteria determination, and the mechanisms of stressor-response models should be studied further. Based on studies of the relationships among water ecological criteria, eutrophication, nutrient criteria and plankton, methods for determining nutrient criteria should be closely integrated with water management requirements. Copyright © 2017. Published by Elsevier B.V.
Xiao, Xun; Geyer, Veikko F.; Bowne-Anderson, Hugo; Howard, Jonathon; Sbalzarini, Ivo F.
2016-01-01
Biological filaments, such as actin filaments, microtubules, and cilia, are often imaged using different light-microscopy techniques. Reconstructing the filament curve from the acquired images constitutes the filament segmentation problem. Since filaments have lower dimensionality than the image itself, there is an inherent trade-off between tracing the filament with sub-pixel accuracy and avoiding noise artifacts. Here, we present a globally optimal filament segmentation method based on B-spline vector level-sets and a generalized linear model for the pixel intensity statistics. We show that the resulting optimization problem is convex and can hence be solved with global optimality. We introduce a simple and efficient algorithm to compute such optimal filament segmentations, and provide an open-source implementation as an ImageJ/Fiji plugin. We further derive an information-theoretic lower bound on the filament segmentation error, quantifying how well an algorithm could possibly do given the information in the image. We show that our algorithm asymptotically reaches this bound in the spline coefficients. We validate our method in comprehensive benchmarks, compare with other methods, and show applications from fluorescence, phase-contrast, and dark-field microscopy. PMID:27104582
Benner, Christian; Havulinna, Aki S; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ripatti, Samuli; Pirinen, Matti
2017-10-05
During the past few years, various novel statistical methods have been developed for fine-mapping with the use of summary statistics from genome-wide association studies (GWASs). Although these approaches require information about the linkage disequilibrium (LD) between variants, there has not been a comprehensive evaluation of how estimation of the LD structure from reference genotype panels performs in comparison with that from the original individual-level GWAS data. Using population genotype data from Finland and the UK Biobank, we show here that a reference panel of 1,000 individuals from the target population is adequate for a GWAS cohort of up to 10,000 individuals, whereas smaller panels, such as those from the 1000 Genomes Project, should be avoided. We also show, both theoretically and empirically, that the size of the reference panel needs to scale with the GWAS sample size; this has important consequences for the application of these methods in ongoing GWAS meta-analyses and large biobank studies. We conclude by providing software tools and by recommending practices for sharing LD information to more efficiently exploit summary statistics in genetics research. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Estimating the Diets of Animals Using Stable Isotopes and a Comprehensive Bayesian Mixing Model
Hopkins, John B.; Ferguson, Jake M.
2012-01-01
Using stable isotope mixing models (SIMMs) as a tool to investigate the foraging ecology of animals is gaining popularity among researchers. As a result, statistical methods are rapidly evolving and numerous models have been produced to estimate the diets of animals—each with their benefits and their limitations. Deciding which SIMM to use is contingent on factors such as the consumer of interest, its food sources, sample size, the familiarity a user has with a particular framework for statistical analysis, or the level of inference the researcher desires to make (e.g., population- or individual-level). In this paper, we provide a review of commonly used SIMM models and describe a comprehensive SIMM that includes all features commonly used in SIMM analysis and two new features. We used data collected in Yosemite National Park to demonstrate IsotopeR's ability to estimate dietary parameters. We then examined the importance of each feature in the model and compared our results to inferences from commonly used SIMMs. IsotopeR's user interface (in R) will provide researchers a user-friendly tool for SIMM analysis. The model is also applicable for use in paleontology, archaeology, and forensic studies as well as estimating pollution inputs. PMID:22235246
Walsh, Michele E.
2011-01-01
Objectives. We examined the impact of Arizona's May 2007 comprehensive statewide smoking ban on hospital admissions for diagnoses for which there is evidence of a causal relationship with secondhand smoke (SHS) exposure (acute myocardial infarction [AMI], angina, stroke, and asthma). Methods. We compared monthly hospital admissions from January 2004 through May 2008 for these primary diagnoses and 4 diagnoses not associated with SHS (appendicitis, kidney stones, acute cholecystitis, and ulcers) for Arizona counties with preexisting county or municipal smoking bans and counties with no previous bans. We attributed reductions in admissions to the statewide ban if they occurred only in diagnoses associated with SHS and if they were larger in counties with no previous bans. We analyzed the data with Poisson regressions, controlling for seasonality and admissions trends. We also estimated cost savings. Results. Statistically significant reductions in hospital admissions were seen for AMI, angina, stroke, and asthma in counties with no previous bans over what was seen in counties with previous bans. No ban variable coefficients were statistically significant for diagnoses not associated with SHS. Conclusions. Arizona's statewide smoking ban decreased hospital admissions for AMI, stroke, asthma, and angina. PMID:20466955
Chen, Xin-yu; Zhang, Chun-quan; Zhang, Hong; Liu, Wei; Wang, Jun-yan
2010-05-01
To investigate the use and trend of drugs on digestive system in Hangzhou area, under the comprehensive statistics index (CSI). Using the analytical method related to the sum of consumption and CSI, the application of digestive system drugs of different manufacturers in Hangzhou from 2005 to 2007 was analyzed. Other than H2 receptor antagonist, the total consumption of digestive system drugs increased yearly, in terms of the total consumption, the first 4 leading ones were proton pump inhibitors, micro ecology medicines, antiemetic drugs and gastroprokinetic agents. The Laspeyres index of drugs on digestive system increased to different extent. The Laspeyres indices of proton pump inhibitors, probiotics, antiemetic drugs and gastroprokinetic agents were 1.396 50, 1.020 42, 1.728 90, 1.148 50 in 2006 while 2.081 10, 1.217 55, 2.223 50, 1.156 60 in 2007, respectively. Through CSI, the results showed the situation of use and trend of digestive system related drugs in Hangzhou. Factors as rationality, efficiency and costs of the drugs as well as the etiology of the disease were also explored to some degree.
BTC method for evaluation of remaining strength and service life of bridge cables.
DOT National Transportation Integrated Search
2011-09-01
"This report presents the BTC method; a comprehensive state-of-the-art methodology for evaluation of remaining : strength and residual life of bridge cables. The BTC method is a probability-based, proprietary, patented, and peerreviewed : methodology...
Miyata, Hiromitsu; Minagawa-Kawai, Yasuyo; Watanabe, Shigeru; Sasaki, Toyofumi; Ueda, Kazuhiro
2012-01-01
Background A growing body of evidence suggests that meditative training enhances perception and cognition. In Japan, the Park-Sasaki method of speed-reading involves organized visual training while forming both a relaxed and concentrated state of mind, as in meditation. The present study examined relationships between reading speed, sentence comprehension, and eye movements while reading short Japanese novels. In addition to normal untrained readers, three middle-level trainees and one high-level expert on this method were included for the two case studies. Methodology/Principal Findings In Study 1, three of 17 participants were middle-level trainees on the speed-reading method. Immediately after reading each story once on a computer monitor, participants answered true or false questions regarding the content of the novel. Eye movements while reading were recorded using an eye-tracking system. Results revealed higher reading speed and lower comprehension scores in the trainees than in the untrained participants. Furthermore, eye-tracking data by untrained participants revealed multiple correlations between reading speed, accuracy and eye-movement measures, with faster readers showing shorter fixation durations and larger saccades in X than slower readers. In Study 2, participants included a high-level expert and 14 untrained students. The expert showed higher reading speed and statistically comparable, although numerically lower, comprehension scores compared with the untrained participants. During test sessions this expert moved her eyes along a nearly straight horizontal line as a first pass, without moving her eyes over the whole sentence display as did the untrained students. Conclusions/Significance In addition to revealing correlations between speed, comprehension and eye movements in reading Japanese contemporary novels by untrained readers, we describe cases of speed-reading trainees regarding relationships between these variables. The trainees overall tended to show poor performance influenced by the speed-accuracy trade-off, although this trade-off may be reduced in the case of at least one high-level expert. PMID:22590519
Cappel, Daniel; Sherman, Woody; Beuming, Thijs
2017-01-01
The ability to accurately characterize the solvation properties (water locations and thermodynamics) of biomolecules is of great importance to drug discovery. While crystallography, NMR, and other experimental techniques can assist in determining the structure of water networks in proteins and protein-ligand complexes, most water molecules are not fully resolved and accurately placed. Furthermore, understanding the energetic effects of solvation and desolvation on binding requires an analysis of the thermodynamic properties of solvent involved in the interaction between ligands and proteins. WaterMap is a molecular dynamics-based computational method that uses statistical mechanics to describe the thermodynamic properties (entropy, enthalpy, and free energy) of water molecules at the surface of proteins. This method can be used to assess the solvent contributions to ligand binding affinity and to guide lead optimization. In this review, we provide a comprehensive summary of published uses of WaterMap, including applications to lead optimization, virtual screening, selectivity analysis, ligand pose prediction, and druggability assessment. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
NASA Astrophysics Data System (ADS)
Glicksman, Martin E.; Smith, Richard N.; Marsh, Steven P.; Kuklinski, Robert
A key element of mushy zone modeling is the description of the microscopic evolution of the lengthscales within the mushy zone and the influence of macroscopic transport processes. This paper describes some recent progress in developing a mean-field statistical theory of phase coarsening in adiabatic mushy zones. The main theoretical predictions are temporal scaling laws that indicate that average lengthscale increases as time 1/3, a self-similar distribution of mushy zone lengthscales based on spherical solid particle shapes, and kinetic rate constants which provide the dependences of the coarsening process on material parameters and the volume fraction of the solid phase. High precision thermal decay experiments are described which verify aspects of the theory in pure material mushy zones held under adiabatic conditions. The microscopic coarsening theory is then integrated within a macroscopic heat transfer model of one-dimensional alloy solidification, using the Double Integral Method. The method demonstrates an ability to predict the influence of macroscopic heat transfer on the evolution of primary and secondary dendrite arm spacings in Al-Cu alloys. Finally, some suggestions are made for future experimental and theoretical studies required in developing comprehensive solidification processing models.
Analyzing and interpreting genome data at the network level with ConsensusPathDB.
Herwig, Ralf; Hardt, Christopher; Lienhard, Matthias; Kamburov, Atanas
2016-10-01
ConsensusPathDB consists of a comprehensive collection of human (as well as mouse and yeast) molecular interaction data integrated from 32 different public repositories and a web interface featuring a set of computational methods and visualization tools to explore these data. This protocol describes the use of ConsensusPathDB (http://consensuspathdb.org) with respect to the functional and network-based characterization of biomolecules (genes, proteins and metabolites) that are submitted to the system either as a priority list or together with associated experimental data such as RNA-seq. The tool reports interaction network modules, biochemical pathways and functional information that are significantly enriched by the user's input, applying computational methods for statistical over-representation, enrichment and graph analysis. The results of this protocol can be observed within a few minutes, even with genome-wide data. The resulting network associations can be used to interpret high-throughput data mechanistically, to characterize and prioritize biomarkers, to integrate different omics levels, to design follow-up functional assay experiments and to generate topology for kinetic models at different scales.
Evaluation of variability in high-resolution protein structures by global distance scoring.
Anzai, Risa; Asami, Yoshiki; Inoue, Waka; Ueno, Hina; Yamada, Koya; Okada, Tetsuji
2018-01-01
Systematic analysis of the statistical and dynamical properties of proteins is critical to understanding cellular events. Extraction of biologically relevant information from a set of high-resolution structures is important because it can provide mechanistic details behind the functional properties of protein families, enabling rational comparison between families. Most of the current structural comparisons are pairwise-based, which hampers the global analysis of increasing contents in the Protein Data Bank. Additionally, pairing of protein structures introduces uncertainty with respect to reproducibility because it frequently accompanies other settings for superimposition. This study introduces intramolecular distance scoring for the global analysis of proteins, for each of which at least several high-resolution structures are available. As a pilot study, we have tested 300 human proteins and showed that the method is comprehensively used to overview advances in each protein and protein family at the atomic level. This method, together with the interpretation of the model calculations, provide new criteria for understanding specific structural variation in a protein, enabling global comparison of the variability in proteins from different species.
NASA Astrophysics Data System (ADS)
Knopoff, Damián A.
2016-09-01
The recent review paper [4] constitutes a valuable contribution on the understanding, modeling and simulation of crowd dynamics in extreme situations. It provides a very comprehensive revision about the complexity features of the system under consideration, scaling and the consequent justification of the used methods. In particular, macro and microscopic models have so far been used to model crowd dynamics [9] and authors appropriately explain that working at the mesoscale is a good choice to deal with the heterogeneous behaviour of walkers as well as with the difficulty of their deterministic identification. In this way, methods based on the kinetic theory and statistical dynamics are employed, more precisely the so-called kinetic theory for active particles [7]. This approach has successfully been applied in the modeling of several complex dynamics, with recent applications to learning [2,8] that constitutes the key to understand communication and is of great importance in social dynamics and behavioral sciences.
NASA Astrophysics Data System (ADS)
Müller, M. F.; Thompson, S. E.
2016-02-01
The prediction of flow duration curves (FDCs) in ungauged basins remains an important task for hydrologists given the practical relevance of FDCs for water management and infrastructure design. Predicting FDCs in ungauged basins typically requires spatial interpolation of statistical or model parameters. This task is complicated if climate becomes non-stationary, as the prediction challenge now also requires extrapolation through time. In this context, process-based models for FDCs that mechanistically link the streamflow distribution to climate and landscape factors may have an advantage over purely statistical methods to predict FDCs. This study compares a stochastic (process-based) and statistical method for FDC prediction in both stationary and non-stationary contexts, using Nepal as a case study. Under contemporary conditions, both models perform well in predicting FDCs, with Nash-Sutcliffe coefficients above 0.80 in 75 % of the tested catchments. The main drivers of uncertainty differ between the models: parameter interpolation was the main source of error for the statistical model, while violations of the assumptions of the process-based model represented the main source of its error. The process-based approach performed better than the statistical approach in numerical simulations with non-stationary climate drivers. The predictions of the statistical method under non-stationary rainfall conditions were poor if (i) local runoff coefficients were not accurately determined from the gauge network, or (ii) streamflow variability was strongly affected by changes in rainfall. A Monte Carlo analysis shows that the streamflow regimes in catchments characterized by frequent wet-season runoff and a rapid, strongly non-linear hydrologic response are particularly sensitive to changes in rainfall statistics. In these cases, process-based prediction approaches are favored over statistical models.
Schiffman, Eric L.; Truelove, Edmond L.; Ohrbach, Richard; Anderson, Gary C.; John, Mike T.; List, Thomas; Look, John O.
2011-01-01
AIMS The purpose of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) Validation Project was to assess the diagnostic validity of this examination protocol. An overview is presented, including Axis I and II methodology and descriptive statistics for the study participant sample. This paper details the development of reliable methods to establish the reference standards for assessing criterion validity of the Axis I RDC/TMD diagnoses. Validity testing for the Axis II biobehavioral instruments was based on previously validated reference standards. METHODS The Axis I reference standards were based on the consensus of 2 criterion examiners independently performing a comprehensive history, clinical examination, and evaluation of imaging. Intersite reliability was assessed annually for criterion examiners and radiologists. Criterion exam reliability was also assessed within study sites. RESULTS Study participant demographics were comparable to those of participants in previous studies using the RDC/TMD. Diagnostic agreement of the criterion examiners with each other and with the consensus-based reference standards was excellent with all kappas ≥ 0.81, except for osteoarthrosis (moderate agreement, k = 0.53). Intrasite criterion exam agreement with reference standards was excellent (k ≥ 0.95). Intersite reliability of the radiologists for detecting computed tomography-disclosed osteoarthrosis and magnetic resonance imaging-disclosed disc displacement was good to excellent (k = 0.71 and 0.84, respectively). CONCLUSION The Validation Project study population was appropriate for assessing the reliability and validity of the RDC/TMD Axis I and II. The reference standards used to assess the validity of Axis I TMD were based on reliable and clinically credible methods. PMID:20213028
Misconceptions of the p-value among Chilean and Italian Academic Psychologists
Badenes-Ribera, Laura; Frias-Navarro, Dolores; Iotti, Bryan; Bonilla-Campos, Amparo; Longobardi, Claudio
2016-01-01
Common misconceptions of p-values are based on certain beliefs and attributions about the significance of the results. Thus, they affect the professionals' decisions and jeopardize the quality of interventions and the accumulation of valid scientific knowledge. We conducted a survey on 164 academic psychologists (134 Italian, 30 Chilean) questioned on this topic. Our findings are consistent with previous research and suggest that some participants do not know how to correctly interpret p-values. The inverse probability fallacy presents the greatest comprehension problems, followed by the replication fallacy. These results highlight the importance of the statistical re-education of researchers. Recommendations for improving statistical cognition are proposed. PMID:27602007
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tratnyek, Paul G.; Bylaska, Eric J.; Weber, Eric J.
2017-01-01
Quantitative structure–activity relationships (QSARs) have long been used in the environmental sciences. More recently, molecular modeling and chemoinformatic methods have become widespread. These methods have the potential to expand and accelerate advances in environmental chemistry because they complement observational and experimental data with “in silico” results and analysis. The opportunities and challenges that arise at the intersection between statistical and theoretical in silico methods are most apparent in the context of properties that determine the environmental fate and effects of chemical contaminants (degradation rate constants, partition coefficients, toxicities, etc.). The main example of this is the calibration of QSARs usingmore » descriptor variable data calculated from molecular modeling, which can make QSARs more useful for predicting property data that are unavailable, but also can make them more powerful tools for diagnosis of fate determining pathways and mechanisms. Emerging opportunities for “in silico environmental chemical science” are to move beyond the calculation of specific chemical properties using statistical models and toward more fully in silico models, prediction of transformation pathways and products, incorporation of environmental factors into model predictions, integration of databases and predictive models into more comprehensive and efficient tools for exposure assessment, and extending the applicability of all the above from chemicals to biologicals and materials.« less
Lack of Comprehension of Common Prostate Cancer Terms in an Underserved Population
Kilbridge, Kerry L.; Fraser, Gertrude; Krahn, Murray; Nelson, Elizabeth M.; Conaway, Mark; Bashore, Randall; Wolf, Andrew; Barry, Michael J.; Gong, Debra A.; Nease, Robert F.; Connors, Alfred F.
2009-01-01
Purpose To assess the comprehension of common medical terms used in prostate cancer in patient education materials to obtain informed consent, and to measure outcomes after prostate cancer treatment. We address this issue among underserved, African-American men because of the increased cancer incidence and mortality observed in this population. Patients and Methods We reviewed patient education materials and prostate-specific quality-of-life instruments to identify technical terms describing sexual, urinary, and bowel function. Understanding of these terms was assessed in face-to-face interviews of 105, mostly African-American men, age ≥ 40, from two low-income clinics. Comprehension was evaluated using semiqualitative methods coded by two independent investigators. Demographics were collected and literacy was measured. Results Fewer than 50% of patients understood the terms “erection” or “impotent.” Only 5% of patients understood the term “incontinence” and 25% understood the term “bowel habits.” More patients recognized word roots than related terms or compound words (eg, “rectum” v “rectal urgency,” “intercourse” v “vaginal intercourse”). Comprehension of terms from all domains was statistically significantly correlated with reading level (P < .001). Median literacy level was fourth to sixth grade. Prostate cancer knowledge was poor. Many patients had difficulty locating key anatomic structures. Conclusion Limited comprehension of prostate cancer terms and low literacy create barriers to obtaining informed consent for treatment and to measuring prostate cancer outcomes accurately in our study population. In addition, the level of prostate cancer knowledge was poor. These results highlight the need for prostate cancer education efforts and outcomes measurements that consider literacy and use nonmedical language. PMID:19307512
Major, Nicole; McQuistan, Michelle R; Qian, Fang
2014-08-01
The purpose of this study was to assess which components of a community-based dental education (CBDE) program at The University of Iowa College of Dentistry & Dental Clinics were associated with overall student performance. This retrospective study analyzed data for 444 fourth-year students who graduated in 2006 through 2011. Information pertaining to students' CBDE rotations and their final grades from the comprehensive clinic (in two areas: Production and Competence) were used for statistical analysis. Bivariate analyses indicated that students who completed CBDE in the fall were more likely to receive an A or B in Production compared to students who completed CBDE in the spring. However, students who completed CBDE in the beginning or end of the academic year were more likely to receive an A or B in Competence compared to those who completed CBDE in the middle of the year. Students who treated a variety of patient types during CBDE experiences (comprehensive and emergency care vs. mainly comprehensive care) were more likely to receive better grades in Production, while CBDE clinic type was not associated with grades. Dental schools should consider how CBDE may impact students' performance in their institutional clinics when developing and evaluating CBDE programs.
2013-01-01
Background The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based. Results One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to “filter” redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate. Conclusion We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the summary-statistic based approach. We also implement the summary-statistic test using Z-statistics from an already-published GWAS of Chronic Obstructive Pulmonary Disorder (COPD) and correlation structure obtained from HapMap. We experiment with the modification of this test because the correlation structure is assumed imperfectly known. PMID:24199751
Swanson, David M; Blacker, Deborah; Alchawa, Taofik; Ludwig, Kerstin U; Mangold, Elisabeth; Lange, Christoph
2013-11-07
The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based. One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to "filter" redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate. We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the summary-statistic based approach. We also implement the summary-statistic test using Z-statistics from an already-published GWAS of Chronic Obstructive Pulmonary Disorder (COPD) and correlation structure obtained from HapMap. We experiment with the modification of this test because the correlation structure is assumed imperfectly known.
Ford, M E; Kallen, M; Richardson, P; Matthiesen, E; Cox, V; Teng, E J; Cook, K F; Petersen, N J
2008-01-01
To evaluate the effects of social support on comprehension and recall of consent form information in a study of Parkinson disease patients and their caregivers. Comparison of comprehension and recall outcomes among participants who read and signed the consent form accompanied by a family member/friend versus those of participants who read and signed the consent form unaccompanied. Comprehension and recall of consent form information were measured at one week and one month respectively, using Part A of the Quality of Informed Consent Questionnaire (QuIC). The mean age of the sample of 143 participants was 71 years (SD = 8.6 years). Analysis of covariance was used to compare QuIC scores between the intervention group (n = 70) and control group (n = 73). In the 1-week model, no statistically significant intervention effect was found (p = 0.860). However, the intervention status by patient status interaction was statistically significant (p = 0.012). In the 1-month model, no statistically significant intervention effect was found (p = 0.480). Again, however, the intervention status by patient status interaction was statistically significant (p = 0.040). At both time periods, intervention group patients scored higher (better) on the QuIC than did intervention group caregivers, and control group patients scored lower (worse) on the QuIC than did control group caregivers. Social support played a significant role in enhancing comprehension and recall of consent form information among patients.
Spatial Ensemble Postprocessing of Precipitation Forecasts Using High Resolution Analyses
NASA Astrophysics Data System (ADS)
Lang, Moritz N.; Schicker, Irene; Kann, Alexander; Wang, Yong
2017-04-01
Ensemble prediction systems are designed to account for errors or uncertainties in the initial and boundary conditions, imperfect parameterizations, etc. However, due to sampling errors and underestimation of the model errors, these ensemble forecasts tend to be underdispersive, and to lack both reliability and sharpness. To overcome such limitations, statistical postprocessing methods are commonly applied to these forecasts. In this study, a full-distributional spatial post-processing method is applied to short-range precipitation forecasts over Austria using Standardized Anomaly Model Output Statistics (SAMOS). Following Stauffer et al. (2016), observation and forecast fields are transformed into standardized anomalies by subtracting a site-specific climatological mean and dividing by the climatological standard deviation. Due to the need of fitting only a single regression model for the whole domain, the SAMOS framework provides a computationally inexpensive method to create operationally calibrated probabilistic forecasts for any arbitrary location or for all grid points in the domain simultaneously. Taking advantage of the INCA system (Integrated Nowcasting through Comprehensive Analysis), high resolution analyses are used for the computation of the observed climatology and for model training. The INCA system operationally combines station measurements and remote sensing data into real-time objective analysis fields at 1 km-horizontal resolution and 1 h-temporal resolution. The precipitation forecast used in this study is obtained from a limited area model ensemble prediction system also operated by ZAMG. The so called ALADIN-LAEF provides, by applying a multi-physics approach, a 17-member forecast at a horizontal resolution of 10.9 km and a temporal resolution of 1 hour. The performed SAMOS approach statistically combines the in-house developed high resolution analysis and ensemble prediction system. The station-based validation of 6 hour precipitation sums shows a mean improvement of more than 40% in CRPS when compared to bilinearly interpolated uncalibrated ensemble forecasts. The validation on randomly selected grid points, representing the true height distribution over Austria, still indicates a mean improvement of 35%. The applied statistical model is currently set up for 6-hourly and daily accumulation periods, but will be extended to a temporal resolution of 1-3 hours within a new probabilistic nowcasting system operated by ZAMG.
NASA Technical Reports Server (NTRS)
Bonavito, N. L.; Gordon, C. L.; Inguva, R.; Serafino, G. N.; Barnes, R. A.
1994-01-01
NASA's Mission to Planet Earth (MTPE) will address important interdisciplinary and environmental issues such as global warming, ozone depletion, deforestation, acid rain, and the like with its long term satellite observations of the Earth and with its comprehensive Data and Information System. Extensive sets of satellite observations supporting MTPE will be provided by the Earth Observing System (EOS), while more specific process related observations will be provided by smaller Earth Probes. MTPE will use data from ground and airborne scientific investigations to supplement and validate the global observations obtained from satellite imagery, while the EOS satellites will support interdisciplinary research and model development. This is important for understanding the processes that control the global environment and for improving the prediction of events. In this paper we illustrate the potential for powerful artificial intelligence (AI) techniques when used in the analysis of the formidable problems that exist in the NASA Earth Science programs and of those to be encountered in the future MTPE and EOS programs. These techniques, based on the logical and probabilistic reasoning aspects of plausible inference, strongly emphasize the synergetic relation between data and information. As such, they are ideally suited for the analysis of the massive data streams to be provided by both MTPE and EOS. To demonstrate this, we address both the satellite imagery and model enhancement issues for the problem of ozone profile retrieval through a method based on plausible scientific inferencing. Since in the retrieval problem, the atmospheric ozone profile that is consistent with a given set of measured radiances may not be unique, an optimum statistical method is used to estimate a 'best' profile solution from the radiances and from additional a priori information.
An adaptive state of charge estimation approach for lithium-ion series-connected battery system
NASA Astrophysics Data System (ADS)
Peng, Simin; Zhu, Xuelai; Xing, Yinjiao; Shi, Hongbing; Cai, Xu; Pecht, Michael
2018-07-01
Due to the incorrect or unknown noise statistics of a battery system and its cell-to-cell variations, state of charge (SOC) estimation of a lithium-ion series-connected battery system is usually inaccurate or even divergent using model-based methods, such as extended Kalman filter (EKF) and unscented Kalman filter (UKF). To resolve this problem, an adaptive unscented Kalman filter (AUKF) based on a noise statistics estimator and a model parameter regulator is developed to accurately estimate the SOC of a series-connected battery system. An equivalent circuit model is first built based on the model parameter regulator that illustrates the influence of cell-to-cell variation on the battery system. A noise statistics estimator is then used to attain adaptively the estimated noise statistics for the AUKF when its prior noise statistics are not accurate or exactly Gaussian. The accuracy and effectiveness of the SOC estimation method is validated by comparing the developed AUKF and UKF when model and measurement statistics noises are inaccurate, respectively. Compared with the UKF and EKF, the developed method shows the highest SOC estimation accuracy.
Application of competency-based education in laparoscopic training.
Xue, Dongbo; Bo, Hong; Zhang, Weihui; Zhao, Song; Meng, Xianzhi; Zhang, Donghua
2015-01-01
To induce competency-based education/developing a curriculum in the training of postgraduate students in laparoscopic surgery. This study selected postgraduate students before the implementation of competency-based education (n = 16) or after the implementation of competency-based education (n = 17). On the basis of the 5 competencies of patient care, medical knowledge, practice-based learning and improvement, interpersonal and communication skills, and professionalism, the research team created a developing a curriculum chart and specific improvement measures that were implemented in the competency-based education group. On the basis of the developing a curriculum chart, the assessment of the 5 comprehensive competencies using the 360° assessment method indicated that the competency-based education group's competencies were significantly improved compared with those of the traditional group (P < .05). The improvement in the comprehensive assessment was also significant compared with the traditional group (P < .05). The implementation of competency-based education/developing a curriculum teaching helps to improve the comprehensive competencies of postgraduate students and enables them to become qualified clinicians equipped to meet society's needs.
Comprehensive studies of the dynamics of geosystems with the use of remote sensing techniques
NASA Astrophysics Data System (ADS)
Vasilev, L. N.; Kaczyński, R.; Ney, B. I.
The described research programme for comprehensive studies of changes occuring within geosystems is a part of scientific activity of INTERKOSMOS, which will be executed mainly with the use of remote sensing methods and techniques. The main aim of the programme is to get an insight into the seasonal rithm of environmental changes on both regional and global level. The work will consist of gathering systematized information concerning quantitative and qualitative relations between various components of the environment. The application of remote sensing methods enables the acquisition of such environmental data in dynamic setting. Research will be conducted for areas comprising distinct geosystems and will lead to the detection of diurnal, seasonal and yearly dynamics of geosystems as well as long-term trends. Except cognitive, the programme will also serve the methodological purpose. The first aim will be realized with respect to individual geosystems; the resulting sets of data will consist of matrixes of statistical data characterizing relations between various components of geosystems. The methodological aim will be achieved through the process of practical verification of the preliminary assumptions. Information will be collected from different data acquisition levels namely from satellite and aerial platforms and through ground measurements. Different types of data, such as multispectral photography (SALYUT, KOSMOS), multispectral scanner images (LANDSAT THEMATIC MAPPER, SPOT), infrared photography, radar imagery and spectrometric measurements will be gathered during simultaneous data acquisition projects. All types of observations will be timed in accordance with the natural rithm of the observed phenomena. The paper contains the description of geosystems under anthropogenic stress based on the previous research of the authors. The presented multifactor characteristics of soil and crops is a part of completed studies on agricultural geosystems. The results of comprehensive remote sensing experiments already completed within the framework of INTERKOSMOS programme on test sites in member countries fully support the approved programme for studying the dynamics of geosystems with the use of remote sensing.
Missing value imputation for microarray data: a comprehensive comparison study and a web tool.
Chiu, Chia-Chun; Chan, Shih-Yao; Wang, Chung-Ching; Wu, Wei-Sheng
2013-01-01
Microarray data are usually peppered with missing values due to various reasons. However, most of the downstream analyses for microarray data require complete datasets. Therefore, accurate algorithms for missing value estimation are needed for improving the performance of microarray data analyses. Although many algorithms have been developed, there are many debates on the selection of the optimal algorithm. The studies about the performance comparison of different algorithms are still incomprehensive, especially in the number of benchmark datasets used, the number of algorithms compared, the rounds of simulation conducted, and the performance measures used. In this paper, we performed a comprehensive comparison by using (I) thirteen datasets, (II) nine algorithms, (III) 110 independent runs of simulation, and (IV) three types of measures to evaluate the performance of each imputation algorithm fairly. First, the effects of different types of microarray datasets on the performance of each imputation algorithm were evaluated. Second, we discussed whether the datasets from different species have different impact on the performance of different algorithms. To assess the performance of each algorithm fairly, all evaluations were performed using three types of measures. Our results indicate that the performance of an imputation algorithm mainly depends on the type of a dataset but not on the species where the samples come from. In addition to the statistical measure, two other measures with biological meanings are useful to reflect the impact of missing value imputation on the downstream data analyses. Our study suggests that local-least-squares-based methods are good choices to handle missing values for most of the microarray datasets. In this work, we carried out a comprehensive comparison of the algorithms for microarray missing value imputation. Based on such a comprehensive comparison, researchers could choose the optimal algorithm for their datasets easily. Moreover, new imputation algorithms could be compared with the existing algorithms using this comparison strategy as a standard protocol. In addition, to assist researchers in dealing with missing values easily, we built a web-based and easy-to-use imputation tool, MissVIA (http://cosbi.ee.ncku.edu.tw/MissVIA), which supports many imputation algorithms. Once users upload a real microarray dataset and choose the imputation algorithms, MissVIA will determine the optimal algorithm for the users' data through a series of simulations, and then the imputed results can be downloaded for the downstream data analyses.
Chi-Square Statistics, Tests of Hypothesis and Technology.
ERIC Educational Resources Information Center
Rochowicz, John A.
The use of technology such as computers and programmable calculators enables students to find p-values and conduct tests of hypotheses in many different ways. Comprehension and interpretation of a research problem become the focus for statistical analysis. This paper describes how to calculate chisquare statistics and p-values for statistical…
Sharif, Mienah Z.; Rizzo, Shemra; Prelip, Michael L; Glik, Deborah C; Belin, Thomas R; Langellier, Brent A; Kuo, Alice A.; Garza, Jeremiah R; Ortega, Alexander N
2014-01-01
Background The Nutrition Facts label can facilitate healthy dietary practices. There is a dearth of research on Latinos’ utilization and comprehension of the Nutrition Facts label. Objective To measure Nutrition Facts label use and comprehension and to identify their correlates among Latinos in East Los Angeles. Design Cross-sectional interviewer-administered survey using a computer assisted personal interview (CAPI) software conducted in either English or Spanish in the participant’s home. Participants/Setting Eligibility criteria were: living in a household within the block clusters identified, being age 18 or over, speaking English or Spanish, identifying as Latino and as the household’s main food purchaser and preparer. Analyses were based on 269 eligible respondents. Statistical analyses performed Chi-square test and multivariate logistic regression analysis assessed the association between the main outcomes and demographics. Multiple imputation addressed missing data. Results Sixty percent reported using the label; only 13% showed adequate comprehension of the label. Utilization was associated with being female, speaking Spanish and being below the poverty line. Comprehension was associated with younger age, not being married, and higher education. Utilization was not associated with comprehension. Conclusions Latinos who are using the Nutrition Facts label are not correctly interpreting the available information. Targeted education is needed to improve Nutrition Facts label use and comprehension, to directly improve diet, particularly among males, older Latinos, and those with less than a high school education. PMID:24974172
Enviroplan—a summary methodology for comprehensive environmental planning and design
Robert Allen Jr.; George Nez; Fred Nicholson; Larry Sutphin
1979-01-01
This paper will discuss a comprehensive environmental assessment methodology that includes a numerical method for visual management and analysis. This methodology employs resource and human activity units as a means to produce a visual form unit which is the fundamental unit of the perceptual environment. The resource unit is based on the ecosystem as the fundamental...
Research Review: Reading Comprehension in Developmental Disorders of Language and Communication
ERIC Educational Resources Information Center
Ricketts, Jessie
2011-01-01
Background: Deficits in reading airment (SLI), Down syndrome (DS) and autism spectrum disorders (ASD). Methods: In this review (based on a search of the ISI Web of Knowledge database to 2011), the Simple View of Reading is used as a framework for considering reading comprehension in these groups. Conclusions: There is substantial evidence for…
Effects of Comprehensive, Multiple High-Risk Behaviors Prevention Program on High School Students
ERIC Educational Resources Information Center
Collier, Crystal
2013-01-01
The purpose of this mixed methods study was to examine the effect of a multiple high-risk behaviors prevention program applied comprehensively throughout an entire school-system involving universal, selective, and indicated levels of students at a local private high school during a 4-year period. The prevention program was created based upon the…
2013-01-01
Background Population studies on end-of-life decisions have not been conducted in Cyprus. Our study aim was to evaluate the beliefs and attitudes of Greek Cypriots towards end-of-life issues regarding euthanasia and cremation. Methods A population-based telephone survey was conducted in Cyprus. One thousand randomly selected individuals from the population of Cyprus age 20 years or older were invited to participate. Beliefs and attitudes on end-of-life decisions were collected using an anonymous and validated questionnaire. Statistical analyses included cross-tabulations, Pearson’s chi-square tests and multivariable-adjusted logistic regression models. Results A total of 308 males and 689 females participated in the survey. About 70% of the respondents did not support euthanasia for people with incurable illness and/or elders with dementia when requested by them and 77% did not support euthanasia for people with incurable illness and/or elders with dementia when requested by relatives. Regarding cremation, 78% were against and only 14% reported being in favor. Further statistical analyses showed that male gender, being single and having reached higher educational level were factors positively associated with support for euthanasia in a statistically significant fashion. On the contrary, the more religiosity expressed by study participants, the less support they reported for euthanasia or cremation. Conclusions The vast majority of Greek Cypriots does not support euthanasia for people with incurable illness and/or elders with dementia and also do not support cremation. Certain demographic characteristics such as age and education have a positive influence towards attitudes for euthanasia and cremation, while religiosity exerts a strong negative influence on the above. Family bonding as well as social and cultural traditions may also play a role although not comprehensively evaluated in the current study. PMID:24060291
Straight Talk About Birth Control: A Contraceptive Education Protocol for Home Care.
Schoenberg, Leslie
Home healthcare providers play a critical role in the prevention of unintended pregnancies by providing evidence-based contraception education during home visits. This article describes an innovative and comprehensive contraception protocol that was developed for Nurse-Family Partnership to improve contraception education for home healthcare patients. The protocol focused on increasing uptake of long-acting reversible contraception (LARC) for high-risk prenatal and postpartum home healthcare patients. The protocol was designed to reduce early subsequent pregnancies and thereby improve outcomes for mothers and their infants. An evidence-based translation project was designed and piloted in three California counties. The protocol consisted of a contraception education module for nurses and a patient education toolkit. The toolkit included an interactive patient education workbook emphasizing LARC methods for nurses to complete with their patients along with other teaching tools. The project was evaluated using pre- and posttest surveys that measured changes in nurses' knowledge, attitudes, and practice before, after, and 2 months after implementation. Outcomes revealed the following statistically significant results: (a) nurses' knowledge doubled at the first posttest and persisted at 2 months, (b) nurses' attitudes improved on two of the three measures, and (c) there was a 17.7% increase in the frequency of LARC birth control education 2 months after implementation. An evidence-based contraception protocol can promote acceptance of LARC methods and improve home healthcare clinician comfort with and frequency of birth control education.
Inferring Admixture Histories of Human Populations Using Linkage Disequilibrium
Loh, Po-Ru; Lipson, Mark; Patterson, Nick; Moorjani, Priya; Pickrell, Joseph K.; Reich, David; Berger, Bonnie
2013-01-01
Long-range migrations and the resulting admixtures between populations have been important forces shaping human genetic diversity. Most existing methods for detecting and reconstructing historical admixture events are based on allele frequency divergences or patterns of ancestry segments in chromosomes of admixed individuals. An emerging new approach harnesses the exponential decay of admixture-induced linkage disequilibrium (LD) as a function of genetic distance. Here, we comprehensively develop LD-based inference into a versatile tool for investigating admixture. We present a new weighted LD statistic that can be used to infer mixture proportions as well as dates with fewer constraints on reference populations than previous methods. We define an LD-based three-population test for admixture and identify scenarios in which it can detect admixture events that previous formal tests cannot. We further show that we can uncover phylogenetic relationships among populations by comparing weighted LD curves obtained using a suite of references. Finally, we describe several improvements to the computation and fitting of weighted LD curves that greatly increase the robustness and speed of the calculations. We implement all of these advances in a software package, ALDER, which we validate in simulations and apply to test for admixture among all populations from the Human Genome Diversity Project (HGDP), highlighting insights into the admixture history of Central African Pygmies, Sardinians, and Japanese. PMID:23410830
Effectiveness of Project Based Learning in Statistics for Lower Secondary Schools
ERIC Educational Resources Information Center
Siswono, Tatag Yuli Eko; Hartono, Sugi; Kohar, Ahmad Wachidul
2018-01-01
Purpose: This study aimed at investigating the effectiveness of implementing Project Based Learning (PBL) on the topic of statistics at a lower secondary school in Surabaya city, Indonesia, indicated by examining student learning outcomes, student responses, and student activity. Research Methods: A quasi experimental method was conducted over two…
Consistency of extreme flood estimation approaches
NASA Astrophysics Data System (ADS)
Felder, Guido; Paquet, Emmanuel; Penot, David; Zischg, Andreas; Weingartner, Rolf
2017-04-01
Estimations of low-probability flood events are frequently used for the planning of infrastructure as well as for determining the dimensions of flood protection measures. There are several well-established methodical procedures to estimate low-probability floods. However, a global assessment of the consistency of these methods is difficult to achieve, the "true value" of an extreme flood being not observable. Anyway, a detailed comparison performed on a given case study brings useful information about the statistical and hydrological processes involved in different methods. In this study, the following three different approaches for estimating low-probability floods are compared: a purely statistical approach (ordinary extreme value statistics), a statistical approach based on stochastic rainfall-runoff simulation (SCHADEX method), and a deterministic approach (physically based PMF estimation). These methods are tested for two different Swiss catchments. The results and some intermediate variables are used for assessing potential strengths and weaknesses of each method, as well as for evaluating the consistency of these methods.
2010-01-01
Background The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. Results In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. Conclusion High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data. PMID:20122245
Seok, Junhee; Kaushal, Amit; Davis, Ronald W; Xiao, Wenzhong
2010-01-18
The large amount of high-throughput genomic data has facilitated the discovery of the regulatory relationships between transcription factors and their target genes. While early methods for discovery of transcriptional regulation relationships from microarray data often focused on the high-throughput experimental data alone, more recent approaches have explored the integration of external knowledge bases of gene interactions. In this work, we develop an algorithm that provides improved performance in the prediction of transcriptional regulatory relationships by supplementing the analysis of microarray data with a new method of integrating information from an existing knowledge base. Using a well-known dataset of yeast microarrays and the Yeast Proteome Database, a comprehensive collection of known information of yeast genes, we show that knowledge-based predictions demonstrate better sensitivity and specificity in inferring new transcriptional interactions than predictions from microarray data alone. We also show that comprehensive, direct and high-quality knowledge bases provide better prediction performance. Comparison of our results with ChIP-chip data and growth fitness data suggests that our predicted genome-wide regulatory pairs in yeast are reasonable candidates for follow-up biological verification. High quality, comprehensive, and direct knowledge bases, when combined with appropriate bioinformatic algorithms, can significantly improve the discovery of gene regulatory relationships from high throughput gene expression data.
Murray, G F; Jones, D R; Stritter, F T
1995-10-01
The Comprehensive Thoracic Surgery Curriculum was developed to provide program directors with a basis for planning instruction and evaluating residents, program practices, and outcomes. A survey design was selected to obtain opinions about the curriculum from a large group of people, ie, all program directors and all active residents. Two parallel instruments were developed: one to be completed by program directors and one to be completed by active residents. Responses were collated for directors and residents, entered into a computerized database, and compared using the chi 2 statistic. A response rate of 93% was obtained from the directors and 79% from the residents. The survey demonstrates broad-based support for a comprehensive curriculum by the respondents. Current perceptions of and expectations for the curriculum are diverse and regionalized. Serious concerns are expressed about quality issues and particularly the environment for residency education. The thoughtful responses of our colleagues will guide leaders who will implement the curriculum for thoracic surgery. Strategies for change will necessarily focus on the prerequisite curriculum.
Empirical research in service engineering based on AHP and fuzzy methods
NASA Astrophysics Data System (ADS)
Zhang, Yanrui; Cao, Wenfu; Zhang, Lina
2015-12-01
Recent years, management consulting industry has been rapidly developing worldwide. Taking a big management consulting company as research object, this paper established an index system of service quality of consulting, based on customer satisfaction survey, evaluated service quality of the consulting company by AHP and fuzzy comprehensive evaluation methods.
Moral Virtue and Practical Wisdom: Theme Comprehension in Children, Youth and Adults
Narvaez, Darcia; Gleason, Tracy; Mitchell, Christyan
2010-01-01
Three hypotheses were tested about the relation of moral comprehension to prudential comprehension by contrasting comprehension of themes in moral stories with comprehension of themes in prudential stories among third grade, fifth grade and college students (n = 168) in Study 1, and among college students, young and middle aged adults, and older adults (n = 96) in Study 2. In both studies, all groups were statistically significantly better at moral theme comprehension than prudential theme comprehension, suggesting that moral comprehension may develop prior to prudential comprehension. In Study 2, all groups performed equally on moral theme generation whereas both adult groups were significantly better than college students on prudential theme generation. Overall, the findings of these studies provide modest evidence that moral and prudential comprehension each develop separately, and that the latter may develop more slowly. PMID:21171549