identifying statistical dependence: Topics by Science.gov

Sample records for identifying statistical dependence

Statistical dependency in visual scanning

NASA Technical Reports Server (NTRS)

Ellis, Stephen R.; Stark, Lawrence

1986-01-01

A method to identify statistical dependencies in the positions of eye fixations is developed and applied to eye movement data from subjects who viewed dynamic displays of air traffic and judged future relative position of aircraft. Analysis of approximately 23,000 fixations on points of interest on the display identified statistical dependencies in scanning that were independent of the physical placement of the points of interest. Identification of these dependencies is inconsistent with random-sampling-based theories used to model visual search and information seeking.
Knowledge-Assisted Approach to Identify Pathways with Differential Dependencies | Office of Cancer Genomics

Cancer.gov

We have previously developed a statistical method to identify gene sets enriched with condition-specific genetic dependencies. The method constructs gene dependency networks from bootstrapped samples in one condition and computes the divergence between distributions of network likelihood scores from different conditions. It was shown to be capable of sensitive and specific identification of pathways with phenotype-specific dysregulation, i.e., rewiring of dependencies between genes in different conditions.
From Combat to Campus

ERIC Educational Resources Information Center

Bellafiore, Margaret

2012-01-01

Soldiers are returning from war to college. The number of veterans enrolled nationally is hard to find. Data from the National Center for Veterans Analysis and Statistics identify nearly 924,000 veterans as "total education program beneficiaries" for 2011. These statistics combine many categories, including dependents and survivors. The…
Mixed models, linear dependency, and identification in age-period-cohort models.

PubMed

O'Brien, Robert M

2017-07-20

This paper examines the identification problem in age-period-cohort models that use either linear or categorically coded ages, periods, and cohorts or combinations of these parameterizations. These models are not identified using the traditional fixed effect regression model approach because of a linear dependency between the ages, periods, and cohorts. However, these models can be identified if the researcher introduces a single just identifying constraint on the model coefficients. The problem with such constraints is that the results can differ substantially depending on the constraint chosen. Somewhat surprisingly, age-period-cohort models that specify one or more of ages and/or periods and/or cohorts as random effects are identified. This is the case without introducing an additional constraint. I label this identification as statistical model identification and show how statistical model identification comes about in mixed models and why which effects are treated as fixed and which are treated as random can substantially change the estimates of the age, period, and cohort effects. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Spurious correlations and inference in landscape genetics

Treesearch

Samuel A. Cushman; Erin L. Landguth

2010-01-01

Reliable interpretation of landscape genetic analyses depends on statistical methods that have high power to identify the correct process driving gene flow while rejecting incorrect alternative hypotheses. Little is known about statistical power and inference in individual-based landscape genetics. Our objective was to evaluate the power of causalmodelling with partial...
About influence of input rate random part of nonstationary queue system on statistical estimates of its macroscopic indicators

NASA Astrophysics Data System (ADS)

Korelin, Ivan A.; Porshnev, Sergey V.

2018-05-01

A model of the non-stationary queuing system (NQS) is described. The input of this model receives a flow of requests with input rate λ = λdet (t) + λrnd (t), where λdet (t) is a deterministic function depending on time; λrnd (t) is a random function. The parameters of functions λdet (t), λrnd (t) were identified on the basis of statistical information on visitor flows collected from various Russian football stadiums. The statistical modeling of NQS is carried out and the average statistical dependences are obtained: the length of the queue of requests waiting for service, the average wait time for the service, the number of visitors entered to the stadium on the time. It is shown that these dependencies can be characterized by the following parameters: the number of visitors who entered at the time of the match; time required to service all incoming visitors; the maximum value; the argument value when the studied dependence reaches its maximum value. The dependences of these parameters on the energy ratio of the deterministic and random component of the input rate are investigated.
Dependency of high coastal water level and river discharge at the global scale

NASA Astrophysics Data System (ADS)

Ward, P.; Couasnon, A.; Haigh, I. D.; Muis, S.; Veldkamp, T.; Winsemius, H.; Wahl, T.

2017-12-01

It is widely recognized that floods cause huge socioeconomic impacts. From 1980-2013, global flood losses exceeded $1 trillion, with 220,000 fatalities. These impacts are particularly hard felt in low-lying densely populated deltas and estuaries, whose location at the coast-land interface makes them naturally prone to flooding. When river and coastal floods coincide, their impacts in these deltas and estuaries are often worse than when they occur in isolation. Such floods are examples of so-called `compound events'. In this contribution, we present the first global scale analysis of the statistical dependency of high coastal water levels (and the storm surge component alone) and river discharge. We show that there is statistical dependency between these components at more than half of the stations examined. We also show time-lags in the highest correlation between peak discharges and coastal water levels. Finally, we assess the probability of the simultaneous occurrence of design discharge and design coastal water levels, assuming both independence and statistical dependence. For those stations where we identified statistical dependency, the probability is between 1 and 5 times greater, when the dependence structure is accounted for. This information is essential for understanding the likelihood of compound flood events occurring at locations around the world as well as for accurate flood risk assessments and effective flood risk management. The research was carried out by analysing the statistical dependency between observed coastal water levels (and the storm surge component) from GESLA-2 and river discharge using gauged data from GRDC stations all around the world. The dependence structure was examined using copula functions.
Novel genes identified in a high-density genome wide association study for nicotine dependence.

PubMed

Bierut, Laura Jean; Madden, Pamela A F; Breslau, Naomi; Johnson, Eric O; Hatsukami, Dorothy; Pomerleau, Ovide F; Swan, Gary E; Rutter, Joni; Bertelsen, Sarah; Fox, Louis; Fugman, Douglas; Goate, Alison M; Hinrichs, Anthony L; Konvicka, Karel; Martin, Nicholas G; Montgomery, Grant W; Saccone, Nancy L; Saccone, Scott F; Wang, Jen C; Chase, Gary A; Rice, John P; Ballinger, Dennis G

2007-01-01

Tobacco use is a leading contributor to disability and death worldwide, and genetic factors contribute in part to the development of nicotine dependence. To identify novel genes for which natural variation contributes to the development of nicotine dependence, we performed a comprehensive genome wide association study using nicotine dependent smokers as cases and non-dependent smokers as controls. To allow the efficient, rapid, and cost effective screen of the genome, the study was carried out using a two-stage design. In the first stage, genotyping of over 2.4 million single nucleotide polymorphisms (SNPs) was completed in case and control pools. In the second stage, we selected SNPs for individual genotyping based on the most significant allele frequency differences between cases and controls from the pooled results. Individual genotyping was performed in 1050 cases and 879 controls using 31 960 selected SNPs. The primary analysis, a logistic regression model with covariates of age, gender, genotype and gender by genotype interaction, identified 35 SNPs with P-values less than 10(-4) (minimum P-value 1.53 x 10(-6)). Although none of the individual findings is statistically significant after correcting for multiple tests, additional statistical analyses support the existence of true findings in this group. Our study nominates several novel genes, such as Neurexin 1 (NRXN1), in the development of nicotine dependence while also identifying a known candidate gene, the beta3 nicotinic cholinergic receptor. This work anticipates the future directions of large-scale genome wide association studies with state-of-the-art methodological approaches and sharing of data with the scientific community.
Evaluating the utility of companion animal tick surveillance practices for monitoring spread and occurrence of human Lyme disease in West Virginia, 2014-2016.

PubMed

Hendricks, Brian; Mark-Carew, Miguella; Conley, Jamison

2017-11-13

Domestic dogs and cats are potentially effective sentinel populations for monitoring occurrence and spread of Lyme disease. Few studies have evaluated the public health utility of sentinel programmes using geo-analytic approaches. Confirmed Lyme disease cases diagnosed by physicians and ticks submitted by veterinarians to the West Virginia State Health Department were obtained for 2014-2016. Ticks were identified to species, and only Ixodes scapularis were incorporated in the analysis. Separate ordinary least squares (OLS) and spatial lag regression models were conducted to estimate the association between average numbers of Ix. scapularis collected on pets and human Lyme disease incidence. Regression residuals were visualised using Local Moran's I as a diagnostic tool to identify spatial dependence. Statistically significant associations were identified between average numbers of Ix. scapularis collected from dogs and human Lyme disease in the OLS (β=20.7, P<0.001) and spatial lag (β=12.0, P=0.002) regression. No significant associations were identified for cats in either regression model. Statistically significant (P≤0.05) spatial dependence was identified in all regression models. Local Moran's I maps produced for spatial lag regression residuals indicated a decrease in model over- and under-estimation, but identified a higher number of statistically significant outliers than OLS regression. Results support previous conclusions that dogs are effective sentinel populations for monitoring risk of human exposure to Lyme disease. Findings reinforce the utility of spatial analysis of surveillance data, and highlight West Virginia's unique position within the eastern United States in regards to Lyme disease occurrence.
Comprehensive Analyses of Ventricular Myocyte Models Identify Targets Exhibiting Favorable Rate Dependence

PubMed Central

Bugana, Marco; Severi, Stefano; Sobie, Eric A.

2014-01-01

Reverse rate dependence is a problematic property of antiarrhythmic drugs that prolong the cardiac action potential (AP). The prolongation caused by reverse rate dependent agents is greater at slow heart rates, resulting in both reduced arrhythmia suppression at fast rates and increased arrhythmia risk at slow rates. The opposite property, forward rate dependence, would theoretically overcome these parallel problems, yet forward rate dependent (FRD) antiarrhythmics remain elusive. Moreover, there is evidence that reverse rate dependence is an intrinsic property of perturbations to the AP. We have addressed the possibility of forward rate dependence by performing a comprehensive analysis of 13 ventricular myocyte models. By simulating populations of myocytes with varying properties and analyzing population results statistically, we simultaneously predicted the rate-dependent effects of changes in multiple model parameters. An average of 40 parameters were tested in each model, and effects on AP duration were assessed at slow (0.2 Hz) and fast (2 Hz) rates. The analysis identified a variety of FRD ionic current perturbations and generated specific predictions regarding their mechanisms. For instance, an increase in L-type calcium current is FRD when this is accompanied by indirect, rate-dependent changes in slow delayed rectifier potassium current. A comparison of predictions across models identified inward rectifier potassium current and the sodium-potassium pump as the two targets most likely to produce FRD AP prolongation. Finally, a statistical analysis of results from the 13 models demonstrated that models displaying minimal rate-dependent changes in AP shape have little capacity for FRD perturbations, whereas models with large shape changes have considerable FRD potential. This can explain differences between species and between ventricular cell types. Overall, this study provides new insights, both specific and general, into the determinants of AP duration rate dependence, and illustrates a strategy for the design of potentially beneficial antiarrhythmic drugs. PMID:24675446
Comprehensive analyses of ventricular myocyte models identify targets exhibiting favorable rate dependence.

PubMed

Cummins, Megan A; Dalal, Pavan J; Bugana, Marco; Severi, Stefano; Sobie, Eric A

2014-03-01

Reverse rate dependence is a problematic property of antiarrhythmic drugs that prolong the cardiac action potential (AP). The prolongation caused by reverse rate dependent agents is greater at slow heart rates, resulting in both reduced arrhythmia suppression at fast rates and increased arrhythmia risk at slow rates. The opposite property, forward rate dependence, would theoretically overcome these parallel problems, yet forward rate dependent (FRD) antiarrhythmics remain elusive. Moreover, there is evidence that reverse rate dependence is an intrinsic property of perturbations to the AP. We have addressed the possibility of forward rate dependence by performing a comprehensive analysis of 13 ventricular myocyte models. By simulating populations of myocytes with varying properties and analyzing population results statistically, we simultaneously predicted the rate-dependent effects of changes in multiple model parameters. An average of 40 parameters were tested in each model, and effects on AP duration were assessed at slow (0.2 Hz) and fast (2 Hz) rates. The analysis identified a variety of FRD ionic current perturbations and generated specific predictions regarding their mechanisms. For instance, an increase in L-type calcium current is FRD when this is accompanied by indirect, rate-dependent changes in slow delayed rectifier potassium current. A comparison of predictions across models identified inward rectifier potassium current and the sodium-potassium pump as the two targets most likely to produce FRD AP prolongation. Finally, a statistical analysis of results from the 13 models demonstrated that models displaying minimal rate-dependent changes in AP shape have little capacity for FRD perturbations, whereas models with large shape changes have considerable FRD potential. This can explain differences between species and between ventricular cell types. Overall, this study provides new insights, both specific and general, into the determinants of AP duration rate dependence, and illustrates a strategy for the design of potentially beneficial antiarrhythmic drugs.
Chemical potential dependence of particle ratios within a unified thermal approach

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bashir, I., E-mail: inamhep@gmail.com; Nanda, H.; Uddin, S.

2016-06-15

A unified statistical thermal freeze-out model (USTFM) is used to study the chemical potential dependence of identified particle ratios at mid-rapidity in heavy-ion collisions. We successfully reproduce the experimental data ranging from SPS energies to LHC energies, suggesting the statistical nature of the particle production in these collisions and hence the validity of our approach. The behavior of the freeze-out temperature is studied with respect to chemical potential. The freeze-out temperature is found to be universal at the RHIC and LHC and is close to the QCD predicted phase transition temperature, suggesting that the chemical freeze-out occurs soon after themore » hadronization takes place.« less
Spatio-temporal analysis of annual rainfall in Crete, Greece

NASA Astrophysics Data System (ADS)

Varouchakis, Emmanouil A.; Corzo, Gerald A.; Karatzas, George P.; Kotsopoulou, Anastasia

2018-03-01

Analysis of rainfall data from the island of Crete, Greece was performed to identify key hydrological years and return periods as well as to analyze the inter-annual behavior of the rainfall variability during the period 1981-2014. The rainfall spatial distribution was also examined in detail to identify vulnerable areas of the island. Data analysis using statistical tools and spectral analysis were applied to investigate and interpret the temporal course of the available rainfall data set. In addition, spatial analysis techniques were applied and compared to determine the rainfall spatial distribution on the island of Crete. The analysis presented that in contrast to Regional Climate Model estimations, rainfall rates have not decreased, while return periods vary depending on seasonality and geographic location. A small but statistical significant increasing trend was detected in the inter-annual rainfall variations as well as a significant rainfall cycle almost every 8 years. In addition, statistically significant correlation of the island's rainfall variability with the North Atlantic Oscillation is identified for the examined period. On the other hand, regression kriging method combining surface elevation as secondary information improved the estimation of the annual rainfall spatial variability on the island of Crete by 70% compared to ordinary kriging. The rainfall spatial and temporal trends on the island of Crete have variable characteristics that depend on the geographical area and on the hydrological period.
USE OF MODELS TO SUPPORT WATER QUALTIY CRITERIA - A CASE STUDY

EPA Science Inventory

In the United States, current methods for deriving chemical criteria protective of aquatic life depend on acute and chronic toxicity test results involving several species. These results are analyzed statistically to identify chemical concentrations that protect the majority of ...
Microdose Induced Data Loss on Floating Gate Memories

NASA Technical Reports Server (NTRS)

Guertin, Steven M.; Nguyen, Duc M.; Patterson, Jeffrey D.

2006-01-01

Heavy ion irradiation of flash memories shows loss of stored data. The fluence dependence is indicative of microdose effects. Other qualitative factors identifying the effect as microdose are discussed. The data is presented, and compared to statistical results of a microdose target-based model.
Indicator organisms in meat and poultry slaughter operations: their potential use in process control and the role of emerging technologies.

PubMed

Saini, Parmesh K; Marks, Harry M; Dreyfuss, Moshe S; Evans, Peter; Cook, L Victor; Dessai, Uday

2011-08-01

Measuring commonly occurring, nonpathogenic organisms on poultry products may be used for designing statistical process control systems that could result in reductions of pathogen levels. The extent of pathogen level reduction that could be obtained from actions resulting from monitoring these measurements over time depends upon the degree of understanding cause-effect relationships between processing variables, selected output variables, and pathogens. For such measurements to be effective for controlling or improving processing to some capability level within the statistical process control context, sufficiently frequent measurements would be needed to help identify processing deficiencies. Ultimately the correct balance of sampling and resources is determined by those characteristics of deficient processing that are important to identify. We recommend strategies that emphasize flexibility, depending upon sampling objectives. Coupling the measurement of levels of indicator organisms with practical emerging technologies and suitable on-site platforms that decrease the time between sample collections and interpreting results would enhance monitoring process control.
Using a cross section to train veterinary students to visualize anatomical structures in three dimensions

NASA Astrophysics Data System (ADS)

Provo, Judy; Lamar, Carlton; Newby, Timothy

2002-01-01

A cross section was used to enhance three-dimensional knowledge of anatomy of the canine head. All veterinary students in two successive classes (n = 124) dissected the head; experimental groups also identified structures on a cross section of the head. A test assessing spatial knowledge of the head generated 10 dependent variables from two administrations. The test had content validity and statistically significant interrater and test-retest reliability. A live-dog examination generated one additional dependent variable. Analysis of covariance controlling for performance on course examinations and quizzes revealed no treatment effect. Including spatial skill as a third covariate revealed a statistically significant effect of spatial skill on three dependent variables. Men initially had greater spatial skill than women, but spatial skills were equal after 8 months. A qualitative analysis showed the positive impact of this experience on participants. Suggestions for improvement and future research are discussed.
Estimation of trends

NASA Technical Reports Server (NTRS)

1981-01-01

The application of statistical methods to recorded ozone measurements. The effects of a long term depletion of ozone at magnitudes predicted by the NAS is harmful to most forms of life. Empirical prewhitening filters the derivation of which is independent of the underlying physical mechanisms were analyzed. Statistical analysis performs a checks and balances effort. Time series filters variations into systematic and random parts, errors are uncorrelated, and significant phase lag dependencies are identified. The use of time series modeling to enhance the capability of detecting trends is discussed.
Statistical properties of solar granulation derived from the SOUP instrument on Spacelab 2

NASA Technical Reports Server (NTRS)

Title, A. M.; Tarbell, T. D.; Topka, K. P.; Ferguson, S. H.; Shine, R. A.

1989-01-01

Computer algorithms and statistical techniques were used to identify, measure, and quantify the properties of solar granulation derived from movies collected by the Solar Optical Universal Polarimeter on Spacelab 2. The results show that there is neither a typical solar granule nor a typical granule evolution. A granule's evolution is dependent on local magnetic flux density, its position with respect to the active region plage, its position in the mesogranulation pattern, and the evolution of granules in its immediate neighborhood.
Search for the Theta+ in photoproduction on the deuteron

DOE Office of Scientific and Technical Information (OSTI.GOV)

K.H. Hicks

2005-07-26

A high-statistics experiment on a deuterium target was performed using a real photon beam with energies up to 3.6 GeV at the CLAS detector of Jefferson Lab. The reaction reported here is for {gamma}d {yields} pK{sup -} K{sup +} n where the neutron was identified using the missing mass technique. No statistically significant narrow peak in the mass region from 1.5-1.6 GeV was found. An upper limit on the elementary process {gamma}n {yields} K{sup -} {Theta}{sup +} was estimated to be about 4-5 nb, using a model-dependent correction for rescattering determined from {Lambda}(1520) production. Other reactions with less model-dependence aremore » being pursued.« less

Segmenting Dynamic Human Action via Statistical Structure

ERIC Educational Resources Information Center

Baldwin, Dare; Andersson, Annika; Saffran, Jenny; Meyer, Meredith

2008-01-01

Human social, cognitive, and linguistic functioning depends on skills for rapidly processing action. Identifying distinct acts within the dynamic motion flow is one basic component of action processing; for example, skill at segmenting action is foundational to action categorization, verb learning, and comprehension of novel action sequences. Yet…
Method of identifying clusters representing statistical dependencies in multivariate data

NASA Technical Reports Server (NTRS)

Borucki, W. J.; Card, D. H.; Lyle, G. C.

1975-01-01

Approach is first to cluster and then to compute spatial boundaries for resulting clusters. Next step is to compute, from set of Monte Carlo samples obtained from scrambled data, estimates of probabilities of obtaining at least as many points within boundaries as were actually observed in original data.
Identifying the Factors That Influence Change in SEBD Using Logistic Regression Analysis

ERIC Educational Resources Information Center

Camilleri, Liberato; Cefai, Carmel

2013-01-01

Multiple linear regression and ANOVA models are widely used in applications since they provide effective statistical tools for assessing the relationship between a continuous dependent variable and several predictors. However these models rely heavily on linearity and normality assumptions and they do not accommodate categorical dependent…
Radial dependence of self-organized criticality behavior in TCABR tokamak

NASA Astrophysics Data System (ADS)

dos Santos Lima, G. Z.; Iarosz, K. C.; Batista, A. M.; Guimarães-Filho, Z. O.; Caldas, I. L.; Kuznetsov, Y. K.; Nascimento, I. C.; Viana, R. L.; Lopes, S. R.

2011-03-01

In this work we present evidence of the self-organized criticality behavior of the plasma edge electrostatic turbulence in the tokamak TCABR. Analyzing fluctuation data measured by Langmuir probes, we verify the radial dependence of self-organized criticality behavior at the plasma edge and scrape-off layer. We identify evidence of this radial criticality in statistical properties of the laminar period distribution function, power spectral density, autocorrelation, and Hurst parameter for the analyzed fluctuations.
Direction dependence analysis: A framework to test the direction of effects in linear models with an implementation in SPSS.

PubMed

Wiedermann, Wolfgang; Li, Xintong

2018-04-16

In nonexperimental data, at least three possible explanations exist for the association of two variables x and y: (1) x is the cause of y, (2) y is the cause of x, or (3) an unmeasured confounder is present. Statistical tests that identify which of the three explanatory models fits best would be a useful adjunct to the use of theory alone. The present article introduces one such statistical method, direction dependence analysis (DDA), which assesses the relative plausibility of the three explanatory models on the basis of higher-moment information about the variables (i.e., skewness and kurtosis). DDA involves the evaluation of three properties of the data: (1) the observed distributions of the variables, (2) the residual distributions of the competing models, and (3) the independence properties of the predictors and residuals of the competing models. When the observed variables are nonnormally distributed, we show that DDA components can be used to uniquely identify each explanatory model. Statistical inference methods for model selection are presented, and macros to implement DDA in SPSS are provided. An empirical example is given to illustrate the approach. Conceptual and empirical considerations are discussed for best-practice applications in psychological data, and sample size recommendations based on previous simulation studies are provided.
Research participant compensation: A matter of statistical inference as well as ethics.

PubMed

Swanson, David M; Betensky, Rebecca A

2015-11-01

The ethics of compensation of research subjects for participation in clinical trials has been debated for years. One ethical issue of concern is variation among subjects in the level of compensation for identical treatments. Surprisingly, the impact of variation on the statistical inferences made from trial results has not been examined. We seek to identify how variation in compensation may influence any existing dependent censoring in clinical trials, thereby also influencing inference about the survival curve, hazard ratio, or other measures of treatment efficacy. In simulation studies, we consider a model for how compensation structure may influence the censoring model. Under existing dependent censoring, we estimate survival curves under different compensation structures and observe how these structures induce variability in the estimates. We show through this model that if the compensation structure affects the censoring model and dependent censoring is present, then variation in that structure induces variation in the estimates and affects the accuracy of estimation and inference on treatment efficacy. From the perspectives of both ethics and statistical inference, standardization and transparency in the compensation of participants in clinical trials are warranted. Copyright © 2015 Elsevier Inc. All rights reserved.
Body mass index trends of military dependents: a cross-sectional study.

PubMed

Winegarner, James

2015-03-01

Obesity is an epidemic affecting many people in the United States, to include military beneficiaries, with immediate and long-term implications on health care utilization and costs. We compared the body mass index (BMI) of officer vs. enlisted military-dependent spouses. Retrospective chart review of 7,226 random dependent spouses cared for at Madigan Army Medical Center. Statistical analysis of BMI was performed comparing the spouses of commissioned officers and enlisted soldiers. There are a higher percentage of overweight and obese enlisted spouses when compared to officer spouses. In all age groups, BMI was 2.6 to 4.8 points higher in enlisted spouses, in both all-inclusive and female-specific analyses (p < 0.001). Male spouse BMI was not statistically different. BMI generally increased with age, with a statistically significant difference in BMI between age groups (p < 0.001). Our study shows that the average BMI of enlisted soldier's female spouses is significantly higher than officer spouses of similar age groups. A much larger proportion of enlisted spouses are obese. This analysis provides public health information for military primary care doctors and identifies at-risk individuals for targeted education and interventions. Reprint & Copyright © 2015 Association of Military Surgeons of the U.S.
An analysis of science versus pseudoscience

NASA Astrophysics Data System (ADS)

Hooten, James T.

2011-12-01

This quantitative study identified distinctive features in archival datasets commissioned by the National Science Foundation (NSF) for Science and Engineering Indicators reports. The dependent variables included education level, and scores for science fact knowledge, science process knowledge, and pseudoscience beliefs. The dependent variables were aggregated into nine NSF-defined geographic regions and examined for the years 2004 and 2006. The variables were also examined over all years available in the dataset. Descriptive statistics were determined and tests for normality and homogeneity of variances were performed using Statistical Package for the Social Sciences. Analysis of Variance was used to test for statistically significant differences between the nine geographic regions for each of the four dependent variables. Statistical significance of 0.05 was used. Tukey post-hoc analysis was used to compute practical significance of differences between regions. Post-hoc power analysis using G*Power was used to calculate the probability of Type II errors. Tests for correlations across all years of the dependent variables were also performed. Pearson's r was used to indicate the strength of the relationship between the dependent variables. Small to medium differences in science literacy and education level were observed between many of the nine U.S. geographic regions. The most significant differences occurred when the West South Central region was compared to the New England and the Pacific regions. Belief in pseudoscience appeared to be distributed evenly across all U.S. geographic regions. Education level was a strong indicator of science literacy regardless of a respondent's region of residence. Recommendations for further study include more in-depth investigation to uncover the nature of the relationship between education level and belief in pseudoscience.
Score tests for independence in semiparametric competing risks models.

PubMed

Saïd, Mériem; Ghazzali, Nadia; Rivest, Louis-Paul

2009-12-01

A popular model for competing risks postulates the existence of a latent unobserved failure time for each risk. Assuming that these underlying failure times are independent is attractive since it allows standard statistical tools for right-censored lifetime data to be used in the analysis. This paper proposes simple independence score tests for the validity of this assumption when the individual risks are modeled using semiparametric proportional hazards regressions. It assumes that covariates are available, making the model identifiable. The score tests are derived for alternatives that specify that copulas are responsible for a possible dependency between the competing risks. The test statistics are constructed by adding to the partial likelihoods for the individual risks an explanatory variable for the dependency between the risks. A variance estimator is derived by writing the score function and the Fisher information matrix for the marginal models as stochastic integrals. Pitman efficiencies are used to compare test statistics. A simulation study and a numerical example illustrate the methodology proposed in this paper.
Effect of heating rate and kinetic model selection on activation energy of nonisothermal crystallization of amorphous felodipine.

PubMed

Chattoraj, Sayantan; Bhugra, Chandan; Li, Zheng Jane; Sun, Changquan Calvin

2014-12-01

The nonisothermal crystallization kinetics of amorphous materials is routinely analyzed by statistically fitting the crystallization data to kinetic models. In this work, we systematically evaluate how the model-dependent crystallization kinetics is impacted by variations in the heating rate and the selection of the kinetic model, two key factors that can lead to significant differences in the crystallization activation energy (Ea ) of an amorphous material. Using amorphous felodipine, we show that the Ea decreases with increase in the heating rate, irrespective of the kinetic model evaluated in this work. The model that best describes the crystallization phenomenon cannot be identified readily through the statistical fitting approach because several kinetic models yield comparable R(2) . Here, we propose an alternate paired model-fitting model-free (PMFMF) approach for identifying the most suitable kinetic model, where Ea obtained from model-dependent kinetics is compared with those obtained from model-free kinetics. The most suitable kinetic model is identified as the one that yields Ea values comparable with the model-free kinetics. Through this PMFMF approach, nucleation and growth is identified as the main mechanism that controls the crystallization kinetics of felodipine. Using this PMFMF approach, we further demonstrate that crystallization mechanism from amorphous phase varies with heating rate. © 2014 Wiley Periodicals, Inc. and the American Pharmacists Association.
Poor oral intake causes enteral nutrition dependency after concomitant chemoradiotherapy for pharyngeal cancers.

PubMed

Ishii, Ryo; Kato, Kengo; Ogawa, Takenori; Sato, Takeshi; Nakanome, Ayako; Ohkoshi, Akira; Kawamoto-Hirano, Ai; Shirakura, Masayuki; Hidaka, Hiroshi; Katori, Yukio

2018-06-01

To identify precipitating factors responsible for enteral nutrition (EN) dependency after concomitant chemoradiotherapy (CCRT) of head and neck cancers and to examine their statistical correlations. Factors related to feeding condition, nutritional status, disease, and treatment of 26 oropharyngeal and hypopharyngeal cancer patients who received definitive CCRT were retrospectively investigated by examining their medical records. The days of no oral intake (NOI) during hospitalization and the months using enteral nutrition after CCRT were counted as representing the feeding condition, and the changes in body weight (BW) were examined as reflecting nutritional status. The factors related to EN dependency after CCRT were analyzed. Long duration of total NOI (≥ 30 days) and maximum NOI ≥ 14 days were significant predictors of EN dependency. Decreased BW (≥ 7.5 kg) was the next predictor identified, but it was not significant. Multivariate analysis showed that the total duration of NOI was more correlated with EN dependency than changes in BW. A long duration of NOI was more strongly related to EN dependency than nutritional factors.
Identifying seizure clusters in patients with psychogenic nonepileptic seizures.

PubMed

Baird, Grayson L; Harlow, Lisa L; Machan, Jason T; Thomas, Dave; LaFrance, W C

2017-08-01

The present study explored how seizure clusters may be defined for those with psychogenic nonepileptic seizures (PNES), a topic for which there is a paucity of literature. The sample was drawn from a multisite randomized clinical trial for PNES; seizure data are from participants' seizure diaries. Three possible cluster definitions were examined: 1) common clinical definition, where ≥3 seizures in a day is considered a cluster, along with two novel statistical definitions, where ≥3 seizures in a day are considered a cluster if the observed number of seizures statistically exceeds what would be expected relative to a patient's: 1) average seizure rate prior to the trial, 2) observed seizure rate for the previous seven days. Prevalence of clusters was 62-68% depending on cluster definition used, and occurrence rate of clusters was 6-19% depending on cluster definition. Based on these data, clusters seem to be common in patients with PNES, and more research is needed to identify if clusters are related to triggers and outcomes. Copyright © 2017 Elsevier Inc. All rights reserved.
Contextualization of drug-mediator relations using evidence networks.

PubMed

Tran, Hai Joey; Speyer, Gil; Kiefer, Jeff; Kim, Seungchan

2017-05-31

Genomic analysis of drug response can provide unique insights into therapies that can be used to match the "right drug to the right patient." However, the process of discovering such therapeutic insights using genomic data is not straightforward and represents an area of active investigation. EDDY (Evaluation of Differential DependencY), a statistical test to detect differential statistical dependencies, is one method that leverages genomic data to identify differential genetic dependencies. EDDY has been used in conjunction with the Cancer Therapeutics Response Portal (CTRP), a dataset with drug-response measurements for more than 400 small molecules, and RNAseq data of cell lines in the Cancer Cell Line Encyclopedia (CCLE) to find potential drug-mediator pairs. Mediators were identified as genes that showed significant change in genetic statistical dependencies within annotated pathways between drug sensitive and drug non-sensitive cell lines, and the results are presented as a public web-portal (EDDY-CTRP). However, the interpretability of drug-mediator pairs currently hinders further exploration of these potentially valuable results. In this study, we address this challenge by constructing evidence networks built with protein and drug interactions from the STITCH and STRING interaction databases. STITCH and STRING are sister databases that catalog known and predicted drug-protein interactions and protein-protein interactions, respectively. Using these two databases, we have developed a method to construct evidence networks to "explain" the relation between a drug and a mediator. RESULTS: We applied this approach to drug-mediator relations discovered in EDDY-CTRP analysis and identified evidence networks for ~70% of drug-mediator pairs where most mediators were not known direct targets for the drug. Constructed evidence networks enable researchers to contextualize the drug-mediator pair with current research and knowledge. Using evidence networks, we were able to improve the interpretability of the EDDY-CTRP results by linking the drugs and mediators with genes associated with both the drug and the mediator. We anticipate that these evidence networks will help inform EDDY-CTRP results and enhance the generation of important insights to drug sensitivity that will lead to improved precision medicine applications.
Statistical analysis for understanding and predicting battery degradations in real-life electric vehicle use

NASA Astrophysics Data System (ADS)

Barré, Anthony; Suard, Frédéric; Gérard, Mathias; Montaru, Maxime; Riu, Delphine

2014-01-01

This paper describes the statistical analysis of recorded data parameters of electrical battery ageing during electric vehicle use. These data permit traditional battery ageing investigation based on the evolution of the capacity fade and resistance raise. The measured variables are examined in order to explain the correlation between battery ageing and operating conditions during experiments. Such study enables us to identify the main ageing factors. Then, detailed statistical dependency explorations present the responsible factors on battery ageing phenomena. Predictive battery ageing models are built from this approach. Thereby results demonstrate and quantify a relationship between variables and battery ageing global observations, and also allow accurate battery ageing diagnosis through predictive models.
Differences in Performance Among Test Statistics for Assessing Phylogenomic Model Adequacy.

PubMed

Duchêne, David A; Duchêne, Sebastian; Ho, Simon Y W

2018-05-18

Statistical phylogenetic analyses of genomic data depend on models of nucleotide or amino acid substitution. The adequacy of these substitution models can be assessed using a number of test statistics, allowing the model to be rejected when it is found to provide a poor description of the evolutionary process. A potentially valuable use of model-adequacy test statistics is to identify when data sets are likely to produce unreliable phylogenetic estimates, but their differences in performance are rarely explored. We performed a comprehensive simulation study to identify test statistics that are sensitive to some of the most commonly cited sources of phylogenetic estimation error. Our results show that, for many test statistics, traditional thresholds for assessing model adequacy can fail to reject the model when the phylogenetic inferences are inaccurate and imprecise. This is particularly problematic when analysing loci that have few variable informative sites. We propose new thresholds for assessing substitution model adequacy and demonstrate their effectiveness in analyses of three phylogenomic data sets. These thresholds lead to frequent rejection of the model for loci that yield topological inferences that are imprecise and are likely to be inaccurate. We also propose the use of a summary statistic that provides a practical assessment of overall model adequacy. Our approach offers a promising means of enhancing model choice in genome-scale data sets, potentially leading to improvements in the reliability of phylogenomic inference.
Information-dependent enrichment analysis reveals time-dependent transcriptional regulation of the estrogen pathway of toxicity.

PubMed

Pendse, Salil N; Maertens, Alexandra; Rosenberg, Michael; Roy, Dipanwita; Fasani, Rick A; Vantangoli, Marguerite M; Madnick, Samantha J; Boekelheide, Kim; Fornace, Albert J; Odwin, Shelly-Ann; Yager, James D; Hartung, Thomas; Andersen, Melvin E; McMullen, Patrick D

2017-04-01

The twenty-first century vision for toxicology involves a transition away from high-dose animal studies to in vitro and computational models (NRC in Toxicity testing in the 21st century: a vision and a strategy, The National Academies Press, Washington, DC, 2007). This transition requires mapping pathways of toxicity by understanding how in vitro systems respond to chemical perturbation. Uncovering transcription factors/signaling networks responsible for gene expression patterns is essential for defining pathways of toxicity, and ultimately, for determining the chemical modes of action through which a toxicant acts. Traditionally, transcription factor identification is achieved via chromatin immunoprecipitation studies and summarized by calculating which transcription factors are statistically associated with up- and downregulated genes. These lists are commonly determined via statistical or fold-change cutoffs, a procedure that is sensitive to statistical power and may not be as useful for determining transcription factor associations. To move away from an arbitrary statistical or fold-change-based cutoff, we developed, in the context of the Mapping the Human Toxome project, an enrichment paradigm called information-dependent enrichment analysis (IDEA) to guide identification of the transcription factor network. We used a test case of activation in MCF-7 cells by 17β estradiol (E2). Using this new approach, we established a time course for transcriptional and functional responses to E2. ERα and ERβ were associated with short-term transcriptional changes in response to E2. Sustained exposure led to recruitment of additional transcription factors and alteration of cell cycle machinery. TFAP2C and SOX2 were the transcription factors most highly correlated with dose. E2F7, E2F1, and Foxm1, which are involved in cell proliferation, were enriched only at 24 h. IDEA should be useful for identifying candidate pathways of toxicity. IDEA outperforms gene set enrichment analysis (GSEA) and provides similar results to weighted gene correlation network analysis, a platform that helps to identify genes not annotated to pathways.
Reynolds number dependence of relative dispersion statistics in isotropic turbulence

NASA Astrophysics Data System (ADS)

Sawford, Brian L.; Yeung, P. K.; Hackl, Jason F.

2008-06-01

Direct numerical simulation results for a range of relative dispersion statistics over Taylor-scale Reynolds numbers up to 650 are presented in an attempt to observe and quantify inertial subrange scaling and, in particular, Richardson's t3 law. The analysis includes the mean-square separation and a range of important but less-studied differential statistics for which the motion is defined relative to that at time t =0. It seeks to unambiguously identify and quantify the Richardson scaling by demonstrating convergence with both the Reynolds number and initial separation. According to these criteria, the standard compensated plots for these statistics in inertial subrange scaling show clear evidence of a Richardson range but with an imprecise estimate for the Richardson constant. A modified version of the cube-root plots introduced by Ott and Mann [J. Fluid Mech. 422, 207 (2000)] confirms such convergence. It has been used to yield more precise estimates for Richardson's constant g which decrease with Taylor-scale Reynolds numbers over the range of 140-650. Extrapolation to the large Reynolds number limit gives an asymptotic value for Richardson's constant in the range g =0.55-0.57, depending on the functional form used to make the extrapolation.
Probing the Statistical Properties of Unknown Texts: Application to the Voynich Manuscript

PubMed Central

Amancio, Diego R.; Altmann, Eduardo G.; Rybski, Diego; Oliveira, Osvaldo N.; Costa, Luciano da F.

2013-01-01

While the use of statistical physics methods to analyze large corpora has been useful to unveil many patterns in texts, no comprehensive investigation has been performed on the interdependence between syntactic and semantic factors. In this study we propose a framework for determining whether a text (e.g., written in an unknown alphabet) is compatible with a natural language and to which language it could belong. The approach is based on three types of statistical measurements, i.e. obtained from first-order statistics of word properties in a text, from the topology of complex networks representing texts, and from intermittency concepts where text is treated as a time series. Comparative experiments were performed with the New Testament in 15 different languages and with distinct books in English and Portuguese in order to quantify the dependency of the different measurements on the language and on the story being told in the book. The metrics found to be informative in distinguishing real texts from their shuffled versions include assortativity, degree and selectivity of words. As an illustration, we analyze an undeciphered medieval manuscript known as the Voynich Manuscript. We show that it is mostly compatible with natural languages and incompatible with random texts. We also obtain candidates for keywords of the Voynich Manuscript which could be helpful in the effort of deciphering it. Because we were able to identify statistical measurements that are more dependent on the syntax than on the semantics, the framework may also serve for text analysis in language-dependent applications. PMID:23844002
Probing the statistical properties of unknown texts: application to the Voynich Manuscript.

PubMed

Amancio, Diego R; Altmann, Eduardo G; Rybski, Diego; Oliveira, Osvaldo N; Costa, Luciano da F

2013-01-01

While the use of statistical physics methods to analyze large corpora has been useful to unveil many patterns in texts, no comprehensive investigation has been performed on the interdependence between syntactic and semantic factors. In this study we propose a framework for determining whether a text (e.g., written in an unknown alphabet) is compatible with a natural language and to which language it could belong. The approach is based on three types of statistical measurements, i.e. obtained from first-order statistics of word properties in a text, from the topology of complex networks representing texts, and from intermittency concepts where text is treated as a time series. Comparative experiments were performed with the New Testament in 15 different languages and with distinct books in English and Portuguese in order to quantify the dependency of the different measurements on the language and on the story being told in the book. The metrics found to be informative in distinguishing real texts from their shuffled versions include assortativity, degree and selectivity of words. As an illustration, we analyze an undeciphered medieval manuscript known as the Voynich Manuscript. We show that it is mostly compatible with natural languages and incompatible with random texts. We also obtain candidates for keywords of the Voynich Manuscript which could be helpful in the effort of deciphering it. Because we were able to identify statistical measurements that are more dependent on the syntax than on the semantics, the framework may also serve for text analysis in language-dependent applications.
Strength of Dislocation Junctions in FCC-monocrystals with a [\\overline{1}11] Deformation Axis

NASA Astrophysics Data System (ADS)

Kurinnaya, R. I.; Zgolich, M. V.; Starenchenko, V. A.

2017-07-01

The paper examines all dislocation reactions implemented in FCC-monocrystals with axis deformation oriented in the [\\overline{1}11] direction. It identifies the fracture stresses of dislocation junctions depending on intersection geometry of the reacting dislocation loop segments. Estimates are produced for the full spectrum of reacting forest dislocations. The paper presents the statistical data of the research performed and identifies the share of long strong dislocation junctions capable of limiting the zone of dislocation shift.

Predictive Model for the Design of Zwitterionic Polymer Brushes: A Statistical Design of Experiments Approach.

PubMed

Kumar, Ramya; Lahann, Joerg

2016-07-06

The performance of polymer interfaces in biology is governed by a wide spectrum of interfacial properties. With the ultimate goal of identifying design parameters for stem cell culture coatings, we developed a statistical model that describes the dependence of brush properties on surface-initiated polymerization (SIP) parameters. Employing a design of experiments (DOE) approach, we identified operating boundaries within which four gel architecture regimes can be realized, including a new regime of associated brushes in thin films. Our statistical model can accurately predict the brush thickness and the degree of intermolecular association of poly[{2-(methacryloyloxy) ethyl} dimethyl-(3-sulfopropyl) ammonium hydroxide] (PMEDSAH), a previously reported synthetic substrate for feeder-free and xeno-free culture of human embryonic stem cells. DOE-based multifunctional predictions offer a powerful quantitative framework for designing polymer interfaces. For example, model predictions can be used to decrease the critical thickness at which the wettability transition occurs by simply increasing the catalyst quantity from 1 to 3 mol %.
Multidimensional effects in nonadiabatic statistical theories of spin- forbidden kinetics. A case study of 3O + CO → CO 2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jasper, Ahren

2015-04-14

The appropriateness of treating crossing seams of electronic states of different spins as nonadiabatic transition states in statistical calculations of spin-forbidden reaction rates is considered. We show that the spin-forbidden reaction coordinate, the nuclear coordinate perpendicular to the crossing seam, is coupled to the remaining nuclear degrees of freedom. We found that this coupling gives rise to multidimensional effects that are not typically included in statistical treatments of spin-forbidden kinetics. Three qualitative categories of multidimensional effects may be identified: static multidimensional effects due to the geometry-dependence of the local shape of the crossing seam and of the spin–orbit coupling, dynamicalmore » multidimensional effects due to energy exchange with the reaction coordinate during the seam crossing, and nonlocal(history-dependent) multidimensional effects due to interference of the electronic variables at second, third, and later seam crossings. Nonlocal multidimensional effects are intimately related to electronic decoherence, where electronic dephasing acts to erase the history of the system. A semiclassical model based on short-time full-dimensional trajectories that includes all three multidimensional effects as well as a model for electronic decoherence is presented. The results of this multidimensional nonadiabatic statistical theory (MNST) for the 3O + CO → CO 2 reaction are compared with the results of statistical theories employing one-dimensional (Landau–Zener and weak coupling) models for the transition probability and with those calculated previously using multistate trajectories. The MNST method is shown to accurately reproduce the multistate decay-of-mixing trajectory results, so long as consistent thresholds are used. Furthermore, the MNST approach has several advantages over multistate trajectory approaches and is more suitable in chemical kinetics calculations at low temperatures and for complex systems. The error in statistical calculations that neglect multidimensional effects is shown to be as large as a factor of 2 for this system, with static multidimensional effects identified as the largest source of error.« less
On system behaviour using complex networks of a compression algorithm

NASA Astrophysics Data System (ADS)

Walker, David M.; Correa, Debora C.; Small, Michael

2018-01-01

We construct complex networks of scalar time series using a data compression algorithm. The structure and statistics of the resulting networks can be used to help characterize complex systems, and one property, in particular, appears to be a useful discriminating statistic in surrogate data hypothesis tests. We demonstrate these ideas on systems with known dynamical behaviour and also show that our approach is capable of identifying behavioural transitions within electroencephalogram recordings as well as changes due to a bifurcation parameter of a chaotic system. The technique we propose is dependent on a coarse grained quantization of the original time series and therefore provides potential for a spatial scale-dependent characterization of the data. Finally the method is as computationally efficient as the underlying compression algorithm and provides a compression of the salient features of long time series.
On the Use of Biomineral Oxygen Isotope Data to Identify Human Migrants in the Archaeological Record: Intra-Sample Variation, Statistical Methods and Geographical Considerations

PubMed Central

Lightfoot, Emma; O’Connell, Tamsin C.

2016-01-01

Oxygen isotope analysis of archaeological skeletal remains is an increasingly popular tool to study past human migrations. It is based on the assumption that human body chemistry preserves the δ18O of precipitation in such a way as to be a useful technique for identifying migrants and, potentially, their homelands. In this study, the first such global survey, we draw on published human tooth enamel and bone bioapatite data to explore the validity of using oxygen isotope analyses to identify migrants in the archaeological record. We use human δ18O results to show that there are large variations in human oxygen isotope values within a population sample. This may relate to physiological factors influencing the preservation of the primary isotope signal, or due to human activities (such as brewing, boiling, stewing, differential access to water sources and so on) causing variation in ingested water and food isotope values. We compare the number of outliers identified using various statistical methods. We determine that the most appropriate method for identifying migrants is dependent on the data but is likely to be the IQR or median absolute deviation from the median under most archaeological circumstances. Finally, through a spatial assessment of the dataset, we show that the degree of overlap in human isotope values from different locations across Europe is such that identifying individuals’ homelands on the basis of oxygen isotope analysis alone is not possible for the regions analysed to date. Oxygen isotope analysis is a valid method for identifying first-generation migrants from an archaeological site when used appropriately, however it is difficult to identify migrants using statistical methods for a sample size of less than c. 25 individuals. In the absence of local previous analyses, each sample should be treated as an individual dataset and statistical techniques can be used to identify migrants, but in most cases pinpointing a specific homeland should not be attempted. PMID:27124001
Time-dynamics of the two-color emission from vertical-external-cavity surface-emitting lasers

NASA Astrophysics Data System (ADS)

Chernikov, A.; Wichmann, M.; Shakfa, M. K.; Scheller, M.; Moloney, J. V.; Koch, S. W.; Koch, M.

2012-01-01

The temporal stability of a two-color vertical-external-cavity surface-emitting laser is studied using single-shot streak-camera measurements. The collected data is evaluated via quantitative statistical analysis schemes. Dynamically stable and unstable regions for the two-color operation are identified and the dependence on the pump conditions is analyzed.
Automation method to identify the geological structure of seabed using spatial statistic analysis of echo sounding data

NASA Astrophysics Data System (ADS)

Kwon, O.; Kim, W.; Kim, J.

2017-12-01

Recently construction of subsea tunnel has been increased globally. For safe construction of subsea tunnel, identifying the geological structure including fault at design and construction stage is more than important. Then unlike the tunnel in land, it's very difficult to obtain the data on geological structure because of the limit in geological survey. This study is intended to challenge such difficulties in a way of developing the technology to identify the geological structure of seabed automatically by using echo sounding data. When investigation a potential site for a deep subsea tunnel, there is the technical and economical limit with borehole of geophysical investigation. On the contrary, echo sounding data is easily obtainable while information reliability is higher comparing to above approaches. This study is aimed at developing the algorithm that identifies the large scale of geological structure of seabed using geostatic approach. This study is based on theory of structural geology that topographic features indicate geological structure. Basic concept of algorithm is outlined as follows; (1) convert the seabed topography to the grid data using echo sounding data, (2) apply the moving window in optimal size to the grid data, (3) estimate the spatial statistics of the grid data in the window area, (4) set the percentile standard of spatial statistics, (5) display the values satisfying the standard on the map, (6) visualize the geological structure on the map. The important elements in this study include optimal size of moving window, kinds of optimal spatial statistics and determination of optimal percentile standard. To determine such optimal elements, a numerous simulations were implemented. Eventually, user program based on R was developed using optimal analysis algorithm. The user program was designed to identify the variations of various spatial statistics. It leads to easy analysis of geological structure depending on variation of spatial statistics by arranging to easily designate the type of spatial statistics and percentile standard. This research was supported by the Korea Agency for Infrastructure Technology Advancement under the Ministry of Land, Infrastructure and Transport of the Korean government. (Project Number: 13 Construction Research T01)
DEPEND - A design environment for prediction and evaluation of system dependability

NASA Technical Reports Server (NTRS)

Goswami, Kumar K.; Iyer, Ravishankar K.

1990-01-01

The development of DEPEND, an integrated simulation environment for the design and dependability analysis of fault-tolerant systems, is described. DEPEND models both hardware and software components at a functional level, and allows automatic failure injection to assess system performance and reliability. It relieves the user of the work needed to inject failures, maintain statistics, and output reports. The automatic failure injection scheme is geared toward evaluating a system under high stress (workload) conditions. The failures that are injected can affect both hardware and software components. To illustrate the capability of the simulator, a distributed system which employs a prediction-based, dynamic load-balancing heuristic is evaluated. Experiments were conducted to determine the impact of failures on system performance and to identify the failures to which the system is especially susceptible.
The Return of Rate Dependence.

PubMed

Quisenberry, Amanda J; Snider, Sarah E; Bickel, Warren K

2016-11-01

Rate dependence, a well-known phenomenon in behavioral pharmacology, appears to have declined as a topic of interest, perhaps, as a result of being viewed pertinent to only the preclinical investigation of drugs on schedule-controlled performance. Obstacles to data interpretation due to conflation with regression to the mean also appear to have contributed to the topic's decline. Despite this reduction in exposure, rate dependence is a useful concept and tool that can be used to determine sources of variability, predict therapeutic outcomes, and identify individuals that are most likely to respond therapeutically. Armed with new statistical methods and an understanding of the broad range of conditions under which rate dependence can be observed, we urge researchers to revisit the concept, use the appropriate analysis methods, and to design empirical studies a priori to further explore rate dependence.
Statistical Learning of Probabilistic Nonadjacent Dependencies by Multiple-Cue Integration

ERIC Educational Resources Information Center

van den Bos, Esther; Christiansen, Morten H.; Misyak, Jennifer B.

2012-01-01

Previous studies have indicated that dependencies between nonadjacent elements can be acquired by statistical learning when each element predicts only one other element (deterministic dependencies). The present study investigates statistical learning of probabilistic nonadjacent dependencies, in which each element predicts several other elements…
Enhancing the mathematical properties of new haplotype homozygosity statistics for the detection of selective sweeps.

PubMed

Garud, Nandita R; Rosenberg, Noah A

2015-06-01

Soft selective sweeps represent an important form of adaptation in which multiple haplotypes bearing adaptive alleles rise to high frequency. Most statistical methods for detecting selective sweeps from genetic polymorphism data, however, have focused on identifying hard selective sweeps in which a favored allele appears on a single haplotypic background; these methods might be underpowered to detect soft sweeps. Among exceptions is the set of haplotype homozygosity statistics introduced for the detection of soft sweeps by Garud et al. (2015). These statistics, examining frequencies of multiple haplotypes in relation to each other, include H12, a statistic designed to identify both hard and soft selective sweeps, and H2/H1, a statistic that conditional on high H12 values seeks to distinguish between hard and soft sweeps. A challenge in the use of H2/H1 is that its range depends on the associated value of H12, so that equal H2/H1 values might provide different levels of support for a soft sweep model at different values of H12. Here, we enhance the H12 and H2/H1 haplotype homozygosity statistics for selective sweep detection by deriving the upper bound on H2/H1 as a function of H12, thereby generating a statistic that normalizes H2/H1 to lie between 0 and 1. Through a reanalysis of resequencing data from inbred lines of Drosophila, we show that the enhanced statistic both strengthens interpretations obtained with the unnormalized statistic and leads to empirical insights that are less readily apparent without the normalization. Copyright © 2015 Elsevier Inc. All rights reserved.
Magnetosheath plasma stability and ULF wave occurrence as a function of location in the magnetosheath and upstream bow shock parameters

NASA Astrophysics Data System (ADS)

Soucek, Jan; Escoubet, C. Philippe; Grison, Benjamin

2015-04-01

We present the results of a statistical study of the distribution of mirror and Alfvén-ion cyclotron (AIC) waves in the magnetosheath together with plasma parameters important for the stability of ULF waves, specifically ion temperature anisotropy and ion beta. Magnetosheath crossings registered by Cluster spacecraft over the course of 2 years served as a basis for the statistics. For each observation we used bow shock, magnetopause, and magnetosheath flow models to identify the relative position of the spacecraft with respect to magnetosheath boundaries and local properties of the upstream shock crossing. A strong dependence of both plasma parameters and mirror/AIC wave occurrence on upstream ΘBn and MA is identified. We analyzed a joint dependence of the same parameters on ΘBn and fractional distance between shock and magnetopause, zenith angle, and length of the flow line. Finally, the occurrence of mirror and AIC modes was compared against the respective instability thresholds. We noted that AIC waves occurred nearly exclusively under mirror stable conditions. This is interpreted in terms of different characters of nonlinear saturation of the two modes.
Narrowing the scope of failure prediction using targeted fault load injection

NASA Astrophysics Data System (ADS)

Jordan, Paul L.; Peterson, Gilbert L.; Lin, Alan C.; Mendenhall, Michael J.; Sellers, Andrew J.

2018-05-01

As society becomes more dependent upon computer systems to perform increasingly critical tasks, ensuring that those systems do not fail becomes increasingly important. Many organizations depend heavily on desktop computers for day-to-day operations. Unfortunately, the software that runs on these computers is written by humans and, as such, is still subject to human error and consequent failure. A natural solution is to use statistical machine learning to predict failure. However, since failure is still a relatively rare event, obtaining labelled training data to train these models is not a trivial task. This work presents new simulated fault-inducing loads that extend the focus of traditional fault injection techniques to predict failure in the Microsoft enterprise authentication service and Apache web server. These new fault loads were successful in creating failure conditions that were identifiable using statistical learning methods, with fewer irrelevant faults being created.
Two-Dimensional Hermite Filters Simplify the Description of High-Order Statistics of Natural Images.

PubMed

Hu, Qin; Victor, Jonathan D

2016-09-01

Natural image statistics play a crucial role in shaping biological visual systems, understanding their function and design principles, and designing effective computer-vision algorithms. High-order statistics are critical for conveying local features, but they are challenging to study - largely because their number and variety is large. Here, via the use of two-dimensional Hermite (TDH) functions, we identify a covert symmetry in high-order statistics of natural images that simplifies this task. This emerges from the structure of TDH functions, which are an orthogonal set of functions that are organized into a hierarchy of ranks. Specifically, we find that the shape (skewness and kurtosis) of the distribution of filter coefficients depends only on the projection of the function onto a 1-dimensional subspace specific to each rank. The characterization of natural image statistics provided by TDH filter coefficients reflects both their phase and amplitude structure, and we suggest an intuitive interpretation for the special subspace within each rank.
SAFER, an Analysis Method of Quantitative Proteomic Data, Reveals New Interactors of the C. elegans Autophagic Protein LGG-1.

PubMed

Yi, Zhou; Manil-Ségalen, Marion; Sago, Laila; Glatigny, Annie; Redeker, Virginie; Legouis, Renaud; Mucchielli-Giorgi, Marie-Hélène

2016-05-06

Affinity purifications followed by mass spectrometric analysis are used to identify protein-protein interactions. Because quantitative proteomic data are noisy, it is necessary to develop statistical methods to eliminate false-positives and identify true partners. We present here a novel approach for filtering false interactors, named "SAFER" for mass Spectrometry data Analysis by Filtering of Experimental Replicates, which is based on the reproducibility of the replicates and the fold-change of the protein intensities between bait and control. To identify regulators or targets of autophagy, we characterized the interactors of LGG1, a ubiquitin-like protein involved in autophagosome formation in C. elegans. LGG-1 partners were purified by affinity, analyzed by nanoLC-MS/MS mass spectrometry, and quantified by a label-free proteomic approach based on the mass spectrometric signal intensity of peptide precursor ions. Because the selection of confident interactions depends on the method used for statistical analysis, we compared SAFER with several statistical tests and different scoring algorithms on this set of data. We show that SAFER recovers high-confidence interactors that have been ignored by the other methods and identified new candidates involved in the autophagy process. We further validated our method on a public data set and conclude that SAFER notably improves the identification of protein interactors.
Modeling Protein Expression and Protein Signaling Pathways

PubMed Central

Telesca, Donatello; Müller, Peter; Kornblau, Steven M.; Suchard, Marc A.; Ji, Yuan

2015-01-01

High-throughput functional proteomic technologies provide a way to quantify the expression of proteins of interest. Statistical inference centers on identifying the activation state of proteins and their patterns of molecular interaction formalized as dependence structure. Inference on dependence structure is particularly important when proteins are selected because they are part of a common molecular pathway. In that case, inference on dependence structure reveals properties of the underlying pathway. We propose a probability model that represents molecular interactions at the level of hidden binary latent variables that can be interpreted as indicators for active versus inactive states of the proteins. The proposed approach exploits available expert knowledge about the target pathway to define an informative prior on the hidden conditional dependence structure. An important feature of this prior is that it provides an instrument to explicitly anchor the model space to a set of interactions of interest, favoring a local search approach to model determination. We apply our model to reverse-phase protein array data from a study on acute myeloid leukemia. Our inference identifies relevant subpathways in relation to the unfolding of the biological process under study. PMID:26246646
In situ statistical observations of EMIC waves by Arase satellite

NASA Astrophysics Data System (ADS)

Nomura, R.; Matsuoka, A.; Teramoto, M.; Nose, M.; Yoshizumi, M.; Fujimoto, A.; Shinohara, M.; Tanaka, Y.

2017-12-01

We present in situ statistical survey of electromagnetic ion cyclotron (EMIC) waves observed by Arase satellite from 3 March to 16 July 2017. We identified 64 events using the fluxgate magnetometer (MGF) on the satellite. The EMIC wave is the key phenomena to understand the loss dynamics of MeV-energy electrons in the radiation belt. We will show the radial and latitudinal dependence of the wave occurance rate and the wave parameters (frequency band, coherence, polarization, and ellipticity). Especially the EMIC waves observed at localized weak background magnetic field will be discussed for the wave excitation mechanism in the deep inner magnetosphere.
Defining Face Perception Areas in the Human Brain: A Large-Scale Factorial fMRI Face Localizer Analysis

ERIC Educational Resources Information Center

Rossion, Bruno; Hanseeuw, Bernard; Dricot, Laurence

2012-01-01

A number of human brain areas showing a larger response to faces than to objects from different categories, or to scrambled faces, have been identified in neuroimaging studies. Depending on the statistical criteria used, the set of areas can be overextended or minimized, both at the local (size of areas) and global (number of areas) levels. Here…
A critique of the usefulness of inferential statistics in applied behavior analysis

PubMed Central

Hopkins, B. L.; Cole, Brian L.; Mason, Tina L.

1998-01-01

Researchers continue to recommend that applied behavior analysts use inferential statistics in making decisions about effects of independent variables on dependent variables. In many other approaches to behavioral science, inferential statistics are the primary means for deciding the importance of effects. Several possible uses of inferential statistics are considered. Rather than being an objective means for making decisions about effects, as is often claimed, inferential statistics are shown to be subjective. It is argued that the use of inferential statistics adds nothing to the complex and admittedly subjective nonstatistical methods that are often employed in applied behavior analysis. Attacks on inferential statistics that are being made, perhaps with increasing frequency, by those who are not behavior analysts, are discussed. These attackers are calling for banning the use of inferential statistics in research publications and commonly recommend that behavioral scientists should switch to using statistics aimed at interval estimation or the method of confidence intervals. Interval estimation is shown to be contrary to the fundamental assumption of behavior analysis that only individuals behave. It is recommended that authors who wish to publish the results of inferential statistics be asked to justify them as a means for helping us to identify any ways in which they may be useful. PMID:22478304
Modelling nitrate pollution pressure using a multivariate statistical approach: the case of Kinshasa groundwater body, Democratic Republic of Congo

NASA Astrophysics Data System (ADS)

Mfumu Kihumba, Antoine; Ndembo Longo, Jean; Vanclooster, Marnik

2016-03-01

A multivariate statistical modelling approach was applied to explain the anthropogenic pressure of nitrate pollution on the Kinshasa groundwater body (Democratic Republic of Congo). Multiple regression and regression tree models were compared and used to identify major environmental factors that control the groundwater nitrate concentration in this region. The analyses were made in terms of physical attributes related to the topography, land use, geology and hydrogeology in the capture zone of different groundwater sampling stations. For the nitrate data, groundwater datasets from two different surveys were used. The statistical models identified the topography, the residential area, the service land (cemetery), and the surface-water land-use classes as major factors explaining nitrate occurrence in the groundwater. Also, groundwater nitrate pollution depends not on one single factor but on the combined influence of factors representing nitrogen loading sources and aquifer susceptibility characteristics. The groundwater nitrate pressure was better predicted with the regression tree model than with the multiple regression model. Furthermore, the results elucidated the sensitivity of the model performance towards the method of delineation of the capture zones. For pollution modelling at the monitoring points, therefore, it is better to identify capture-zone shapes based on a conceptual hydrogeological model rather than to adopt arbitrary circular capture zones.
Use of Statistical Analyses in the Ophthalmic Literature

PubMed Central

Lisboa, Renato; Meira-Freitas, Daniel; Tatham, Andrew J.; Marvasti, Amir H.; Sharpsten, Lucie; Medeiros, Felipe A.

2014-01-01

Purpose To identify the most commonly used statistical analyses in the ophthalmic literature and to determine the likely gain in comprehension of the literature that readers could expect if they were to sequentially add knowledge of more advanced techniques to their statistical repertoire. Design Cross-sectional study Methods All articles published from January 2012 to December 2012 in Ophthalmology, American Journal of Ophthalmology and Archives of Ophthalmology were reviewed. A total of 780 peer-reviewed articles were included. Two reviewers examined each article and assigned categories to each one depending on the type of statistical analyses used. Discrepancies between reviewers were resolved by consensus. Main Outcome Measures Total number and percentage of articles containing each category of statistical analysis were obtained. Additionally we estimated the accumulated number and percentage of articles that a reader would be expected to be able to interpret depending on their statistical repertoire. Results Readers with little or no statistical knowledge would be expected to be able to interpret the statistical methods presented in only 20.8% of articles. In order to understand more than half (51.4%) of the articles published, readers were expected to be familiar with at least 15 different statistical methods. Knowledge of 21 categories of statistical methods was necessary to comprehend 70.9% of articles, while knowledge of more than 29 categories was necessary to comprehend more than 90% of articles. Articles in retina and glaucoma subspecialties showed a tendency for using more complex analysis when compared to cornea. Conclusions Readers of clinical journals in ophthalmology need to have substantial knowledge of statistical methodology to understand the results of published studies in the literature. The frequency of use of complex statistical analyses also indicates that those involved in the editorial peer-review process must have sound statistical knowledge in order to critically appraise articles submitted for publication. The results of this study could provide guidance to direct the statistical learning of clinical ophthalmologists, researchers and educators involved in the design of courses for residents and medical students. PMID:24612977

An Evaluation of the Euroncap Crash Test Safety Ratings in the Real World

PubMed Central

Segui-Gomez, Maria; Lopez-Valdes, Francisco J.; Frampton, Richard

2007-01-01

We investigated whether the rating obtained in the EuroNCAP test procedures correlates with injury protection to vehicle occupants in real crashes using data in the UK Cooperative Crash Injury Study (CCIS) database from 1996 to 2005. Multivariate Poisson regression models were developed, using the Abbreviated Injury Scale (AIS) score by body region as the dependent variable and the EuroNCAP score for that particular body region, seat belt use, mass ratio and Equivalent Test Speed (ETS) as independent variables. Our models identified statistically significant relationships between injury severity and safety belt use, mass ratio and ETS. We could not identify any statistically significant relationships between the EuroNCAP body region scores and real injury outcome except for the protection to pelvis-femur-knee in frontal impacts where scoring “green” is significantly better than scoring “yellow” or “red”.
Autocorrelation and cross-correlation in time series of homicide and attempted homicide

NASA Astrophysics Data System (ADS)

Machado Filho, A.; da Silva, M. F.; Zebende, G. F.

2014-04-01

We propose in this paper to establish the relationship between homicides and attempted homicides by a non-stationary time-series analysis. This analysis will be carried out by Detrended Fluctuation Analysis (DFA), Detrended Cross-Correlation Analysis (DCCA), and DCCA cross-correlation coefficient, ρ(n). Through this analysis we can identify a positive cross-correlation between homicides and attempted homicides. At the same time, looked at from the point of view of autocorrelation (DFA), this analysis can be more informative depending on time scale. For short scale (days), we cannot identify auto-correlations, on the scale of weeks DFA presents anti-persistent behavior, and for long time scales (n>90 days) DFA presents a persistent behavior. Finally, the application of this new type of statistical analysis proved to be efficient and, in this sense, this paper can contribute to a more accurate descriptive statistics of crime.
Identifying the impact of social determinants of health on disease rates using correlation analysis of area-based summary information.

PubMed

Song, Ruiguang; Hall, H Irene; Harrison, Kathleen McDavid; Sharpe, Tanya Telfair; Lin, Lillian S; Dean, Hazel D

2011-01-01

We developed a statistical tool that brings together standard, accessible, and well-understood analytic approaches and uses area-based information and other publicly available data to identify social determinants of health (SDH) that significantly affect the morbidity of a specific disease. We specified AIDS as the disease of interest and used data from the American Community Survey and the National HIV Surveillance System. Morbidity and socioeconomic variables in the two data systems were linked through geographic areas that can be identified in both systems. Correlation and partial correlation coefficients were used to measure the impact of socioeconomic factors on AIDS diagnosis rates in certain geographic areas. We developed an easily explained approach that can be used by a data analyst with access to publicly available datasets and standard statistical software to identify the impact of SDH. We found that the AIDS diagnosis rate was highly correlated with the distribution of race/ethnicity, population density, and marital status in an area. The impact of poverty, education level, and unemployment depended on other SDH variables. Area-based measures of socioeconomic variables can be used to identify risk factors associated with a disease of interest. When correlation analysis is used to identify risk factors, potential confounding from other variables must be taken into account.
Use of AUDIT-based measures to identify unhealthy alcohol use and alcohol dependence in primary care: a validation study.

PubMed

Johnson, J Aaron; Lee, Anna; Vinson, Daniel; Seale, J Paul

2013-01-01

As programs for screening, brief intervention, and referral to treatment (SBIRT) for unhealthy alcohol use disseminate, evidence-based approaches for identifying patients with unhealthy alcohol use and alcohol dependence (AD) are needed. While the National Institute on Alcohol Abuse and Alcoholism Clinician Guide suggests use of a single alcohol screening question (SASQ) for screening and Diagnostic and Statistical Manual checklists for assessment, many SBIRT programs use alcohol use disorders identification test (AUDIT) "zones" for screening and assessment. Validation data for these zones are limited. This study used primary care data from a bi-ethnic southern U.S. population to examine the ability of the AUDIT zones and other AUDIT-based approaches to identify unhealthy alcohol use and dependence. Existing data were analyzed from interviews with 625 female and male adult drinkers presenting to 5 southeastern primary care practices. Timeline follow-back was used to identify at-risk drinking, and diagnostic interview schedule was used to identify alcohol abuse and dependence. Validity measures compared performance of AUDIT, AUDIT-C, and AUDIT dependence domains scores, with and without a 30-day binge drinking measure, for detecting unhealthy alcohol use and dependence. Optimal AUDIT scores for detecting unhealthy alcohol use were lower than current commonly used cutoffs (5 for men, 3 for women). Improved performance was obtained by combining AUDIT cutoffs of 6 for men and 4 for women with a 30-day binge drinking measure. AUDIT scores of 15 for men and 13 for women detected AD with 100% specificity but low sensitivity (20 and 18%, respectively). AUDIT dependence subscale scores of 2 or more showed similar specificity (99%) and slightly higher sensitivity (31% for men, 24% for women). Combining lower AUDIT cutoff scores and binge drinking measures may increase the detection of unhealthy alcohol use in primary care. Use of lower cutoff scores and dependence subscale scores may increase diagnosis of AD; however, better measures for detecting dependence are needed. Copyright © 2012 by the Research Society on Alcoholism.
Multifactor-Dimensionality Reduction Reveals High-Order Interactions among Estrogen-Metabolism Genes in Sporadic Breast Cancer

PubMed Central

Ritchie, Marylyn D.; Hahn, Lance W.; Roodi, Nady; Bailey, L. Renee; Dupont, William D.; Parl, Fritz F.; Moore, Jason H.

2001-01-01

One of the greatest challenges facing human geneticists is the identification and characterization of susceptibility genes for common complex multifactorial human diseases. This challenge is partly due to the limitations of parametric-statistical methods for detection of gene effects that are dependent solely or partially on interactions with other genes and with environmental exposures. We introduce multifactor-dimensionality reduction (MDR) as a method for reducing the dimensionality of multilocus information, to improve the identification of polymorphism combinations associated with disease risk. The MDR method is nonparametric (i.e., no hypothesis about the value of a statistical parameter is made), is model-free (i.e., it assumes no particular inheritance model), and is directly applicable to case-control and discordant-sib-pair studies. Using simulated case-control data, we demonstrate that MDR has reasonable power to identify interactions among two or more loci in relatively small samples. When it was applied to a sporadic breast cancer case-control data set, in the absence of any statistically significant independent main effects, MDR identified a statistically significant high-order interaction among four polymorphisms from three different estrogen-metabolism genes. To our knowledge, this is the first report of a four-locus interaction associated with a common complex multifactorial disease. PMID:11404819
HLA-A and -B phenotypes associated with tuberculosis in population from north-eastern Romania.

PubMed

Vasilca, Venera; Oana, Raluca; Munteanu, Dorina; Zugun, F; Constantinescu, Daniela; Carasevici, E

2004-01-01

HLA antigens are involved in inducing either susceptibility or resistance to different diseases. Many studies reported various associations between HLA antigens and tuberculosis, depending on race, ethnic group and geographic area. Our purpose was to identify HLA class I antigens inducing susceptibility to tuberculosis in population from North-Eastern Romania. The study group consisted of 50 tuberculosis patients and the control group included 90 healthy people. HLA-A and HLA-B antigens were determined using the CDC-NIH (complement-dependent-cytotoxicity-National Institute of Health) assay. A comparison was made between the frequency of HLA antigens expression in the two studied groups. HLA-B18 and HLA-A29(19) were expressed more frequently in tuberculosis patients. The difference was statistically significant only for HLA-B18 antigen. HLA-B7 and -B61(40) antigens were expressed with statistically significant higher frequency in controls compared to tuberculosis patients. The frequency of other HLA-A and HLA-B antigens was either comparable in the two groups or without statistical significance. CONCLUSIONS We found a positive association between HLA-B18 antigen and tuberculosis, while HLA-B7 and HLA-B61(40) antigens seem to protect against the disease.
Translational Genomics Research Institute: Identification of Pathways Enriched with Condition-Specific Statistical Dependencies Across Four Subtypes of Glioblastoma Multiforme | Office of Cancer Genomics

Cancer.gov

Evaluation of Differential DependencY (EDDY) is a statistical test for the differential dependency relationship of a set of genes between two given conditions. For each condition, possible dependency network structures are enumerated and their likelihoods are computed to represent a probability distribution of dependency networks. The difference between the probability distributions of dependency networks is computed between conditions, and its statistical significance is evaluated with random permutations of condition labels on the samples.
Translational Genomics Research Institute (TGen): Identification of Pathways Enriched with Condition-Specific Statistical Dependencies Across Four Subtypes of Glioblastoma Multiforme | Office of Cancer Genomics

Cancer.gov

Evaluation of Differential DependencY (EDDY) is a statistical test for the differential dependency relationship of a set of genes between two given conditions. For each condition, possible dependency network structures are enumerated and their likelihoods are computed to represent a probability distribution of dependency networks. The difference between the probability distributions of dependency networks is computed between conditions, and its statistical significance is evaluated with random permutations of condition labels on the samples.
Measuring determinants of career satisfaction of anesthesiologists: validation of a survey instrument.

PubMed

Afonso, Anoushka M; Diaz, James H; Scher, Corey S; Beyl, Robbie A; Nair, Singh R; Kaye, Alan David

2013-06-01

To measure the parameter of job satisfaction among anesthesiologists. Survey instrument. Academic anesthesiology departments in the United States. 320 anesthesiologists who attended the annual meeting of the ASA in 2009 (95% response rate). The anonymous 50-item survey collected information on 26 independent demographic variables and 24 dependent ranked variables of career satisfaction among practicing anesthesiologists. Mean survey scores were calculated for each demographic variable and tested for statistically significant differences by analysis of variance. Questions within each domain that were internally consistent with each other within domains were identified by Cronbach's alpha ≥ 0.7. P-values ≤ 0.05 were considered statistically significant. Cronbach's alpha analysis showed strong internal consistency for 10 dependent outcome questions in the practice factor-related domain (α = 0.72), 6 dependent outcome questions in the peer factor-related domain (α = 0.71), and 8 dependent outcome questions in the personal factor-related domain (α = 0.81). Although age was not a variable, full-time status, early satisfaction within the first 5 years of practice, working with respected peers, and personal choice factors were all significantly associated with anesthesiologist job satisfaction. Improvements in factors related to job satisfaction among anesthesiologists may lead to higher early and current career satisfaction. Copyright © 2013 Elsevier Inc. All rights reserved.
Dynamic triggering of low magnitude earthquakes in the Middle American Subduction Zone

NASA Astrophysics Data System (ADS)

Escudero, C. R.; Velasco, A. A.

2010-12-01

We analyze global and Middle American Subduction Zone (MASZ) seismicity from 1998 to 2008 to quantify the transient stresses effects at teleseismic distances. We use the Bulletin of the International Seismological Centre Catalog (ISCCD) published by the Incorporated Research Institutions for Seismology (IRIS). To identify MASZ seismicity changes due to distant, large (Mw >7) earthquakes, we first identify local earthquakes that occurred before and after the mainshocks. We then group the local earthquakes within a cluster radius between 75 to 200 km. We obtain statistics based on characteristics of both mainshocks and local earthquakes clusters, such as local cluster-mainshock azimuth, mainshock focal mechanism, and local earthquakes clusters within the MASZ. Due to lateral variations of the dip along the subducted oceanic plate, we divide the Mexican subduction zone in four segments. We then apply the Paired Samples Statistical Test (PSST) to the sorted data to identify increment, decrement or either in the local seismicity associated with distant large earthquakes. We identify dynamic triggering for all MASZ segments produced by large earthquakes emerging from specific azimuths, as well as, a decrease for some cases. We find no depend of seismicity changes due to focal mainshock mechanism.
A conceptual and statistical framework for adaptive radiations with a key role for diversity dependence.

PubMed

Etienne, Rampal S; Haegeman, Bart

2012-10-01

In this article we propose a new framework for studying adaptive radiations in the context of diversity-dependent diversification. Diversity dependence causes diversification to decelerate at the end of an adaptive radiation but also plays a key role in the initial pulse of diversification. In particular, key innovations (which in our definition include novel traits as well as new environments) may cause decoupling of the diversity-dependent dynamics of the innovative clade from the diversity-dependent dynamics of its ancestral clade. We present a likelihood-based inference method to test for decoupling of diversity dependence using molecular phylogenies. The method, which can handle incomplete phylogenies, identifies when the decoupling took place and which diversification parameters are affected. We illustrate our approach by applying it to the molecular phylogeny of the North American clade of the legume tribe Psoraleeae (47 extant species, of which 4 are missing). Two diversification rate shifts were previously identified for this clade; our analysis shows that the first, positive shift can be associated with decoupling of two Pediomelum subgenera from the other Psoraleeae lineages, while we argue that the second, negative shift can be attributed to speciation being protracted. The latter explanation yields nonzero extinction rates, in contrast to previous findings. Our framework offers a new perspective on macroevolution: new environments and novel traits (ecological opportunity) and diversity dependence (ecological limits) cannot be considered separately.
First high-statistics and high-resolution recoil-ion data from the WITCH retardation spectrometer

NASA Astrophysics Data System (ADS)

Finlay, P.; Breitenfeldt, M.; Porobić, T.; Wursten, E.; Ban, G.; Beck, M.; Couratin, C.; Fabian, X.; Fléchard, X.; Friedag, P.; Glück, F.; Herlert, A.; Knecht, A.; Kozlov, V. Y.; Liénard, E.; Soti, G.; Tandecki, M.; Traykov, E.; Van Gorp, S.; Weinheimer, Ch.; Zákoucký, D.; Severijns, N.

2016-07-01

The first high-statistics and high-resolution data set for the integrated recoil-ion energy spectrum following the β^+ decay of 35Ar has been collected with the WITCH retardation spectrometer located at CERN-ISOLDE. Over 25 million recoil-ion events were recorded on a large-area multichannel plate (MCP) detector with a time-stamp precision of 2ns and position resolution of 0.1mm due to the newly upgraded data acquisition based on the LPC Caen FASTER protocol. The number of recoil ions was measured for more than 15 different settings of the retardation potential, complemented by dedicated background and half-life measurements. Previously unidentified systematic effects, including an energy-dependent efficiency of the main MCP and a radiation-induced time-dependent background, have been identified and incorporated into the analysis. However, further understanding and treatment of the radiation-induced background requires additional dedicated measurements and remains the current limiting factor in extracting a beta-neutrino angular correlation coefficient for 35Ar decay using the WITCH spectrometer.
NETWORK ASSISTED ANALYSIS TO REVEAL THE GENETIC BASIS OF AUTISM1

PubMed Central

Liu, Li; Lei, Jing; Roeder, Kathryn

2016-01-01

While studies show that autism is highly heritable, the nature of the genetic basis of this disorder remains illusive. Based on the idea that highly correlated genes are functionally interrelated and more likely to affect risk, we develop a novel statistical tool to find more potentially autism risk genes by combining the genetic association scores with gene co-expression in specific brain regions and periods of development. The gene dependence network is estimated using a novel partial neighborhood selection (PNS) algorithm, where node specific properties are incorporated into network estimation for improved statistical and computational efficiency. Then we adopt a hidden Markov random field (HMRF) model to combine the estimated network and the genetic association scores in a systematic manner. The proposed modeling framework can be naturally extended to incorporate additional structural information concerning the dependence between genes. Using currently available genetic association data from whole exome sequencing studies and brain gene expression levels, the proposed algorithm successfully identified 333 genes that plausibly affect autism risk. PMID:27134692
Using statistical process control for monitoring the prevalence of hospital-acquired pressure ulcers.

PubMed

Kottner, Jan; Halfens, Ruud

2010-05-01

Institutionally acquired pressure ulcers are used as outcome indicators to assess the quality of pressure ulcer prevention programs. Determining whether quality improvement projects that aim to decrease the proportions of institutionally acquired pressure ulcers lead to real changes in clinical practice depends on the measurement method and statistical analysis used. To examine whether nosocomial pressure ulcer prevalence rates in hospitals in the Netherlands changed, a secondary data analysis using different statistical approaches was conducted of annual (1998-2008) nationwide nursing-sensitive health problem prevalence studies in the Netherlands. Institutions that participated regularly in all survey years were identified. Risk-adjusted nosocomial pressure ulcers prevalence rates, grade 2 to 4 (European Pressure Ulcer Advisory Panel system) were calculated per year and hospital. Descriptive statistics, chi-square trend tests, and P charts based on statistical process control (SPC) were applied and compared. Six of the 905 healthcare institutions participated in every survey year and 11,444 patients in these six hospitals were identified as being at risk for pressure ulcers. Prevalence rates per year ranged from 0.05 to 0.22. Chi-square trend tests revealed statistically significant downward trends in four hospitals but based on SPC methods, prevalence rates of five hospitals varied by chance only. Results of chi-square trend tests and SPC methods were not comparable, making it impossible to decide which approach is more appropriate. P charts provide more valuable information than single P values and are more helpful for monitoring institutional performance. Empirical evidence about the decrease of nosocomial pressure ulcer prevalence rates in the Netherlands is contradictory and limited.
Performance of cancer cluster Q-statistics for case-control residential histories

PubMed Central

Sloan, Chantel D.; Jacquez, Geoffrey M.; Gallagher, Carolyn M.; Ward, Mary H.; Raaschou-Nielsen, Ole; Nordsborg, Rikke Baastrup; Meliker, Jaymie R.

2012-01-01

Few investigations of health event clustering have evaluated residential mobility, though causative exposures for chronic diseases such as cancer often occur long before diagnosis. Recently developed Q-statistics incorporate human mobility into disease cluster investigations by quantifying space- and time-dependent nearest neighbor relationships. Using residential histories from two cancer case-control studies, we created simulated clusters to examine Q-statistic performance. Results suggest the intersection of cases with significant clustering over their life course, Qi, with cases who are constituents of significant local clusters at given times, Qit, yielded the best performance, which improved with increasing cluster size. Upon comparison, a larger proportion of true positives were detected with Kulldorf’s spatial scan method if the time of clustering was provided. We recommend using Q-statistics to identify when and where clustering may have occurred, followed by the scan method to localize the candidate clusters. Future work should investigate the generalizability of these findings. PMID:23149326
Biostatistics Series Module 10: Brief Overview of Multivariate Methods.

PubMed

Hazra, Avijit; Gogtay, Nithya

2017-01-01

Multivariate analysis refers to statistical techniques that simultaneously look at three or more variables in relation to the subjects under investigation with the aim of identifying or clarifying the relationships between them. These techniques have been broadly classified as dependence techniques, which explore the relationship between one or more dependent variables and their independent predictors, and interdependence techniques, that make no such distinction but treat all variables equally in a search for underlying relationships. Multiple linear regression models a situation where a single numerical dependent variable is to be predicted from multiple numerical independent variables. Logistic regression is used when the outcome variable is dichotomous in nature. The log-linear technique models count type of data and can be used to analyze cross-tabulations where more than two variables are included. Analysis of covariance is an extension of analysis of variance (ANOVA), in which an additional independent variable of interest, the covariate, is brought into the analysis. It tries to examine whether a difference persists after "controlling" for the effect of the covariate that can impact the numerical dependent variable of interest. Multivariate analysis of variance (MANOVA) is a multivariate extension of ANOVA used when multiple numerical dependent variables have to be incorporated in the analysis. Interdependence techniques are more commonly applied to psychometrics, social sciences and market research. Exploratory factor analysis and principal component analysis are related techniques that seek to extract from a larger number of metric variables, a smaller number of composite factors or components, which are linearly related to the original variables. Cluster analysis aims to identify, in a large number of cases, relatively homogeneous groups called clusters, without prior information about the groups. The calculation intensive nature of multivariate analysis has so far precluded most researchers from using these techniques routinely. The situation is now changing with wider availability, and increasing sophistication of statistical software and researchers should no longer shy away from exploring the applications of multivariate methods to real-life data sets.
Copula-based model for rainfall and El- Niño in Banyuwangi Indonesia

NASA Astrophysics Data System (ADS)

Caraka, R. E.; Supari; Tahmid, M.

2018-04-01

Modelling, describing and measuring the structure dependences between different random events is at the very heart of statistics. Therefore, a broad variety of varying dependence concepts has been developed in the past. Most often, practitioners rely only on the linear correlation to describe the degree of dependence between two or more variables; an approach that can lead to quite misleading conclusions as this measure is only capable of capturing linear relationships. Copulas go beyond dependence measures and provide a sound framework for general dependence modelling. This paper will introduce an application of Copula to estimate, understand, and interpret the dependence structure in a given set of data El-Niño in Banyuwangi, Indonesia. In a nutshell, we proved the flexibility of Copulas Archimedean in rainfall modelling and catching phenomena of El Niño in Banyuwangi, East Java, Indonesia. Also, it was found that SST of nino3, nino4, and nino3.4 are most appropriate ENSO indicators in identifying the relationship of El Nino and rainfall.
Using spatial statistics to identify emerging hot spots of forest loss

NASA Astrophysics Data System (ADS)

Harris, Nancy L.; Goldman, Elizabeth; Gabris, Christopher; Nordling, Jon; Minnemeyer, Susan; Ansari, Stephen; Lippmann, Michael; Bennett, Lauren; Raad, Mansour; Hansen, Matthew; Potapov, Peter

2017-02-01

As sources of data for global forest monitoring grow larger, more complex and numerous, data analysis and interpretation become critical bottlenecks for effectively using them to inform land use policy discussions. Here in this paper, we present a method that combines big data analytical tools with Emerging Hot Spot Analysis (ArcGIS) to identify statistically significant spatiotemporal trends of forest loss in Brazil, Indonesia and the Democratic Republic of Congo (DRC) between 2000 and 2014. Results indicate that while the overall rate of forest loss in Brazil declined over the 14-year time period, spatiotemporal patterns of loss shifted, with forest loss significantly diminishing within the Amazonian states of Mato Grosso and Rondônia and intensifying within the cerrado biome. In Indonesia, forest loss intensified in Riau province in Sumatra and in Sukamara and West Kotawaringin regencies in Central Kalimantan. Substantial portions of West Kalimantan became new and statistically significant hot spots of forest loss in the years 2013 and 2014. Similarly, vast areas of DRC emerged as significant new hot spots of forest loss, with intensified loss radiating out from city centers such as Beni and Kisangani. While our results focus on identifying significant trends at the national scale, we also demonstrate the scalability of our approach to smaller or larger regions depending on the area of interest and specific research question involved. When combined with other contextual information, these statistical data models can help isolate the most significant clusters of loss occurring over dynamic forest landscapes and provide more coherent guidance for the allocation of resources for forest monitoring and enforcement efforts.
Economic poverty among children and adolescents in the Nordic countries.

PubMed

Povlsen, Lene; Regber, Susann; Fosse, Elisabeth; Karlsson, Leena Eklund; Gunnarsdottir, Hrafnhildur

2018-02-01

This study aimed to identify applied definitions and measurements of economic poverty and to explore the proportions and characteristics of children and adolescents living in economic poverty in Denmark, Finland, Iceland, Norway and Sweden during the last decade and to compare various statistics between the Nordic countries. Official data from central national authorities on statistics, national reports and European Union Statistics of income and living conditions data were collected and analysed during 2015-2016. The proportion of Nordic children living in economic poverty in 2014 ranged from 9.4% in Norway to 18.5% in Sweden. Compared with the European Union average, from 2004 to 2014 Nordic families with dependent children experienced fewer difficulties in making their money last, even though Icelandic families reported considerable difficulties. The characteristics of children living in economic poverty proved to be similar in the five countries and were related to their parents' level of education and employment, single-parent households and - in Denmark, Norway and Sweden - to immigrant background. In Finland, poverty among children was linked in particular to low income in employed households. This study showed that economic poverty among Nordic families with dependent children has increased during the latest decade, but it also showed that poverty rates are not necessarily connected to families' ability to make their money last. Therefore additional studies are needed to explore existing policies and political commitments in the Nordic countries to compensate families with dependent children living in poverty.
Reproducibility-optimized test statistic for ranking genes in microarray studies.

PubMed

Elo, Laura L; Filén, Sanna; Lahesmaa, Riitta; Aittokallio, Tero

2008-01-01

A principal goal of microarray studies is to identify the genes showing differential expression under distinct conditions. In such studies, the selection of an optimal test statistic is a crucial challenge, which depends on the type and amount of data under analysis. While previous studies on simulated or spike-in datasets do not provide practical guidance on how to choose the best method for a given real dataset, we introduce an enhanced reproducibility-optimization procedure, which enables the selection of a suitable gene- anking statistic directly from the data. In comparison with existing ranking methods, the reproducibilityoptimized statistic shows good performance consistently under various simulated conditions and on Affymetrix spike-in dataset. Further, the feasibility of the novel statistic is confirmed in a practical research setting using data from an in-house cDNA microarray study of asthma-related gene expression changes. These results suggest that the procedure facilitates the selection of an appropriate test statistic for a given dataset without relying on a priori assumptions, which may bias the findings and their interpretation. Moreover, the general reproducibilityoptimization procedure is not limited to detecting differential expression only but could be extended to a wide range of other applications as well.

Neural underpinnings of the identifiable victim effect: affect shifts preferences for giving.

PubMed

Genevsky, Alexander; Västfjäll, Daniel; Slovic, Paul; Knutson, Brian

2013-10-23

The "identifiable victim effect" refers to peoples' tendency to preferentially give to identified versus anonymous victims of misfortune, and has been proposed to partly depend on affect. By soliciting charitable donations from human subjects during behavioral and neural (i.e., functional magnetic resonance imaging) experiments, we sought to determine whether and how affect might promote the identifiable victim effect. Behaviorally, subjects gave more to orphans depicted by photographs versus silhouettes, and their shift in preferences was mediated by photograph-induced feelings of positive arousal, but not negative arousal. Neurally, while photographs versus silhouettes elicited activity in widespread circuits associated with facial and affective processing, only nucleus accumbens activity predicted and could statistically account for increased donations. Together, these findings suggest that presenting evaluable identifiable information can recruit positive arousal, which then promotes giving. We propose that affect elicited by identifiable stimuli can compel people to give more to strangers, even despite costs to the self.
Intrinsic alignment in redMaPPer clusters – II. Radial alignment of satellites towards cluster centres

DOE PAGES

Huang, Hung-Jin; Mandelbaum, Rachel; Freeman, Peter E.; ...

2017-11-23

We study the orientations of satellite galaxies in redMaPPer clusters constructed from the Sloan Digital Sky Survey at 0.1 < z < 0.35 to determine whether there is any preferential tendency for satellites to point radially towards cluster centres. Here, we analyse the satellite alignment (SA) signal based on three shape measurement methods (re-Gaussianization, de Vaucouleurs, and isophotal shapes), which trace galaxy light profiles at different radii. The measured SA signal depends on these shape measurement methods. We detect the strongest SA signal in isophotal shapes, followed by de Vaucouleurs shapes. While no net SA signal is detected using re-Gaussianizationmore » shapes across the entire sample, the observed SA signal reaches a statistically significant level when limiting to a subsample of higher luminosity satellites. We further investigate the impact of noise, systematics, and real physical isophotal twisting effects in the comparison between the SA signal detected via different shape measurement methods. Unlike previous studies, which only consider the dependence of SA on a few parameters, here we explore a total of 17 galaxy and cluster properties, using a statistical model averaging technique to naturally account for parameter correlations and identify significant SA predictors. We find that the measured SA signal is strongest for satellites with the following characteristics: higher luminosity, smaller distance to the cluster centre, rounder in shape, higher bulge fraction, and distributed preferentially along the major axis directions of their centrals. Finally, we provide physical explanations for the identified dependences and discuss the connection to theories of SA.« less
Intrinsic alignment in redMaPPer clusters – II. Radial alignment of satellites towards cluster centres

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huang, Hung-Jin; Mandelbaum, Rachel; Freeman, Peter E.

We study the orientations of satellite galaxies in redMaPPer clusters constructed from the Sloan Digital Sky Survey at 0.1 < z < 0.35 to determine whether there is any preferential tendency for satellites to point radially towards cluster centres. Here, we analyse the satellite alignment (SA) signal based on three shape measurement methods (re-Gaussianization, de Vaucouleurs, and isophotal shapes), which trace galaxy light profiles at different radii. The measured SA signal depends on these shape measurement methods. We detect the strongest SA signal in isophotal shapes, followed by de Vaucouleurs shapes. While no net SA signal is detected using re-Gaussianizationmore » shapes across the entire sample, the observed SA signal reaches a statistically significant level when limiting to a subsample of higher luminosity satellites. We further investigate the impact of noise, systematics, and real physical isophotal twisting effects in the comparison between the SA signal detected via different shape measurement methods. Unlike previous studies, which only consider the dependence of SA on a few parameters, here we explore a total of 17 galaxy and cluster properties, using a statistical model averaging technique to naturally account for parameter correlations and identify significant SA predictors. We find that the measured SA signal is strongest for satellites with the following characteristics: higher luminosity, smaller distance to the cluster centre, rounder in shape, higher bulge fraction, and distributed preferentially along the major axis directions of their centrals. Finally, we provide physical explanations for the identified dependences and discuss the connection to theories of SA.« less
Beyond passivity: Dependency as a risk factor for intimate partner violence.

PubMed

Kane, Fallon A; Bornstein, Robert F

2016-02-01

Interpersonal dependency in male perpetrators of intimate partner violence (IPV) is an understudied phenomenon but one that has noteworthy clinical implications. The present investigation used meta-analytic techniques to quantify the dependency-IPV link in all extant studies examining this relationship (n of studies = 17). Studies were gathered via an extensive literature search using relevant dependency/IPV search terms in the PsychInfo, Medline and Google Scholar databases. Results revealed a small but statistically significant relationship between dependency and perpetration of IPV in men (r = 0.150, Combined Z = 4.25, p < 0.0001), with the magnitude of the dependency-IPV link becoming stronger (r = 0.365, Combined Z = 6.00, p < 0.0001) when studies using measures of dependent personality disorder symptoms were omitted. Other moderators of the dependency-IPV effect size included IPV measure, type of sample and perpetrator age. These findings illuminate the underlying dynamics and interpersonal processes involved in some instances of IPV and may aid in understanding how to identify and treat male perpetrators of domestic violence. Copyright © 2015 John Wiley & Sons, Ltd.
Statistics of Point Vortex Turbulence in Non-neutral Flows and in Flows with Translational and Rotational Symmetries

NASA Astrophysics Data System (ADS)

Esler, J. G.

2017-12-01

A theory (Esler and Ashbee in J Fluid Mech 779:275-308, 2015) describing the statistics of N freely-evolving point vortices in a bounded two-dimensional domain is extended. First, the case of a non-neutral vortex gas is addressed, and it is shown that the density of states function can be identified with the probability density function of an infinite sum of independent non-central chi-squared random variables, the details of which depend only on the shape of the domain. Equations for the equilibrium energy spectrum and other statistical quantities follow, the validity of which are verified against direct numerical simulations of the equations of motion. Second, domains with additional conserved quantities associated with a symmetry (e.g., circle, periodic channel) are investigated, and it is shown that the treatment of the non-neutral case can be modified to account for the additional constraint.
Recombination in polymer-fullerene bulk heterojunction solar cells

NASA Astrophysics Data System (ADS)

Cowan, Sarah R.; Roy, Anshuman; Heeger, Alan J.

2010-12-01

Recombination of photogenerated charge carriers in polymer bulk heterojunction (BHJ) solar cells reduces the short circuit current (Jsc) and the fill factor (FF). Identifying the mechanism of recombination is, therefore, fundamentally important for increasing the power conversion efficiency. Light intensity and temperature-dependent current-voltage measurements on polymer BHJ cells made from a variety of different semiconducting polymers and fullerenes show that the recombination kinetics are voltage dependent and evolve from first-order recombination at short circuit to bimolecular recombination at open circuit as a result of increasing the voltage-dependent charge carrier density in the cell. The “missing 0.3 V” inferred from comparison of the band gaps of the bulk heterojunction materials and the measured open-circuit voltage at room-temperature results from the temperature dependence of the quasi-Fermi levels in the polymer and fullerene domains—a conclusion based on the fundamental statistics of fermions.
An Overview of Interrater Agreement on Likert Scales for Researchers and Practitioners

PubMed Central

O'Neill, Thomas A.

2017-01-01

Applications of interrater agreement (IRA) statistics for Likert scales are plentiful in research and practice. IRA may be implicated in job analysis, performance appraisal, panel interviews, and any other approach to gathering systematic observations. Any rating system involving subject-matter experts can also benefit from IRA as a measure of consensus. Further, IRA is fundamental to aggregation in multilevel research, which is becoming increasingly common in order to address nesting. Although, several technical descriptions of a few specific IRA statistics exist, this paper aims to provide a tractable orientation to common IRA indices to support application. The introductory overview is written with the intent of facilitating contrasts among IRA statistics by critically reviewing equations, interpretations, strengths, and weaknesses. Statistics considered include rwg, rwg*, r′wg, rwg(p), average deviation (AD), awg, standard deviation (Swg), and the coefficient of variation (CVwg). Equations support quick calculation and contrasting of different agreement indices. The article also includes a “quick reference” table and three figures in order to help readers identify how IRA statistics differ and how interpretations of IRA will depend strongly on the statistic employed. A brief consideration of recommended practices involving statistical and practical cutoff standards is presented, and conclusions are offered in light of the current literature. PMID:28553257
Local dependence in random graph models: characterization, properties and statistical inference

PubMed Central

Schweinberger, Michael; Handcock, Mark S.

2015-01-01

Summary Dependent phenomena, such as relational, spatial and temporal phenomena, tend to be characterized by local dependence in the sense that units which are close in a well-defined sense are dependent. In contrast with spatial and temporal phenomena, though, relational phenomena tend to lack a natural neighbourhood structure in the sense that it is unknown which units are close and thus dependent. Owing to the challenge of characterizing local dependence and constructing random graph models with local dependence, many conventional exponential family random graph models induce strong dependence and are not amenable to statistical inference. We take first steps to characterize local dependence in random graph models, inspired by the notion of finite neighbourhoods in spatial statistics and M-dependence in time series, and we show that local dependence endows random graph models with desirable properties which make them amenable to statistical inference. We show that random graph models with local dependence satisfy a natural domain consistency condition which every model should satisfy, but conventional exponential family random graph models do not satisfy. In addition, we establish a central limit theorem for random graph models with local dependence, which suggests that random graph models with local dependence are amenable to statistical inference. We discuss how random graph models with local dependence can be constructed by exploiting either observed or unobserved neighbourhood structure. In the absence of observed neighbourhood structure, we take a Bayesian view and express the uncertainty about the neighbourhood structure by specifying a prior on a set of suitable neighbourhood structures. We present simulation results and applications to two real world networks with ‘ground truth’. PMID:26560142
Outcomes Associated with Adolescent Marijuana and Alcohol Use Among Urban Young Adults: A Prospective Study

PubMed Central

Green, Kerry M.; Musci, Rashelle J.; Johnson, Renee M.; Matson, Pamela A.; Reboussin, Beth A.; Ialongo, Nicholas S.

2015-01-01

Objective This study identifies and compares outcomes in young adulthood associated with longitudinal patterns of alcohol and marijuana use during adolescence among urban youth. Method Data come from a cohort of 678 urban, predominantly Black children followed from ages 6–25 (1993–2012). Analyses are based on the 608 children who participated over time (53.6% male). Longitudinal patterning of alcohol and marijuana use were based on annual frequency reports from grades 8–12 and estimated through latent profile analysis. Results We identified four classes of alcohol and marijuana use including Non-Use (47%), Moderate Alcohol Use (28%), Moderate Alcohol/Increasing Marijuana Use (12%) and High Dual Use (13%). A marijuana only class was not identified. Analyses show negative outcomes in adulthood associated with all three adolescent substance use classes. Compared to the non-use class, all use classes had statistically significantly higher rates of substance dependence. Those in the ‘High Dual Use’ class had the lowest rate of high school graduation. Comparing classes with similar alcohol but different marijuana patterns, the ‘Moderate Alcohol/Increasing Marijuana Use’ class had a statistically significant increased risk of having a criminal justice record and developing substance use dependence in adulthood. Conclusion Among urban youth, heterogeneous patterns of alcohol and marijuana use across adolescence are evident, and these patterns are associated with distinct outcomes in adulthood. These findings suggest a need for targeted education and intervention efforts to address the needs of youth using both marijuana and alcohol, as well as the importance of universal early preventive intervention efforts. PMID:26517712
Review of Factors, Methods, and Outcome Definition in Designing Opioid Abuse Predictive Models.

PubMed

Alzeer, Abdullah H; Jones, Josette; Bair, Matthew J

2018-05-01

Several opioid risk assessment tools are available to prescribers to evaluate opioid analgesic abuse among chronic patients. The objectives of this study are to 1) identify variables available in the literature to predict opioid abuse; 2) explore and compare methods (population, database, and analysis) used to develop statistical models that predict opioid abuse; and 3) understand how outcomes were defined in each statistical model predicting opioid abuse. The OVID database was searched for this study. The search was limited to articles written in English and published from January 1990 to April 2016. This search generated 1,409 articles. Only seven studies and nine models met our inclusion-exclusion criteria. We found nine models and identified 75 distinct variables. Three studies used administrative claims data, and four studies used electronic health record data. The majority, four out of seven articles (six out of nine models), were primarily dependent on the presence or absence of opioid abuse or dependence (ICD-9 diagnosis code) to define opioid abuse. However, two articles used a predefined list of opioid-related aberrant behaviors. We identified variables used to predict opioid abuse from electronic health records and administrative data. Medication variables are the recurrent variables in the articles reviewed (33 variables). Age and gender are the most consistent demographic variables in predicting opioid abuse. Overall, there is similarity in the sampling method and inclusion/exclusion criteria (age, number of prescriptions, follow-up period, and data analysis methods). Intuitive research to utilize unstructured data may increase opioid abuse models' accuracy.
The smart cluster method. Adaptive earthquake cluster identification and analysis in strong seismic regions

NASA Astrophysics Data System (ADS)

Schaefer, Andreas M.; Daniell, James E.; Wenzel, Friedemann

2017-07-01

Earthquake clustering is an essential part of almost any statistical analysis of spatial and temporal properties of seismic activity. The nature of earthquake clusters and subsequent declustering of earthquake catalogues plays a crucial role in determining the magnitude-dependent earthquake return period and its respective spatial variation for probabilistic seismic hazard assessment. This study introduces the Smart Cluster Method (SCM), a new methodology to identify earthquake clusters, which uses an adaptive point process for spatio-temporal cluster identification. It utilises the magnitude-dependent spatio-temporal earthquake density to adjust the search properties, subsequently analyses the identified clusters to determine directional variation and adjusts its search space with respect to directional properties. In the case of rapid subsequent ruptures like the 1992 Landers sequence or the 2010-2011 Darfield-Christchurch sequence, a reclassification procedure is applied to disassemble subsequent ruptures using near-field searches, nearest neighbour classification and temporal splitting. The method is capable of identifying and classifying earthquake clusters in space and time. It has been tested and validated using earthquake data from California and New Zealand. A total of more than 1500 clusters have been found in both regions since 1980 with M m i n = 2.0. Utilising the knowledge of cluster classification, the method has been adjusted to provide an earthquake declustering algorithm, which has been compared to existing methods. Its performance is comparable to established methodologies. The analysis of earthquake clustering statistics lead to various new and updated correlation functions, e.g. for ratios between mainshock and strongest aftershock and general aftershock activity metrics.
A study of students' learning styles and mathematics anxiety amongst form four students in Kerian Perak

NASA Astrophysics Data System (ADS)

Esa, Suraya; Mohamed, Nurul Akmal

2017-05-01

This study aims to identify the relationship between students' learning styles and mathematics anxiety amongst Form Four students in Kerian, Perak. The study involves 175 Form Four students as respondents. The instrument which is used to assess the students' learning styles and mathematic anxiety is adapted from the Grasha's Learning Styles Inventory and the Mathematics Anxiety Scale (MAS) respectively. The types of learning styles used are independent, avoidant, collaborative, dependent, competitive and participant. The collected data is processed by SPSS (Statistical Packages for Social Sciences 16.0). The data is analysed by using descriptive statistics and inferential statistics that include t-test and Pearson correlation. The results show that majority of the students adopt collaborative learning style and the students have moderate level of mathematics anxiety. Moreover, it is found that there is significant difference between learning style avoidant, collaborative, dependent and participant based on gender. Amongst all students' learning style, there exists a weak but significant correlation between avoidant, independent and participant learning style and mathematics anxiety. It is very important for the teachers need to be concerned about the effects of learning styles on mathematics anxiety. Therefore, the teachers should understand mathematics anxiety and implement suitable learning strategies in order for the students to overcome their mathematics anxiety.
Euclidean distance can identify the mannitol level that produces the most remarkable integral effect on sugarcane micropropagation in temporary immersion bioreactors.

PubMed

Gómez, Daviel; Hernández, L Ázaro; Yabor, Lourdes; Beemster, Gerrit T S; Tebbe, Christoph C; Papenbrock, Jutta; Lorenzo, José Carlos

2018-03-15

Plant scientists usually record several indicators in their abiotic factor experiments. The common statistical management involves univariate analyses. Such analyses generally create a split picture of the effects of experimental treatments since each indicator is addressed independently. The Euclidean distance combined with the information of the control treatment could have potential as an integrating indicator. The Euclidean distance has demonstrated its usefulness in many scientific fields but, as far as we know, it has not yet been employed for plant experimental analyses. To exemplify the use of the Euclidean distance in this field, we performed an experiment focused on the effects of mannitol on sugarcane micropropagation in temporary immersion bioreactors. Five mannitol concentrations were compared: 0, 50, 100, 150 and 200 mM. As dependent variables we recorded shoot multiplication rate, fresh weight, and levels of aldehydes, chlorophylls, carotenoids and phenolics. The statistical protocol which we then carried out integrated all dependent variables to easily identify the mannitol concentration that produced the most remarkable integral effect. Results provided by the Euclidean distance demonstrate a gradually increasing distance from the control in function of increasing mannitol concentrations. 200 mM mannitol caused the most significant alteration of sugarcane biochemistry and physiology under the experimental conditions described here. This treatment showed the longest statistically significant Euclidean distance to the control treatment (2.38). In contrast, 50 and 100 mM mannitol showed the lowest Euclidean distances (0.61 and 0.84, respectively) and thus poor integrated effects of mannitol. The analysis shown here indicates that the use of the Euclidean distance can contribute to establishing a more integrated evaluation of the contrasting mannitol treatments.
Indicators of Dysphagia in Aged Care Facilities.

PubMed

Pu, Dai; Murry, Thomas; Wong, May C M; Yiu, Edwin M L; Chan, Karen M K

2017-09-18

The current cross-sectional study aimed to investigate risk factors for dysphagia in elderly individuals in aged care facilities. A total of 878 individuals from 42 aged care facilities were recruited for this study. The dependent outcome was speech therapist-determined swallowing function. Independent factors were Eating Assessment Tool score, oral motor assessment score, Mini-Mental State Examination, medical history, and various functional status ratings. Binomial logistic regression was used to identify independent variables associated with dysphagia in this cohort. Two statistical models were constructed. Model 1 used variables from case files without the need for hands-on assessment, and Model 2 used variables that could be obtained from hands-on assessment. Variables positively associated with dysphagia identified in Model 1 were male gender, total dependence for activities of daily living, need for feeding assistance, mobility, requiring assistance walking or using a wheelchair, and history of pneumonia. Variables positively associated with dysphagia identified in Model 2 were Mini-Mental State Examination score, edentulousness, and oral motor assessments score. Cognitive function, dentition, and oral motor function are significant indicators associated with the presence of swallowing in the elderly. When assessing the frail elderly, case file information can help clinicians identify frail elderly individuals who may be suffering from dysphagia.
Probing stochastic inter-galactic magnetic fields using blazar-induced gamma ray halo morphology

NASA Astrophysics Data System (ADS)

Duplessis, Francis; Vachaspati, Tanmay

2017-05-01

Inter-galactic magnetic fields can imprint their structure on the morphology of blazar-induced gamma ray halos. We show that the halo morphology arises through the interplay of the source's jet and a two-dimensional surface dictated by the magnetic field. Through extensive numerical simulations, we generate mock halos created by stochastic magnetic fields with and without helicity, and study the dependence of the halo features on the properties of the magnetic field. We propose a sharper version of the Q-statistics and demonstrate its sensitivity to the magnetic field strength, the coherence scale, and the handedness of the helicity. We also identify and explain a new feature of the Q-statistics that can further enhance its power.
Computational pathology: Exploring the spatial dimension of tumor ecology.

PubMed

Nawaz, Sidra; Yuan, Yinyin

2016-09-28

Tumors are evolving ecosystems where cancer subclones and the microenvironment interact. This is analogous to interaction dynamics between species in their natural habitats, which is a prime area of study in ecology. Spatial statistics are frequently used in ecological studies to infer complex relations including predator-prey, resource dependency and co-evolution. Recently, the emerging field of computational pathology has enabled high-throughput spatial analysis by using image processing to identify different cell types and their locations within histological tumor samples. We discuss how these data may be analyzed with spatial statistics used in ecology to reveal patterns and advance our understanding of ecological interactions occurring among cancer cells and their microenvironment. Copyright © 2015 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
[Suicide due to mental diseases based on the Vital Statistics Survey Death Form].

PubMed

Takizawa, Tohru

2012-06-01

Mental diseases such as schizophrenia and depression put patients at risk for suicide. It is extremely important to understand that one way of preventing suicide is to determine the actual mental state of the individual. The purpose of this study was to analyze the true mental state of suicide victims reported in the vital statistics. This study investigated the vital statistics of 30,299 suicide victims in Japan in 2008. The use of these basic statistics for non-statistical purposes was approved by the Japanese Ministry of Health, Labour and Welfare. The method involved reviewing the Vital Statistics Survey Death Form at the Ministry of Health, Labour and Welfare as well as analyzing their Online Reporting of Vital Statistics. Furthermore, this study was able to validate 29,799 of the 30,299 suicides (98.3%) that occurred in 2008. Mental diseases were validated not only from the "Cause of death" section as marked on the death certificate, but also by information found in sections for "Additional items for death by external cause" and "Other special remarks." RESULTS; From the Vital Statistics Survey Death Form and Online Reporting of Vital Statistics, 2964 individuals with either a mental disease or mental disorder were identified. Of the 2964 identified individuals, 55 had dementia (of which 13 were dementia in Alzheimer's disease), 116 had alcohol dependence/psychotic disorder, 550 had schizophrenia, 101 had bipolar affective disorder, 1,913 has had a depressive episode, 13 had obsessive-compulsive disorder, 22 had adjustment disorders, 14 had eating disorders, 49 had nonorganic sleep disorders, 24 had personality disorder, and 6 had pervasive developmental disorders. In addition, 125 individuals had more than one mental disease. The national police statistics from 2008 show that 1,368 suicide victims had schizophrenia and 6,490 had depression. These figures show quite a difference between the results of this study and the police statistics. Further, there have been controversies regarding autopsies of suicide victims. Thus, further investigation into the cause of death is of great importance.
Time-dependent quantum oscillator as attenuator and amplifier: noise and statistical evolutions

NASA Astrophysics Data System (ADS)

Portes, D.; Rodrigues, H.; Duarte, S. B.; Baseia, B.

2004-10-01

We revisit the quantum oscillator, modelled as a time-dependent LC-circuit. Nonclassical properties concerned with attenuation and amplification regions are considered, as well as time evolution of quantum noise and statistics, with emphasis on revivals of the statistical distribution.
The fragility of statistically significant findings from randomized trials in head and neck surgery.

PubMed

Noel, Christopher W; McMullen, Caitlin; Yao, Christopher; Monteiro, Eric; Goldstein, David P; Eskander, Antoine; de Almeida, John R

2018-04-23

The Fragility Index (FI) is a novel tool for evaluating the robustness of statistically significant findings in a randomized control trial (RCT). It measures the number of events upon which statistical significance depends. We sought to calculate the FI scores for RCTs in the head and neck cancer literature where surgery was a primary intervention. Potential articles were identified in PubMed (MEDLINE), Embase, and Cochrane without publication date restrictions. Two reviewers independently screened eligible RCTs reporting at least one dichotomous and statistically significant outcome. The data from each trial were extracted and the FI scores were calculated. Associations between trial characteristics and FI were determined. In total, 27 articles were identified. The median sample size was 67.5 (interquartile range [IQR] = 42-143) and the median number of events per trial was 8 (IQR = 2.25-18.25). The median FI score was 1 (IQR = 0-2.5), meaning that changing one patient from a nonevent to an event in the treatment arm would change the result to a statistically nonsignificant result, or P > .05. The FI score was less than the number of patients lost to follow-up in 71% of cases. The FI score was found to be moderately correlated with P value (ρ = -0.52, P = .007) and with journal impact factor (ρ = 0.49, P = .009) on univariable analysis. On multivariable analysis, only the P value was found to be a predictor of FI score (P = .001). Randomized trials in the head and neck cancer literature where surgery is a primary modality are relatively nonrobust statistically with low FI scores. Laryngoscope, 2018. © 2018 The American Laryngological, Rhinological and Otological Society, Inc.
Dimensional Reduction for the General Markov Model on Phylogenetic Trees.

PubMed

Sumner, Jeremy G

2017-03-01

We present a method of dimensional reduction for the general Markov model of sequence evolution on a phylogenetic tree. We show that taking certain linear combinations of the associated random variables (site pattern counts) reduces the dimensionality of the model from exponential in the number of extant taxa, to quadratic in the number of taxa, while retaining the ability to statistically identify phylogenetic divergence events. A key feature is the identification of an invariant subspace which depends only bilinearly on the model parameters, in contrast to the usual multi-linear dependence in the full space. We discuss potential applications including the computation of split (edge) weights on phylogenetic trees from observed sequence data.

Markov Logic Networks in the Analysis of Genetic Data

PubMed Central

Sakhanenko, Nikita A.

2010-01-01

Abstract Complex, non-additive genetic interactions are common and can be critical in determining phenotypes. Genome-wide association studies (GWAS) and similar statistical studies of linkage data, however, assume additive models of gene interactions in looking for genotype-phenotype associations. These statistical methods view the compound effects of multiple genes on a phenotype as a sum of influences of each gene and often miss a substantial part of the heritable effect. Such methods do not use any biological knowledge about underlying mechanisms. Modeling approaches from the artificial intelligence (AI) field that incorporate deterministic knowledge into models to perform statistical analysis can be applied to include prior knowledge in genetic analysis. We chose to use the most general such approach, Markov Logic Networks (MLNs), for combining deterministic knowledge with statistical analysis. Using simple, logistic regression-type MLNs we can replicate the results of traditional statistical methods, but we also show that we are able to go beyond finding independent markers linked to a phenotype by using joint inference without an independence assumption. The method is applied to genetic data on yeast sporulation, a complex phenotype with gene interactions. In addition to detecting all of the previously identified loci associated with sporulation, our method identifies four loci with smaller effects. Since their effect on sporulation is small, these four loci were not detected with methods that do not account for dependence between markers due to gene interactions. We show how gene interactions can be detected using more complex models, which can be used as a general framework for incorporating systems biology with genetics. PMID:20958249
Understanding sexual orientation and health in Canada: Who are we capturing and who are we missing using the Statistics Canada sexual orientation question?

PubMed

Dharma, Christoffer; Bauer, Greta R

2017-04-20

Public health research on inequalities in Canada depends heavily on population data sets such as the Canadian Community Health Survey. While sexual orientation has three dimensions - identity, behaviour and attraction - Statistics Canada and public health agencies assess sexual orientation with a single questionnaire item on identity, defined behaviourally. This study aims to evaluate this item, to allow for clearer interpretation of sexual orientation frequencies and inequalities. Through an online convenience sampling of Canadians ≥14 years of age, participants (n = 311) completed the Statistics Canada question and a second set of sexual orientation questions. The single-item question had an 85.8% sensitivity in capturing sexual minorities, broadly defined by their sexual identity, lifetime behaviour and attraction. Kappa statistic for agreement between the single item and sexual identity was 0.89; with past year, lifetime behaviour and attraction were 0.39, 0.48 and 0.57 respectively. The item captured 99.3% of those with a sexual minority identity, 84.2% of those with any lifetime same-sex partners, 98.4% with a past-year same-sex partner, and 97.8% who indicated at least equal attraction to same-sex persons. Findings from Statistics Canada surveys can be best interpreted as applying to those who identify as sexual minorities. Analyses using this measure will underidentify those with same-sex partners or attractions who do not identify as a sexual minority, and should be interpreted accordingly. To understand patterns of sexual minority health in Canada, there is a need to incorporate other dimensions of sexual orientation.
Entropy production in a photovoltaic cell

NASA Astrophysics Data System (ADS)

Ansari, Mohammad H.

2017-05-01

We evaluate entropy production in a photovoltaic cell that is modeled by four electronic levels resonantly coupled to thermally populated field modes at different temperatures. We use a formalism recently proposed, the so-called multiple parallel worlds, to consistently address the nonlinearity of entropy in terms of density matrix. Our result shows that entropy production is the difference between two flows: a semiclassical flow that linearly depends on occupational probabilities, and another flow that depends nonlinearly on quantum coherence and has no semiclassical analog. We show that entropy production in the cells depends on environmentally induced decoherence time and energy detuning. We characterize regimes where reversal flow of information takes place from a cold to hot bath. Interestingly, we identify a lower bound on entropy production, which sets limitations on the statistics of dissipated heat in the cells.
Harnessing the complexity of gene expression data from cancer: from single gene to structural pathway methods

PubMed Central

2012-01-01

High-dimensional gene expression data provide a rich source of information because they capture the expression level of genes in dynamic states that reflect the biological functioning of a cell. For this reason, such data are suitable to reveal systems related properties inside a cell, e.g., in order to elucidate molecular mechanisms of complex diseases like breast or prostate cancer. However, this is not only strongly dependent on the sample size and the correlation structure of a data set, but also on the statistical hypotheses tested. Many different approaches have been developed over the years to analyze gene expression data to (I) identify changes in single genes, (II) identify changes in gene sets or pathways, and (III) identify changes in the correlation structure in pathways. In this paper, we review statistical methods for all three types of approaches, including subtypes, in the context of cancer data and provide links to software implementations and tools and address also the general problem of multiple hypotheses testing. Further, we provide recommendations for the selection of such analysis methods. Reviewers This article was reviewed by Arcady Mushegian, Byung-Soo Kim and Joel Bader. PMID:23227854
The Essential Genome of Escherichia coli K-12

PubMed Central

2018-01-01

ABSTRACT Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. PMID:29463657
Using the Bootstrap Method to Evaluate the Critical Range of Misfit for Polytomous Rasch Fit Statistics.

PubMed

Seol, Hyunsoo

2016-06-01

The purpose of this study was to apply the bootstrap procedure to evaluate how the bootstrapped confidence intervals (CIs) for polytomous Rasch fit statistics might differ according to sample sizes and test lengths in comparison with the rule-of-thumb critical value of misfit. A total of 25 simulated data sets were generated to fit the Rasch measurement and then a total of 1,000 replications were conducted to compute the bootstrapped CIs under each of 25 testing conditions. The results showed that rule-of-thumb critical values for assessing the magnitude of misfit were not applicable because the infit and outfit mean square error statistics showed different magnitudes of variability over testing conditions and the standardized fit statistics did not exactly follow the standard normal distribution. Further, they also do not share the same critical range for the item and person misfit. Based on the results of the study, the bootstrapped CIs can be used to identify misfitting items or persons as they offer a reasonable alternative solution, especially when the distributions of the infit and outfit statistics are not well known and depend on sample size. © The Author(s) 2016.
Local statistics of retinal optic flow for self-motion through natural sceneries.

PubMed

Calow, Dirk; Lappe, Markus

2007-12-01

Image analysis in the visual system is well adapted to the statistics of natural scenes. Investigations of natural image statistics have so far mainly focused on static features. The present study is dedicated to the measurement and the analysis of the statistics of optic flow generated on the retina during locomotion through natural environments. Natural locomotion includes bouncing and swaying of the head and eye movement reflexes that stabilize gaze onto interesting objects in the scene while walking. We investigate the dependencies of the local statistics of optic flow on the depth structure of the natural environment and on the ego-motion parameters. To measure these dependencies we estimate the mutual information between correlated data sets. We analyze the results with respect to the variation of the dependencies over the visual field, since the visual motions in the optic flow vary depending on visual field position. We find that retinal flow direction and retinal speed show only minor statistical interdependencies. Retinal speed is statistically tightly connected to the depth structure of the scene. Retinal flow direction is statistically mostly driven by the relation between the direction of gaze and the direction of ego-motion. These dependencies differ at different visual field positions such that certain areas of the visual field provide more information about ego-motion and other areas provide more information about depth. The statistical properties of natural optic flow may be used to tune the performance of artificial vision systems based on human imitating behavior, and may be useful for analyzing properties of natural vision systems.
Assessing colour-dependent occupation statistics inferred from galaxy group catalogues

NASA Astrophysics Data System (ADS)

Campbell, Duncan; van den Bosch, Frank C.; Hearin, Andrew; Padmanabhan, Nikhil; Berlind, Andreas; Mo, H. J.; Tinker, Jeremy; Yang, Xiaohu

2015-09-01

We investigate the ability of current implementations of galaxy group finders to recover colour-dependent halo occupation statistics. To test the fidelity of group catalogue inferred statistics, we run three different group finders used in the literature over a mock that includes galaxy colours in a realistic manner. Overall, the resulting mock group catalogues are remarkably similar, and most colour-dependent statistics are recovered with reasonable accuracy. However, it is also clear that certain systematic errors arise as a consequence of correlated errors in group membership determination, central/satellite designation, and halo mass assignment. We introduce a new statistic, the halo transition probability (HTP), which captures the combined impact of all these errors. As a rule of thumb, errors tend to equalize the properties of distinct galaxy populations (i.e. red versus blue galaxies or centrals versus satellites), and to result in inferred occupation statistics that are more accurate for red galaxies than for blue galaxies. A statistic that is particularly poorly recovered from the group catalogues is the red fraction of central galaxies as a function of halo mass. Group finders do a good job in recovering galactic conformity, but also have a tendency to introduce weak conformity when none is present. We conclude that proper inference of colour-dependent statistics from group catalogues is best achieved using forward modelling (i.e. running group finders over mock data) or by implementing a correction scheme based on the HTP, as long as the latter is not too strongly model dependent.
Manipulating measurement scales in medical statistical analysis and data mining: A review of methodologies

PubMed Central

Marateb, Hamid Reza; Mansourian, Marjan; Adibi, Peyman; Farina, Dario

2014-01-01

Background: selecting the correct statistical test and data mining method depends highly on the measurement scale of data, type of variables, and purpose of the analysis. Different measurement scales are studied in details and statistical comparison, modeling, and data mining methods are studied based upon using several medical examples. We have presented two ordinal–variables clustering examples, as more challenging variable in analysis, using Wisconsin Breast Cancer Data (WBCD). Ordinal-to-Interval scale conversion example: a breast cancer database of nine 10-level ordinal variables for 683 patients was analyzed by two ordinal-scale clustering methods. The performance of the clustering methods was assessed by comparison with the gold standard groups of malignant and benign cases that had been identified by clinical tests. Results: the sensitivity and accuracy of the two clustering methods were 98% and 96%, respectively. Their specificity was comparable. Conclusion: by using appropriate clustering algorithm based on the measurement scale of the variables in the study, high performance is granted. Moreover, descriptive and inferential statistics in addition to modeling approach must be selected based on the scale of the variables. PMID:24672565
Technical Note: The Initial Stages of Statistical Data Analysis

PubMed Central

Tandy, Richard D.

1998-01-01

Objective: To provide an overview of several important data-related considerations in the design stage of a research project and to review the levels of measurement and their relationship to the statistical technique chosen for the data analysis. Background: When planning a study, the researcher must clearly define the research problem and narrow it down to specific, testable questions. The next steps are to identify the variables in the study, decide how to group and treat subjects, and determine how to measure, and the underlying level of measurement of, the dependent variables. Then the appropriate statistical technique can be selected for data analysis. Description: The four levels of measurement in increasing complexity are nominal, ordinal, interval, and ratio. Nominal data are categorical or “count” data, and the numbers are treated as labels. Ordinal data can be ranked in a meaningful order by magnitude. Interval data possess the characteristics of ordinal data and also have equal distances between levels. Ratio data have a natural zero point. Nominal and ordinal data are analyzed with nonparametric statistical techniques and interval and ratio data with parametric statistical techniques. Advantages: Understanding the four levels of measurement and when it is appropriate to use each is important in determining which statistical technique to use when analyzing data. PMID:16558489
[Continuity of hospital identifiers in hospital discharge data - Analysis of the nationwide German DRG Statistics from 2005 to 2013].

PubMed

Nimptsch, Ulrike; Wengler, Annelene; Mansky, Thomas

2016-11-01

In Germany, nationwide hospital discharge data (DRG statistics provided by the research data centers of the Federal Statistical Office and the Statistical Offices of the 'Länder') are increasingly used as data source for health services research. Within this data hospitals can be separated via their hospital identifier ([Institutionskennzeichen] IK). However, this hospital identifier primarily designates the invoicing unit and is not necessarily equivalent to one hospital location. Aiming to investigate direction and extent of possible bias in hospital-level analyses this study examines the continuity of the hospital identifier within a cross-sectional and longitudinal approach and compares the results to official hospital census statistics. Within the DRG statistics from 2005 to 2013 the annual number of hospitals as classified by hospital identifiers was counted for each year of observation. The annual number of hospitals derived from DRG statistics was compared to the number of hospitals in the official census statistics 'Grunddaten der Krankenhäuser'. Subsequently, the temporal continuity of hospital identifiers in the DRG statistics was analyzed within cohorts of hospitals. Until 2013, the annual number of hospital identifiers in the DRG statistics fell by 175 (from 1,725 to 1,550). This decline affected only providers with small or medium case volume. The number of hospitals identified in the DRG statistics was lower than the number given in the census statistics (e.g., in 2013 1,550 IK vs. 1,668 hospitals in the census statistics). The longitudinal analyses revealed that the majority of hospital identifiers persisted in the years of observation, while one fifth of hospital identifiers changed. In cross-sectional studies of German hospital discharge data the separation of hospitals via the hospital identifier might lead to underestimating the number of hospitals and consequential overestimation of caseload per hospital. Discontinuities of hospital identifiers over time might impair the follow-up of hospital cohorts. These limitations must be taken into account in analyses of German hospital discharge data focusing on the hospital level. Copyright © 2016. Published by Elsevier GmbH.
Collective stimulated Brillouin backscatter

NASA Astrophysics Data System (ADS)

Lushnikov, Pavel; Rose, Harvey

2007-11-01

We develop the statistical theory of linear collective stimulated Brillouin backscatter (CBSBS) in spatially and temporally incoherent laser beam. Instability is collective because it does not depend on the dynamics of isolated hot spots (speckles) of laser intensity, but rather depends on averaged laser beam intensity, optic f/#, and laser coherence time, Tc. CBSBS has a much larger threshold than a classical coherent beam's in long-scale-length high temperature plasma. It is a novel regime in which Tc is too large for applicability of well-known statistical theories (RPA) but Tc must be small enough to suppress single speckle processes such as self-focusing. Even if laser Tc is too large for a priori applicability of our theory, collective forward SBS^1, perhaps enhanced by high Z dopant, and its resultant self-induced Tc reduction, may regain the CBSBS regime. We identified convective and absolute CBSBS regimes. The threshold of convective instability is inside the typical parameter region of NIF designs. Well above incoherent threshold, the coherent instability growth rate is recovered. ^1 P.M. Lushnikov and H.A. Rose, Plasma Physics and Controlled Fusion, 48, 1501 (2006).
Characterizing the Joint Effect of Diverse Test-Statistic Correlation Structures and Effect Size on False Discovery Rates in a Multiple-Comparison Study of Many Outcome Measures

NASA Technical Reports Server (NTRS)

Feiveson, Alan H.; Ploutz-Snyder, Robert; Fiedler, James

2011-01-01

In their 2009 Annals of Statistics paper, Gavrilov, Benjamini, and Sarkar report the results of a simulation assessing the robustness of their adaptive step-down procedure (GBS) for controlling the false discovery rate (FDR) when normally distributed test statistics are serially correlated. In this study we extend the investigation to the case of multiple comparisons involving correlated non-central t-statistics, in particular when several treatments or time periods are being compared to a control in a repeated-measures design with many dependent outcome measures. In addition, we consider several dependence structures other than serial correlation and illustrate how the FDR depends on the interaction between effect size and the type of correlation structure as indexed by Foerstner s distance metric from an identity. The relationship between the correlation matrix R of the original dependent variables and R, the correlation matrix of associated t-statistics is also studied. In general R depends not only on R, but also on sample size and the signed effect sizes for the multiple comparisons.
Identifying Pleiotropic Genes in Genome-Wide Association Studies for Multivariate Phenotypes with Mixed Measurement Scales

PubMed Central

Williams, L. Keoki; Buu, Anne

2017-01-01

We propose a multivariate genome-wide association test for mixed continuous, binary, and ordinal phenotypes. A latent response model is used to estimate the correlation between phenotypes with different measurement scales so that the empirical distribution of the Fisher’s combination statistic under the null hypothesis is estimated efficiently. The simulation study shows that our proposed correlation estimation methods have high levels of accuracy. More importantly, our approach conservatively estimates the variance of the test statistic so that the type I error rate is controlled. The simulation also shows that the proposed test maintains the power at the level very close to that of the ideal analysis based on known latent phenotypes while controlling the type I error. In contrast, conventional approaches–dichotomizing all observed phenotypes or treating them as continuous variables–could either reduce the power or employ a linear regression model unfit for the data. Furthermore, the statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that conducting a multivariate test on multiple phenotypes can increase the power of identifying markers that may not be, otherwise, chosen using marginal tests. The proposed method also offers a new approach to analyzing the Fagerström Test for Nicotine Dependence as multivariate phenotypes in genome-wide association studies. PMID:28081206
Identifying trends in climate: an application to the cenozoic

NASA Astrophysics Data System (ADS)

Richards, Gordon R.

1998-05-01

The recent literature on trending in climate has raised several issues, whether trends should be modeled as deterministic or stochastic, whether trends are nonlinear, and the relative merits of statistical models versus models based on physics. This article models trending since the late Cretaceous. This 68 million-year interval is selected because the reliability of tests for trending is critically dependent on the length of time spanned by the data. Two main hypotheses are tested, that the trend has been caused primarily by CO2 forcing, and that it reflects a variety of forcing factors which can be approximated by statistical methods. The CO2 data is obtained from model simulations. Several widely-used statistical models are found to be inadequate. ARIMA methods parameterize too much of the short-term variation, and do not identify low frequency movements. Further, the unit root in the ARIMA process does not predict the long-term path of temperature. Spectral methods also have little ability to predict temperature at long horizons. Instead, the statistical trend is estimated using a nonlinear smoothing filter. Both of these paradigms make it possible to model climate as a cointegrated process, in which temperature can wander quite far from the trend path in the intermediate term, but converges back over longer horizons. Comparing the forecasting properties of the two trend models demonstrates that the optimal forecasting model includes CO2 forcing and a parametric representation of the nonlinear variability in climate.
Successful classification of cocaine dependence using brain imaging: a generalizable machine learning approach.

PubMed

Mete, Mutlu; Sakoglu, Unal; Spence, Jeffrey S; Devous, Michael D; Harris, Thomas S; Adinoff, Bryon

2016-10-06

Neuroimaging studies have yielded significant advances in the understanding of neural processes relevant to the development and persistence of addiction. However, these advances have not explored extensively for diagnostic accuracy in human subjects. The aim of this study was to develop a statistical approach, using a machine learning framework, to correctly classify brain images of cocaine-dependent participants and healthy controls. In this study, a framework suitable for educing potential brain regions that differed between the two groups was developed and implemented. Single Photon Emission Computerized Tomography (SPECT) images obtained during rest or a saline infusion in three cohorts of 2-4 week abstinent cocaine-dependent participants (n = 93) and healthy controls (n = 69) were used to develop a classification model. An information theoretic-based feature selection algorithm was first conducted to reduce the number of voxels. A density-based clustering algorithm was then used to form spatially connected voxel clouds in three-dimensional space. A statistical classifier, Support Vectors Machine (SVM), was then used for participant classification. Statistically insignificant voxels of spatially connected brain regions were removed iteratively and classification accuracy was reported through the iterations. The voxel-based analysis identified 1,500 spatially connected voxels in 30 distinct clusters after a grid search in SVM parameters. Participants were successfully classified with 0.88 and 0.89 F-measure accuracies in 10-fold cross validation (10xCV) and leave-one-out (LOO) approaches, respectively. Sensitivity and specificity were 0.90 and 0.89 for LOO; 0.83 and 0.83 for 10xCV. Many of the 30 selected clusters are highly relevant to the addictive process, including regions relevant to cognitive control, default mode network related self-referential thought, behavioral inhibition, and contextual memories. Relative hyperactivity and hypoactivity of regional cerebral blood flow in brain regions in cocaine-dependent participants are presented with corresponding level of significance. The SVM-based approach successfully classified cocaine-dependent and healthy control participants using voxels selected with information theoretic-based and statistical methods from participants' SPECT data. The regions found in this study align with brain regions reported in the literature. These findings support the future use of brain imaging and SVM-based classifier in the diagnosis of substance use disorders and furthering an understanding of their underlying pathology.
28 CFR 22.25 - Final disposition of identifiable materials.

Code of Federal Regulations, 2011 CFR

2011-07-01

... RESEARCH AND STATISTICAL INFORMATION § 22.25 Final disposition of identifiable materials. Upon completion of a research or statistical project the security of identifiable research or statistical information...
28 CFR 22.25 - Final disposition of identifiable materials.

Code of Federal Regulations, 2010 CFR

2010-07-01

... RESEARCH AND STATISTICAL INFORMATION § 22.25 Final disposition of identifiable materials. Upon completion of a research or statistical project the security of identifiable research or statistical information...
Altruistic behavior in cohesive social groups: The role of target identifiability

PubMed Central

Ritov, Ilana; Kogut, Tehila

2017-01-01

People’s tendency to be more generous toward identifiable victims than toward unidentifiable or statistical victims is known as the Identifiable Victim Effect. Recent research has called the generality of this effect into question, showing that in cross-national contexts, identifiability mostly affects willingness to help victims of one’s own “in-group.” Furthermore, in inter-group conflict situations, identifiability increased generosity toward a member of the adversary group, but decreased generosity toward a member of one’s own group. In the present research we examine the role of group-cohesiveness as an underlying factor accounting for these divergent findings. In particular, we examined novel groups generated in the lab, using the minimal group paradigm, as well as natural groups of students in regular exercise sections. Allocation decisions in dictator games revealed that a group’s cohesiveness affects generosity toward in-group and out-group recipients differently, depending on their identifiability. In particular, in cohesive groups the identification of an in-group recipient decreased, rather than increased generosity. PMID:29161282
Neural Underpinnings of the Identifiable Victim Effect: Affect Shifts Preferences for Giving

PubMed Central

Västfjäll, Daniel; Slovic, Paul; Knutson, Brian

2013-01-01

The “identifiable victim effect” refers to peoples' tendency to preferentially give to identified versus anonymous victims of misfortune, and has been proposed to partly depend on affect. By soliciting charitable donations from human subjects during behavioral and neural (i.e., functional magnetic resonance imaging) experiments, we sought to determine whether and how affect might promote the identifiable victim effect. Behaviorally, subjects gave more to orphans depicted by photographs versus silhouettes, and their shift in preferences was mediated by photograph-induced feelings of positive arousal, but not negative arousal. Neurally, while photographs versus silhouettes elicited activity in widespread circuits associated with facial and affective processing, only nucleus accumbens activity predicted and could statistically account for increased donations. Together, these findings suggest that presenting evaluable identifiable information can recruit positive arousal, which then promotes giving. We propose that affect elicited by identifiable stimuli can compel people to give more to strangers, even despite costs to the self. PMID:24155323

Altruistic behavior in cohesive social groups: The role of target identifiability.

PubMed

Ritov, Ilana; Kogut, Tehila

2017-01-01

People's tendency to be more generous toward identifiable victims than toward unidentifiable or statistical victims is known as the Identifiable Victim Effect. Recent research has called the generality of this effect into question, showing that in cross-national contexts, identifiability mostly affects willingness to help victims of one's own "in-group." Furthermore, in inter-group conflict situations, identifiability increased generosity toward a member of the adversary group, but decreased generosity toward a member of one's own group. In the present research we examine the role of group-cohesiveness as an underlying factor accounting for these divergent findings. In particular, we examined novel groups generated in the lab, using the minimal group paradigm, as well as natural groups of students in regular exercise sections. Allocation decisions in dictator games revealed that a group's cohesiveness affects generosity toward in-group and out-group recipients differently, depending on their identifiability. In particular, in cohesive groups the identification of an in-group recipient decreased, rather than increased generosity.
Probing stochastic inter-galactic magnetic fields using blazar-induced gamma ray halo morphology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Duplessis, Francis; Vachaspati, Tanmay, E-mail: fdupless@asu.edu, E-mail: tvachasp@asu.edu

Inter-galactic magnetic fields can imprint their structure on the morphology of blazar-induced gamma ray halos. We show that the halo morphology arises through the interplay of the source's jet and a two-dimensional surface dictated by the magnetic field. Through extensive numerical simulations, we generate mock halos created by stochastic magnetic fields with and without helicity, and study the dependence of the halo features on the properties of the magnetic field. We propose a sharper version of the Q-statistics and demonstrate its sensitivity to the magnetic field strength, the coherence scale, and the handedness of the helicity. We also identify andmore » explain a new feature of the Q-statistics that can further enhance its power.« less
Exposure time independent summary statistics for assessment of drug dependent cell line growth inhibition.

PubMed

Falgreen, Steffen; Laursen, Maria Bach; Bødker, Julie Støve; Kjeldsen, Malene Krag; Schmitz, Alexander; Nyegaard, Mette; Johnsen, Hans Erik; Dybkær, Karen; Bøgsted, Martin

2014-06-05

In vitro generated dose-response curves of human cancer cell lines are widely used to develop new therapeutics. The curves are summarised by simplified statistics that ignore the conventionally used dose-response curves' dependency on drug exposure time and growth kinetics. This may lead to suboptimal exploitation of data and biased conclusions on the potential of the drug in question. Therefore we set out to improve the dose-response assessments by eliminating the impact of time dependency. First, a mathematical model for drug induced cell growth inhibition was formulated and used to derive novel dose-response curves and improved summary statistics that are independent of time under the proposed model. Next, a statistical analysis workflow for estimating the improved statistics was suggested consisting of 1) nonlinear regression models for estimation of cell counts and doubling times, 2) isotonic regression for modelling the suggested dose-response curves, and 3) resampling based method for assessing variation of the novel summary statistics. We document that conventionally used summary statistics for dose-response experiments depend on time so that fast growing cell lines compared to slowly growing ones are considered overly sensitive. The adequacy of the mathematical model is tested for doxorubicin and found to fit real data to an acceptable degree. Dose-response data from the NCI60 drug screen were used to illustrate the time dependency and demonstrate an adjustment correcting for it. The applicability of the workflow was illustrated by simulation and application on a doxorubicin growth inhibition screen. The simulations show that under the proposed mathematical model the suggested statistical workflow results in unbiased estimates of the time independent summary statistics. Variance estimates of the novel summary statistics are used to conclude that the doxorubicin screen covers a significant diverse range of responses ensuring it is useful for biological interpretations. Time independent summary statistics may aid the understanding of drugs' action mechanism on tumour cells and potentially renew previous drug sensitivity evaluation studies.
Exposure time independent summary statistics for assessment of drug dependent cell line growth inhibition

PubMed Central

2014-01-01

Background In vitro generated dose-response curves of human cancer cell lines are widely used to develop new therapeutics. The curves are summarised by simplified statistics that ignore the conventionally used dose-response curves’ dependency on drug exposure time and growth kinetics. This may lead to suboptimal exploitation of data and biased conclusions on the potential of the drug in question. Therefore we set out to improve the dose-response assessments by eliminating the impact of time dependency. Results First, a mathematical model for drug induced cell growth inhibition was formulated and used to derive novel dose-response curves and improved summary statistics that are independent of time under the proposed model. Next, a statistical analysis workflow for estimating the improved statistics was suggested consisting of 1) nonlinear regression models for estimation of cell counts and doubling times, 2) isotonic regression for modelling the suggested dose-response curves, and 3) resampling based method for assessing variation of the novel summary statistics. We document that conventionally used summary statistics for dose-response experiments depend on time so that fast growing cell lines compared to slowly growing ones are considered overly sensitive. The adequacy of the mathematical model is tested for doxorubicin and found to fit real data to an acceptable degree. Dose-response data from the NCI60 drug screen were used to illustrate the time dependency and demonstrate an adjustment correcting for it. The applicability of the workflow was illustrated by simulation and application on a doxorubicin growth inhibition screen. The simulations show that under the proposed mathematical model the suggested statistical workflow results in unbiased estimates of the time independent summary statistics. Variance estimates of the novel summary statistics are used to conclude that the doxorubicin screen covers a significant diverse range of responses ensuring it is useful for biological interpretations. Conclusion Time independent summary statistics may aid the understanding of drugs’ action mechanism on tumour cells and potentially renew previous drug sensitivity evaluation studies. PMID:24902483
DESCARTES' RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA.

PubMed

Bhaskar, Anand; Song, Yun S

2014-01-01

The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the "folded" SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes' rule of signs for polynomials to the Laplace transform of piecewise continuous functions.
DESCARTES’ RULE OF SIGNS AND THE IDENTIFIABILITY OF POPULATION DEMOGRAPHIC MODELS FROM GENOMIC VARIATION DATA1

PubMed Central

Bhaskar, Anand; Song, Yun S.

2016-01-01

The sample frequency spectrum (SFS) is a widely-used summary statistic of genomic variation in a sample of homologous DNA sequences. It provides a highly efficient dimensional reduction of large-scale population genomic data and its mathematical dependence on the underlying population demography is well understood, thus enabling the development of efficient inference algorithms. However, it has been recently shown that very different population demographies can actually generate the same SFS for arbitrarily large sample sizes. Although in principle this nonidentifiability issue poses a thorny challenge to statistical inference, the population size functions involved in the counterexamples are arguably not so biologically realistic. Here, we revisit this problem and examine the identifiability of demographic models under the restriction that the population sizes are piecewise-defined where each piece belongs to some family of biologically-motivated functions. Under this assumption, we prove that the expected SFS of a sample uniquely determines the underlying demographic model, provided that the sample is sufficiently large. We obtain a general bound on the sample size sufficient for identifiability; the bound depends on the number of pieces in the demographic model and also on the type of population size function in each piece. In the cases of piecewise-constant, piecewise-exponential and piecewise-generalized-exponential models, which are often assumed in population genomic inferences, we provide explicit formulas for the bounds as simple functions of the number of pieces. Lastly, we obtain analogous results for the “folded” SFS, which is often used when there is ambiguity as to which allelic type is ancestral. Our results are proved using a generalization of Descartes’ rule of signs for polynomials to the Laplace transform of piecewise continuous functions. PMID:28018011
Critical shear stress measurement of cohesive soils in streams: identifying device-dependent variability using an in-situ jet test device and conduit flume

NASA Astrophysics Data System (ADS)

Mahalder, B.; Schwartz, J. S.; Palomino, A.; Papanicolaou, T.

2016-12-01

Cohesive soil erodibility and threshold shear stress for stream bed and bank are dependent on both soil physical and geochemical properties in association with the channel vegetative conditions. These properties can be spatially variable therefore making critical shear stress measurement in cohesive soil challenging and leads to a need for a more comprehensive understanding of the erosional processes in streams. Several in-situ and flume-type test devices for estimating critical shear stress have been introduced by different researchers; however reported shear stress estimates per device vary widely in orders of magnitude. Advantages and disadvantages exist between these devices. Development of in-situ test devices leave the bed and/or bank material relatively undisturbed and can capture the variable nature of field soil conditions. However, laboratory flumes provide a means to control environmental conditions that can be quantify and tested. This study was conducted to observe differences in critical shear stress using jet tester and a well-controlled conduit flume. Soil samples were collected from the jet test locations and tested in a pressurized flume following standard operational procedure to calculate the critical shear stress. The results were compared using statistical data analysis (mean-separation ANOVA procedure) to identify possible differences. In addition to the device comparison, the mini jet device was used to measure critical shear stress across geologically diverse regions of Tennessee, USA. Statistical correlation between critical shear stress and the soil physical, and geochemical properties were completed identifying that geological origin plays a significant role in critical shear stress prediction for cohesive soils. Finally, the critical shear stress prediction equations using the jet test data were examined with possible suggestions to modify based on the flume test results.
Multivariate pattern dependence

PubMed Central

Saxe, Rebecca

2017-01-01

When we perform a cognitive task, multiple brain regions are engaged. Understanding how these regions interact is a fundamental step to uncover the neural bases of behavior. Most research on the interactions between brain regions has focused on the univariate responses in the regions. However, fine grained patterns of response encode important information, as shown by multivariate pattern analysis. In the present article, we introduce and apply multivariate pattern dependence (MVPD): a technique to study the statistical dependence between brain regions in humans in terms of the multivariate relations between their patterns of responses. MVPD characterizes the responses in each brain region as trajectories in region-specific multidimensional spaces, and models the multivariate relationship between these trajectories. We applied MVPD to the posterior superior temporal sulcus (pSTS) and to the fusiform face area (FFA), using a searchlight approach to reveal interactions between these seed regions and the rest of the brain. Across two different experiments, MVPD identified significant statistical dependence not detected by standard functional connectivity. Additionally, MVPD outperformed univariate connectivity in its ability to explain independent variance in the responses of individual voxels. In the end, MVPD uncovered different connectivity profiles associated with different representational subspaces of FFA: the first principal component of FFA shows differential connectivity with occipital and parietal regions implicated in the processing of low-level properties of faces, while the second and third components show differential connectivity with anterior temporal regions implicated in the processing of invariant representations of face identity. PMID:29155809
28 CFR 22.21 - Use of identifiable data.

Code of Federal Regulations, 2010 CFR

2010-07-01

... STATISTICAL INFORMATION § 22.21 Use of identifiable data. Research or statistical information identifiable to a private person may be used only for research or statistical purposes. ... 28 Judicial Administration 1 2010-07-01 2010-07-01 false Use of identifiable data. 22.21 Section...
Data Fusion for a Vision-Radiological System: a Statistical Calibration Algorithm

DOE Office of Scientific and Technical Information (OSTI.GOV)

Enqvist, Andreas; Koppal, Sanjeev; Riley, Phillip

2015-07-01

Presented here is a fusion system based on simple, low-cost computer vision and radiological sensors for tracking of multiple objects and identifying potential radiological materials being transported or shipped. The main focus of this work is the development of calibration algorithms for characterizing the fused sensor system as a single entity. There is an apparent need for correcting for a scene deviation from the basic inverse distance-squared law governing the detection rates even when evaluating system calibration algorithms. In particular, the computer vision system enables a map of distance-dependence of the sources being tracked, to which the time-dependent radiological datamore » can be incorporated by means of data fusion of the two sensors' output data. (authors)« less
Effective connectivity: Influence, causality and biophysical modeling

PubMed Central

Valdes-Sosa, Pedro A.; Roebroeck, Alard; Daunizeau, Jean; Friston, Karl

2011-01-01

This is the final paper in a Comments and Controversies series dedicated to “The identification of interacting networks in the brain using fMRI: Model selection, causality and deconvolution”. We argue that discovering effective connectivity depends critically on state-space models with biophysically informed observation and state equations. These models have to be endowed with priors on unknown parameters and afford checks for model Identifiability. We consider the similarities and differences among Dynamic Causal Modeling, Granger Causal Modeling and other approaches. We establish links between past and current statistical causal modeling, in terms of Bayesian dependency graphs and Wiener–Akaike–Granger–Schweder influence measures. We show that some of the challenges faced in this field have promising solutions and speculate on future developments. PMID:21477655
Statistical detection of systematic election irregularities

PubMed Central

Klimek, Peter; Yegorov, Yuri; Hanel, Rudolf; Thurner, Stefan

2012-01-01

Democratic societies are built around the principle of free and fair elections, and that each citizen’s vote should count equally. National elections can be regarded as large-scale social experiments, where people are grouped into usually large numbers of electoral districts and vote according to their preferences. The large number of samples implies statistical consequences for the polling results, which can be used to identify election irregularities. Using a suitable data representation, we find that vote distributions of elections with alleged fraud show a kurtosis substantially exceeding the kurtosis of normal elections, depending on the level of data aggregation. As an example, we show that reported irregularities in recent Russian elections are, indeed, well-explained by systematic ballot stuffing. We develop a parametric model quantifying the extent to which fraudulent mechanisms are present. We formulate a parametric test detecting these statistical properties in election results. Remarkably, this technique produces robust outcomes with respect to the resolution of the data and therefore, allows for cross-country comparisons. PMID:23010929
Statistical detection of systematic election irregularities.

PubMed

Klimek, Peter; Yegorov, Yuri; Hanel, Rudolf; Thurner, Stefan

2012-10-09

Democratic societies are built around the principle of free and fair elections, and that each citizen's vote should count equally. National elections can be regarded as large-scale social experiments, where people are grouped into usually large numbers of electoral districts and vote according to their preferences. The large number of samples implies statistical consequences for the polling results, which can be used to identify election irregularities. Using a suitable data representation, we find that vote distributions of elections with alleged fraud show a kurtosis substantially exceeding the kurtosis of normal elections, depending on the level of data aggregation. As an example, we show that reported irregularities in recent Russian elections are, indeed, well-explained by systematic ballot stuffing. We develop a parametric model quantifying the extent to which fraudulent mechanisms are present. We formulate a parametric test detecting these statistical properties in election results. Remarkably, this technique produces robust outcomes with respect to the resolution of the data and therefore, allows for cross-country comparisons.
Environmentally safe areas and routes in the Baltic proper using Eulerian tracers.

PubMed

Höglund, A; Meier, H E M

2012-07-01

In recent years, the shipping of environmentally hazardous cargo has increased considerably in the Baltic proper. In this study, a large number of hypothetical oil spills with an idealized, passive tracer are simulated. From the tracer distributions, statistical measures are calculated to optimize the quantity of tracer from a spill that would stay at sea as long as possible. Increased time may permit action to be taken against the spill before the oil reaches environmentally vulnerable coastal zones. The statistical measures are used to calculate maritime routes with maximum probability that an oil spill will stay at sea as long as possible. Under these assumptions, ships should follow routes that are located south of Bornholm instead of the northern routes in use currently. Our results suggest that the location of the optimal maritime routes depends on the season, although interannual variability is too large to identify statistically significant changes. Copyright © 2012. Published by Elsevier Ltd.
Is There a Critical Distance for Fickian Transport? - a Statistical Approach to Sub-Fickian Transport Modelling in Porous Media

NASA Astrophysics Data System (ADS)

Most, S.; Nowak, W.; Bijeljic, B.

2014-12-01

Transport processes in porous media are frequently simulated as particle movement. This process can be formulated as a stochastic process of particle position increments. At the pore scale, the geometry and micro-heterogeneities prohibit the commonly made assumption of independent and normally distributed increments to represent dispersion. Many recent particle methods seek to loosen this assumption. Recent experimental data suggest that we have not yet reached the end of the need to generalize, because particle increments show statistical dependency beyond linear correlation and over many time steps. The goal of this work is to better understand the validity regions of commonly made assumptions. We are investigating after what transport distances can we observe: A statistical dependence between increments, that can be modelled as an order-k Markov process, boils down to order 1. This would be the Markovian distance for the process, where the validity of yet-unexplored non-Gaussian-but-Markovian random walks would start. A bivariate statistical dependence that simplifies to a multi-Gaussian dependence based on simple linear correlation (validity of correlated PTRW). Complete absence of statistical dependence (validity of classical PTRW/CTRW). The approach is to derive a statistical model for pore-scale transport from a powerful experimental data set via copula analysis. The model is formulated as a non-Gaussian, mutually dependent Markov process of higher order, which allows us to investigate the validity ranges of simpler models.
Reasons for smoking cessation attempts among Japanese male smokers vary by nicotine dependence level: a cross-sectional study after the 2010 tobacco tax increase

PubMed Central

Tanihara, Shinichi; Momose, Yoshito

2015-01-01

Objectives To examine the association between smoking cessation attempts during the previous 12 months, motivators to quit smoking and nicotine dependence levels among current male smokers after Japan's massive 2010 tobacco tax increase. Design Cross-sectional study. Setting A self-reported questionnaire about smoking habits, nicotine dependence levels and factors identified as motivators to quit smoking was administered to 9378 employees working at a company located in Fukuoka Prefecture in Japan (as of 1 October 2011). Participants A total of 2251 male current smokers 20–69 years old. Primary and secondary outcome measures Nicotine dependence level assessed by Fagerström Test for Cigarette Dependence (FTCD), smoking cessation attempts during the previous 12 months and motivators for smoking cessation. Results The proportion of current smokers who had attempted to quit smoking within the previous 12 months was 40.6%. Nicotine dependence level of current smokers was negatively associated with cessation attempts during the previous 12 months. Motivators for smoking cessation differed by nicotine dependence levels. ‘The rise in cigarette prices since October 2010’ as a smoking cessation motivator increased significantly at the medium nicotine dependence level (OR 1.44, 95% CI 1.09 to 1.90); however, this association was not statistically significant for individuals with high nicotine dependence (OR 1.24, 95% CI 0.80 to 1.92). ‘Feeling unhealthy’ was significantly negatively associated for medium (OR 0.42, 95% CI 0.27 to 0.65) and high (OR 0.31, 95% CI 0.14 to 0.71) nicotine dependence levels. Trend associations assessed by assigning ordinal numbers to total FTCD score for those two motivators were statistically significant. Conclusions The efficacy of smoking cessation strategies can be improved by considering the target group's nicotine dependence level. For smokers with medium and high nicotine dependence levels, more effective strategies aimed at encouraging smoking cessation, such as policy interventions including increasing tobacco taxes, are needed. PMID:25795690
Reasons for smoking cessation attempts among Japanese male smokers vary by nicotine dependence level: a cross-sectional study after the 2010 tobacco tax increase.

PubMed

Tanihara, Shinichi; Momose, Yoshito

2015-03-20

To examine the association between smoking cessation attempts during the previous 12 months, motivators to quit smoking and nicotine dependence levels among current male smokers after Japan's massive 2010 tobacco tax increase. Cross-sectional study. A self-reported questionnaire about smoking habits, nicotine dependence levels and factors identified as motivators to quit smoking was administered to 9378 employees working at a company located in Fukuoka Prefecture in Japan (as of 1 October 2011). A total of 2251 male current smokers 20-69 years old. Nicotine dependence level assessed by Fagerström Test for Cigarette Dependence (FTCD), smoking cessation attempts during the previous 12 months and motivators for smoking cessation. The proportion of current smokers who had attempted to quit smoking within the previous 12 months was 40.6%. Nicotine dependence level of current smokers was negatively associated with cessation attempts during the previous 12 months. Motivators for smoking cessation differed by nicotine dependence levels. 'The rise in cigarette prices since October 2010' as a smoking cessation motivator increased significantly at the medium nicotine dependence level (OR 1.44, 95% CI 1.09 to 1.90); however, this association was not statistically significant for individuals with high nicotine dependence (OR 1.24, 95% CI 0.80 to 1.92). 'Feeling unhealthy' was significantly negatively associated for medium (OR 0.42, 95% CI 0.27 to 0.65) and high (OR 0.31, 95% CI 0.14 to 0.71) nicotine dependence levels. Trend associations assessed by assigning ordinal numbers to total FTCD score for those two motivators were statistically significant. The efficacy of smoking cessation strategies can be improved by considering the target group's nicotine dependence level. For smokers with medium and high nicotine dependence levels, more effective strategies aimed at encouraging smoking cessation, such as policy interventions including increasing tobacco taxes, are needed. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Optimal False Discovery Rate Control for Dependent Data

PubMed Central

Xie, Jichun; Cai, T. Tony; Maris, John; Li, Hongzhe

2013-01-01

This paper considers the problem of optimal false discovery rate control when the test statistics are dependent. An optimal joint oracle procedure, which minimizes the false non-discovery rate subject to a constraint on the false discovery rate is developed. A data-driven marginal plug-in procedure is then proposed to approximate the optimal joint procedure for multivariate normal data. It is shown that the marginal procedure is asymptotically optimal for multivariate normal data with a short-range dependent covariance structure. Numerical results show that the marginal procedure controls false discovery rate and leads to a smaller false non-discovery rate than several commonly used p-value based false discovery rate controlling methods. The procedure is illustrated by an application to a genome-wide association study of neuroblastoma and it identifies a few more genetic variants that are potentially associated with neuroblastoma than several p-value-based false discovery rate controlling procedures. PMID:23378870
Geographic Distribution of Disaster-Specific Emergency Department Use After Hurricane Sandy in New York City.

PubMed

Lee, David C; Smith, Silas W; Carr, Brendan G; Doran, Kelly M; Portelli, Ian; Grudzen, Corita R; Goldfrank, Lewis R

2016-06-01

We aimed to characterize the geographic distribution of post-Hurricane Sandy emergency department use in administrative flood evacuation zones of New York City. Using emergency claims data, we identified significant deviations in emergency department use after Hurricane Sandy. Using time-series analysis, we analyzed the frequency of visits for specific conditions and comorbidities to identify medically vulnerable populations who developed acute postdisaster medical needs. We found statistically significant decreases in overall post-Sandy emergency department use in New York City but increased utilization in the most vulnerable evacuation zone. In addition to dialysis- and ventilator-dependent patients, we identified that patients who were elderly or homeless or who had diabetes, dementia, cardiac conditions, limitations in mobility, or drug dependence were more likely to visit emergency departments after Hurricane Sandy. Furthermore, patients were more likely to develop drug-resistant infections, require isolation, and present for hypothermia, environmental exposures, or administrative reasons. Our study identified high-risk populations who developed acute medical and social needs in specific geographic areas after Hurricane Sandy. Our findings can inform coherent and targeted responses to disasters. Early identification of medically vulnerable populations can help to map "hot spots" requiring additional medical and social attention and prioritize resources for areas most impacted by disasters. (Disaster Med Public Health Preparedness. 2016;10:351-361).
Identification of Homophily and Preferential Recruitment in Respondent-Driven Sampling.

PubMed

Crawford, Forrest W; Aronow, Peter M; Zeng, Li; Li, Jianghong

2018-01-01

Respondent-driven sampling (RDS) is a link-tracing procedure used in epidemiologic research on hidden or hard-to-reach populations in which subjects recruit others via their social networks. Estimates from RDS studies may have poor statistical properties due to statistical dependence in sampled subjects' traits. Two distinct mechanisms account for dependence in an RDS study: homophily, the tendency for individuals to share social ties with others exhibiting similar characteristics, and preferential recruitment, in which recruiters do not recruit uniformly at random from their network alters. The different effects of network homophily and preferential recruitment in RDS studies have been a source of confusion and controversy in methodological and empirical research in epidemiology. In this work, we gave formal definitions of homophily and preferential recruitment and showed that neither is identified in typical RDS studies. We derived nonparametric identification regions for homophily and preferential recruitment and showed that these parameters were not identified unless the network took a degenerate form. The results indicated that claims of homophily or recruitment bias measured from empirical RDS studies may not be credible. We applied our identification results to a study involving both a network census and RDS on a population of injection drug users in Hartford, Connecticut (2012-2013). © The Author(s) 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Characterization of individuals seeking treatment for caffeine dependence.

PubMed

Juliano, Laura M; Evatt, Daniel P; Richards, Brian D; Griffiths, Roland R

2012-12-01

Previous investigations have identified individuals who meet criteria for Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR; American Psychiatric Association, 2000) substance dependence as applied to caffeine, but there is little research on treatments for caffeine dependence. This study aimed to thoroughly characterize individuals who are seeking treatment for problematic caffeine use. Ninety-four individuals who identified as being psychologically or physically dependent on caffeine, or who had tried unsuccessfully to modify caffeine consumption participated in a face-to-face diagnostic clinical interview. They also completed measures concerning caffeine use and quitting history, reasons for seeking treatment, and standardized self-report measures of psychological functioning. Caffeine treatment seekers (mean age 41 years, 55% women) consumed an average of 548 mg caffeine per day. The primary source of caffeine was coffee for 50% of the sample and soft drinks for 37%. Eighty-eight percent reported prior serious attempts to modify caffeine use (mean 2.7 prior attempts), and 43% reported being advised by a medical professional to reduce or eliminate caffeine. Ninety-three percent met criteria for caffeine dependence when generic DSM-IV-TR substance dependence criteria were applied to caffeine use. The most commonly endorsed criteria were withdrawal (96%), persistent desire or unsuccessful efforts to control use (89%), and use despite knowledge of physical or psychological problems caused by caffeine (87%). The most common reasons for wanting to modify caffeine use were health-related (59%) and not wanting to be dependent on caffeine (35%). This investigation reveals that there are individuals with problematic caffeine use who are seeking treatment and suggests that there is a need for effective caffeine dependence treatments. 2013 APA, all rights reserved
Identification of altered pathways in breast cancer based on individualized pathway aberrance score.

PubMed

Shi, Sheng-Hong; Zhang, Wei; Jiang, Jing; Sun, Long

2017-08-01

The objective of the present study was to identify altered pathways in breast cancer based on the individualized pathway aberrance score (iPAS) method combined with the normal reference (nRef). There were 4 steps to identify altered pathways using the iPAS method: Data preprocessing conducted by the robust multi-array average (RMA) algorithm; gene-level statistics based on average Z ; pathway-level statistics according to iPAS; and a significance test dependent on 1 sample Wilcoxon test. The altered pathways were validated by calculating the changed percentage of each pathway in tumor samples and comparing them with pathways from differentially expressed genes (DEGs). A total of 688 altered pathways with P<0.01 were identified, including kinesin (KIF)- and polo-like kinase (PLK)-mediated events. When the percentage of change reached 50%, 310 pathways were involved in the total 688 altered pathways, which may validate the present results. In addition, there were 324 DEGs and 155 common genes between DEGs and pathway genes. DEGs and common genes were enriched in the same 9 significant terms, which also were members of altered pathways. The iPAS method was suitable for identifying altered pathways in breast cancer. Altered pathways (such as KIF and PLK mediated events) were important for understanding breast cancer mechanisms and for the future application of customized therapeutic decisions.
Study of subgrid-scale velocity models for reacting and nonreacting flows

NASA Astrophysics Data System (ADS)

Langella, I.; Doan, N. A. K.; Swaminathan, N.; Pope, S. B.

2018-05-01

A study is conducted to identify advantages and limitations of existing large-eddy simulation (LES) closures for the subgrid-scale (SGS) kinetic energy using a database of direct numerical simulations (DNS). The analysis is conducted for both reacting and nonreacting flows, different turbulence conditions, and various filter sizes. A model, based on dissipation and diffusion of momentum (LD-D model), is proposed in this paper based on the observed behavior of four existing models. Our model shows the best overall agreements with DNS statistics. Two main investigations are conducted for both reacting and nonreacting flows: (i) an investigation on the robustness of the model constants, showing that commonly used constants lead to a severe underestimation of the SGS kinetic energy and enlightening their dependence on Reynolds number and filter size; and (ii) an investigation on the statistical behavior of the SGS closures, which suggests that the dissipation of momentum is the key parameter to be considered in such closures and that dilatation effect is important and must be captured correctly in reacting flows. Additional properties of SGS kinetic energy modeling are identified and discussed.
PPM1D Mosaic Truncating Variants in Ovarian Cancer Cases May Be Treatment-Related Somatic Mutations

PubMed Central

Pharoah, Paul D. P.; Song, Honglin; Dicks, Ed; Intermaggio, Maria P.; Harrington, Patricia; Baynes, Caroline; Alsop, Kathryn; Bogdanova, Natalia; Cicek, Mine S.; Cunningham, Julie M.; Fridley, Brooke L.; Gentry-Maharaj, Aleksandra; Hillemanns, Peter; Lele, Shashi; Lester, Jenny; McGuire, Valerie; Moysich, Kirsten B.; Poblete, Samantha; Sieh, Weiva; Sucheston-Campbell, Lara; Widschwendter, Martin; Whittemore, Alice S.; Dörk, Thilo; Menon, Usha; Odunsi, Kunle; Goode, Ellen L.; Karlan, Beth Y.; Bowtell, David D.; Gayther, Simon A.; Ramus, Susan J.

2016-01-01

Mosaic truncating mutations in the protein phosphatase, Mg2+/Mn2+-dependent, 1D (PPM1D) gene have recently been reported with a statistically significantly greater frequency in lymphocyte DNA from ovarian cancer case patients compared with unaffected control patients. Using massively parallel sequencing (MPS) we identified truncating PPM1D mutations in 12 of 3236 epithelial ovarian cancer (EOC) case patients (0.37%) but in only one of 3431 unaffected control patients (0.03%) (P = .001). All statistical tests were two-sided. A combination of Sanger sequencing, pyrosequencing, and MPS data suggested that 12 of the 13 mutations were mosaic. All mutations were identified in post-chemotherapy treatment blood samples from case patients (n = 1827) (average 1234 days post-treatment in carriers) rather than from cases collected pretreatment (less than 14 days after diagnosis, n = 1384) (P = .002). These data suggest that PPM1D variants in EOC cases are primarily somatic mosaic mutations caused by treatment and are not associated with germline predisposition to EOC. PMID:26823519
MixGF: spectral probabilities for mixture spectra from more than one peptide.

PubMed

Wang, Jian; Bourne, Philip E; Bandeira, Nuno

2014-12-01

In large-scale proteomic experiments, multiple peptide precursors are often cofragmented simultaneously in the same mixture tandem mass (MS/MS) spectrum. These spectra tend to elude current computational tools because of the ubiquitous assumption that each spectrum is generated from only one peptide. Therefore, tools that consider multiple peptide matches to each MS/MS spectrum can potentially improve the relatively low spectrum identification rate often observed in proteomics experiments. More importantly, data independent acquisition protocols promoting the cofragmentation of multiple precursors are emerging as alternative methods that can greatly improve the throughput of peptide identifications but their success also depends on the availability of algorithms to identify multiple peptides from each MS/MS spectrum. Here we address a fundamental question in the identification of mixture MS/MS spectra: determining the statistical significance of multiple peptides matched to a given MS/MS spectrum. We propose the MixGF generating function model to rigorously compute the statistical significance of peptide identifications for mixture spectra and show that this approach improves the sensitivity of current mixture spectra database search tools by a ≈30-390%. Analysis of multiple data sets with MixGF reveals that in complex biological samples the number of identified mixture spectra can be as high as 20% of all the identified spectra and the number of unique peptides identified only in mixture spectra can be up to 35.4% of those identified in single-peptide spectra. © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.
MixGF: Spectral Probabilities for Mixture Spectra from more than One Peptide*

PubMed Central

Wang, Jian; Bourne, Philip E.; Bandeira, Nuno

2014-01-01

In large-scale proteomic experiments, multiple peptide precursors are often cofragmented simultaneously in the same mixture tandem mass (MS/MS) spectrum. These spectra tend to elude current computational tools because of the ubiquitous assumption that each spectrum is generated from only one peptide. Therefore, tools that consider multiple peptide matches to each MS/MS spectrum can potentially improve the relatively low spectrum identification rate often observed in proteomics experiments. More importantly, data independent acquisition protocols promoting the cofragmentation of multiple precursors are emerging as alternative methods that can greatly improve the throughput of peptide identifications but their success also depends on the availability of algorithms to identify multiple peptides from each MS/MS spectrum. Here we address a fundamental question in the identification of mixture MS/MS spectra: determining the statistical significance of multiple peptides matched to a given MS/MS spectrum. We propose the MixGF generating function model to rigorously compute the statistical significance of peptide identifications for mixture spectra and show that this approach improves the sensitivity of current mixture spectra database search tools by a ≈30–390%. Analysis of multiple data sets with MixGF reveals that in complex biological samples the number of identified mixture spectra can be as high as 20% of all the identified spectra and the number of unique peptides identified only in mixture spectra can be up to 35.4% of those identified in single-peptide spectra. PMID:25225354
Pitfalls of national routine death statistics for maternal mortality study.

PubMed

Saucedo, Monica; Bouvier-Colle, Marie-Hélène; Chantry, Anne A; Lamarche-Vadel, Agathe; Rey, Grégoire; Deneux-Tharaux, Catherine

2014-11-01

The lessons learned from the study of maternal deaths depend on the accuracy of data. Our objective was to assess time trends in the underestimation of maternal mortality (MM) in the national routine death statistics in France and to evaluate their current accuracy for the selection and causes of maternal deaths. National data obtained by enhanced methods in 1989, 1999, and 2007-09 were used as the gold standard to assess time trends in the underestimation of MM ratios (MMRs) in death statistics. Enhanced data and death statistics for 2007-09 were further compared by characterising false negatives (FNs) and false positives (FPs). The distribution of cause-specific MMRs, as assessed by each system, was described. Underestimation of MM in death statistics decreased from 55.6% in 1989 to 11.4% in 2007-09 (P < 0.001). In 2007-09, of 787 pregnancy-associated deaths, 254 were classified as maternal by the enhanced system and 211 by the death statistics; 34% of maternal deaths in the enhanced system were FNs in the death statistics, and 20% of maternal deaths in the death statistics were FPs. The hierarchy of causes of MM differed between the two systems. The discordances were mainly explained by the lack of precision in the drafting of death certificates by clinicians. Although the underestimation of MM in routine death statistics has decreased substantially over time, one third of maternal deaths remain unidentified, and the main causes of death are incorrectly identified in these data. Defining relevant priorities in maternal health requires the use of enhanced methods for MM study. © 2014 John Wiley & Sons Ltd.
Effects of Inaccurate Identification of Interictal Epileptiform Discharges in Concurrent EEG-fMRI

NASA Astrophysics Data System (ADS)

Gkiatis, K.; Bromis, K.; Kakkos, I.; Karanasiou, I. S.; Matsopoulos, G. K.; Garganis, K.

2017-11-01

Concurrent continuous EEG-fMRI is a novel multimodal technique that is finding its way into clinical practice in epilepsy. EEG timeseries are used to identify the timing of interictal epileptiform discharges (IEDs) which is then included in a GLM analysis in fMRI to localize the epileptic onset zone. Nevertheless, there are still some concerns about its reliability concerning BOLD changes correlated with IEDs. Even though IEDs are identified by an experienced neurologist-epiliptologist, the reliability and concordance of the mark-ups is depending on many factors including the level of fatigue, the amount of time that he spent or, in some cases, even the screen that is being used for the display of timeseries. This investigation is aiming to unravel the effect of misidentification or inaccuracy in the mark-ups of IEDs in the fMRI statistical parametric maps. Concurrent EEG-fMRI was conducted in six subjects with various types of epilepsy. IEDs were identified by an experienced neurologist-epiliptologist. Analysis of EEG was performed with EEGLAB and analysis of fMRI was conducted in FSL. Preliminary results revealed lower statistical significance for missing events or larger period of IEDs than the actual ones and the introduction of false positives and false negatives in statistical parametric maps when random events were included in the GLM on top of the IEDs. Our results suggest that mark-ups in EEG for simultaneous EEG-fMRI should be done with caution from an experienced and restful neurologist as it affects the fMRI results in various and unpredicted ways.
Identifying taxonomic and functional surrogates for spring biodiversity conservation.

PubMed

Jyväsjärvi, Jussi; Virtanen, Risto; Ilmonen, Jari; Paasivirta, Lauri; Muotka, Timo

2018-02-27

Surrogate approaches are widely used to estimate overall taxonomic diversity for conservation planning. Surrogate taxa are frequently selected based on rarity or charisma, whereas selection through statistical modeling has been applied rarely. We used boosted-regression-tree models (BRT) fitted to biological data from 165 springs to identify bryophyte and invertebrate surrogates for taxonomic and functional diversity of boreal springs. We focused on these 2 groups because they are well known and abundant in most boreal springs. The best indicators of taxonomic versus functional diversity differed. The bryophyte Bryum weigelii and the chironomid larva Paratrichocladius skirwithensis best indicated taxonomic diversity, whereas the isopod Asellus aquaticus and the chironomid Macropelopia spp. were the best surrogates of functional diversity. In a scoring algorithm for priority-site selection, taxonomic surrogates performed only slightly better than random selection for all spring-dwelling taxa, but they were very effective in representing spring specialists, providing a distinct improvement over random solutions. However, the surrogates for taxonomic diversity represented functional diversity poorly and vice versa. When combined with cross-taxon complementarity analyses, surrogate selection based on statistical modeling provides a promising approach for identifying groundwater-dependent ecosystems of special conservation value, a key requirement of the EU Water Framework Directive. © 2018 Society for Conservation Biology.
A scan statistic for identifying optimal risk windows in vaccine safety studies using self-controlled case series design.

PubMed

Xu, Stanley; Hambidge, Simon J; McClure, David L; Daley, Matthew F; Glanz, Jason M

2013-08-30

In the examination of the association between vaccines and rare adverse events after vaccination in postlicensure observational studies, it is challenging to define appropriate risk windows because prelicensure RCTs provide little insight on the timing of specific adverse events. Past vaccine safety studies have often used prespecified risk windows based on prior publications, biological understanding of the vaccine, and expert opinion. Recently, a data-driven approach was developed to identify appropriate risk windows for vaccine safety studies that use the self-controlled case series design. This approach employs both the maximum incidence rate ratio and the linear relation between the estimated incidence rate ratio and the inverse of average person time at risk, given a specified risk window. In this paper, we present a scan statistic that can identify appropriate risk windows in vaccine safety studies using the self-controlled case series design while taking into account the dependence of time intervals within an individual and while adjusting for time-varying covariates such as age and seasonality. This approach uses the maximum likelihood ratio test based on fixed-effects models, which has been used for analyzing data from self-controlled case series design in addition to conditional Poisson models. Copyright © 2013 John Wiley & Sons, Ltd.
Heterogeneity of Depressive Symptom Trajectories through Adolescence: Predicting Outcomes in Young Adulthood.

PubMed

Chaiton, Michael; Contreras, Gisèle; Brunet, Jennifer; Sabiston, Catherine M; O'Loughlin, Erin; Low, Nancy C P; Karp, Igor; Barnett, Tracie A; O'Loughlin, Jennifer

2013-05-01

This study describes developmental trajectories of depressive symptoms in adolescents and examines the association between trajectory group and mental health outcomes in young adulthood. Depressive symptoms were self-reported every three months from grade seven through grade 11 by 1293 adolescents in the Nicotine Dependence in Teens (NDIT) study and followed in young adulthood (average age 20.4, SD=0.7, n=865). Semi-parametric growth modeling was used to identify sex-specific trajectories of depressive symptoms. THREE DISTINCT TRAJECTORY GROUPS WERE IDENTIFIED: 50% of boys and 29% of girls exhibited low, decreasing levels of depressive symptoms; 14% of boys and 28% of girls exhibited high and increasing levels; and 36% of boys and 43% of girls exhibited moderate levels with linear increase. Trajectory group was a statistically significant independent predictor of depression, stress, and self-rated mental health in young adulthood in boys and girls. Boys, but not girls, in the high trajectory group had a statistically significant increase in the likelihood of seeking psychiatric care. Substantial heterogeneity in changes in depressive symptoms over time was found. Because early depressive symptoms predict mental health problems in young adulthood, monitoring adolescents for depressive symptoms may help identify those most at risk and in need of intervention.
Frequency-selective fading statistics of shallow-water acoustic communication channel with a few multipaths

NASA Astrophysics Data System (ADS)

Bae, Minja; Park, Jihyun; Kim, Jongju; Xue, Dandan; Park, Kyu-Chil; Yoon, Jong Rak

2016-07-01

The bit error rate of an underwater acoustic communication system is related to multipath fading statistics, which determine the signal-to-noise ratio. The amplitude and delay of each path depend on sea surface roughness, propagation medium properties, and source-to-receiver range as a function of frequency. Therefore, received signals will show frequency-dependent fading. A shallow-water acoustic communication channel generally shows a few strong multipaths that interfere with each other and the resulting interference affects the fading statistics model. In this study, frequency-selective fading statistics are modeled on the basis of the phasor representation of the complex path amplitude. The fading statistics distribution is parameterized by the frequency-dependent constructive or destructive interference of multipaths. At a 16 m depth with a muddy bottom, a wave height of 0.2 m, and source-to-receiver ranges of 100 and 400 m, fading statistics tend to show a Rayleigh distribution at a destructive interference frequency, but a Rice distribution at a constructive interference frequency. The theoretical fading statistics well matched the experimental ones.
The Attenuation of Correlation Coefficients: A Statistical Literacy Issue

ERIC Educational Resources Information Center

Trafimow, David

2016-01-01

Much of the science reported in the media depends on correlation coefficients. But the size of correlation coefficients depends, in part, on the reliability with which the correlated variables are measured. Understanding this is a statistical literacy issue.
A Robust Semi-Parametric Test for Detecting Trait-Dependent Diversification.

PubMed

Rabosky, Daniel L; Huang, Huateng

2016-03-01

Rates of species diversification vary widely across the tree of life and there is considerable interest in identifying organismal traits that correlate with rates of speciation and extinction. However, it has been challenging to develop methodological frameworks for testing hypotheses about trait-dependent diversification that are robust to phylogenetic pseudoreplication and to directionally biased rates of character change. We describe a semi-parametric test for trait-dependent diversification that explicitly requires replicated associations between character states and diversification rates to detect effects. To use the method, diversification rates are reconstructed across a phylogenetic tree with no consideration of character states. A test statistic is then computed to measure the association between species-level traits and the corresponding diversification rate estimates at the tips of the tree. The empirical value of the test statistic is compared to a null distribution that is generated by structured permutations of evolutionary rates across the phylogeny. The test is applicable to binary discrete characters as well as continuous-valued traits and can accommodate extremely sparse sampling of character states at the tips of the tree. We apply the test to several empirical data sets and demonstrate that the method has acceptable Type I error rates. © The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Individual determinants of research utilization: a systematic review.

PubMed

Estabrooks, Carole A; Floyd, Judith A; Scott-Findlay, Shannon; O'Leary, Katherine A; Gushta, Matthew

2003-09-01

In order to design interventions that increase research use in nursing, it is necessary to have an understanding of what influences research use. To report findings on a systematic review of studies that examine individual characteristics of nurses and how they influence the utilization of research. A survey of published articles in English that examine the influence of individual factors on the research utilization behaviour of nurses, without restriction of the study design, from selected computerized databases and hand searches. Articles had to measure one or more individual determinants of research utilization, measure the dependent variable (research utilization), and evaluate the relationship between the dependent and independent variables. The studies also had to indicate the direction of the relationship between the independent and dependent variables, report a P-value and the statistic used, and indicate the magnitude of the relationship. Six categories of potential individual determinants were identified: beliefs and attitudes, involvement in research activities, information seeking, professional characteristics, education and other socio-economic factors. Research design, sampling, measurement, and statistical analysis were examined to evaluate methodological quality. Methodological problems surfaced in all of the studies and, apart from attitude to research, there was little to suggest that any potential individual determinant influences research use. Important conceptual and measurement issues with regard to research utilization could be better addressed if research in the area were undertaken longitudinally by multi-disciplinary teams of researchers.
The influence of narrative v. statistical information on perceiving vaccination risks.

PubMed

Betsch, Cornelia; Ulshöfer, Corina; Renkewitz, Frank; Betsch, Tilmann

2011-01-01

Health-related information found on the Internet is increasing and impacts patient decision making, e.g. regarding vaccination decisions. In addition to statistical information (e.g. incidence rates of vaccine adverse events), narrative information is also widely available such as postings on online bulletin boards. Previous research has shown that narrative information can impact treatment decisions, even when statistical information is presented concurrently. As the determinants of this effect are largely unknown, we will vary features of the narratives to identify mechanisms through which narratives impact risk judgments. An online bulletin board setting provided participants with statistical information and authentic narratives about the occurrence and nonoccurrence of adverse events. Experiment 1 followed a single factorial design with 1, 2, or 4 narratives out of 10 reporting adverse events. Experiment 2 implemented a 2 (statistical risk 20% vs. 40%) × 2 (2/10 vs. 4/10 narratives reporting adverse events) × 2 (high vs. low richness) × 2 (high vs. low emotionality) between-subjects design. Dependent variables were perceived risk of side-effects and vaccination intentions. Experiment 1 shows an inverse relation between the number of narratives reporting adverse-events and vaccination intentions, which was mediated by the perceived risk of vaccinating. Experiment 2 showed a stronger influence of the number of narratives than of the statistical risk information. High (vs. low) emotional narratives had a greater impact on the perceived risk, while richness had no effect. The number of narratives influences risk judgments can potentially override statistical information about risk.
Statistical analysis of Geopotential Height (GH) timeseries based on Tsallis non-extensive statistical mechanics

NASA Astrophysics Data System (ADS)

Karakatsanis, L. P.; Iliopoulos, A. C.; Pavlos, E. G.; Pavlos, G. P.

2018-02-01

In this paper, we perform statistical analysis of time series deriving from Earth's climate. The time series are concerned with Geopotential Height (GH) and correspond to temporal and spatial components of the global distribution of month average values, during the period (1948-2012). The analysis is based on Tsallis non-extensive statistical mechanics and in particular on the estimation of Tsallis' q-triplet, namely {qstat, qsens, qrel}, the reconstructed phase space and the estimation of correlation dimension and the Hurst exponent of rescaled range analysis (R/S). The deviation of Tsallis q-triplet from unity indicates non-Gaussian (Tsallis q-Gaussian) non-extensive character with heavy tails probability density functions (PDFs), multifractal behavior and long range dependences for all timeseries considered. Also noticeable differences of the q-triplet estimation found in the timeseries at distinct local or temporal regions. Moreover, in the reconstructive phase space revealed a lower-dimensional fractal set in the GH dynamical phase space (strong self-organization) and the estimation of Hurst exponent indicated multifractality, non-Gaussianity and persistence. The analysis is giving significant information identifying and characterizing the dynamical characteristics of the earth's climate.
Statistical Quality Control of Moisture Data in GEOS DAS

NASA Technical Reports Server (NTRS)

Dee, D. P.; Rukhovets, L.; Todling, R.

1999-01-01

A new statistical quality control algorithm was recently implemented in the Goddard Earth Observing System Data Assimilation System (GEOS DAS). The final step in the algorithm consists of an adaptive buddy check that either accepts or rejects outlier observations based on a local statistical analysis of nearby data. A basic assumption in any such test is that the observed field is spatially coherent, in the sense that nearby data can be expected to confirm each other. However, the buddy check resulted in excessive rejection of moisture data, especially during the Northern Hemisphere summer. The analysis moisture variable in GEOS DAS is water vapor mixing ratio. Observational evidence shows that the distribution of mixing ratio errors is far from normal. Furthermore, spatial correlations among mixing ratio errors are highly anisotropic and difficult to identify. Both factors contribute to the poor performance of the statistical quality control algorithm. To alleviate the problem, we applied the buddy check to relative humidity data instead. This variable explicitly depends on temperature and therefore exhibits a much greater spatial coherence. As a result, reject rates of moisture data are much more reasonable and homogeneous in time and space.
A Statistics-based Platform for Quantitative N-terminome Analysis and Identification of Protease Cleavage Products*

PubMed Central

auf dem Keller, Ulrich; Prudova, Anna; Gioia, Magda; Butler, Georgina S.; Overall, Christopher M.

2010-01-01

Terminal amine isotopic labeling of substrates (TAILS), our recently introduced platform for quantitative N-terminome analysis, enables wide dynamic range identification of original mature protein N-termini and protease cleavage products. Modifying TAILS by use of isobaric tag for relative and absolute quantification (iTRAQ)-like labels for quantification together with a robust statistical classifier derived from experimental protease cleavage data, we report reliable and statistically valid identification of proteolytic events in complex biological systems in MS2 mode. The statistical classifier is supported by a novel parameter evaluating ion intensity-dependent quantification confidences of single peptide quantifications, the quantification confidence factor (QCF). Furthermore, the isoform assignment score (IAS) is introduced, a new scoring system for the evaluation of single peptide-to-protein assignments based on high confidence protein identifications in the same sample prior to negative selection enrichment of N-terminal peptides. By these approaches, we identified and validated, in addition to known substrates, low abundance novel bioactive MMP-2 targets including the plasminogen receptor S100A10 (p11) and the proinflammatory cytokine proEMAP/p43 that were previously undescribed. PMID:20305283
On the inequivalence of the CH and CHSH inequalities due to finite statistics

NASA Astrophysics Data System (ADS)

Renou, M. O.; Rosset, D.; Martin, A.; Gisin, N.

2017-06-01

Different variants of a Bell inequality, such as CHSH and CH, are known to be equivalent when evaluated on nonsignaling outcome probability distributions. However, in experimental setups, the outcome probability distributions are estimated using a finite number of samples. Therefore the nonsignaling conditions are only approximately satisfied and the robustness of the violation depends on the chosen inequality variant. We explain that phenomenon using the decomposition of the space of outcome probability distributions under the action of the symmetry group of the scenario, and propose a method to optimize the statistical robustness of a Bell inequality. In the process, we describe the finite group composed of relabeling of parties, measurement settings and outcomes, and identify correspondences between the irreducible representations of this group and properties of outcome probability distributions such as normalization, signaling or having uniform marginals.

Discrete approach to stochastic parametrization and dimension reduction in nonlinear dynamics.

PubMed

Chorin, Alexandre J; Lu, Fei

2015-08-11

Many physical systems are described by nonlinear differential equations that are too complicated to solve in full. A natural way to proceed is to divide the variables into those that are of direct interest and those that are not, formulate solvable approximate equations for the variables of greater interest, and use data and statistical methods to account for the impact of the other variables. In the present paper we consider time-dependent problems and introduce a fully discrete solution method, which simplifies both the analysis of the data and the numerical algorithms. The resulting time series are identified by a NARMAX (nonlinear autoregression moving average with exogenous input) representation familiar from engineering practice. The connections with the Mori-Zwanzig formalism of statistical physics are discussed, as well as an application to the Lorenz 96 system.
Histogram of gradient and binarized statistical image features of wavelet subband-based palmprint features extraction

NASA Astrophysics Data System (ADS)

Attallah, Bilal; Serir, Amina; Chahir, Youssef; Boudjelal, Abdelwahhab

2017-11-01

Palmprint recognition systems are dependent on feature extraction. A method of feature extraction using higher discrimination information was developed to characterize palmprint images. In this method, two individual feature extraction techniques are applied to a discrete wavelet transform of a palmprint image, and their outputs are fused. The two techniques used in the fusion are the histogram of gradient and the binarized statistical image features. They are then evaluated using an extreme learning machine classifier before selecting a feature based on principal component analysis. Three palmprint databases, the Hong Kong Polytechnic University (PolyU) Multispectral Palmprint Database, Hong Kong PolyU Palmprint Database II, and the Delhi Touchless (IIDT) Palmprint Database, are used in this study. The study shows that our method effectively identifies and verifies palmprints and outperforms other methods based on feature extraction.
Statistical Detection of Atypical Aircraft Flights

NASA Technical Reports Server (NTRS)

Statler, Irving; Chidester, Thomas; Shafto, Michael; Ferryman, Thomas; Amidan, Brett; Whitney, Paul; White, Amanda; Willse, Alan; Cooley, Scott; Jay, Joseph;

2006-01-01

A computational method and software to implement the method have been developed to sift through vast quantities of digital flight data to alert human analysts to aircraft flights that are statistically atypical in ways that signify that safety may be adversely affected. On a typical day, there are tens of thousands of flights in the United States and several times that number throughout the world. Depending on the specific aircraft design, the volume of data collected by sensors and flight recorders can range from a few dozen to several thousand parameters per second during a flight. Whereas these data have long been utilized in investigating crashes, the present method is oriented toward helping to prevent crashes by enabling routine monitoring of flight operations to identify portions of flights that may be of interest with respect to safety issues.

The use of open source bioinformatics tools to dissect transcriptomic data.

PubMed

Nitsche, Benjamin M; Ram, Arthur F J; Meyer, Vera

2012-01-01

Microarrays are a valuable technology to study fungal physiology on a transcriptomic level. Various microarray platforms are available comprising both single and two channel arrays. Despite different technologies, preprocessing of microarray data generally includes quality control, background correction, normalization, and summarization of probe level data. Subsequently, depending on the experimental design, diverse statistical analysis can be performed, including the identification of differentially expressed genes and the construction of gene coexpression networks.We describe how Bioconductor, a collection of open source and open development packages for the statistical programming language R, can be used for dissecting microarray data. We provide fundamental details that facilitate the process of getting started with R and Bioconductor. Using two publicly available microarray datasets from Aspergillus niger, we give detailed protocols on how to identify differentially expressed genes and how to construct gene coexpression networks.
Extreme Unconditional Dependence Vs. Multivariate GARCH Effect in the Analysis of Dependence Between High Losses on Polish and German Stock Indexes

NASA Astrophysics Data System (ADS)

Rokita, Pawel

Classical portfolio diversification methods do not take account of any dependence between extreme returns (losses). Many researchers provide, however, some empirical evidence for various assets that extreme-losses co-occur. If the co-occurrence is frequent enough to be statistically significant, it may seriously influence portfolio risk. Such effects may result from a few different properties of financial time series, like for instance: (1) extreme dependence in a (long-term) unconditional distribution, (2) extreme dependence in subsequent conditional distributions, (3) time-varying conditional covariance, (4) time-varying (long-term) unconditional covariance, (5) market contagion. Moreover, a mix of these properties may be present in return time series. Modeling each of them requires different approaches. It seams reasonable to investigate whether distinguishing between the properties is highly significant for portfolio risk measurement. If it is, identifying the effect responsible for high loss co-occurrence would be of a great importance. If it is not, the best solution would be selecting the easiest-to-apply model. This article concentrates on two of the aforementioned properties: extreme dependence (in a long-term unconditional distribution) and time-varying conditional covariance.
A Statistical Analysis of IrisCode and Its Security Implications.

PubMed

Kong, Adams Wai-Kin

2015-03-01

IrisCode has been used to gather iris data for 430 million people. Because of the huge impact of IrisCode, it is vital that it is completely understood. This paper first studies the relationship between bit probabilities and a mean of iris images (The mean of iris images is defined as the average of independent iris images.) and then uses the Chi-square statistic, the correlation coefficient and a resampling algorithm to detect statistical dependence between bits. The results show that the statistical dependence forms a graph with a sparse and structural adjacency matrix. A comparison of this graph with a graph whose edges are defined by the inner product of the Gabor filters that produce IrisCodes shows that partial statistical dependence is induced by the filters and propagates through the graph. Using this statistical information, the security risk associated with two patented template protection schemes that have been deployed in commercial systems for producing application-specific IrisCodes is analyzed. To retain high identification speed, they use the same key to lock all IrisCodes in a database. The belief has been that if the key is not compromised, the IrisCodes are secure. This study shows that even without the key, application-specific IrisCodes can be unlocked and that the key can be obtained through the statistical dependence detected.
Topical tranexamic acid in total knee replacement: a systematic review and meta-analysis.

PubMed

Panteli, Michalis; Papakostidis, Costas; Dahabreh, Ziad; Giannoudis, Peter V

2013-10-01

To examine the safety and efficacy of topical use of tranexamic acid (TA) in total knee arthroplasty (TKA). An electronic literature search of PubMed Medline; Ovid Medline; Embase; and the Cochrane Library was performed, identifying studies published in any language from 1966 to February 2013. The studies enrolled adults undergoing a primary TKA, where topical TA was used. Inverse variance statistical method and either a fixed or random effect model, depending on the absence or presence of statistical heterogeneity were used; subgroup analysis was performed when possible. We identified a total of seven eligible reports for analysis. Our meta-analysis indicated that when compared with the control group, topical application of TA limited significantly postoperative drain output (mean difference: -268.36ml), total blood loss (mean difference=-220.08ml), Hb drop (mean difference=-0.94g/dL) and lowered the risk of transfusion requirements (risk ratio=0.47, 95CI=0.26-0.84), without increased risk of thromboembolic events. Sub-group analysis indicated that a higher dose of topical TA (>2g) significantly reduced transfusion requirements. Although the present meta-analysis proved a statistically significant reduction of postoperative blood loss and transfusion requirements with topical use of TA in TKA, the clinical importance of the respective estimates of effect size should be interpreted with caution. I, II. Copyright © 2013 Elsevier B.V. All rights reserved.
Poisson Statistics of Combinatorial Library Sampling Predict False Discovery Rates of Screening

PubMed Central

2017-01-01

Microfluidic droplet-based screening of DNA-encoded one-bead-one-compound combinatorial libraries is a miniaturized, potentially widely distributable approach to small molecule discovery. In these screens, a microfluidic circuit distributes library beads into droplets of activity assay reagent, photochemically cleaves the compound from the bead, then incubates and sorts the droplets based on assay result for subsequent DNA sequencing-based hit compound structure elucidation. Pilot experimental studies revealed that Poisson statistics describe nearly all aspects of such screens, prompting the development of simulations to understand system behavior. Monte Carlo screening simulation data showed that increasing mean library sampling (ε), mean droplet occupancy, or library hit rate all increase the false discovery rate (FDR). Compounds identified as hits on k > 1 beads (the replicate k class) were much more likely to be authentic hits than singletons (k = 1), in agreement with previous findings. Here, we explain this observation by deriving an equation for authenticity, which reduces to the product of a library sampling bias term (exponential in k) and a sampling saturation term (exponential in ε) setting a threshold that the k-dependent bias must overcome. The equation thus quantitatively describes why each hit structure’s FDR is based on its k class, and further predicts the feasibility of intentionally populating droplets with multiple library beads, assaying the micromixtures for function, and identifying the active members by statistical deconvolution. PMID:28682059
Recurrent network dynamics reconciles visual motion segmentation and integration.

PubMed

Medathati, N V Kartheek; Rankin, James; Meso, Andrew I; Kornprobst, Pierre; Masson, Guillaume S

2017-09-12

In sensory systems, a range of computational rules are presumed to be implemented by neuronal subpopulations with different tuning functions. For instance, in primate cortical area MT, different classes of direction-selective cells have been identified and related either to motion integration, segmentation or transparency. Still, how such different tuning properties are constructed is unclear. The dominant theoretical viewpoint based on a linear-nonlinear feed-forward cascade does not account for their complex temporal dynamics and their versatility when facing different input statistics. Here, we demonstrate that a recurrent network model of visual motion processing can reconcile these different properties. Using a ring network, we show how excitatory and inhibitory interactions can implement different computational rules such as vector averaging, winner-take-all or superposition. The model also captures ordered temporal transitions between these behaviors. In particular, depending on the inhibition regime the network can switch from motion integration to segmentation, thus being able to compute either a single pattern motion or to superpose multiple inputs as in motion transparency. We thus demonstrate that recurrent architectures can adaptively give rise to different cortical computational regimes depending upon the input statistics, from sensory flow integration to segmentation.
Spatial Pattern Classification for More Accurate Forecasting of Variable Energy Resources

NASA Astrophysics Data System (ADS)

Novakovskaia, E.; Hayes, C.; Collier, C.

2014-12-01

The accuracy of solar and wind forecasts is becoming increasingly essential as grid operators continue to integrate additional renewable generation onto the electric grid. Forecast errors affect rate payers, grid operators, wind and solar plant maintenance crews and energy traders through increases in prices, project down time or lost revenue. While extensive and beneficial efforts were undertaken in recent years to improve physical weather models for a broad spectrum of applications these improvements have generally not been sufficient to meet the accuracy demands of system planners. For renewables, these models are often used in conjunction with additional statistical models utilizing both meteorological observations and the power generation data. Forecast accuracy can be dependent on specific weather regimes for a given location. To account for these dependencies it is important that parameterizations used in statistical models change as the regime changes. An automated tool, based on an artificial neural network model, has been developed to identify different weather regimes as they impact power output forecast accuracy at wind or solar farms. In this study, improvements in forecast accuracy were analyzed for varying time horizons for wind farms and utility-scale PV plants located in different geographical regions.
Spatial heterogeneity and risk factors for stunting among children under age five in Ethiopia: A Bayesian geo-statistical model.

PubMed

Hagos, Seifu; Hailemariam, Damen; WoldeHanna, Tasew; Lindtjørn, Bernt

2017-01-01

Understanding the spatial distribution of stunting and underlying factors operating at meso-scale is of paramount importance for intervention designing and implementations. Yet, little is known about the spatial distribution of stunting and some discrepancies are documented on the relative importance of reported risk factors. Therefore, the present study aims at exploring the spatial distribution of stunting at meso- (district) scale, and evaluates the effect of spatial dependency on the identification of risk factors and their relative contribution to the occurrence of stunting and severe stunting in a rural area of Ethiopia. A community based cross sectional study was conducted to measure the occurrence of stunting and severe stunting among children aged 0-59 months. Additionally, we collected relevant information on anthropometric measures, dietary habits, parent and child-related demographic and socio-economic status. Latitude and longitude of surveyed households were also recorded. Local Anselin Moran's I was calculated to investigate the spatial variation of stunting prevalence and identify potential local pockets (hotspots) of high prevalence. Finally, we employed a Bayesian geo-statistical model, which accounted for spatial dependency structure in the data, to identify potential risk factors for stunting in the study area. Overall, the prevalence of stunting and severe stunting in the district was 43.7% [95%CI: 40.9, 46.4] and 21.3% [95%CI: 19.5, 23.3] respectively. We identified statistically significant clusters of high prevalence of stunting (hotspots) in the eastern part of the district and clusters of low prevalence (cold spots) in the western. We found out that the inclusion of spatial structure of the data into the Bayesian model has shown to improve the fit for stunting model. The Bayesian geo-statistical model indicated that the risk of stunting increased as the child's age increased (OR 4.74; 95% Bayesian credible interval [BCI]:3.35-6.58) and among boys (OR 1.28; 95%BCI; 1.12-1.45). However, maternal education and household food security were found to be protective against stunting and severe stunting. Stunting prevalence may vary across space at different scale. For this, it's important that nutrition studies and, more importantly, control interventions take into account this spatial heterogeneity in the distribution of nutritional deficits and their underlying associated factors. The findings of this study also indicated that interventions integrating household food insecurity in nutrition programs in the district might help to avert the burden of stunting.
28 CFR 22.22 - Revelation of identifiable data.

Code of Federal Regulations, 2011 CFR

2011-07-01

... STATISTICAL INFORMATION § 22.22 Revelation of identifiable data. (a) Except as noted in paragraph (b) of this section, research and statistical information relating to a private person may be revealed in identifiable... Act. (3) Persons or organizations for research or statistical purposes. Information may only be...
Accuracy Rates of Ancestry Estimation by Forensic Anthropologists Using Identified Forensic Cases.

PubMed

Thomas, Richard M; Parks, Connie L; Richard, Adam H

2017-07-01

A common task in forensic anthropology involves the estimation of the ancestry of a decedent by comparing their skeletal morphology and measurements to skeletons of individuals from known geographic groups. However, the accuracy rates of ancestry estimation methods in actual forensic casework have rarely been studied. This article uses 99 forensic cases with identified skeletal remains to develop accuracy rates for ancestry estimations conducted by forensic anthropologists. The overall rate of correct ancestry estimation from these cases is 90.9%, which is comparable to most research-derived rates and those reported by individual practitioners. Statistical tests showed no significant difference in accuracy rates depending on examiner education level or on the estimated or identified ancestry. More recent cases showed a significantly higher accuracy rate. The incorporation of metric analyses into the ancestry estimate in these cases led to a higher accuracy rate. © 2017 American Academy of Forensic Sciences.
Statistics of natural binaural sounds.

PubMed

Młynarski, Wiktor; Jost, Jürgen

2014-01-01

Binaural sound localization is usually considered a discrimination task, where interaural phase (IPD) and level (ILD) disparities at narrowly tuned frequency channels are utilized to identify a position of a sound source. In natural conditions however, binaural circuits are exposed to a stimulation by sound waves originating from multiple, often moving and overlapping sources. Therefore statistics of binaural cues depend on acoustic properties and the spatial configuration of the environment. Distribution of cues encountered naturally and their dependence on physical properties of an auditory scene have not been studied before. In the present work we analyzed statistics of naturally encountered binaural sounds. We performed binaural recordings of three auditory scenes with varying spatial configuration and analyzed empirical cue distributions from each scene. We have found that certain properties such as the spread of IPD distributions as well as an overall shape of ILD distributions do not vary strongly between different auditory scenes. Moreover, we found that ILD distributions vary much weaker across frequency channels and IPDs often attain much higher values, than can be predicted from head filtering properties. In order to understand the complexity of the binaural hearing task in the natural environment, sound waveforms were analyzed by performing Independent Component Analysis (ICA). Properties of learned basis functions indicate that in natural conditions soundwaves in each ear are predominantly generated by independent sources. This implies that the real-world sound localization must rely on mechanisms more complex than a mere cue extraction.
Statistics of Natural Binaural Sounds

PubMed Central

Młynarski, Wiktor; Jost, Jürgen

2014-01-01

Binaural sound localization is usually considered a discrimination task, where interaural phase (IPD) and level (ILD) disparities at narrowly tuned frequency channels are utilized to identify a position of a sound source. In natural conditions however, binaural circuits are exposed to a stimulation by sound waves originating from multiple, often moving and overlapping sources. Therefore statistics of binaural cues depend on acoustic properties and the spatial configuration of the environment. Distribution of cues encountered naturally and their dependence on physical properties of an auditory scene have not been studied before. In the present work we analyzed statistics of naturally encountered binaural sounds. We performed binaural recordings of three auditory scenes with varying spatial configuration and analyzed empirical cue distributions from each scene. We have found that certain properties such as the spread of IPD distributions as well as an overall shape of ILD distributions do not vary strongly between different auditory scenes. Moreover, we found that ILD distributions vary much weaker across frequency channels and IPDs often attain much higher values, than can be predicted from head filtering properties. In order to understand the complexity of the binaural hearing task in the natural environment, sound waveforms were analyzed by performing Independent Component Analysis (ICA). Properties of learned basis functions indicate that in natural conditions soundwaves in each ear are predominantly generated by independent sources. This implies that the real-world sound localization must rely on mechanisms more complex than a mere cue extraction. PMID:25285658
The Dependence Structure of Conditional Probabilities in a Contingency Table

ERIC Educational Resources Information Center

Joarder, Anwar H.; Al-Sabah, Walid S.

2002-01-01

Conditional probability and statistical independence can be better explained with contingency tables. In this note some special cases of 2 x 2 contingency tables are considered. In turn an interesting insight into statistical dependence as well as independence of events is obtained.
An Investigation of the Variety and Complexity of Statistical Methods Used in Current Internal Medicine Literature.

PubMed

Narayanan, Roshni; Nugent, Rebecca; Nugent, Kenneth

2015-10-01

Accreditation Council for Graduate Medical Education guidelines require internal medicine residents to develop skills in the interpretation of medical literature and to understand the principles of research. A necessary component is the ability to understand the statistical methods used and their results, material that is not an in-depth focus of most medical school curricula and residency programs. Given the breadth and depth of the current medical literature and an increasing emphasis on complex, sophisticated statistical analyses, the statistical foundation and education necessary for residents are uncertain. We reviewed the statistical methods and terms used in 49 articles discussed at the journal club in the Department of Internal Medicine residency program at Texas Tech University between January 1, 2013 and June 30, 2013. We collected information on the study type and on the statistical methods used for summarizing and comparing samples, determining the relations between independent variables and dependent variables, and estimating models. We then identified the typical statistics education level at which each term or method is learned. A total of 14 articles came from the Journal of the American Medical Association Internal Medicine, 11 from the New England Journal of Medicine, 6 from the Annals of Internal Medicine, 5 from the Journal of the American Medical Association, and 13 from other journals. Twenty reported randomized controlled trials. Summary statistics included mean values (39 articles), category counts (38), and medians (28). Group comparisons were based on t tests (14 articles), χ2 tests (21), and nonparametric ranking tests (10). The relations between dependent and independent variables were analyzed with simple regression (6 articles), multivariate regression (11), and logistic regression (8). Nine studies reported odds ratios with 95% confidence intervals, and seven analyzed test performance using sensitivity and specificity calculations. These papers used 128 statistical terms and context-defined concepts, including some from data analysis (56), epidemiology-biostatistics (31), modeling (24), data collection (12), and meta-analysis (5). Ten different software programs were used in these articles. Based on usual undergraduate and graduate statistics curricula, 64.3% of the concepts and methods used in these papers required at least a master's degree-level statistics education. The interpretation of the current medical literature can require an extensive background in statistical methods at an education level exceeding the material and resources provided to most medical students and residents. Given the complexity and time pressure of medical education, these deficiencies will be hard to correct, but this project can serve as a basis for developing a curriculum in study design and statistical methods needed by physicians-in-training.
A Statistics-Based Material Property Analysis to Support TPS Characterization

NASA Technical Reports Server (NTRS)

Copeland, Sean R.; Cozmuta, Ioana; Alonso, Juan J.

2012-01-01

Accurate characterization of entry capsule heat shield material properties is a critical component in modeling and simulating Thermal Protection System (TPS) response in a prescribed aerothermal environment. The thermal decomposition of the TPS material during the pyrolysis and charring processes is poorly characterized and typically results in large uncertainties in material properties as inputs for ablation models. These material property uncertainties contribute to large design margins on flight systems and cloud re- construction efforts for data collected during flight and ground testing, making revision to existing models for entry systems more challenging. The analysis presented in this work quantifies how material property uncertainties propagate through an ablation model and guides an experimental test regimen aimed at reducing these uncertainties and characterizing the dependencies between properties in the virgin and charred states for a Phenolic Impregnated Carbon Ablator (PICA) based TPS. A sensitivity analysis identifies how the high-fidelity model behaves in the expected flight environment, while a Monte Carlo based uncertainty propagation strategy is used to quantify the expected spread in the in-depth temperature response of the TPS. An examination of how perturbations to the input probability density functions affect output temperature statistics is accomplished using a Kriging response surface of the high-fidelity model. Simulations are based on capsule configuration and aerothermal environments expected during the Mars Science Laboratory (MSL) entry sequence. We identify and rank primary sources of uncertainty from material properties in a flight-relevant environment, show the dependence on spatial orientation and in-depth location on those uncertainty contributors, and quantify how sensitive the expected results are.
Efficient Global Aerodynamic Modeling from Flight Data

NASA Technical Reports Server (NTRS)

Morelli, Eugene A.

2012-01-01

A method for identifying global aerodynamic models from flight data in an efficient manner is explained and demonstrated. A novel experiment design technique was used to obtain dynamic flight data over a range of flight conditions with a single flight maneuver. Multivariate polynomials and polynomial splines were used with orthogonalization techniques and statistical modeling metrics to synthesize global nonlinear aerodynamic models directly and completely from flight data alone. Simulation data and flight data from a subscale twin-engine jet transport aircraft were used to demonstrate the techniques. Results showed that global multivariate nonlinear aerodynamic dependencies could be accurately identified using flight data from a single maneuver. Flight-derived global aerodynamic model structures, model parameter estimates, and associated uncertainties were provided for all six nondimensional force and moment coefficients for the test aircraft. These models were combined with a propulsion model identified from engine ground test data to produce a high-fidelity nonlinear flight simulation very efficiently. Prediction testing using a multi-axis maneuver showed that the identified global model accurately predicted aircraft responses.
User Selection Criteria of Airspace Designs in Flexible Airspace Management

NASA Technical Reports Server (NTRS)

Lee, Hwasoo E.; Lee, Paul U.; Jung, Jaewoo; Lai, Chok Fung

2011-01-01

A method for identifying global aerodynamic models from flight data in an efficient manner is explained and demonstrated. A novel experiment design technique was used to obtain dynamic flight data over a range of flight conditions with a single flight maneuver. Multivariate polynomials and polynomial splines were used with orthogonalization techniques and statistical modeling metrics to synthesize global nonlinear aerodynamic models directly and completely from flight data alone. Simulation data and flight data from a subscale twin-engine jet transport aircraft were used to demonstrate the techniques. Results showed that global multivariate nonlinear aerodynamic dependencies could be accurately identified using flight data from a single maneuver. Flight-derived global aerodynamic model structures, model parameter estimates, and associated uncertainties were provided for all six nondimensional force and moment coefficients for the test aircraft. These models were combined with a propulsion model identified from engine ground test data to produce a high-fidelity nonlinear flight simulation very efficiently. Prediction testing using a multi-axis maneuver showed that the identified global model accurately predicted aircraft responses.

An investigation of new toxicity test method performance in validation studies: 1. Toxicity test methods that have predictive capacity no greater than chance.

PubMed

Bruner, L H; Carr, G J; Harbell, J W; Curren, R D

2002-06-01

An approach commonly used to measure new toxicity test method (NTM) performance in validation studies is to divide toxicity results into positive and negative classifications, and the identify true positive (TP), true negative (TN), false positive (FP) and false negative (FN) results. After this step is completed, the contingent probability statistics (CPS), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) are calculated. Although these statistics are widely used and often the only statistics used to assess the performance of toxicity test methods, there is little specific guidance in the validation literature on what values for these statistics indicate adequate performance. The purpose of this study was to begin developing data-based answers to this question by characterizing the CPS obtained from an NTM whose data have a completely random association with a reference test method (RTM). Determining the CPS of this worst-case scenario is useful because it provides a lower baseline from which the performance of an NTM can be judged in future validation studies. It also provides an indication of relationships in the CPS that help identify random or near-random relationships in the data. The results from this study of randomly associated tests show that the values obtained for the statistics vary significantly depending on the cut-offs chosen, that high values can be obtained for individual statistics, and that the different measures cannot be considered independently when evaluating the performance of an NTM. When the association between results of an NTM and RTM is random the sum of the complementary pairs of statistics (sensitivity + specificity, NPV + PPV) is approximately 1, and the prevalence (i.e., the proportion of toxic chemicals in the population of chemicals) and PPV are equal. Given that combinations of high sensitivity-low specificity or low specificity-high sensitivity (i.e., the sum of the sensitivity and specificity equal to approximately 1) indicate lack of predictive capacity, an NTM having these performance characteristics should be considered no better for predicting toxicity than by chance alone.
Tanning as an addictive behavior: a literature review.

PubMed

Nolan, Bridgit V; Taylor, Sarah L; Liguori, Anthony; Feldman, Steven R

2009-02-01

Recent studies have identified reinforcing properties associated with tanning and suggest a possible physiologic mechanism and addiction driving tanning behavior. This article attempts to synthesize the existing literature on tanning and addiction to investigate possible associations. We investigated a variety of substance dependence models to define what constitutes dependence/addiction and to determine how current studies on tanning meet these criteria. In some individuals, tanning has met Diagnostic and Statistical Manual criteria for a substance-related disorder or tanning-modified Cut Down, Annoyed, Guilt, Eye-opener criteria. Trial studies have demonstrated the induction of withdrawal symptoms in frequent tanners. Additional studies are needed to investigate the associated dependency and addiction more fully and to elucidate its similarities to other better-known addictive syndromes. Tanning is a problem behavior, both as a health risk and as a possible dependency. Future studies, especially in the area of cognitive mapping and cue-related stimuli are needed. Imaging studies may be important in elucidating whether the same areas of the brain are involved in tanning addiction as in other addictive syndromes.
Triadic motifs in the dependence networks of virtual societies.

PubMed

Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

2014-06-10

In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.
Triadic motifs in the dependence networks of virtual societies

NASA Astrophysics Data System (ADS)

Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

2014-06-01

In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs.
Triadic motifs in the dependence networks of virtual societies

PubMed Central

Xie, Wen-Jie; Li, Ming-Xia; Jiang, Zhi-Qiang; Zhou, Wei-Xing

2014-01-01

In friendship networks, individuals have different numbers of friends, and the closeness or intimacy between an individual and her friends is heterogeneous. Using a statistical filtering method to identify relationships about who depends on whom, we construct dependence networks (which are directed) from weighted friendship networks of avatars in more than two hundred virtual societies of a massively multiplayer online role-playing game (MMORPG). We investigate the evolution of triadic motifs in dependence networks. Several metrics show that the virtual societies evolved through a transient stage in the first two to three weeks and reached a relatively stable stage. We find that the unidirectional loop motif (M9) is underrepresented and does not appear, open motifs are also underrepresented, while other close motifs are overrepresented. We also find that, for most motifs, the overall level difference of the three avatars in the same motif is significantly lower than average, whereas the sum of ranks is only slightly larger than average. Our findings show that avatars' social status plays an important role in the formation of triadic motifs. PMID:24912755
An evaluation of intraoperative and postoperative outcomes of torsional mode versus longitudinal ultrasound mode phacoemulsification: a Meta-analysis.

PubMed

Leon, Pia; Umari, Ingrid; Mangogna, Alessandro; Zanei, Andrea; Tognetto, Daniele

2016-01-01

To evaluate and compare the intraoperative parameters and postoperative outcomes of torsional mode and longitudinal mode of phacoemulsification. Pertinent studies were identified by a computerized MEDLINE search from January 2002 to September 2013. The Meta-analysis is composed of two parts. In the first part the intraoperative parameters were considered: ultrasound time (UST) and cumulative dissipated energy (CDE). The intraoperative values were also distinctly considered for two categories (moderate and hard cataract group) depending on the nuclear opacity grade. In the second part of the study the postoperative outcomes as the best corrected visual acuity (BCVA) and the endothelial cell loss (ECL) were taken in consideration. The UST and CDE values proved statistically significant in support of torsional mode for both moderate and hard cataract group. The analysis of BCVA did not present statistically significant difference between the two surgical modalities. The ECL count was statistically significant in support of torsional mode (P<0.001). The Meta-analysis shows the superiority of the torsional mode for intraoperative parameters (UST, CDE) and postoperative ECL outcomes.
An evaluation of intraoperative and postoperative outcomes of torsional mode versus longitudinal ultrasound mode phacoemulsification: a Meta-analysis

PubMed Central

Leon, Pia; Umari, Ingrid; Mangogna, Alessandro; Zanei, Andrea; Tognetto, Daniele

2016-01-01

AIM To evaluate and compare the intraoperative parameters and postoperative outcomes of torsional mode and longitudinal mode of phacoemulsification. METHODS Pertinent studies were identified by a computerized MEDLINE search from January 2002 to September 2013. The Meta-analysis is composed of two parts. In the first part the intraoperative parameters were considered: ultrasound time (UST) and cumulative dissipated energy (CDE). The intraoperative values were also distinctly considered for two categories (moderate and hard cataract group) depending on the nuclear opacity grade. In the second part of the study the postoperative outcomes as the best corrected visual acuity (BCVA) and the endothelial cell loss (ECL) were taken in consideration. RESULTS The UST and CDE values proved statistically significant in support of torsional mode for both moderate and hard cataract group. The analysis of BCVA did not present statistically significant difference between the two surgical modalities. The ECL count was statistically significant in support of torsional mode (P<0.001). CONCLUSION The Meta-analysis shows the superiority of the torsional mode for intraoperative parameters (UST, CDE) and postoperative ECL outcomes. PMID:27366694
Graphical and statistical techniques for cardiac cycle time (phase) dependent changes in interbeat interval: problems with the Jennings et al. (1991) proposals.

PubMed

Barry, R J

1993-01-01

Two apparently new effects in human cardiac responding, "primary bradycardia" and "vagal inhibition", were first described by the Laceys. These effects have been considered by some researchers to reflect differential cardiac innervation, analogous to similar effects observed in animal preparations with direct vagal stimulation. However, it has been argued that such effects arise merely from the data-analytic techniques introduced by the Laceys, and hence are not genuine cardiac cycle effects. Jennings, van der Molen, Somsen and Ridderinkhoff (Psychophysiology, 28 (1991) 596-606) recently proposed a plotting technique and statistical procedure in an attempt to resolve this issue. The present paper demonstrates that the plotting technique fails to achieve their stated aim, since it identifies data from identical cardiac responses as showing cardiac-cycle effects. In addition, the statistical procedure is shown to be reducible to a trivial test of response occurrence. The implication of these demonstrations, in the context of other work, is that this area of investigation has reached a dead end.
Automated thematic mapping and change detection of ERTS-A images. [digital interpretation of Arizona imagery

NASA Technical Reports Server (NTRS)

Gramenopoulos, N. (Principal Investigator)

1973-01-01

The author has identified the following significant results. For the recognition of terrain types, spatial signatures are developed from the diffraction patterns of small areas of ERTS-1 images. This knowledge is exploited for the measurements of a small number of meaningful spatial features from the digital Fourier transforms of ERTS-1 image cells containing 32 x 32 picture elements. Using these spatial features and a heuristic algorithm, the terrain types in the vicinity of Phoenix, Arizona were recognized by the computer with a high accuracy. Then, the spatial features were combined with spectral features and using the maximum likelihood criterion the recognition accuracy of terrain types increased substantially. It was determined that the recognition accuracy with the maximum likelihood criterion depends on the statistics of the feature vectors. Nonlinear transformations of the feature vectors are required so that the terrain class statistics become approximately Gaussian. It was also determined that for a given geographic area the statistics of the classes remain invariable for a period of a month but vary substantially between seasons.
Heterogeneous Structure of Stem Cells Dynamics: Statistical Models and Quantitative Predictions

PubMed Central

Bogdan, Paul; Deasy, Bridget M.; Gharaibeh, Burhan; Roehrs, Timo; Marculescu, Radu

2014-01-01

Understanding stem cell (SC) population dynamics is essential for developing models that can be used in basic science and medicine, to aid in predicting cells fate. These models can be used as tools e.g. in studying patho-physiological events at the cellular and tissue level, predicting (mal)functions along the developmental course, and personalized regenerative medicine. Using time-lapsed imaging and statistical tools, we show that the dynamics of SC populations involve a heterogeneous structure consisting of multiple sub-population behaviors. Using non-Gaussian statistical approaches, we identify the co-existence of fast and slow dividing subpopulations, and quiescent cells, in stem cells from three species. The mathematical analysis also shows that, instead of developing independently, SCs exhibit a time-dependent fractal behavior as they interact with each other through molecular and tactile signals. These findings suggest that more sophisticated models of SC dynamics should view SC populations as a collective and avoid the simplifying homogeneity assumption by accounting for the presence of more than one dividing sub-population, and their multi-fractal characteristics. PMID:24769917
Is it possible to identify a risk factor condition of hypocalcemia in patients candidates to thyroidectomy for benign disease?

PubMed

Del Rio, Paolo; Iapichino, Gioacchino; De Simone, Belinda; Bezer, Lamia; Arcuri, MariaFrancesca; Sianesi, Mario

2010-01-01

Hypocalcaemia is the most frequent complication after total thyroidectomy. The incidence of postoperative hypocalcaemia is reported with different percentages in literature. We report 227 patients undergoing surgery for benign thyroid disease. After obtaining patient's informed consent, we collected and analyzed prospectively the following data: calcium serum levels pre and postoperative in the first 24 hours after surgery according to sex, age, duration of surgery, number of parathyroids identified by the surgeon, surgical technique (open and minimally invasive video-assisted thyroidectomy, i.e., MIVAT). We have considered cases treated consecutively from the same two experienced endocrine surgeons. Hypocalcaemia is assumed when the value of serum calcium is below 7.5 mg/dL. Pre-and post-operative mean serum calcium, with confidence intervals at 99% divided by sex, revealed a statistically significant difference in the ANOVA test (p < 0.01) in terms of incidence. Female sex has higher incidence of hypocalcemia. The evaluation of the mean serum calcium in pre-and post-operative period, with confidence intervals at 95%, depending on the number of identified parathyroid glands by surgeon, showed that the result is not correlated with values of postoperative serum calcium. Age and pre-and postoperative serum calcium values with confidence intervals at 99% based on sex of patients, didn't show statistically significant differences. We haven't highlighted a significant difference in postoperative hypocalcemia in patients treated with conventional thyroidectomy versus MIVAT. A difference in pre- and postoperative mean serum calcium occurs in all patients surgically treated. The only statistical meaningful risk factor for hypocalcemia has been the female sex.
Gene-environment studies: any advantage over environmental studies?

PubMed

Bermejo, Justo Lorenzo; Hemminki, Kari

2007-07-01

Gene-environment studies have been motivated by the likely existence of prevalent low-risk genes that interact with common environmental exposures. The present study assessed the statistical advantage of the simultaneous consideration of genes and environment to investigate the effect of environmental risk factors on disease. In particular, we contemplated the possibility that several genes modulate the environmental effect. Environmental exposures, genotypes and phenotypes were simulated according to a wide range of parameter settings. Different models of gene-gene-environment interaction were considered. For each parameter combination, we estimated the probability of detecting the main environmental effect, the power to identify the gene-environment interaction and the frequency of environmentally affected individuals at which environmental and gene-environment studies show the same statistical power. The proportion of cases in the population attributable to the modeled risk factors was also calculated. Our data indicate that environmental exposures with weak effects may account for a significant proportion of the population prevalence of the disease. A general result was that, if the environmental effect was restricted to rare genotypes, the power to detect the gene-environment interaction was higher than the power to identify the main environmental effect. In other words, when few individuals contribute to the overall environmental effect, individual contributions are large and result in easily identifiable gene-environment interactions. Moreover, when multiple genes interacted with the environment, the statistical benefit of gene-environment studies was limited to those studies that included major contributors to the gene-environment interaction. The advantage of gene-environment over plain environmental studies also depends on the inheritance mode of the involved genes, on the study design and, to some extend, on the disease prevalence.
Regression: The Apple Does Not Fall Far From the Tree.

PubMed

Vetter, Thomas R; Schober, Patrick

2018-05-15

Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.
A Retrospective Cohort Study of Obstetric Outcomes in Opioid-Dependent Women Treated with Implant Naltrexone, Oral Methadone or Sublingual Buprenorphine, and Non-Dependent Controls.

PubMed

Kelty, Erin; Hulse, Gary

2017-07-01

Opioid pharmacotherapies play an important role in the treatment of opioid-dependent women; however, very little is known about the safety of naltrexone in pregnant patients. This study examined the obstetric health of opioid-dependent women who were treated with implant naltrexone during pregnancy, and compared them with women treated with methadone and/or buprenorphine and a cohort of non-opioid-dependent controls. Women treated with implant naltrexone, oral methadone or sublingual buprenorphine between 2001 and 2010, along with a cohort of age-matched controls, were linked with records from midwives, hospital and emergency departments (EDs) and the death registry to identify pregnancy and health events that occurred during pregnancy and in the post-partum period. Overall rates of pregnancy loss (requiring hospital or ED attendance) were significantly elevated in naltrexone-treated women compared with buprenorphine-treated women (p = 0.018) and controls (p < 0.001); however, they were not statistically different to methadone-treated women (p = 0.210). Birth rates in women on naltrexone implant treatment were significantly higher than in all three comparison groups (p < 0.001). Rates of hospital and ED attendance during pregnancy in the naltrexone-treated women were not statistically different to those of either the methadone or buprenorphine groups, and neither were overall complications during pregnancy and labour. Overall rates of complications during pregnancy were significantly higher in the naltrexone-treated women than in the controls. Opioid-dependent women treated with naltrexone implant had higher rates of birth than the other three groups (methadone- or buprenorphine-treated women, or age-matched controls). Overall rates of complications during pregnancy were elevated in naltrexone-treated women when compared with the control group, but were generally not significantly different to rates in methadone- or buprenorphine-treated women.
Data communications in a parallel active messaging interface of a parallel computer

DOEpatents

Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

2013-11-12

Data communications in a parallel active messaging interface (`PAMI`) of a parallel computer composed of compute nodes that execute a parallel application, each compute node including application processors that execute the parallel application and at least one management processor dedicated to gathering information regarding data communications. The PAMI is composed of data communications endpoints, each endpoint composed of a specification of data communications parameters for a thread of execution on a compute node, including specifications of a client, a context, and a task, the compute nodes and the endpoints coupled for data communications through the PAMI and through data communications resources. Embodiments function by gathering call site statistics describing data communications resulting from execution of data communications instructions and identifying in dependence upon the call cite statistics a data communications algorithm for use in executing a data communications instruction at a call site in the parallel application.
A Maximum Likelihood Approach to Functional Mapping of Longitudinal Binary Traits

PubMed Central

Wang, Chenguang; Li, Hongying; Wang, Zhong; Wang, Yaqun; Wang, Ningtao; Wang, Zuoheng; Wu, Rongling

2013-01-01

Despite their importance in biology and biomedicine, genetic mapping of binary traits that change over time has not been well explored. In this article, we develop a statistical model for mapping quantitative trait loci (QTLs) that govern longitudinal responses of binary traits. The model is constructed within the maximum likelihood framework by which the association between binary responses is modeled in terms of conditional log odds-ratios. With this parameterization, the maximum likelihood estimates (MLEs) of marginal mean parameters are robust to the misspecification of time dependence. We implement an iterative procedures to obtain the MLEs of QTL genotype-specific parameters that define longitudinal binary responses. The usefulness of the model was validated by analyzing a real example in rice. Simulation studies were performed to investigate the statistical properties of the model, showing that the model has power to identify and map specific QTLs responsible for the temporal pattern of binary traits. PMID:23183762
Quantitative characterization of nonstructural carbohydrates of mezcal Agave (Agave salmiana Otto ex Salm-Dick).

PubMed

Michel-Cuello, Christian; Juárez-Flores, Bertha Irene; Aguirre-Rivera, Juan Rogelio; Pinos-Rodríguez, Juan Manuel

2008-07-23

Fructans are the reserve carbohydrates in Agave spp. plants. In mezcal factories, fructans undergoes thermal hydrolysis to release fructose and glucose, which are the basis to produce this spirit. Carbohydrate content determines the yield of the final product, which depends on plant organ, ripeness stage, and thermal hydrolysis. Thus, a qualitative and quantitative characterization of nonstructural carbohydrates was conducted in raw and hydrolyzed juices extracted from Agave salmiana stems and leaves under three ripeness stages. By high-performance liquid chromatography (HPLC), fructose, glucose, sucrose, xylose, and maltose were identified in agave juice. Only the plant fraction with hydrolysis interaction was found to be significant in the glucose concentration plant. Interactions of the fraction with hydrolysis and ripeness with hydrolysis were statistically significant in fructose concentration. Fructose concentration rose considerably with hydrolysis, but only in juice extracted from ripe agave stems (early mature and castrated). This increase was statistically significant only with acid hydrolysis.
Statistical research into low-power solar flares. Main phase duration

NASA Astrophysics Data System (ADS)

Borovik, Aleksandr; Zhdanov, Anton

2017-12-01

This paper is a sequel to earlier papers on time parameters of solar flares in the Hα line. Using data from the International Flare Patrol, an electronic database of solar flares for the period 1972-2010 has been created. The statistical analysis of the duration of the main phase has shown that it increases with increasing flare class and brightness. It has been found that the duration of the main phase depends on the type and features of development of solar flares. Flares with one brilliant point have the shortest main phase; flares with several intensity maxima and two-ribbon flares, the longest one. We have identified more than 3000 cases with an ultra-long duration of the main phase (more than 60 minutes). For 90% of such flares the duration of the main phase is 2-3 hrs, but sometimes it reaches 12 hrs.
Can the behavioral sciences self-correct? A social epistemic study.

PubMed

Romero, Felipe

2016-12-01

Advocates of the self-corrective thesis argue that scientific method will refute false theories and find closer approximations to the truth in the long run. I discuss a contemporary interpretation of this thesis in terms of frequentist statistics in the context of the behavioral sciences. First, I identify experimental replications and systematic aggregation of evidence (meta-analysis) as the self-corrective mechanism. Then, I present a computer simulation study of scientific communities that implement this mechanism to argue that frequentist statistics may converge upon a correct estimate or not depending on the social structure of the community that uses it. Based on this study, I argue that methodological explanations of the "replicability crisis" in psychology are limited and propose an alternative explanation in terms of biases. Finally, I conclude suggesting that scientific self-correction should be understood as an interaction effect between inference methods and social structures. Copyright © 2016 Elsevier Ltd. All rights reserved.
A new test statistic for climate models that includes field and spatial dependencies using Gaussian Markov random fields

DOE PAGES

Nosedal-Sanchez, Alvaro; Jackson, Charles S.; Huerta, Gabriel

2016-07-20

A new test statistic for climate model evaluation has been developed that potentially mitigates some of the limitations that exist for observing and representing field and space dependencies of climate phenomena. Traditionally such dependencies have been ignored when climate models have been evaluated against observational data, which makes it difficult to assess whether any given model is simulating observed climate for the right reasons. The new statistic uses Gaussian Markov random fields for estimating field and space dependencies within a first-order grid point neighborhood structure. We illustrate the ability of Gaussian Markov random fields to represent empirical estimates of fieldmore » and space covariances using "witch hat" graphs. We further use the new statistic to evaluate the tropical response of a climate model (CAM3.1) to changes in two parameters important to its representation of cloud and precipitation physics. Overall, the inclusion of dependency information did not alter significantly the recognition of those regions of parameter space that best approximated observations. However, there were some qualitative differences in the shape of the response surface that suggest how such a measure could affect estimates of model uncertainty.« less

A new test statistic for climate models that includes field and spatial dependencies using Gaussian Markov random fields

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nosedal-Sanchez, Alvaro; Jackson, Charles S.; Huerta, Gabriel

A new test statistic for climate model evaluation has been developed that potentially mitigates some of the limitations that exist for observing and representing field and space dependencies of climate phenomena. Traditionally such dependencies have been ignored when climate models have been evaluated against observational data, which makes it difficult to assess whether any given model is simulating observed climate for the right reasons. The new statistic uses Gaussian Markov random fields for estimating field and space dependencies within a first-order grid point neighborhood structure. We illustrate the ability of Gaussian Markov random fields to represent empirical estimates of fieldmore » and space covariances using "witch hat" graphs. We further use the new statistic to evaluate the tropical response of a climate model (CAM3.1) to changes in two parameters important to its representation of cloud and precipitation physics. Overall, the inclusion of dependency information did not alter significantly the recognition of those regions of parameter space that best approximated observations. However, there were some qualitative differences in the shape of the response surface that suggest how such a measure could affect estimates of model uncertainty.« less
Impact of an Onsite Clinic on Utilization of Preventive Services.

PubMed

Ostovari, Mina; Yu, Denny; Yih, Yuehwern; Steele-Morris, Charlotte Joy

2017-07-01

To assess impact of an onsite clinic on healthcare utilization of preventive services for employees of a public university and their dependents. Descriptive statistics, logistic regression and classification tree techniques were used to assess health claim data to identify changes in patterns of healthcare utilization and factors impacting usage of onsite clinic. Utilization of preventive services significantly increased for women and men employees by 9% and 14% one year after implementation of the onsite clinic. Hourly-paid employees, employees without diabetes, employees with spouse opt out or no coverage were more likely to go to the onsite clinic. Adapted framework for assessing performance of onsite clinics based on usage of health informatics would help to identify health utilization patterns and interaction between onsite clinic and offsite health providers.
Identifying pleiotropic genes in genome-wide association studies from related subjects using the linear mixed model and Fisher combination function.

PubMed

Yang, James J; Williams, L Keoki; Buu, Anne

2017-08-24

A multivariate genome-wide association test is proposed for analyzing data on multivariate quantitative phenotypes collected from related subjects. The proposed method is a two-step approach. The first step models the association between the genotype and marginal phenotype using a linear mixed model. The second step uses the correlation between residuals of the linear mixed model to estimate the null distribution of the Fisher combination test statistic. The simulation results show that the proposed method controls the type I error rate and is more powerful than the marginal tests across different population structures (admixed or non-admixed) and relatedness (related or independent). The statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that applying the multivariate association test may facilitate identification of the pleiotropic genes contributing to the risk for alcohol dependence commonly expressed by four correlated phenotypes. This study proposes a multivariate method for identifying pleiotropic genes while adjusting for cryptic relatedness and population structure between subjects. The two-step approach is not only powerful but also computationally efficient even when the number of subjects and the number of phenotypes are both very large.
PPM1D Mosaic Truncating Variants in Ovarian Cancer Cases May Be Treatment-Related Somatic Mutations.

PubMed

Pharoah, Paul D P; Song, Honglin; Dicks, Ed; Intermaggio, Maria P; Harrington, Patricia; Baynes, Caroline; Alsop, Kathryn; Bogdanova, Natalia; Cicek, Mine S; Cunningham, Julie M; Fridley, Brooke L; Gentry-Maharaj, Aleksandra; Hillemanns, Peter; Lele, Shashi; Lester, Jenny; McGuire, Valerie; Moysich, Kirsten B; Poblete, Samantha; Sieh, Weiva; Sucheston-Campbell, Lara; Widschwendter, Martin; Whittemore, Alice S; Dörk, Thilo; Menon, Usha; Odunsi, Kunle; Goode, Ellen L; Karlan, Beth Y; Bowtell, David D; Gayther, Simon A; Ramus, Susan J

2016-03-01

Mosaic truncating mutations in the protein phosphatase, Mg(2+)/Mn(2+)-dependent, 1D (PPM1D) gene have recently been reported with a statistically significantly greater frequency in lymphocyte DNA from ovarian cancer case patients compared with unaffected control patients. Using massively parallel sequencing (MPS) we identified truncating PPM1D mutations in 12 of 3236 epithelial ovarian cancer (EOC) case patients (0.37%) but in only one of 3431 unaffected control patients (0.03%) (P = .001). All statistical tests were two-sided. A combination of Sanger sequencing, pyrosequencing, and MPS data suggested that 12 of the 13 mutations were mosaic. All mutations were identified in post-chemotherapy treatment blood samples from case patients (n = 1827) (average 1234 days post-treatment in carriers) rather than from cases collected pretreatment (less than 14 days after diagnosis, n = 1384) (P = .002). These data suggest that PPM1D variants in EOC cases are primarily somatic mosaic mutations caused by treatment and are not associated with germline predisposition to EOC. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Biological effective dose evaluation in gynaecological brachytherapy: LDR and HDR treatments, dependence on radiobiological parameters, and treatment optimisation.

PubMed

Bianchi, C; Botta, F; Conte, L; Vanoli, P; Cerizza, L

2008-10-01

This study was undertaken to compare the biological efficacy of different high-dose-rate (HDR) and low-dose-rate (LDR) treatments of gynaecological lesions, to identify the causes of possible nonuniformity and to optimise treatment through customised calculation. The study considered 110 patients treated between 2001 and 2006 with external beam radiation therapy and/or brachytherapy with either LDR (afterloader Selectron, (137)Cs) or HDR (afterloader microSelectron Classic, (192)Ir). The treatments were compared in terms of biologically effective dose (BED) to the tumour and to the rectum (linear-quadratic model) by using statistical tests for comparisons between independent samples. The difference between the two treatments was statistically significant in one case only. However, within each technique, we identified considerable nonuniformity in therapeutic efficacy due to differences in fractionation schemes and overall treatment time. To solve this problem, we created a Microsoft Excel spreadsheet allowing calculation of the optimal treatment for each patient: best efficacy (BED(tumour)) without exceeding toxicity threshold (BED(rectum)). The efficacy of a treatment may vary as a result of several factors. Customised radiobiological evaluation is a useful adjunct to clinical evaluation in planning equivalent treatments that satisfy all dosimetric constraints.
Exploring Marine Corps Officer Quality: An Analysis of Promotion to Lieutenant Colonel

DTIC Science & Technology

2017-03-01

44 G. DESCRIPTIVE STATISTICS ................................................................44 1. Dependent...Variable Summary Statistics ...................................44 2. Performance...87 4. Further Research .........................................................................88 APPENDIX A. SUMMARY STATISTICS OF FITREP AND
Cluster size statistic and cluster mass statistic: two novel methods for identifying changes in functional connectivity between groups or conditions.

PubMed

Ing, Alex; Schwarzbauer, Christian

2014-01-01

Functional connectivity has become an increasingly important area of research in recent years. At a typical spatial resolution, approximately 300 million connections link each voxel in the brain with every other. This pattern of connectivity is known as the functional connectome. Connectivity is often compared between experimental groups and conditions. Standard methods used to control the type 1 error rate are likely to be insensitive when comparisons are carried out across the whole connectome, due to the huge number of statistical tests involved. To address this problem, two new cluster based methods--the cluster size statistic (CSS) and cluster mass statistic (CMS)--are introduced to control the family wise error rate across all connectivity values. These methods operate within a statistical framework similar to the cluster based methods used in conventional task based fMRI. Both methods are data driven, permutation based and require minimal statistical assumptions. Here, the performance of each procedure is evaluated in a receiver operator characteristic (ROC) analysis, utilising a simulated dataset. The relative sensitivity of each method is also tested on real data: BOLD (blood oxygen level dependent) fMRI scans were carried out on twelve subjects under normal conditions and during the hypercapnic state (induced through the inhalation of 6% CO2 in 21% O2 and 73%N2). Both CSS and CMS detected significant changes in connectivity between normal and hypercapnic states. A family wise error correction carried out at the individual connection level exhibited no significant changes in connectivity.
Cluster Size Statistic and Cluster Mass Statistic: Two Novel Methods for Identifying Changes in Functional Connectivity Between Groups or Conditions

PubMed Central

Ing, Alex; Schwarzbauer, Christian

2014-01-01

Functional connectivity has become an increasingly important area of research in recent years. At a typical spatial resolution, approximately 300 million connections link each voxel in the brain with every other. This pattern of connectivity is known as the functional connectome. Connectivity is often compared between experimental groups and conditions. Standard methods used to control the type 1 error rate are likely to be insensitive when comparisons are carried out across the whole connectome, due to the huge number of statistical tests involved. To address this problem, two new cluster based methods – the cluster size statistic (CSS) and cluster mass statistic (CMS) – are introduced to control the family wise error rate across all connectivity values. These methods operate within a statistical framework similar to the cluster based methods used in conventional task based fMRI. Both methods are data driven, permutation based and require minimal statistical assumptions. Here, the performance of each procedure is evaluated in a receiver operator characteristic (ROC) analysis, utilising a simulated dataset. The relative sensitivity of each method is also tested on real data: BOLD (blood oxygen level dependent) fMRI scans were carried out on twelve subjects under normal conditions and during the hypercapnic state (induced through the inhalation of 6% CO2 in 21% O2 and 73%N2). Both CSS and CMS detected significant changes in connectivity between normal and hypercapnic states. A family wise error correction carried out at the individual connection level exhibited no significant changes in connectivity. PMID:24906136
I Cannot Read My Statistics Textbook: The Relationship between Reading Ability and Statistics Anxiety

ERIC Educational Resources Information Center

Collins, Kathleen M. T.; Onwuegbuzie, Anthony J.

2007-01-01

Although several antecedents of statistics anxiety have been identified, many of these factors are relatively immutable (e.g., gender) and, at best, identify students who are at risk for debilitative levels of statistics anxiety, thereby having only minimal implications for intervention. Furthermore, the few interventions that have been designed…
Extended local similarity analysis (eLSA) of microbial community and other time series data with replicates.

PubMed

Xia, Li C; Steele, Joshua A; Cram, Jacob A; Cardon, Zoe G; Simmons, Sheri L; Vallino, Joseph J; Fuhrman, Jed A; Sun, Fengzhu

2011-01-01

The increasing availability of time series microbial community data from metagenomics and other molecular biological studies has enabled the analysis of large-scale microbial co-occurrence and association networks. Among the many analytical techniques available, the Local Similarity Analysis (LSA) method is unique in that it captures local and potentially time-delayed co-occurrence and association patterns in time series data that cannot otherwise be identified by ordinary correlation analysis. However LSA, as originally developed, does not consider time series data with replicates, which hinders the full exploitation of available information. With replicates, it is possible to understand the variability of local similarity (LS) score and to obtain its confidence interval. We extended our LSA technique to time series data with replicates and termed it extended LSA, or eLSA. Simulations showed the capability of eLSA to capture subinterval and time-delayed associations. We implemented the eLSA technique into an easy-to-use analytic software package. The software pipeline integrates data normalization, statistical correlation calculation, statistical significance evaluation, and association network construction steps. We applied the eLSA technique to microbial community and gene expression datasets, where unique time-dependent associations were identified. The extended LSA analysis technique was demonstrated to reveal statistically significant local and potentially time-delayed association patterns in replicated time series data beyond that of ordinary correlation analysis. These statistically significant associations can provide insights to the real dynamics of biological systems. The newly designed eLSA software efficiently streamlines the analysis and is freely available from the eLSA homepage, which can be accessed at http://meta.usc.edu/softs/lsa.
Extended local similarity analysis (eLSA) of microbial community and other time series data with replicates

PubMed Central

2011-01-01

Background The increasing availability of time series microbial community data from metagenomics and other molecular biological studies has enabled the analysis of large-scale microbial co-occurrence and association networks. Among the many analytical techniques available, the Local Similarity Analysis (LSA) method is unique in that it captures local and potentially time-delayed co-occurrence and association patterns in time series data that cannot otherwise be identified by ordinary correlation analysis. However LSA, as originally developed, does not consider time series data with replicates, which hinders the full exploitation of available information. With replicates, it is possible to understand the variability of local similarity (LS) score and to obtain its confidence interval. Results We extended our LSA technique to time series data with replicates and termed it extended LSA, or eLSA. Simulations showed the capability of eLSA to capture subinterval and time-delayed associations. We implemented the eLSA technique into an easy-to-use analytic software package. The software pipeline integrates data normalization, statistical correlation calculation, statistical significance evaluation, and association network construction steps. We applied the eLSA technique to microbial community and gene expression datasets, where unique time-dependent associations were identified. Conclusions The extended LSA analysis technique was demonstrated to reveal statistically significant local and potentially time-delayed association patterns in replicated time series data beyond that of ordinary correlation analysis. These statistically significant associations can provide insights to the real dynamics of biological systems. The newly designed eLSA software efficiently streamlines the analysis and is freely available from the eLSA homepage, which can be accessed at http://meta.usc.edu/softs/lsa. PMID:22784572
Hydrometeor classification through statistical clustering of polarimetric radar measurements: a semi-supervised approach

NASA Astrophysics Data System (ADS)

Besic, Nikola; Ventura, Jordi Figueras i.; Grazioli, Jacopo; Gabella, Marco; Germann, Urs; Berne, Alexis

2016-09-01

Polarimetric radar-based hydrometeor classification is the procedure of identifying different types of hydrometeors by exploiting polarimetric radar observations. The main drawback of the existing supervised classification methods, mostly based on fuzzy logic, is a significant dependency on a presumed electromagnetic behaviour of different hydrometeor types. Namely, the results of the classification largely rely upon the quality of scattering simulations. When it comes to the unsupervised approach, it lacks the constraints related to the hydrometeor microphysics. The idea of the proposed method is to compensate for these drawbacks by combining the two approaches in a way that microphysical hypotheses can, to a degree, adjust the content of the classes obtained statistically from the observations. This is done by means of an iterative approach, performed offline, which, in a statistical framework, examines clustered representative polarimetric observations by comparing them to the presumed polarimetric properties of each hydrometeor class. Aside from comparing, a routine alters the content of clusters by encouraging further statistical clustering in case of non-identification. By merging all identified clusters, the multi-dimensional polarimetric signatures of various hydrometeor types are obtained for each of the studied representative datasets, i.e. for each radar system of interest. These are depicted by sets of centroids which are then employed in operational labelling of different hydrometeors. The method has been applied on three C-band datasets, each acquired by different operational radar from the MeteoSwiss Rad4Alp network, as well as on two X-band datasets acquired by two research mobile radars. The results are discussed through a comparative analysis which includes a corresponding supervised and unsupervised approach, emphasising the operational potential of the proposed method.
High Variability in Cellular Stoichiometry of Carbon, Nitrogen, and Phosphorus Within Classes of Marine Eukaryotic Phytoplankton Under Sufficient Nutrient Conditions.

PubMed

Garcia, Nathan S; Sexton, Julie; Riggins, Tracey; Brown, Jeff; Lomas, Michael W; Martiny, Adam C

2018-01-01

Current hypotheses suggest that cellular elemental stoichiometry of marine eukaryotic phytoplankton such as the ratios of cellular carbon:nitrogen:phosphorus (C:N:P) vary between phylogenetic groups. To investigate how phylogenetic structure, cell volume, growth rate, and temperature interact to affect the cellular elemental stoichiometry of marine eukaryotic phytoplankton, we examined the C:N:P composition in 30 isolates across 7 classes of marine phytoplankton that were grown with a sufficient supply of nutrients and nitrate as the nitrogen source. The isolates covered a wide range in cell volume (5 orders of magnitude), growth rate (<0.01-0.9 d -1 ), and habitat temperature (2-24°C). Our analysis indicates that C:N:P is highly variable, with statistical model residuals accounting for over half of the total variance and no relationship between phylogeny and elemental stoichiometry. Furthermore, our data indicated that variability in C:P, N:P, and C:N within Bacillariophyceae (diatoms) was as high as that among all of the isolates that we examined. In addition, a linear statistical model identified a positive relationship between diatom cell volume and C:P and N:P. Among all of the isolates that we examined, the statistical model identified temperature as a significant factor, consistent with the temperature-dependent translation efficiency model, but temperature only explained 5% of the total statistical model variance. While some of our results support data from previous field studies, the high variability of elemental ratios within Bacillariophyceae contradicts previous work that suggests that this cosmopolitan group of microalgae has consistently low C:P and N:P ratios in comparison with other groups.
Statistical identification of gene association by CID in application of constructing ER regulatory network

PubMed Central

Liu, Li-Yu D; Chen, Chien-Yu; Chen, Mei-Ju M; Tsai, Ming-Shian; Lee, Cho-Han S; Phang, Tzu L; Chang, Li-Yun; Kuo, Wen-Hung; Hwa, Hsiao-Lin; Lien, Huang-Chun; Jung, Shih-Ming; Lin, Yi-Shing; Chang, King-Jen; Hsieh, Fon-Jou

2009-01-01

Background A variety of high-throughput techniques are now available for constructing comprehensive gene regulatory networks in systems biology. In this study, we report a new statistical approach for facilitating in silico inference of regulatory network structure. The new measure of association, coefficient of intrinsic dependence (CID), is model-free and can be applied to both continuous and categorical distributions. When given two variables X and Y, CID answers whether Y is dependent on X by examining the conditional distribution of Y given X. In this paper, we apply CID to analyze the regulatory relationships between transcription factors (TFs) (X) and their downstream genes (Y) based on clinical data. More specifically, we use estrogen receptor α (ERα) as the variable X, and the analyses are based on 48 clinical breast cancer gene expression arrays (48A). Results The analytical utility of CID was evaluated in comparison with four commonly used statistical methods, Galton-Pearson's correlation coefficient (GPCC), Student's t-test (STT), coefficient of determination (CoD), and mutual information (MI). When being compared to GPCC, CoD, and MI, CID reveals its preferential ability to discover the regulatory association where distribution of the mRNA expression levels on X and Y does not fit linear models. On the other hand, when CID is used to measure the association of a continuous variable (Y) against a discrete variable (X), it shows similar performance as compared to STT, and appears to outperform CoD and MI. In addition, this study established a two-layer transcriptional regulatory network to exemplify the usage of CID, in combination with GPCC, in deciphering gene networks based on gene expression profiles from patient arrays. Conclusion CID is shown to provide useful information for identifying associations between genes and transcription factors of interest in patient arrays. When coupled with the relationships detected by GPCC, the association predicted by CID are applicable to the construction of transcriptional regulatory networks. This study shows how information from different data sources and learning algorithms can be integrated to investigate whether relevant regulatory mechanisms identified in cell models can also be partially re-identified in clinical samples of breast cancers. Availability the implementation of CID in R codes can be freely downloaded from . PMID:19292896
Deriving inertial wave characteristics from surface drifter velocities - Frequency variability in the tropical Pacific

NASA Technical Reports Server (NTRS)

Poulain, Pierre-Marie; Luther, Douglas S.; Patzert, William C.

1992-01-01

Two techniques were developed for estimating statistics of inertial oscillations from satellite-tracked drifters that overcome the difficulties inherent in estimating such statistics from data dependent upon space coordinates that are a function of time. Application of these techniques to tropical surface drifter data collected during the NORPAX, EPOCS, and TOGA programs reveals a latitude-dependent, statistically significant 'blue shift' of inertial wave frequency. The latitudinal dependence of the blue shift is similar to predictions based on 'global' internal-wave spectral models, with a superposition of frequency shifting due to modification of the effective local inertial frequency by the presence of strongly sheared zonal mean currents within 12 deg of the equator.
Self-consistent mean-field approach to the statistical level density in spherical nuclei

NASA Astrophysics Data System (ADS)

Kolomietz, V. M.; Sanzhur, A. I.; Shlomo, S.

2018-06-01

A self-consistent mean-field approach within the extended Thomas-Fermi approximation with Skyrme forces is applied to the calculations of the statistical level density in spherical nuclei. Landau's concept of quasiparticles with the nucleon effective mass and the correct description of the continuum states for the finite-depth potentials are taken into consideration. The A dependence and the temperature dependence of the statistical inverse level-density parameter K is obtained in a good agreement with experimental data.
Factors associated with frailty in chronically ill older adults.

PubMed

Hackstaff, Lynn

2009-01-01

An ex post facto analysis of a secondary dataset examined relationships between physical frailty, depression, and the self-perceived domains of health status and quality-of-life in older adults. The randomized sample included 992 community-dwelling, chronically ill, and functionally impaired adults age 65 and older who received care from a Southern California Kaiser Permanente medical center between 1998 and 2002. Physical frailty represents a level of physiologic vulnerability and functional loss that results in dependence on others for basic, daily living needs (Fried et al., 2001). The purpose of the study was to identify possible intervention junctures related to self-efficacy of older adults in order to help optimize their functionality. Multivariate correlation analyses showed statistically significant positive correlations between frailty level and depression (r = .18; p = < .05), number of medical conditions (r = .09; p = < .05), and self-rated quality-of-life (r = .24; p = < .05). Frailty level showed a statistically significant negative correlation with self-perceived health status (r = -.25; p = < .05). Notably, no statistically significant correlation was found between age and frailty level (r = -.03; p = < .05). In linear regression, self-perceived health status had a partial variance with frailty level (part r = -.18). The significant correlations found support further research to identify interventions to help vulnerable, older adults challenge self-perceived capabilities so that they may achieve optimum functionality through increased physical activity earlier on, and increased self-efficacy to support successful adaptation to aging-related losses.
28 CFR 22.20 - Applicability.

Code of Federal Regulations, 2011 CFR

2011-07-01

... Judicial Administration DEPARTMENT OF JUSTICE CONFIDENTIALITY OF IDENTIFIABLE RESEARCH AND STATISTICAL INFORMATION § 22.20 Applicability. (a) These regulations govern use and revelation of research and statistical... identifiable research or statistical information was originally obtained; or to any records which are...
28 CFR 22.20 - Applicability.

Code of Federal Regulations, 2010 CFR

2010-07-01

... Judicial Administration DEPARTMENT OF JUSTICE CONFIDENTIALITY OF IDENTIFIABLE RESEARCH AND STATISTICAL INFORMATION § 22.20 Applicability. (a) These regulations govern use and revelation of research and statistical... identifiable research or statistical information was originally obtained; or to any records which are...
Equitability, mutual information, and the maximal information coefficient.

PubMed

Kinney, Justin B; Atwal, Gurinder S

2014-03-04

How should one quantify the strength of association between two random variables without bias for relationships of a specific form? Despite its conceptual simplicity, this notion of statistical "equitability" has yet to receive a definitive mathematical formalization. Here we argue that equitability is properly formalized by a self-consistency condition closely related to Data Processing Inequality. Mutual information, a fundamental quantity in information theory, is shown to satisfy this equitability criterion. These findings are at odds with the recent work of Reshef et al. [Reshef DN, et al. (2011) Science 334(6062):1518-1524], which proposed an alternative definition of equitability and introduced a new statistic, the "maximal information coefficient" (MIC), said to satisfy equitability in contradistinction to mutual information. These conclusions, however, were supported only with limited simulation evidence, not with mathematical arguments. Upon revisiting these claims, we prove that the mathematical definition of equitability proposed by Reshef et al. cannot be satisfied by any (nontrivial) dependence measure. We also identify artifacts in the reported simulation evidence. When these artifacts are removed, estimates of mutual information are found to be more equitable than estimates of MIC. Mutual information is also observed to have consistently higher statistical power than MIC. We conclude that estimating mutual information provides a natural (and often practical) way to equitably quantify statistical associations in large datasets.

Extreme between-study homogeneity in meta-analyses could offer useful insights.

PubMed

Ioannidis, John P A; Trikalinos, Thomas A; Zintzaras, Elias

2006-10-01

Meta-analyses are routinely evaluated for the presence of large between-study heterogeneity. We examined whether it is also important to probe whether there is extreme between-study homogeneity. We used heterogeneity tests with left-sided statistical significance for inference and developed a Monte Carlo simulation test for testing extreme homogeneity in risk ratios across studies, using the empiric distribution of the summary risk ratio and heterogeneity statistic. A left-sided P=0.01 threshold was set for claiming extreme homogeneity to minimize type I error. Among 11,803 meta-analyses with binary contrasts from the Cochrane Library, 143 (1.21%) had left-sided P-value <0.01 for the asymptotic Q statistic and 1,004 (8.50%) had left-sided P-value <0.10. The frequency of extreme between-study homogeneity did not depend on the number of studies in the meta-analyses. We identified examples where extreme between-study homogeneity (left-sided P-value <0.01) could result from various possibilities beyond chance. These included inappropriate statistical inference (asymptotic vs. Monte Carlo), use of a specific effect metric, correlated data or stratification using strong predictors of outcome, and biases and potential fraud. Extreme between-study homogeneity may provide useful insights about a meta-analysis and its constituent studies.
Statistics of Magnetic Reconnection X-Lines in Kinetic Turbulence

NASA Astrophysics Data System (ADS)

Haggerty, C. C.; Parashar, T.; Matthaeus, W. H.; Shay, M. A.; Wan, M.; Servidio, S.; Wu, P.

2016-12-01

In this work we examine the statistics of magnetic reconnection (x-lines) and their associated reconnection rates in intermittent current sheets generated in turbulent plasmas. Although such statistics have been studied previously for fluid simulations (e.g. [1]), they have not yet been generalized to fully kinetic particle-in-cell (PIC) simulations. A significant problem with PIC simulations, however, is electrostatic fluctuations generated due to numerical particle counting statistics. We find that analyzing gradients of the magnetic vector potential from the raw PIC field data identifies numerous artificial (or non-physical) x-points. Using small Orszag-Tang vortex PIC simulations, we analyze x-line identification and show that these artificial x-lines can be removed using sub-Debye length filtering of the data. We examine how turbulent properties such as the magnetic spectrum and scale dependent kurtosis are affected by particle noise and sub-Debye length filtering. We subsequently apply these analysis methods to a large scale kinetic PIC turbulent simulation. Consistent with previous fluid models, we find a range of normalized reconnection rates as large as ½ but with the bulk of the rates being approximately less than to 0.1. [1] Servidio, S., W. H. Matthaeus, M. A. Shay, P. A. Cassak, and P. Dmitruk (2009), Magnetic reconnection and two-dimensional magnetohydrodynamic turbulence, Phys. Rev. Lett., 102, 115003.
Significant Pre-Accession Factors Predicting Success or Failure During a Marine Corps Officer’s Initial Service Obligation

DTIC Science & Technology

2015-12-01

WAIVERS ..............................................................................................49 APPENDIX C. DESCRIPTIVE STATISTICS ... Statistics of Dependent Variables. .............................................23 Table 6. Summary Statistics of Academics Variables...24 Table 7. Summary Statistics of Application Variables ............................................25 Table 8
Identification of reliable gridded reference data for statistical downscaling methods in Alberta

NASA Astrophysics Data System (ADS)

Eum, H. I.; Gupta, A.

2017-12-01

Climate models provide essential information to assess impacts of climate change at regional and global scales. However, statistical downscaling methods have been applied to prepare climate model data for various applications such as hydrologic and ecologic modelling at a watershed scale. As the reliability and (spatial and temporal) resolution of statistically downscaled climate data mainly depend on a reference data, identifying the most reliable reference data is crucial for statistical downscaling. A growing number of gridded climate products are available for key climate variables which are main input data to regional modelling systems. However, inconsistencies in these climate products, for example, different combinations of climate variables, varying data domains and data lengths and data accuracy varying with physiographic characteristics of the landscape, have caused significant challenges in selecting the most suitable reference climate data for various environmental studies and modelling. Employing various observation-based daily gridded climate products available in public domain, i.e. thin plate spline regression products (ANUSPLIN and TPS), inverse distance method (Alberta Townships), and numerical climate model (North American Regional Reanalysis) and an optimum interpolation technique (Canadian Precipitation Analysis), this study evaluates the accuracy of the climate products at each grid point by comparing with the Adjusted and Homogenized Canadian Climate Data (AHCCD) observations for precipitation, minimum and maximum temperature over the province of Alberta. Based on the performance of climate products at AHCCD stations, we ranked the reliability of these publically available climate products corresponding to the elevations of stations discretized into several classes. According to the rank of climate products for each elevation class, we identified the most reliable climate products based on the elevation of target points. A web-based system was developed to allow users to easily select the most reliable reference climate data at each target point based on the elevation of grid cell. By constructing the best combination of reference data for the study domain, the accurate and reliable statistically downscaled climate projections could be significantly improved.
A consistent framework for Horton regression statistics that leads to a modified Hack's law

USGS Publications Warehouse

Furey, P.R.; Troutman, B.M.

2008-01-01

A statistical framework is introduced that resolves important problems with the interpretation and use of traditional Horton regression statistics. The framework is based on a univariate regression model that leads to an alternative expression for Horton ratio, connects Horton regression statistics to distributional simple scaling, and improves the accuracy in estimating Horton plot parameters. The model is used to examine data for drainage area A and mainstream length L from two groups of basins located in different physiographic settings. Results show that confidence intervals for the Horton plot regression statistics are quite wide. Nonetheless, an analysis of covariance shows that regression intercepts, but not regression slopes, can be used to distinguish between basin groups. The univariate model is generalized to include n > 1 dependent variables. For the case where the dependent variables represent ln A and ln L, the generalized model performs somewhat better at distinguishing between basin groups than two separate univariate models. The generalized model leads to a modification of Hack's law where L depends on both A and Strahler order ??. Data show that ?? plays a statistically significant role in the modified Hack's law expression. ?? 2008 Elsevier B.V.
28 CFR 22.1 - Purpose.

Code of Federal Regulations, 2011 CFR

2011-07-01

... Administration DEPARTMENT OF JUSTICE CONFIDENTIALITY OF IDENTIFIABLE RESEARCH AND STATISTICAL INFORMATION § 22.1... information identifiable to a private person obtained in a research or statistical program may only be used... reliability of federally-supported research and statistical findings by minimizing subject concern over...
28 CFR 22.1 - Purpose.

Code of Federal Regulations, 2010 CFR

2010-07-01

... Administration DEPARTMENT OF JUSTICE CONFIDENTIALITY OF IDENTIFIABLE RESEARCH AND STATISTICAL INFORMATION § 22.1... information identifiable to a private person obtained in a research or statistical program may only be used... reliability of federally-supported research and statistical findings by minimizing subject concern over...
Relational memory and self-efficacy measures reveal distinct profiles of subjective memory concerns in older adults.

PubMed

Lucas, Heather D; Monti, Jim M; McAuley, Edward; Watson, Patrick D; Kramer, Arthur F; Cohen, Neal J

2016-07-01

Subjective memory concerns (SMCs) in healthy older adults are associated with future decline and can indicate preclinical dementia. However, SMCs may be multiply determined, and often correlate with affective or psychosocial variables rather than with performance on memory tests. Our objective was to identify sensitive and selective methods to disentangle the underlying causes of SMCs. Because preclinical dementia pathology targets the hippocampus, we hypothesized that performance on hippocampally dependent relational memory tests would correlate with SMCs. We thus administered a series of memory tasks with varying dependence on relational memory processing to 91 older adults, along with questionnaires assessing depression, anxiety, and memory self-efficacy. We used correlational, regression, and mediation analyses to compare the variance in SMCs accounted for by these measures. Performance on the task most dependent on relational memory processing showed a stronger negative association with SMCs than did other memory performance metrics. SMCs were also negatively associated with memory self-efficacy. These 2 measures, along with age and education, accounted for 40% of the variance in SMCs. Self-efficacy and relational memory were uncorrelated and independent predictors of SMCs. Moreover, self-efficacy statistically mediated the relationship between SMCs and depression and anxiety, which can be detrimental to cognitive aging. These data identify multiple mechanisms that can contribute to SMCs, and suggest that SMCs can both cause and be caused by age-related cognitive decline. Relational memory measures may be effective assays of objective memory difficulties, while assessing self-efficacy could identify detrimental affective responses to cognitive aging. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Characterizing Protease Specificity: How Many Substrates Do We Need?

PubMed Central

Schauperl, Michael; Fuchs, Julian E.; Waldner, Birgit J.; Huber, Roland G.; Kramer, Christian; Liedl, Klaus R.

2015-01-01

Calculation of cleavage entropies allows to quantify, map and compare protease substrate specificity by an information entropy based approach. The metric intrinsically depends on the number of experimentally determined substrates (data points). Thus a statistical analysis of its numerical stability is crucial to estimate the systematic error made by estimating specificity based on a limited number of substrates. In this contribution, we show the mathematical basis for estimating the uncertainty in cleavage entropies. Sets of cleavage entropies are calculated using experimental cleavage data and modeled extreme cases. By analyzing the underlying mathematics and applying statistical tools, a linear dependence of the metric in respect to 1/n was found. This allows us to extrapolate the values to an infinite number of samples and to estimate the errors. Analyzing the errors, a minimum number of 30 substrates was found to be necessary to characterize substrate specificity, in terms of amino acid variability, for a protease (S4-S4’) with an uncertainty of 5 percent. Therefore, we encourage experimental researchers in the protease field to record specificity profiles of novel proteases aiming to identify at least 30 peptide substrates of maximum sequence diversity. We expect a full characterization of protease specificity helpful to rationalize biological functions of proteases and to assist rational drug design. PMID:26559682
A weighted U statistic for association analyses considering genetic heterogeneity.

PubMed

Wei, Changshuai; Elston, Robert C; Lu, Qing

2016-07-20

Converging evidence suggests that common complex diseases with the same or similar clinical manifestations could have different underlying genetic etiologies. While current research interests have shifted toward uncovering rare variants and structural variations predisposing to human diseases, the impact of heterogeneity in genetic studies of complex diseases has been largely overlooked. Most of the existing statistical methods assume the disease under investigation has a homogeneous genetic effect and could, therefore, have low power if the disease undergoes heterogeneous pathophysiological and etiological processes. In this paper, we propose a heterogeneity-weighted U (HWU) method for association analyses considering genetic heterogeneity. HWU can be applied to various types of phenotypes (e.g., binary and continuous) and is computationally efficient for high-dimensional genetic data. Through simulations, we showed the advantage of HWU when the underlying genetic etiology of a disease was heterogeneous, as well as the robustness of HWU against different model assumptions (e.g., phenotype distributions). Using HWU, we conducted a genome-wide analysis of nicotine dependence from the Study of Addiction: Genetics and Environments dataset. The genome-wide analysis of nearly one million genetic markers took 7h, identifying heterogeneous effects of two new genes (i.e., CYP3A5 and IKBKB) on nicotine dependence. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Characteristics of Sudden Commencements Observed by Van Allen Probes in the Inner Magnetosphere

NASA Astrophysics Data System (ADS)

Fathy, A.; Kim, K.-H.; Park, J.-S.; Jin, H.; Kletzing, C.; Wygant, J. R.; Ghamry, E.

2018-02-01

We have statistically studied sudden commencement (SC) by using the data acquired from Van Allen Probes (VAP) in the inner magnetosphere (L = 3.0-6.5) and GOES spacecraft at geosynchronous orbit (L =˜ 6.7) from October 2012 to September 2017. During the time period, we identified 85 SCs in the inner magnetosphere and 90 SCs at geosynchronous orbit. Statistical results of the SC events reveal the following characteristics. (1) There is strong seasonal dependence of the geosynchronous SC amplitude in the radial BV component at all local times. However, BV shows weak seasonal variation on the dayside in the inner magnetosphere. (2) The local time dependence of the SC amplitude in the compressional BH component at geosynchronous orbit is similar to that in the inner magnetosphere. (3) In a nightside region of L = 5.0-6.5, ˜19% of BH events are negative, while ˜58% of BH events are negative at geosynchronous orbit. (4) The amplitude of the SC-associated Ey perturbations varies systematically with local time with a morning-afternoon asymmetry near noon. These observations can be explained by spatial and/or temporal changes in the magnetopause and cross-tail currents, which are caused by changes in the solar wind dynamic pressure, with respect to spacecraft positions.
Controlling the crystal polymorph by exploiting the time dependence of nucleation rates.

PubMed

Little, Laurie J; King, Alice A K; Sear, Richard P; Keddie, Joseph L

2017-10-14

Most substances can crystallise into two or more different crystal lattices called polymorphs. Despite this, there are no systems in which we can quantitatively predict the probability of one competing polymorph forming instead of the other. We address this problem using large scale (hundreds of events) studies of the competing nucleation of the alpha and gamma polymorphs of glycine. In situ Raman spectroscopy is used to identify the polymorph of each crystal. We find that the nucleation kinetics of the two polymorphs is very different. Nucleation of the alpha polymorph starts off slowly but accelerates, while nucleation of the gamma polymorph starts off fast but then slows. We exploit this difference to increase the purity with which we obtain the gamma polymorph by a factor of ten. The statistics of the nucleation of crystals is analogous to that of human mortality, and using a result from medical statistics, we show that conventional nucleation data can say nothing about what, if any, are the correlations between competing nucleation processes. Thus we can show that with data of our form it is impossible to disentangle the competing nucleation processes. We also find that the growth rate and the shape of a crystal depend on it when nucleated. This is new evidence that nucleation and growth are linked.
Further developments in cloud statistics for computer simulations

NASA Technical Reports Server (NTRS)

Chang, D. T.; Willand, J. H.

1972-01-01

This study is a part of NASA's continued program to provide global statistics of cloud parameters for computer simulation. The primary emphasis was on the development of the data bank of the global statistical distributions of cloud types and cloud layers and their applications in the simulation of the vertical distributions of in-cloud parameters such as liquid water content. These statistics were compiled from actual surface observations as recorded in Standard WBAN forms. Data for a total of 19 stations were obtained and reduced. These stations were selected to be representative of the 19 primary cloud climatological regions defined in previous studies of cloud statistics. Using the data compiled in this study, a limited study was conducted of the hemogeneity of cloud regions, the latitudinal dependence of cloud-type distributions, the dependence of these statistics on sample size, and other factors in the statistics which are of significance to the problem of simulation. The application of the statistics in cloud simulation was investigated. In particular, the inclusion of the new statistics in an expanded multi-step Monte Carlo simulation scheme is suggested and briefly outlined.
A functional U-statistic method for association analysis of sequencing data.

PubMed

Jadhav, Sneha; Tong, Xiaoran; Lu, Qing

2017-11-01

Although sequencing studies hold great promise for uncovering novel variants predisposing to human diseases, the high dimensionality of the sequencing data brings tremendous challenges to data analysis. Moreover, for many complex diseases (e.g., psychiatric disorders) multiple related phenotypes are collected. These phenotypes can be different measurements of an underlying disease, or measurements characterizing multiple related diseases for studying common genetic mechanism. Although jointly analyzing these phenotypes could potentially increase the power of identifying disease-associated genes, the different types of phenotypes pose challenges for association analysis. To address these challenges, we propose a nonparametric method, functional U-statistic method (FU), for multivariate analysis of sequencing data. It first constructs smooth functions from individuals' sequencing data, and then tests the association of these functions with multiple phenotypes by using a U-statistic. The method provides a general framework for analyzing various types of phenotypes (e.g., binary and continuous phenotypes) with unknown distributions. Fitting the genetic variants within a gene using a smoothing function also allows us to capture complexities of gene structure (e.g., linkage disequilibrium, LD), which could potentially increase the power of association analysis. Through simulations, we compared our method to the multivariate outcome score test (MOST), and found that our test attained better performance than MOST. In a real data application, we apply our method to the sequencing data from Minnesota Twin Study (MTS) and found potential associations of several nicotine receptor subunit (CHRN) genes, including CHRNB3, associated with nicotine dependence and/or alcohol dependence. © 2017 WILEY PERIODICALS, INC.
Collaborative Project: The problem of bias in defining uncertainty in computationally enabled strategies for data-driven climate model development. Final Technical Report.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huerta, Gabriel

The objective of the project is to develop strategies for better representing scientific sensibilities within statistical measures of model skill that then can be used within a Bayesian statistical framework for data-driven climate model development and improved measures of model scientific uncertainty. One of the thorny issues in model evaluation is quantifying the effect of biases on climate projections. While any bias is not desirable, only those biases that affect feedbacks affect scatter in climate projections. The effort at the University of Texas is to analyze previously calculated ensembles of CAM3.1 with perturbed parameters to discover how biases affect projectionsmore » of global warming. The hypothesis is that compensating errors in the control model can be identified by their effect on a combination of processes and that developing metrics that are sensitive to dependencies among state variables would provide a way to select version of climate models that may reduce scatter in climate projections. Gabriel Huerta at the University of New Mexico is responsible for developing statistical methods for evaluating these field dependencies. The UT effort will incorporate these developments into MECS, which is a set of python scripts being developed at the University of Texas for managing the workflow associated with data-driven climate model development over HPC resources. This report reflects the main activities at the University of New Mexico where the PI (Huerta) and the Postdocs (Nosedal, Hattab and Karki) worked on the project.« less
Methodological choices affect cancer incidence rates: a cohort study.

PubMed

Brooke, Hannah L; Talbäck, Mats; Feychting, Maria; Ljung, Rickard

2017-01-19

Incidence rates are fundamental to epidemiology, but their magnitude and interpretation depend on methodological choices. We aimed to examine the extent to which the definition of the study population affects cancer incidence rates. All primary cancer diagnoses in Sweden between 1958 and 2010 were identified from the national Cancer Register. Age-standardized and age-specific incidence rates of 29 cancer subtypes between 2000 and 2010 were calculated using four definitions of the study population: persons resident in Sweden 1) based on general population statistics; 2) with no previous subtype-specific cancer diagnosis; 3) with no previous cancer diagnosis except non-melanoma skin cancer; and 4) with no previous cancer diagnosis of any type. We calculated absolute and relative differences between methods. Age-standardized incidence rates calculated using general population statistics ranged from 6% lower (prostate cancer, incidence rate difference: -13.5/100,000 person-years) to 8% higher (breast cancer in women, incidence rate difference: 10.5/100,000 person-years) than incidence rates based on individuals with no previous subtype-specific cancer diagnosis. Age-standardized incidence rates in persons with no previous cancer of any type were up to 10% lower (bladder cancer in women) than rates in those with no previous subtype-specific cancer diagnosis; however, absolute differences were <5/100,000 person-years for all cancer subtypes. For some cancer subtypes incidence rates vary depending on the definition of the study population. For these subtypes, standardized incidence ratios calculated using general population statistics could be misleading. Moreover, etiological arguments should be used to inform methodological choices during study design.
Spatial dependency of V. cholera prevalence on open space refuse dumps in Kumasi, Ghana: a spatial statistical modelling

PubMed Central

Osei, Frank B; Duker, Alfred A

2008-01-01

Background Cholera has persisted in Ghana since its introduction in the early 70's. From 1999 to 2005, the Ghana Ministry of Health officially reported a total of 26,924 cases and 620 deaths to the WHO. Etiological studies suggest that the natural habitat of V. cholera is the aquatic environment. Its ability to survive within and outside the aquatic environment makes cholera a complex health problem to manage. Once the disease is introduced in a population, several environmental factors may lead to prolonged transmission and secondary cases. An important environmental factor that predisposes individuals to cholera infection is sanitation. In this study, we exploit the importance of two main spatial measures of sanitation in cholera transmission in an urban city, Kumasi. These are proximity and density of refuse dumps within a community. Results A spatial statistical modelling carried out to determine the spatial dependency of cholera prevalence on refuse dumps show that, there is a direct spatial relationship between cholera prevalence and density of refuse dumps, and an inverse spatial relationship between cholera prevalence and distance to refuse dumps. A spatial scan statistics also identified four significant spatial clusters of cholera; a primary cluster with greater than expected cholera prevalence, and three secondary clusters with lower than expected cholera prevalence. A GIS based buffer analysis also showed that the minimum distance within which refuse dumps should not be sited within community centres is 500 m. Conclusion The results suggest that proximity and density of open space refuse dumps play a contributory role in cholera infection in Kumasi. PMID:19087235
Capturing spatial and temporal patterns of widespread, extreme flooding across Europe

NASA Astrophysics Data System (ADS)

Busby, Kathryn; Raven, Emma; Liu, Ye

2013-04-01

Statistical characterisation of physical hazards is an integral part of probabilistic catastrophe models used by the reinsurance industry to estimate losses from large scale events. Extreme flood events are not restricted by country boundaries which poses an issue for reinsurance companies as their exposures often extend beyond them. We discuss challenges and solutions that allow us to appropriately capture the spatial and temporal dependence of extreme hydrological events on a continental-scale, which in turn enables us to generate an industry-standard stochastic event set for estimating financial losses for widespread flooding. By presenting our event set methodology, we focus on explaining how extreme value theory (EVT) and dependence modelling are used to account for short, inconsistent hydrological data from different countries, and how to make appropriate statistical decisions that best characterise the nature of flooding across Europe. The consistency of input data is of vital importance when identifying historical flood patterns. Collating data from numerous sources inherently causes inconsistencies and we demonstrate our robust approach to assessing the data and refining it to compile a single consistent dataset. This dataset is then extrapolated using a parameterised EVT distribution to estimate extremes. Our method then captures the dependence of flood events across countries using an advanced multivariate extreme value model. Throughout, important statistical decisions are explored including: (1) distribution choice; (2) the threshold to apply for extracting extreme data points; (3) a regional analysis; (4) the definition of a flood event, which is often linked with reinsurance industry's hour's clause; and (5) handling of missing values. Finally, having modelled the historical patterns of flooding across Europe, we sample from this model to generate our stochastic event set comprising of thousands of events over thousands of years. We then briefly illustrate how this is applied within a probabilistic model to estimate catastrophic loss curves used by the reinsurance industry.
Kepler AutoRegressive Planet Search

NASA Astrophysics Data System (ADS)

Feigelson, Eric

NASA's Kepler mission is the source of more exoplanets than any other instrument, but the discovery depends on complex statistical analysis procedures embedded in the Kepler pipeline. A particular challenge is mitigating irregular stellar variability without loss of sensitivity to faint periodic planetary transits. This proposal presents a two-stage alternative analysis procedure. First, parametric autoregressive ARFIMA models, commonly used in econometrics, remove most of the stellar variations. Second, a novel matched filter is used to create a periodogram from which transit-like periodicities are identified. This analysis procedure, the Kepler AutoRegressive Planet Search (KARPS), is confirming most of the Kepler Objects of Interest and is expected to identify additional planetary candidates. The proposed research will complete application of the KARPS methodology to the prime Kepler mission light curves of 200,000: stars, and compare the results with Kepler Objects of Interest obtained with the Kepler pipeline. We will then conduct a variety of astronomical studies based on the KARPS results. Important subsamples will be extracted including Habitable Zone planets, hot super-Earths, grazing-transit hot Jupiters, and multi-planet systems. Groundbased spectroscopy of poorly studied candidates will be performed to better characterize the host stars. Studies of stellar variability will then be pursued based on KARPS analysis. The autocorrelation function and nonstationarity measures will be used to identify spotted stars at different stages of autoregressive modeling. Periodic variables with folded light curves inconsistent with planetary transits will be identified; they may be eclipsing or mutually-illuminating binary star systems. Classification of stellar variables with KARPS-derived statistical properties will be attempted. KARPS procedures will then be applied to archived K2 data to identify planetary transits and characterize stellar variability.
75 FR 15709 - Agency Forms Undergoing Paperwork Reduction Act Review

Federal Register 2010, 2011, 2012, 2013, 2014

2010-03-30

... statistics at the national level, referred to as the U.S. National Vital Statistics System (NVSS), depends on.... Proposed Project Vital Statistics Training Application (OMB No. 0920-0217 exp. 7/31/ 2010)--Extension--National Center for Health Statistics (NCHS), Centers for Disease Control and Prevention (CDC). Background...

Dependency Structures for Statistical Machine Translation

ERIC Educational Resources Information Center

Bach, Nguyen

2012-01-01

Dependency structures represent a sentence as a set of dependency relations. Normally the dependency structures from a tree connect all the words in a sentence. One of the most defining characters of dependency structures is the ability to bring long distance dependency between words to local dependency structures. Another the main attraction of…
Single-molecule photon emission statistics for systems with explicit time dependence: Generating function approach

NASA Astrophysics Data System (ADS)

Peng, Yonggang; Xie, Shijie; Zheng, Yujun; Brown, Frank L. H.

2009-12-01

Generating function calculations are extended to allow for laser pulse envelopes of arbitrary shape in numerical applications. We investigate photon emission statistics for two-level and V- and Λ-type three-level systems under time-dependent excitation. Applications relevant to electromagnetically induced transparency and photon emission from single quantum dots are presented.
Data Analysis and Graphing in an Introductory Physics Laboratory: Spreadsheet versus Statistics Suite

ERIC Educational Resources Information Center

Peterlin, Primoz

2010-01-01

Two methods of data analysis are compared: spreadsheet software and a statistics software suite. Their use is compared analysing data collected in three selected experiments taken from an introductory physics laboratory, which include a linear dependence, a nonlinear dependence and a histogram. The merits of each method are compared. (Contains 7…
Deriving inertial wave characteristics from surface drifter velocities: Frequency variability in the Tropical Pacific

NASA Astrophysics Data System (ADS)

Poulain, Pierre-Marie; Luther, Douglas S.; Patzert, William C.

1992-11-01

Two techniques have been developed for estimating statistics of inertial oscillations from satellite-tracked drifters. These techniques overcome the difficulties inherent in estimating such statistics from data dependent upon space coordinates that are a function of time. Application of these techniques to tropical surface drifter data collected during the NORPAX, EPOCS, and TOGA programs reveals a latitude-dependent, statistically significant "blue shift" of inertial wave frequency. The latitudinal dependence of the blue shift is similar to predictions based on "global" internal wave spectral models, with a superposition of frequency shifting due to modification of the effective local inertial frequency by the presence of strongly sheared zonal mean currents within 12° of the equator.
Nutrition education intervention for dependent patients: protocol of a randomized controlled trial.

PubMed

Arija, Victoria; Martín, Núria; Canela, Teresa; Anguera, Carme; Castelao, Ana I; García-Barco, Montserrat; García-Campo, Antoni; González-Bravo, Ana I; Lucena, Carme; Martínez, Teresa; Fernández-Barrés, Silvia; Pedret, Roser; Badia, Waleska; Basora, Josep

2012-05-24

Malnutrition in dependent patients has a high prevalence and can influence the prognosis associated with diverse pathologic processes, decrease quality of life, and increase morbidity-mortality and hospital admissions.The aim of the study is to assess the effect of an educational intervention for caregivers on the nutritional status of dependent patients at risk of malnutrition. Intervention study with control group, randomly allocated, of 200 patients of the Home Care Program carried out in 8 Primary Care Centers (Spain). These patients are dependent and at risk of malnutrition, older than 65, and have caregivers. The socioeconomic and educational characteristics of the patient and the caregiver are recorded. On a schedule of 0-6-12 months, patients are evaluated as follows: Mini Nutritional Assessment (MNA), food intake, dentures, degree of dependency (Barthel test), cognitive state (Pfeiffer test), mood status (Yesavage test), and anthropometric and serum parameters of nutritional status: albumin, prealbumin, transferrin, haemoglobin, lymphocyte count, iron, and ferritin.Prior to the intervention, the educational procedure and the design of educational material are standardized among nurses. The nurses conduct an initial session for caregivers and then monitor the education impact at home every month (4 visits) up to 6 months. The North American Nursing Diagnosis Association (NANDA) methodology will be used. The investigators will study the effect of the intervention with caregivers on the patient's nutritional status using the MNA test, diet, anthropometry, and biochemical parameters.Bivariate normal test statistics and multivariate models will be created to adjust the effect of the intervention.The SPSS/PC program will be used for statistical analysis. The nutritional status of dependent patients has been little studied. This study allows us to know nutritional risk from different points of view: diet, anthropometry and biochemistry in dependent patients at nutritional risk and to assess the effect of a nutritional education intervention. The design with random allocation, inclusion of all patients, validated methods, caregivers' education and standardization between nurses allows us to obtain valuable information about nutritional status and prevention. Clinical Trial Registration-URL: http://www.clinicaltrials.gov. Unique identifier: NCT01360775.
Nutrition education intervention for dependent patients: protocol of a randomized controlled trial

PubMed Central

2012-01-01

Background Malnutrition in dependent patients has a high prevalence and can influence the prognosis associated with diverse pathologic processes, decrease quality of life, and increase morbidity-mortality and hospital admissions. The aim of the study is to assess the effect of an educational intervention for caregivers on the nutritional status of dependent patients at risk of malnutrition. Methods/Design Intervention study with control group, randomly allocated, of 200 patients of the Home Care Program carried out in 8 Primary Care Centers (Spain). These patients are dependent and at risk of malnutrition, older than 65, and have caregivers. The socioeconomic and educational characteristics of the patient and the caregiver are recorded. On a schedule of 0–6–12 months, patients are evaluated as follows: Mini Nutritional Assessment (MNA), food intake, dentures, degree of dependency (Barthel test), cognitive state (Pfeiffer test), mood status (Yesavage test), and anthropometric and serum parameters of nutritional status: albumin, prealbumin, transferrin, haemoglobin, lymphocyte count, iron, and ferritin. Prior to the intervention, the educational procedure and the design of educational material are standardized among nurses. The nurses conduct an initial session for caregivers and then monitor the education impact at home every month (4 visits) up to 6 months. The North American Nursing Diagnosis Association (NANDA) methodology will be used. The investigators will study the effect of the intervention with caregivers on the patient’s nutritional status using the MNA test, diet, anthropometry, and biochemical parameters. Bivariate normal test statistics and multivariate models will be created to adjust the effect of the intervention. The SPSS/PC program will be used for statistical analysis. Discussion The nutritional status of dependent patients has been little studied. This study allows us to know nutritional risk from different points of view: diet, anthropometry and biochemistry in dependent patients at nutritional risk and to assess the effect of a nutritional education intervention. The design with random allocation, inclusion of all patients, validated methods, caregivers’ education and standardization between nurses allows us to obtain valuable information about nutritional status and prevention. Trial Registration number Clinical Trial Registration-URL: http://www.clinicaltrials.gov. Unique identifier: NCT01360775 PMID:22625878
Continuous age- and sex-adjusted reference intervals of urinary markers for cerebral creatine deficiency syndromes: a novel approach to the definition of reference intervals.

PubMed

Mørkrid, Lars; Rowe, Alexander D; Elgstoen, Katja B P; Olesen, Jess H; Ruijter, George; Hall, Patricia L; Tortorelli, Silvia; Schulze, Andreas; Kyriakopoulou, Lianna; Wamelink, Mirjam M C; van de Kamp, Jiddeke M; Salomons, Gajja S; Rinaldo, Piero

2015-05-01

Urinary concentrations of creatine and guanidinoacetic acid divided by creatinine are informative markers for cerebral creatine deficiency syndromes (CDSs). The renal excretion of these substances varies substantially with age and sex, challenging the sensitivity and specificity of postanalytical interpretation. Results from 155 patients with CDS and 12 507 reference individuals were contributed by 5 diagnostic laboratories. They were binned into 104 adjacent age intervals and renormalized with Box-Cox transforms (Ξ). Estimates for central tendency (μ) and dispersion (σ) of Ξ were obtained for each bin. Polynomial regression analysis was used to establish the age dependence of both μ[log(age)] and σ[log(age)]. The regression residuals were then calculated as z-scores = {Ξ - μ[log(age)]}/σ[log(age)]. The process was iterated until all z-scores outside Tukey fences ±3.372 were identified and removed. Continuous percentile charts were then calculated and plotted by retransformation. Statistically significant and biologically relevant subgroups of z-scores were identified. Significantly higher marker values were seen in females than males, necessitating separate reference intervals in both adolescents and adults. Comparison between our reconstructed reference percentiles and current standard age-matched reference intervals highlights an underlying risk of false-positive and false-negative events at certain ages. Disease markers depending strongly on covariates such as age and sex require large numbers of reference individuals to establish peripheral percentiles with sufficient precision. This is feasible only through collaborative data sharing and the use of appropriate statistical methods. Broad application of this approach can be implemented through freely available Web-based software. © 2015 American Association for Clinical Chemistry.
Presymptomatic and longitudinal neuroimaging in neurodegeneration--from snapshots to motion picture: a systematic review.

PubMed

Schuster, Christina; Elamin, Marwa; Hardiman, Orla; Bede, Peter

2015-10-01

Recent quantitative neuroimaging studies have been successful in capturing phenotype and genotype-specific changes in dementia syndromes, amyotrophic lateral sclerosis, Parkinson's disease and other neurodegenerative conditions. However, the majority of imaging studies are cross-sectional, despite the obvious superiority of longitudinal study designs in characterising disease trajectories, response to therapy, progression rates and evaluating the presymptomatic phase of neurodegenerative conditions. The aim of this work is to perform a systematic review of longitudinal imaging initiatives in neurodegeneration focusing on methodology, optimal statistical models, follow-up intervals, attrition rates, primary study outcomes and presymptomatic studies. Longitudinal imaging studies were identified from 'PubMed' and reviewed from 1990 to 2014. The search terms 'longitudinal', 'MRI', 'presymptomatic' and 'imaging' were utilised in combination with one of the following degenerative conditions; Alzheimer's disease, amyotrophic lateral sclerosis/motor neuron disease, frontotemporal dementia, Huntington's disease, multiple sclerosis, Parkinson's disease, ataxia, HIV, alcohol abuse/dependence. A total of 423 longitudinal imaging papers and 103 genotype-based presymptomatic studies were identified and systematically reviewed. Imaging techniques, follow-up intervals and attrition rates showed significant variation depending on the primary diagnosis. Commonly used statistical models included analysis of annualised percentage change, mixed and random effect models, and non-linear cumulative models with acceleration-deceleration components. Although longitudinal imaging studies have the potential to provide crucial insights into the presymptomatic phase and natural trajectory of neurodegenerative processes a standardised design is required to enable meaningful data interpretation. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Hydrological modelling of the Chaohe Basin in China: Statistical model formulation and Bayesian inference

NASA Astrophysics Data System (ADS)

Yang, Jing; Reichert, Peter; Abbaspour, Karim C.; Yang, Hong

2007-07-01

SummaryCalibration of hydrologic models is very difficult because of measurement errors in input and response, errors in model structure, and the large number of non-identifiable parameters of distributed models. The difficulties even increase in arid regions with high seasonal variation of precipitation, where the modelled residuals often exhibit high heteroscedasticity and autocorrelation. On the other hand, support of water management by hydrologic models is important in arid regions, particularly if there is increasing water demand due to urbanization. The use and assessment of model results for this purpose require a careful calibration and uncertainty analysis. Extending earlier work in this field, we developed a procedure to overcome (i) the problem of non-identifiability of distributed parameters by introducing aggregate parameters and using Bayesian inference, (ii) the problem of heteroscedasticity of errors by combining a Box-Cox transformation of results and data with seasonally dependent error variances, (iii) the problems of autocorrelated errors, missing data and outlier omission with a continuous-time autoregressive error model, and (iv) the problem of the seasonal variation of error correlations with seasonally dependent characteristic correlation times. The technique was tested with the calibration of the hydrologic sub-model of the Soil and Water Assessment Tool (SWAT) in the Chaohe Basin in North China. The results demonstrated the good performance of this approach to uncertainty analysis, particularly with respect to the fulfilment of statistical assumptions of the error model. A comparison with an independent error model and with error models that only considered a subset of the suggested techniques clearly showed the superiority of the approach based on all the features (i)-(iv) mentioned above.
Identifying prognostic signature in ovarian cancer using DirGenerank

PubMed Central

Wang, Jian-Yong; Chen, Ling-Ling; Zhou, Xiong-Hui

2017-01-01

Identifying the prognostic genes in cancer is essential not only for the treatment of cancer patients, but also for drug discovery. However, it's still a big challenge to select the prognostic genes that can distinguish the risk of cancer patients across various data sets because of tumor heterogeneity. In this situation, the selected genes whose expression levels are statistically related to prognostic risks may be passengers. In this paper, based on gene expression data and prognostic data of ovarian cancer patients, we used conditional mutual information to construct gene dependency network in which the nodes (genes) with more out-degrees have more chances to be the modulators of cancer prognosis. After that, we proposed DirGenerank (Generank in direct netowrk) algorithm, which concerns both the gene dependency network and genes’ correlations to prognostic risks, to identify the gene signature that can predict the prognostic risks of ovarian cancer patients. Using ovarian cancer data set from TCGA (The Cancer Genome Atlas) as training data set, 40 genes with the highest importance were selected as prognostic signature. Survival analysis of these patients divided by the prognostic signature in testing data set and four independent data sets showed the signature can distinguish the prognostic risks of cancer patients significantly. Enrichment analysis of the signature with curated cancer genes and the drugs selected by CMAP showed the genes in the signature may be drug targets for therapy. In summary, we have proposed a useful pipeline to identify prognostic genes of cancer patients. PMID:28615526
Scale Dependence of Statistics of Spatially Averaged Rain Rate Seen in TOGA COARE Comparison with Predictions from a Stochastic Model

NASA Technical Reports Server (NTRS)

Kundu, Prasun K.; Bell, T. L.; Lau, William K. M. (Technical Monitor)

2002-01-01

A characteristic feature of rainfall statistics is that they in general depend on the space and time scales over which rain data are averaged. As a part of an earlier effort to determine the sampling error of satellite rain averages, a space-time model of rainfall statistics was developed to describe the statistics of gridded rain observed in GATE. The model allows one to compute the second moment statistics of space- and time-averaged rain rate which can be fitted to satellite or rain gauge data to determine the four model parameters appearing in the precipitation spectrum - an overall strength parameter, a characteristic length separating the long and short wavelength regimes and a characteristic relaxation time for decay of the autocorrelation of the instantaneous local rain rate and a certain 'fractal' power law exponent. For area-averaged instantaneous rain rate, this exponent governs the power law dependence of these statistics on the averaging length scale $L$ predicted by the model in the limit of small $L$. In particular, the variance of rain rate averaged over an $L \\times L$ area exhibits a power law singularity as $L \\rightarrow 0$. In the present work the model is used to investigate how the statistics of area-averaged rain rate over the tropical Western Pacific measured with ship borne radar during TOGA COARE (Tropical Ocean Global Atmosphere Coupled Ocean Atmospheric Response Experiment) and gridded on a 2 km grid depends on the size of the spatial averaging scale. Good agreement is found between the data and predictions from the model over a wide range of averaging length scales.
Pitfalls in chronobiology: a suggested analysis using intrathecal bupivacaine analgesia as an example.

PubMed

Shafer, Steven L; Lemmer, Bjoern; Boselli, Emmanuel; Boiste, Fabienne; Bouvet, Lionel; Allaouchiche, Bernard; Chassard, Dominique

2010-10-01

The duration of analgesia from epidural administration of local anesthetics to parturients has been shown to follow a rhythmic pattern according to the time of drug administration. We studied whether there was a similar pattern after intrathecal administration of bupivacaine in parturients. In the course of the analysis, we came to believe that some data points coincident with provider shift changes were influenced by nonbiological, health care system factors, thus incorrectly suggesting a periodic signal in duration of labor analgesia. We developed graphical and analytical tools to help assess the influence of individual points on the chronobiological analysis. Women with singleton term pregnancies in vertex presentation, cervical dilation 3 to 5 cm, pain score >50 mm (of 100 mm), and requesting labor analgesia were enrolled in this study. Patients received 2.5 mg of intrathecal bupivacaine in 2 mL using a combined spinal-epidural technique. Analgesia duration was the time from intrathecal injection until the first request for additional analgesia. The duration of analgesia was analyzed by visual inspection of the data, application of smoothing functions (Supersmoother; LOWESS and LOESS [locally weighted scatterplot smoothing functions]), analysis of variance, Cosinor (Chronos-Fit), Excel, and NONMEM (nonlinear mixed effect modeling). Confidence intervals (CIs) were determined by bootstrap analysis (1000 replications with replacement) using PLT Tools. Eighty-two women were included in the study. Examination of the raw data using 3 smoothing functions revealed a bimodal pattern, with a peak at approximately 0630 and a subsequent peak in the afternoon or evening, depending on the smoother. Analysis of variance did not identify any statistically significant difference between the duration of analgesia when intrathecal injection was given from midnight to 0600 compared with the duration of analgesia after intrathecal injection at other times. Chronos-Fit, Excel, and NONMEM produced identical results, with a mean duration of analgesia of 38.4 minutes (95% CI: 35.4-41.6 minutes), an 8-hour periodic waveform with an amplitude of 5.8 minutes (95% CI: 2.1-10.7 minutes), and a phase offset of 6.5 hours (95% CI: 5.4-8.0 hours) relative to midnight. The 8-hour periodic model did not reach statistical significance in 40% of bootstrap analyses, implying that statistical significance of the 8-hour periodic model was dependent on a subset of the data. Two data points before the change of shift at 0700 contributed most strongly to the statistical significance of the periodic waveform. Without these data points, there was no evidence of an 8-hour periodic waveform for intrathecal bupivacaine analgesia. Chronobiology includes the influence of external daily rhythms in the environment (e.g., nursing shifts) as well as human biological rhythms. We were able to distinguish the influence of an external rhythm by combining several novel analyses: (1) graphical presentation superimposing the raw data, external rhythms (e.g., nursing and anesthesia provider shifts), and smoothing functions; (2) graphical display of the contribution of each data point to the statistical significance; and (3) bootstrap analysis to identify whether the statistical significance was highly dependent on a data subset. These approaches suggested that 2 data points were likely artifacts of the change in nursing and anesthesia shifts. When these points were removed, there was no suggestion of biological rhythm in the duration of intrathecal bupivacaine analgesia.
ANCA: Anharmonic Conformational Analysis of Biomolecular Simulations.

PubMed

Parvatikar, Akash; Vacaliuc, Gabriel S; Ramanathan, Arvind; Chennubhotla, S Chakra

2018-05-08

Anharmonicity in time-dependent conformational fluctuations is noted to be a key feature of functional dynamics of biomolecules. Although anharmonic events are rare, long-timescale (μs-ms and beyond) simulations facilitate probing of such events. We have previously developed quasi-anharmonic analysis to resolve higher-order spatial correlations and characterize anharmonicity in biomolecular simulations. In this article, we have extended this toolbox to resolve higher-order temporal correlations and built a scalable Python package called anharmonic conformational analysis (ANCA). ANCA has modules to: 1) measure anharmonicity in the form of higher-order statistics and its variation as a function of time, 2) output a storyboard representation of the simulations to identify key anharmonic conformational events, and 3) identify putative anharmonic conformational substates and visualization of transitions between these substates. Copyright © 2018 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Statistical test for ΔρDCCA cross-correlation coefficient

NASA Astrophysics Data System (ADS)

Guedes, E. F.; Brito, A. A.; Oliveira Filho, F. M.; Fernandez, B. F.; de Castro, A. P. N.; da Silva Filho, A. M.; Zebende, G. F.

2018-07-01

In this paper we propose a new statistical test for ΔρDCCA, Detrended Cross-Correlation Coefficient Difference, a tool to measure contagion/interdependence effect in time series of size N at different time scale n. For this proposition we analyzed simulated and real time series. The results showed that the statistical significance of ΔρDCCA depends on the size N and the time scale n, and we can define a critical value for this dependency in 90%, 95%, and 99% of confidence level, as will be shown in this paper.
Identifiability of PBPK Models with Applications to ...

EPA Pesticide Factsheets

Any statistical model should be identifiable in order for estimates and tests using it to be meaningful. We consider statistical analysis of physiologically-based pharmacokinetic (PBPK) models in which parameters cannot be estimated precisely from available data, and discuss different types of identifiability that occur in PBPK models and give reasons why they occur. We particularly focus on how the mathematical structure of a PBPK model and lack of appropriate data can lead to statistical models in which it is impossible to estimate at least some parameters precisely. Methods are reviewed which can determine whether a purely linear PBPK model is globally identifiable. We propose a theorem which determines when identifiability at a set of finite and specific values of the mathematical PBPK model (global discrete identifiability) implies identifiability of the statistical model. However, we are unable to establish conditions that imply global discrete identifiability, and conclude that the only safe approach to analysis of PBPK models involves Bayesian analysis with truncated priors. Finally, computational issues regarding posterior simulations of PBPK models are discussed. The methodology is very general and can be applied to numerous PBPK models which can be expressed as linear time-invariant systems. A real data set of a PBPK model for exposure to dimethyl arsinic acid (DMA(V)) is presented to illustrate the proposed methodology. We consider statistical analy
Evidence Integration in Natural Acoustic Textures during Active and Passive Listening

PubMed Central

Rupp, Andre; Celikel, Tansu

2018-01-01

Abstract Many natural sounds can be well described on a statistical level, for example, wind, rain, or applause. Even though the spectro-temporal profile of these acoustic textures is highly dynamic, changes in their statistics are indicative of relevant changes in the environment. Here, we investigated the neural representation of change detection in natural textures in humans, and specifically addressed whether active task engagement is required for the neural representation of this change in statistics. Subjects listened to natural textures whose spectro-temporal statistics were modified at variable times by a variable amount. Subjects were instructed to either report the detection of changes (active) or to passively listen to the stimuli. A subset of passive subjects had performed the active task before (passive-aware vs passive-naive). Psychophysically, longer exposure to pre-change statistics was correlated with faster reaction times and better discrimination performance. EEG recordings revealed that the build-up rate and size of parieto-occipital (PO) potentials reflected change size and change time. Reduced effects were observed in the passive conditions. While P2 responses were comparable across conditions, slope and height of PO potentials scaled with task involvement. Neural source localization identified a parietal source as the main contributor of change-specific potentials, in addition to more limited contributions from auditory and frontal sources. In summary, the detection of statistical changes in natural acoustic textures is predominantly reflected in parietal locations both on the skull and source level. The scaling in magnitude across different levels of task involvement suggests a context-dependent degree of evidence integration. PMID:29662943
Evidence Integration in Natural Acoustic Textures during Active and Passive Listening.

PubMed

Górska, Urszula; Rupp, Andre; Boubenec, Yves; Celikel, Tansu; Englitz, Bernhard

2018-01-01

Many natural sounds can be well described on a statistical level, for example, wind, rain, or applause. Even though the spectro-temporal profile of these acoustic textures is highly dynamic, changes in their statistics are indicative of relevant changes in the environment. Here, we investigated the neural representation of change detection in natural textures in humans, and specifically addressed whether active task engagement is required for the neural representation of this change in statistics. Subjects listened to natural textures whose spectro-temporal statistics were modified at variable times by a variable amount. Subjects were instructed to either report the detection of changes (active) or to passively listen to the stimuli. A subset of passive subjects had performed the active task before (passive-aware vs passive-naive). Psychophysically, longer exposure to pre-change statistics was correlated with faster reaction times and better discrimination performance. EEG recordings revealed that the build-up rate and size of parieto-occipital (PO) potentials reflected change size and change time. Reduced effects were observed in the passive conditions. While P2 responses were comparable across conditions, slope and height of PO potentials scaled with task involvement. Neural source localization identified a parietal source as the main contributor of change-specific potentials, in addition to more limited contributions from auditory and frontal sources. In summary, the detection of statistical changes in natural acoustic textures is predominantly reflected in parietal locations both on the skull and source level. The scaling in magnitude across different levels of task involvement suggests a context-dependent degree of evidence integration.
Probability of identification: a statistical model for the validation of qualitative botanical identification methods.

PubMed

LaBudde, Robert A; Harnly, James M

2012-01-01

A qualitative botanical identification method (BIM) is an analytical procedure that returns a binary result (1 = Identified, 0 = Not Identified). A BIM may be used by a buyer, manufacturer, or regulator to determine whether a botanical material being tested is the same as the target (desired) material, or whether it contains excessive nontarget (undesirable) material. The report describes the development and validation of studies for a BIM based on the proportion of replicates identified, or probability of identification (POI), as the basic observed statistic. The statistical procedures proposed for data analysis follow closely those of the probability of detection, and harmonize the statistical concepts and parameters between quantitative and qualitative method validation. Use of POI statistics also harmonizes statistical concepts for botanical, microbiological, toxin, and other analyte identification methods that produce binary results. The POI statistical model provides a tool for graphical representation of response curves for qualitative methods, reporting of descriptive statistics, and application of performance requirements. Single collaborator and multicollaborative study examples are given.
Examination of two methods for statistical analysis of data with magnitude and direction emphasizing vestibular research applications

NASA Technical Reports Server (NTRS)

Calkins, D. S.

1998-01-01

When the dependent (or response) variable response variable in an experiment has direction and magnitude, one approach that has been used for statistical analysis involves splitting magnitude and direction and applying univariate statistical techniques to the components. However, such treatment of quantities with direction and magnitude is not justifiable mathematically and can lead to incorrect conclusions about relationships among variables and, as a result, to flawed interpretations. This note discusses a problem with that practice and recommends mathematically correct procedures to be used with dependent variables that have direction and magnitude for 1) computation of mean values, 2) statistical contrasts of and confidence intervals for means, and 3) correlation methods.
General quadrupolar statistical anisotropy: Planck limits

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ramazanov, S.; Rubtsov, G.; Thorsrud, M.

2017-03-01

Several early Universe scenarios predict a direction-dependent spectrum of primordial curvature perturbations. This translates into the violation of the statistical isotropy of cosmic microwave background radiation. Previous searches for statistical anisotropy mainly focussed on a quadrupolar direction-dependence characterised by a single multipole vector and an overall amplitude g {sub *}. Generically, however, the quadrupole has a more complicated geometry described by two multipole vectors and g {sub *}. This is the subject of the present work. In particular, we limit the amplitude g {sub *} for different shapes of the quadrupole by making use of Planck 2015 maps. We alsomore » constrain certain inflationary scenarios which predict this kind of more general quadrupolar statistical anisotropy.« less

28 CFR 22.24 - Information transfer agreement.

Code of Federal Regulations, 2010 CFR

2010-07-01

... STATISTICAL INFORMATION § 22.24 Information transfer agreement. Prior to the transfer of any identifiable... identifiable to a private person will be used only for research and statistical purposes. (b) Information...-know basis for research or statistical purposes, provided that such transfer is approved by the person...
Influence of case definition on incidence and outcome of acute coronary syndromes.

PubMed

Torabi, Azam; Cleland, John G F; Sherwi, Nasser; Atkin, Paul; Panahi, Hossein; Kilpatrick, Eric; Thackray, Simon; Hoye, Angela; Alamgir, Farqad; Goode, Kevin; Rigby, Alan; Clark, Andrew L

2016-01-01

Acute coronary syndromes (ACS) are common, but their incidence and outcome might depend greatly on how data are collected. We compared case ascertainment rates for ACS and myocardial infarction (MI) in a single institution using several different strategies. The Hull and East Yorkshire Hospitals serve a population of ∼560 000. Patients admitted with ACS to cardiology or general medical wards were identified prospectively by trained nurses during 2005. Patients with a death or discharge code of MI were also identified by the hospital information department and, independently, from Myocardial Infarction National Audit Project (MINAP) records. The hospital laboratory identified all patients with an elevated serum troponin-T (TnT) by contemporary criteria (>0.03 µg/L in 2005). The prospective survey identified 1731 admissions (1439 patients) with ACS, including 764 admissions (704 patients) with MIs. The hospital information department reported only 552 admissions (544 patients) with MI and only 206 admissions (203 patients) were reported to the MINAP. Using all 3 strategies, 934 admissions (873 patients) for MI were identified, for which TnT was >1 µg/L in 443, 0.04-1.0 µg/L in 435, ≤0.03 µg/L in 19 and not recorded in 37. A further 823 patients had TnT >0.03 µg/L, but did not have ACS ascertained by any survey method. Of the 873 patients with MI, 146 (16.7%) died during admission and 218 (25.0%) by 1 year, but ranging from 9% for patients enrolled in the MINAP to 27% for those identified by the hospital information department. MINAP and hospital statistics grossly underestimated the incidence of MI managed by our hospital. The 1-year mortality was highly dependent on the method of ascertainment.
Methods for detrending success metrics to account for inflationary and deflationary factors*

NASA Astrophysics Data System (ADS)

Petersen, A. M.; Penner, O.; Stanley, H. E.

2011-01-01

Time-dependent economic, technological, and social factors can artificially inflate or deflate quantitative measures for career success. Here we develop and test a statistical method for normalizing career success metrics across time dependent factors. In particular, this method addresses the long standing question: how do we compare the career achievements of professional athletes from different historical eras? Developing an objective approach will be of particular importance over the next decade as major league baseball (MLB) players from the "steroids era" become eligible for Hall of Fame induction. Some experts are calling for asterisks (*) to be placed next to the career statistics of athletes found guilty of using performance enhancing drugs (PED). Here we address this issue, as well as the general problem of comparing statistics from distinct eras, by detrending the seasonal statistics of professional baseball players. We detrend player statistics by normalizing achievements to seasonal averages, which accounts for changes in relative player ability resulting from a range of factors. Our methods are general, and can be extended to various arenas of competition where time-dependent factors play a key role. For five statistical categories, we compare the probability density function (pdf) of detrended career statistics to the pdf of raw career statistics calculated for all player careers in the 90-year period 1920-2009. We find that the functional form of these pdfs is stationary under detrending. This stationarity implies that the statistical regularity observed in the right-skewed distributions for longevity and success in professional sports arises from both the wide range of intrinsic talent among athletes and the underlying nature of competition. We fit the pdfs for career success by the Gamma distribution in order to calculate objective benchmarks based on extreme statistics which can be used for the identification of extraordinary careers.
Some low-altitude cusp dependencies on the interplanetary magnetic field

NASA Technical Reports Server (NTRS)

Newell, Patrick T.; Meng, CHING-I.; Sibeck, David G.; Lepping, Ronald

1989-01-01

The low-altitude cusp dependencies on the interplanetary magnetic field (IMF) were investigated using the algorithm of Newell and Meng (1988) to identify the cusp proper. The algorithm was applied to 12,569 high-latitude dayside passes of the DMSP F7 spacecraft, and the resulting cusp positioning data were correlated with the IMF. It was found that the cusp latitudinal position correlated reasonably well (0.70) with the Bz component when the IMF had a southward component. The correlation for the northward Bz component was only 0.18, suggestive of a half-wave rectifier effect. The ratio of cusp ion number flux precipitation for Bz southward to that for Bz northward was 1.75 + or - 0.12. The statistical local time widths of the cusp proper for the northward and the southward Bz components were found to be 2.1 h and 2.8 h, respectively.
Application of Different Statistical Techniques in Integrated Logistics Support of the International Space Station Alpha

NASA Technical Reports Server (NTRS)

Sepehry-Fard, F.; Coulthard, Maurice H.

1995-01-01

The process to predict the values of the maintenance time dependent variable parameters such as mean time between failures (MTBF) over time must be one that will not in turn introduce uncontrolled deviation in the results of the ILS analysis such as life cycle cost spares calculation, etc. A minor deviation in the values of the maintenance time dependent variable parameters such as MTBF over time will have a significant impact on the logistics resources demands, International Space Station availability, and maintenance support costs. It is the objective of this report to identify the magnitude of the expected enhancement in the accuracy of the results for the International Space Station reliability and maintainability data packages by providing examples. These examples partially portray the necessary information hy evaluating the impact of the said enhancements on the life cycle cost and the availability of the International Space Station.
Granger Causality Testing with Intensive Longitudinal Data.

PubMed

Molenaar, Peter C M

2018-06-01

The availability of intensive longitudinal data obtained by means of ambulatory assessment opens up new prospects for prevention research in that it allows the derivation of subject-specific dynamic networks of interacting variables by means of vector autoregressive (VAR) modeling. The dynamic networks thus obtained can be subjected to Granger causality testing in order to identify causal relations among the observed time-dependent variables. VARs have two equivalent representations: standard and structural. Results obtained with Granger causality testing depend upon which representation is chosen, yet no criteria exist on which this important choice can be based. A new equivalent representation is introduced called hybrid VARs with which the best representation can be chosen in a data-driven way. Partial directed coherence, a frequency-domain statistic for Granger causality testing, is shown to perform optimally when based on hybrid VARs. An application to real data is provided.
Wavelength dependence of biological damage induced by UV radiation on bacteria.

PubMed

Santos, Ana L; Oliveira, Vanessa; Baptista, Inês; Henriques, Isabel; Gomes, Newton C M; Almeida, Adelaide; Correia, António; Cunha, Ângela

2013-01-01

The biological effects of UV radiation of different wavelengths (UVA, UVB and UVC) were assessed in nine bacterial isolates displaying different UV sensitivities. Biological effects (survival and activity) and molecular markers of oxidative stress [DNA strand breakage (DSB), generation of reactive oxygen species (ROS), oxidative damage to proteins and lipids, and the activity of antioxidant enzymes catalase and superoxide dismutase] were quantified and statistically analyzed in order to identify the major determinants of cell inactivation under the different spectral regions. Survival and activity followed a clear wavelength dependence, being highest under UVA and lowest under UVC. The generation of ROS, as well as protein and lipid oxidation, followed the same pattern. DNA damage (DSB) showed the inverse trend. Multiple stepwise regression analysis revealed that survival under UVA, UVB and UVC wavelengths was best explained by DSB, oxidative damage to lipids, and intracellular ROS levels, respectively.
Bulk tank somatic cell counts analyzed by statistical process control tools to identify and monitor subclinical mastitis incidence.

PubMed

Lukas, J M; Hawkins, D M; Kinsel, M L; Reneau, J K

2005-11-01

The objective of this study was to examine the relationship between monthly Dairy Herd Improvement (DHI) subclinical mastitis and new infection rate estimates and daily bulk tank somatic cell count (SCC) summarized by statistical process control tools. Dairy Herd Improvement Association test-day subclinical mastitis and new infection rate estimates along with daily or every other day bulk tank SCC data were collected for 12 mo of 2003 from 275 Upper Midwest dairy herds. Herds were divided into 5 herd production categories. A linear score [LNS = ln(BTSCC/100,000)/0.693147 + 3] was calculated for each individual bulk tank SCC. For both the raw SCC and the transformed data, the mean and sigma were calculated using the statistical quality control individual measurement and moving range chart procedure of Statistical Analysis System. One hundred eighty-three herds of the 275 herds from the study data set were then randomly selected and the raw (method 1) and transformed (method 2) bulk tank SCC mean and sigma were used to develop models for predicting subclinical mastitis and new infection rate estimates. Herd production category was also included in all models as 5 dummy variables. Models were validated by calculating estimates of subclinical mastitis and new infection rates for the remaining 92 herds and plotting them against observed values of each of the dependents. Only herd production category and bulk tank SCC mean were significant and remained in the final models. High R2 values (0.83 and 0.81 for methods 1 and 2, respectively) indicated a strong correlation between the bulk tank SCC and herd's subclinical mastitis prevalence. The standard errors of the estimate were 4.02 and 4.28% for methods 1 and 2, respectively, and decreased with increasing herd production. As a case study, Shewhart Individual Measurement Charts were plotted from the bulk tank SCC to identify shifts in mastitis incidence. Four of 5 charts examined signaled a change in bulk tank SCC before the DHI test day identified the change in subclinical mastitis prevalence. It can be concluded that applying statistical process control tools to daily bulk tank SCC can be used to estimate subclinical mastitis prevalence in the herd and observe for change in the subclinical mastitis status. Single DHI test day estimates of new infection rate were insufficient to accurately describe its dynamics.
What's Missing in Teaching Probability and Statistics: Building Cognitive Schema for Understanding Random Phenomena

ERIC Educational Resources Information Center

Kuzmak, Sylvia

2016-01-01

Teaching probability and statistics is more than teaching the mathematics itself. Historically, the mathematics of probability and statistics was first developed through analyzing games of chance such as the rolling of dice. This article makes the case that the understanding of probability and statistics is dependent upon building a…
GO-Bayes: Gene Ontology-based overrepresentation analysis using a Bayesian approach.

PubMed

Zhang, Song; Cao, Jing; Kong, Y Megan; Scheuermann, Richard H

2010-04-01

A typical approach for the interpretation of high-throughput experiments, such as gene expression microarrays, is to produce groups of genes based on certain criteria (e.g. genes that are differentially expressed). To gain more mechanistic insights into the underlying biology, overrepresentation analysis (ORA) is often conducted to investigate whether gene sets associated with particular biological functions, for example, as represented by Gene Ontology (GO) annotations, are statistically overrepresented in the identified gene groups. However, the standard ORA, which is based on the hypergeometric test, analyzes each GO term in isolation and does not take into account the dependence structure of the GO-term hierarchy. We have developed a Bayesian approach (GO-Bayes) to measure overrepresentation of GO terms that incorporates the GO dependence structure by taking into account evidence not only from individual GO terms, but also from their related terms (i.e. parents, children, siblings, etc.). The Bayesian framework borrows information across related GO terms to strengthen the detection of overrepresentation signals. As a result, this method tends to identify sets of closely related GO terms rather than individual isolated GO terms. The advantage of the GO-Bayes approach is demonstrated with a simulation study and an application example.
28 CFR 22.28 - Use of data identifiable to a private person for judicial, legislative or administrative purposes.

Code of Federal Regulations, 2011 CFR

2011-07-01

... DEPARTMENT OF JUSTICE CONFIDENTIALITY OF IDENTIFIABLE RESEARCH AND STATISTICAL INFORMATION § 22.28 Use of...) Research or statistical information identifiable to a private person shall be immune from legal process and...
28 CFR 22.28 - Use of data identifiable to a private person for judicial, legislative or administrative purposes.

Code of Federal Regulations, 2010 CFR

2010-07-01

... DEPARTMENT OF JUSTICE CONFIDENTIALITY OF IDENTIFIABLE RESEARCH AND STATISTICAL INFORMATION § 22.28 Use of...) Research or statistical information identifiable to a private person shall be immune from legal process and...
Predictive factors of clinical response in steroid-refractory ulcerative colitis treated with granulocyte-monocyte apheresis

PubMed Central

D'Ovidio, Valeria; Meo, Donatella; Viscido, Angelo; Bresci, Giampaolo; Vernia, Piero; Caprilli, Renzo

2011-01-01

AIM: To identify factors predicting the clinical response of ulcerative colitis patients to granulocyte-monocyte apheresis (GMA). METHODS: Sixty-nine ulcerative colitis patients (39 F, 30 M) dependent upon/refractory to steroids were treated with GMA. Steroid dependency, clinical activity index (CAI), C reactive protein (CRP) level, erythrocyte sedimentation rate (ESR), values at baseline, use of immunosuppressant, duration of disease, and age and extent of disease were considered for statistical analysis as predictive factors of clinical response. Univariate and multivariate logistic regression models were used. RESULTS: In the univariate analysis, CAI (P = 0.039) and ESR (P = 0.017) levels at baseline were singled out as predictive of clinical remission. In the multivariate analysis steroid dependency [Odds ratio (OR) = 0.390, 95% Confidence interval (CI): 0.176-0.865, Wald 5.361, P = 0.0160] and low CAI levels at baseline (4 < CAI < 7) (OR = 0.770, 95% CI: 0.425-1.394, Wald 3.747, P = 0.028) proved to be effective as factors predicting clinical response. CONCLUSION: GMA may be a valid therapeutic option for steroid-dependent ulcerative colitis patients with mild-moderate disease and its clinical efficacy seems to persist for 12 mo. PMID:21528055
Applications of modern statistical methods to analysis of data in physical science

NASA Astrophysics Data System (ADS)

Wicker, James Eric

Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance structures. We then use this new algorithm in a genetic algorithm based Expectation-Maximization process that can accurately calculate parameters describing complex clusters in a mixture model routine. Using the accuracy of this GEM algorithm, we assign information scores to cluster calculations in order to best identify the number of mixture components in a multivariate data set. We will showcase how these algorithms can be used to process multivariate data from astronomical observations.
Does Eye Color Depend on Gender? It Might Depend on Who or How You Ask

ERIC Educational Resources Information Center

Froelich, Amy G.; Stephenson, W. Robert

2013-01-01

As a part of an opening course survey, data on eye color and gender were collected from students enrolled in an introductory statistics course at a large university over a recent four year period. Biologically, eye color and gender are independent traits. However, in the data collected from our students, there is a statistically significant…
Relational Care for Perinatal Substance Use: A Systematic Review.

PubMed

Kramlich, Debra; Kronk, Rebecca

2015-01-01

The purpose of this systematic review of the literature is to highlight published studies of perinatal substance use disorder that address relational aspects of various care delivery models to identify opportunities for future studies in this area. Quantitative, qualitative, and mixed-methods studies that included relational variables, such as healthcare provider engagement with pregnant women and facilitation of maternal-infant bonding, were identified using PubMed, Scopus, and EBSCO databases. Key words included neonatal abstinence syndrome, drug, opioid, substance, dependence, and pregnancy. Six studies included in this review identified statistically and/or clinically significant positive maternal and neonatal outcomes thought to be linked to engagement in antenatal care and development of caring relationships with healthcare providers. Comprehensive, integrated multidisciplinary services for pregnant women with substance use disorder aimed at harm reduction are showing positive results. Evidence exists that pregnant women's engagement with comprehensive services facilitated by caring relationships with healthcare providers may improve perinatal outcomes. Gaps in the literature remain; studies have yet to identify the relative contribution of multiple risk factors to adverse outcomes as well as program components most likely to improve outcomes.
An unjustified benefit: immortal time bias in the analysis of time-dependent events.

PubMed

Gleiss, Andreas; Oberbauer, Rainer; Heinze, Georg

2018-02-01

Immortal time bias is a problem arising from methodologically wrong analyses of time-dependent events in survival analyses. We illustrate the problem by analysis of a kidney transplantation study. Following patients from transplantation to death, groups defined by the occurrence or nonoccurrence of graft failure during follow-up seemingly had equal overall mortality. Such naive analysis assumes that patients were assigned to the two groups at time of transplantation, which actually are a consequence of occurrence of a time-dependent event later during follow-up. We introduce landmark analysis as the method of choice to avoid immortal time bias. Landmark analysis splits the follow-up time at a common, prespecified time point, the so-called landmark. Groups are then defined by time-dependent events having occurred before the landmark, and outcome events are only considered if occurring after the landmark. Landmark analysis can be easily implemented with common statistical software. In our kidney transplantation example, landmark analyses with landmarks set at 30 and 60 months clearly identified graft failure as a risk factor for overall mortality. We give further typical examples from transplantation research and discuss strengths and limitations of landmark analysis and other methods to address immortal time bias such as Cox regression with time-dependent covariables. © 2017 Steunstichting ESOT.
The Heuristic Value of p in Inductive Statistical Inference

PubMed Central

Krueger, Joachim I.; Heck, Patrick R.

2017-01-01

Many statistical methods yield the probability of the observed data – or data more extreme – under the assumption that a particular hypothesis is true. This probability is commonly known as ‘the’ p-value. (Null Hypothesis) Significance Testing ([NH]ST) is the most prominent of these methods. The p-value has been subjected to much speculation, analysis, and criticism. We explore how well the p-value predicts what researchers presumably seek: the probability of the hypothesis being true given the evidence, and the probability of reproducing significant results. We also explore the effect of sample size on inferential accuracy, bias, and error. In a series of simulation experiments, we find that the p-value performs quite well as a heuristic cue in inductive inference, although there are identifiable limits to its usefulness. We conclude that despite its general usefulness, the p-value cannot bear the full burden of inductive inference; it is but one of several heuristic cues available to the data analyst. Depending on the inferential challenge at hand, investigators may supplement their reports with effect size estimates, Bayes factors, or other suitable statistics, to communicate what they think the data say. PMID:28649206
The Heuristic Value of p in Inductive Statistical Inference.

PubMed

Krueger, Joachim I; Heck, Patrick R

2017-01-01

Many statistical methods yield the probability of the observed data - or data more extreme - under the assumption that a particular hypothesis is true. This probability is commonly known as 'the' p -value. (Null Hypothesis) Significance Testing ([NH]ST) is the most prominent of these methods. The p -value has been subjected to much speculation, analysis, and criticism. We explore how well the p -value predicts what researchers presumably seek: the probability of the hypothesis being true given the evidence, and the probability of reproducing significant results. We also explore the effect of sample size on inferential accuracy, bias, and error. In a series of simulation experiments, we find that the p -value performs quite well as a heuristic cue in inductive inference, although there are identifiable limits to its usefulness. We conclude that despite its general usefulness, the p -value cannot bear the full burden of inductive inference; it is but one of several heuristic cues available to the data analyst. Depending on the inferential challenge at hand, investigators may supplement their reports with effect size estimates, Bayes factors, or other suitable statistics, to communicate what they think the data say.
Aggregate and individual replication probability within an explicit model of the research process.

PubMed

Miller, Jeff; Schwarz, Wolf

2011-09-01

We study a model of the research process in which the true effect size, the replication jitter due to changes in experimental procedure, and the statistical error of effect size measurement are all normally distributed random variables. Within this model, we analyze the probability of successfully replicating an initial experimental result by obtaining either a statistically significant result in the same direction or any effect in that direction. We analyze both the probability of successfully replicating a particular experimental effect (i.e., the individual replication probability) and the average probability of successful replication across different studies within some research context (i.e., the aggregate replication probability), and we identify the conditions under which the latter can be approximated using the formulas of Killeen (2005a, 2007). We show how both of these probabilities depend on parameters of the research context that would rarely be known in practice. In addition, we show that the statistical uncertainty associated with the size of an initial observed effect would often prevent accurate estimation of the desired individual replication probability even if these research context parameters were known exactly. We conclude that accurate estimates of replication probability are generally unattainable.

Linking Mechanics and Statistics in Epidermal Tissues

NASA Astrophysics Data System (ADS)

Kim, Sangwoo; Hilgenfeldt, Sascha

2015-03-01

Disordered cellular structures, such as foams, polycrystals, or living tissues, can be characterized by quantitative measurements of domain size and topology. In recent work, we showed that correlations between size and topology in 2D systems are sensitive to the shape (eccentricity) of the individual domains: From a local model of neighbor relations, we derived an analytical justification for the famous empirical Lewis law, confirming the theory with experimental data from cucumber epidermal tissue. Here, we go beyond this purely geometrical model and identify mechanical properties of the tissue as the root cause for the domain eccentricity and thus the statistics of tissue structure. The simple model approach is based on the minimization of an interfacial energy functional. Simulations with Surface Evolver show that the domain statistics depend on a single mechanical parameter, while parameter fluctuations from cell to cell play an important role in simultaneously explaining the shape distribution of cells. The simulations are in excellent agreement with experiments and analytical theory, and establish a general link between the mechanical properties of a tissue and its structure. The model is relevant to diagnostic applications in a variety of animal and plant tissues.
Gender differences in learning physical science concepts: Does computer animation help equalize them?

NASA Astrophysics Data System (ADS)

Jacek, Laura Lee

This dissertation details an experiment designed to identify gender differences in learning using three experimental treatments: animation, static graphics, and verbal instruction alone. Three learning presentations were used in testing of 332 university students. Statistical analysis was performed using ANOVA, binomial tests for differences of proportion, and descriptive statistics. Results showed that animation significantly improved women's long-term learning over static graphics (p = 0.067), but didn't significantly improve men's long-term learning over static graphics. In all cases, women's scores improved with animation over both other forms of instruction for long-term testing, indicating that future research should not abandon the study of animation as a tool that may promote gender equity in science. Short-term test differences were smaller, and not statistically significant. Variation present in short-term scores was related more to presentation topic than treatment. This research also details characteristics of each of the three presentations, to identify variables (e.g. level of abstraction in presentation) affecting score differences within treatments. Differences between men's and women's scores were non-standard between presentations, but these differences were not statistically significant (long-term p = 0.2961, short-term p = 0.2893). In future research, experiments might be better designed to test these presentational variables in isolation, possibly yielding more distinctive differences between presentational scores. Differences in confidence interval overlaps between presentations suggested that treatment superiority may be somewhat dependent on the design or topic of the learning presentation. Confidence intervals greatly overlap in all situations. This undercut, to some degree, the surety of conclusions indicating superiority of one treatment type over the others. However, confidence intervals for animation were smaller, overlapped nearly completely for men and women (there was less overlap between the genders for the other two treatments), and centered around slightly higher means, lending further support to the conclusion that animation helped equalize men's and women's learning. The most important conclusion identified in this research is that gender is an important variable experimental populations testing animation as a learning device. Averages indicated that both men and women prefer to work with animation over either static graphics or verbal instruction alone.
Statistics of Statisticians: Critical Mass of Statistics and Operational Research Groups

NASA Astrophysics Data System (ADS)

Kenna, Ralph; Berche, Bertrand

Using a recently developed model, inspired by mean field theory in statistical physics, and data from the UK's Research Assessment Exercise, we analyse the relationship between the qualities of statistics and operational research groups and the quantities of researchers in them. Similar to other academic disciplines, we provide evidence for a linear dependency of quality on quantity up to an upper critical mass, which is interpreted as the average maximum number of colleagues with whom a researcher can communicate meaningfully within a research group. The model also predicts a lower critical mass, which research groups should strive to achieve to avoid extinction. For statistics and operational research, the lower critical mass is estimated to be 9 ± 3. The upper critical mass, beyond which research quality does not significantly depend on group size, is 17 ± 6.
Local sensitivity analysis for inverse problems solved by singular value decomposition

USGS Publications Warehouse

Hill, M.C.; Nolan, B.T.

2010-01-01

Local sensitivity analysis provides computationally frugal ways to evaluate models commonly used for resource management, risk assessment, and so on. This includes diagnosing inverse model convergence problems caused by parameter insensitivity and(or) parameter interdependence (correlation), understanding what aspects of the model and data contribute to measures of uncertainty, and identifying new data likely to reduce model uncertainty. Here, we consider sensitivity statistics relevant to models in which the process model parameters are transformed using singular value decomposition (SVD) to create SVD parameters for model calibration. The statistics considered include the PEST identifiability statistic, and combined use of the process-model parameter statistics composite scaled sensitivities and parameter correlation coefficients (CSS and PCC). The statistics are complimentary in that the identifiability statistic integrates the effects of parameter sensitivity and interdependence, while CSS and PCC provide individual measures of sensitivity and interdependence. PCC quantifies correlations between pairs or larger sets of parameters; when a set of parameters is intercorrelated, the absolute value of PCC is close to 1.00 for all pairs in the set. The number of singular vectors to include in the calculation of the identifiability statistic is somewhat subjective and influences the statistic. To demonstrate the statistics, we use the USDA’s Root Zone Water Quality Model to simulate nitrogen fate and transport in the unsaturated zone of the Merced River Basin, CA. There are 16 log-transformed process-model parameters, including water content at field capacity (WFC) and bulk density (BD) for each of five soil layers. Calibration data consisted of 1,670 observations comprising soil moisture, soil water tension, aqueous nitrate and bromide concentrations, soil nitrate concentration, and organic matter content. All 16 of the SVD parameters could be estimated by regression based on the range of singular values. Identifiability statistic results varied based on the number of SVD parameters included. Identifiability statistics calculated for four SVD parameters indicate the same three most important process-model parameters as CSS/PCC (WFC1, WFC2, and BD2), but the order differed. Additionally, the identifiability statistic showed that BD1 was almost as dominant as WFC1. The CSS/PCC analysis showed that this results from its high correlation with WCF1 (-0.94), and not its individual sensitivity. Such distinctions, combined with analysis of how high correlations and(or) sensitivities result from the constructed model, can produce important insights into, for example, the use of sensitivity analysis to design monitoring networks. In conclusion, the statistics considered identified similar important parameters. They differ because (1) with CSS/PCC can be more awkward because sensitivity and interdependence are considered separately and (2) identifiability requires consideration of how many SVD parameters to include. A continuing challenge is to understand how these computationally efficient methods compare with computationally demanding global methods like Markov-Chain Monte Carlo given common nonlinear processes and the often even more nonlinear models.
Improving the Validity of Activity of Daily Living Dependency Risk Assessment

PubMed Central

Clark, Daniel O.; Stump, Timothy E.; Tu, Wanzhu; Miller, Douglas K.

2015-01-01

Objectives Efforts to prevent activity of daily living (ADL) dependency may be improved through models that assess older adults’ dependency risk. We evaluated whether cognition and gait speed measures improve the predictive validity of interview-based models. Method Participants were 8,095 self-respondents in the 2006 Health and Retirement Survey who were aged 65 years or over and independent in five ADLs. Incident ADL dependency was determined from the 2008 interview. Models were developed using random 2/3rd cohorts and validated in the remaining 1/3rd. Results Compared to a c-statistic of 0.79 in the best interview model, the model including cognitive measures had c-statistics of 0.82 and 0.80 while the best fitting gait speed model had c-statistics of 0.83 and 0.79 in the development and validation cohorts, respectively. Conclusion Two relatively brief models, one that requires an in-person assessment and one that does not, had excellent validity for predicting incident ADL dependency but did not significantly improve the predictive validity of the best fitting interview-based models. PMID:24652867
Quality of life of patients from rural and urban areas in Poland with head and neck cancer treated with radiotherapy. A study of the influence of selected socio-demographic factors.

PubMed

Depta, Adam; Jewczak, Maciej; Skura-Madziała, Anna

2017-10-01

The quality of life (QoL) experienced by cancer patients depends both on their state of health and on sociodemographic factors. Tumours in the head and neck region have a particularly adverse effect on patients psychologically and on their social functioning. The study involved 121 patients receiving radiotherapy treatment for head and neck cancers. They included 72 urban and 49 rural residents. QoL was assessed using the questionnaires EORTC-QLQ-C30 and QLQ-H&N35. The data were analysed using statistical methods: a χ 2 test for independence and a multinomial logit model. The evaluation of QoL showed a strong, statistically significant, positive dependence on state of health, and a weak dependence on sociodemographic factors and place of residence. Evaluations of financial situation and living conditions were similar for rural and urban residents. Patients from urban areas had the greatest anxiety about deterioration of their state of health. Rural respondents were more often anxious about a worsening of their financial situation, and expressed a fear of loneliness. Studying the QoL of patients with head and neck cancer provides information concerning the areas in which the disease inhibits their lives, and the extent to which it does so. It indicates conditions for the adaptation of treatment and care methods in the healthcare system which might improve the QoL of such patients. A multinomial logit model identifies the factors determining the patients' health assessment and defines the probable values of such assessment.
H/D exchange mass spectrometry and statistical coupling analysis reveal a role for allostery in a ferredoxin-dependent bifurcating transhydrogenase catalytic cycle.

PubMed

Berry, Luke; Poudel, Saroj; Tokmina-Lukaszewska, Monika; Colman, Daniel R; Nguyen, Diep M N; Schut, Gerrit J; Adams, Michael W W; Peters, John W; Boyd, Eric S; Bothner, Brian

2018-01-01

Recent investigations into ferredoxin-dependent transhydrogenases, a class of enzymes responsible for electron transport, have highlighted the biological importance of flavin-based electron bifurcation (FBEB). FBEB generates biomolecules with very low reduction potential by coupling the oxidation of an electron donor with intermediate potential to the reduction of high and low potential molecules. Bifurcating systems can generate biomolecules with very low reduction potentials, such as reduced ferredoxin (Fd), from species such as NADPH. Metabolic systems that use bifurcation are more efficient and confer a competitive advantage for the organisms that harbor them. Structural models are now available for two NADH-dependent ferredoxin-NADP + oxidoreductase (Nfn) complexes. These models, together with spectroscopic studies, have provided considerable insight into the catalytic process of FBEB. However, much about the mechanism and regulation of these multi-subunit proteins remains unclear. Using hydrogen/deuterium exchange mass spectrometry (HDX-MS) and statistical coupling analysis (SCA), we identified specific pathways of communication within the model FBEB system, Nfn from Pyrococus furiosus, under conditions at each step of the catalytic cycle. HDX-MS revealed evidence for allosteric coupling across protein subunits upon nucleotide and ferredoxin binding. SCA uncovered a network of co-evolving residues that can provide connectivity across the complex. Together, the HDX-MS and SCA data show that protein allostery occurs across the ensemble of iron‑sulfur cofactors and ligand binding sites using specific pathways that connect domains allowing them to function as dynamically coordinated units. Copyright © 2017 Elsevier B.V. All rights reserved.
Covariance Between Genotypic Effects and its Use for Genomic Inference in Half-Sib Families

PubMed Central

Wittenburg, Dörte; Teuscher, Friedrich; Klosa, Jan; Reinsch, Norbert

2016-01-01

In livestock, current statistical approaches utilize extensive molecular data, e.g., single nucleotide polymorphisms (SNPs), to improve the genetic evaluation of individuals. The number of model parameters increases with the number of SNPs, so the multicollinearity between covariates can affect the results obtained using whole genome regression methods. In this study, dependencies between SNPs due to linkage and linkage disequilibrium among the chromosome segments were explicitly considered in methods used to estimate the effects of SNPs. The population structure affects the extent of such dependencies, so the covariance among SNP genotypes was derived for half-sib families, which are typical in livestock populations. Conditional on the SNP haplotypes of the common parent (sire), the theoretical covariance was determined using the haplotype frequencies of the population from which the individual parent (dam) was derived. The resulting covariance matrix was included in a statistical model for a trait of interest, and this covariance matrix was then used to specify prior assumptions for SNP effects in a Bayesian framework. The approach was applied to one family in simulated scenarios (few and many quantitative trait loci) and using semireal data obtained from dairy cattle to identify genome segments that affect performance traits, as well as to investigate the impact on predictive ability. Compared with a method that does not explicitly consider any of the relationship among predictor variables, the accuracy of genetic value prediction was improved by 10–22%. The results show that the inclusion of dependence is particularly important for genomic inference based on small sample sizes. PMID:27402363
Development of a statistical model for the determination of the probability of riverbank erosion in a Meditteranean river basin

NASA Astrophysics Data System (ADS)

Varouchakis, Emmanouil; Kourgialas, Nektarios; Karatzas, George; Giannakis, Georgios; Lilli, Maria; Nikolaidis, Nikolaos

2014-05-01

Riverbank erosion affects the river morphology and the local habitat and results in riparian land loss, damage to property and infrastructures, ultimately weakening flood defences. An important issue concerning riverbank erosion is the identification of the areas vulnerable to erosion, as it allows for predicting changes and assists with stream management and restoration. One way to predict the vulnerable to erosion areas is to determine the erosion probability by identifying the underlying relations between riverbank erosion and the geomorphological and/or hydrological variables that prevent or stimulate erosion. A statistical model for evaluating the probability of erosion based on a series of independent local variables and by using logistic regression is developed in this work. The main variables affecting erosion are vegetation index (stability), the presence or absence of meanders, bank material (classification), stream power, bank height, river bank slope, riverbed slope, cross section width and water velocities (Luppi et al. 2009). In statistics, logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable, e.g. binary response, based on one or more predictor variables (continuous or categorical). The probabilities of the possible outcomes are modelled as a function of independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. 1 = "presence of erosion" and 0 = "no erosion") for any value of the independent variables. The regression coefficients are estimated by using maximum likelihood estimation. The erosion occurrence probability can be calculated in conjunction with the model deviance regarding the independent variables tested (Atkinson et al. 2003). The developed statistical model is applied to the Koiliaris River Basin in the island of Crete, Greece. The aim is to determine the probability of erosion along the Koiliaris' riverbanks considering a series of independent geomorphological and/or hydrological variables. Data for the river bank slope and for the river cross section width are available at ten locations along the river. The riverbank has indications of erosion at six of the ten locations while four has remained stable. Based on a recent work, measurements for the two independent variables and data regarding bank stability are available at eight different locations along the river. These locations were used as validation points for the proposed statistical model. The results show a very close agreement between the observed erosion indications and the statistical model as the probability of erosion was accurately predicted at seven out of the eight locations. The next step is to apply the model at more locations along the riverbanks. In November 2013, stakes were inserted at selected locations in order to be able to identify the presence or absence of erosion after the winter period. In April 2014 the presence or absence of erosion will be identified and the model results will be compared to the field data. Our intent is to extend the model by increasing the number of independent variables in order to indentify the key factors favouring erosion along the Koiliaris River. We aim at developing an easy to use statistical tool that will provide a quantified measure of the erosion probability along the riverbanks, which could consequently be used to prevent erosion and flooding events. Atkinson, P. M., German, S. E., Sear, D. A. and Clark, M. J. 2003. Exploring the relations between riverbank erosion and geomorphological controls using geographically weighted logistic regression. Geographical Analysis, 35 (1), 58-82. Luppi, L., Rinaldi, M., Teruggi, L. B., Darby, S. E. and Nardi, L. 2009. Monitoring and numerical modelling of riverbank erosion processes: A case study along the Cecina River (central Italy). Earth Surface Processes and Landforms, 34 (4), 530-546. Acknowledgements This work is part of an on-going THALES project (CYBERSENSORS - High Frequency Monitoring System for Integrated Water Resources Management of Rivers). The project has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: THALES. Investing in knowledge society through the European Social Fund.
Lower education level is a major risk factor for peritonitis incidence in chronic peritoneal dialysis patients: a retrospective cohort study with 12-year follow-up.

PubMed

Chern, Yahn-Bor; Ho, Pei-Shan; Kuo, Li-Chueh; Chen, Jin-Bor

2013-01-01

Peritoneal dialysis (PD)-related peritonitis remains an important complication in PD patients, potentially causing technique failure and influencing patient outcome. To date, no comprehensive study in the Taiwanese PD population has used a time-dependent statistical method to analyze the factors associated with PD-related peritonitis. Our single-center retrospective cohort study, conducted in southern Taiwan between February 1999 and July 2010, used time-dependent statistical methods to analyze the factors associated with PD-related peritonitis. The study recruited 404 PD patients for analysis, 150 of whom experienced at least 1 episode of peritonitis during the follow-up period. The incidence rate of peritonitis was highest during the first 6 months after PD start. A comparison of patients in the two groups (peritonitis vs null-peritonitis) by univariate analysis showed that the peritonitis group included fewer men (p = 0.048) and more patients of older age (≥65 years, p = 0.049). In addition, patients who had never received compulsory education showed a statistically higher incidence of PD-related peritonitis in the univariate analysis (p = 0.04). A proportional hazards model identified education level (less than elementary school vs any higher education level) as having an independent association with PD-related peritonitis [hazard ratio (HR): 1.45; 95% confidence interval (CI): 1.01 to 2.06; p = 0.045). Comorbidities measured using the Charlson comorbidity index (score >2 vs ≤2) showed borderline statistical significance (HR: 1.44; 95% CI: 1.00 to 2.13; p = 0.053). A lower education level is a major risk factor for PD-related peritonitis independent of age, sex, hypoalbuminemia, and comorbidities. Our study emphasizes that a comprehensive PD education program is crucial for PD patients with a lower education level.
Accounting for individualized competing mortality risks in estimating postmenopausal breast cancer risk.

PubMed

Schonberg, Mara A; Li, Vicky W; Eliassen, A Heather; Davis, Roger B; LaCroix, Andrea Z; McCarthy, Ellen P; Rosner, Bernard A; Chlebowski, Rowan T; Hankinson, Susan E; Marcantonio, Edward R; Ngo, Long H

2016-12-01

Accurate risk assessment is necessary for decision-making around breast cancer prevention. We aimed to develop a breast cancer prediction model for postmenopausal women that would take into account their individualized competing risk of non-breast cancer death. We included 73,066 women who completed the 2004 Nurses' Health Study (NHS) questionnaire (all ≥57 years) and followed participants until May 2014. We considered 17 breast cancer risk factors (health behaviors, demographics, family history, reproductive factors) and 7 risk factors for non-breast cancer death (comorbidities, functional dependency) and mammography use. We used competing risk regression to identify factors independently associated with breast cancer. We validated the final model by examining calibration (expected-to-observed ratio of breast cancer incidence, E/O) and discrimination (c-statistic) using 74,887 subjects from the Women's Health Initiative Extension Study (WHI-ES; all were ≥55 years and followed for 5 years). Within 5 years, 1.8 % of NHS participants were diagnosed with breast cancer (vs. 2.0 % in WHI-ES, p = 0.02), and 6.6 % experienced non-breast cancer death (vs. 5.2 % in WHI-ES, p < 0.001). Using a model selection procedure which incorporated the Akaike Information Criterion, c-statistic, statistical significance, and clinical judgement, our final model included 9 breast cancer risk factors, 5 comorbidities, functional dependency, and mammography use. The model's c-statistic was 0.61 (95 % CI [0.60-0.63]) in NHS and 0.57 (0.55-0.58) in WHI-ES. On average, our model under predicted breast cancer in WHI-ES (E/O 0.92 [0.88-0.97]). We developed a novel prediction model that factors in postmenopausal women's individualized competing risks of non-breast cancer death when estimating breast cancer risk.
Accuracy of topographic index models at identifying ephemeral gully trajectories on agricultural fields

NASA Astrophysics Data System (ADS)

Sheshukov, Aleksey Y.; Sekaluvu, Lawrence; Hutchinson, Stacy L.

2018-04-01

Topographic index (TI) models have been widely used to predict trajectories and initiation points of ephemeral gullies (EGs) in agricultural landscapes. Prediction of EGs strongly relies on the selected value of critical TI threshold, and the accuracy depends on topographic features, agricultural management, and datasets of observed EGs. This study statistically evaluated the predictions by TI models in two paired watersheds in Central Kansas that had different levels of structural disturbances due to implemented conservation practices. Four TI models with sole dependency on topographic factors of slope, contributing area, and planform curvature were used in this study. The observed EGs were obtained by field reconnaissance and through the process of hydrological reconditioning of digital elevation models (DEMs). The Kernel Density Estimation analysis was used to evaluate TI distribution within a 10-m buffer of the observed EG trajectories. The EG occurrence within catchments was analyzed using kappa statistics of the error matrix approach, while the lengths of predicted EGs were compared with the observed dataset using the Nash-Sutcliffe Efficiency (NSE) statistics. The TI frequency analysis produced bi-modal distribution of topographic indexes with the pixels within the EG trajectory having a higher peak. The graphs of kappa and NSE versus critical TI threshold showed similar profile for all four TI models and both watersheds with the maximum value representing the best comparison with the observed data. The Compound Topographic Index (CTI) model presented the overall best accuracy with NSE of 0.55 and kappa of 0.32. The statistics for the disturbed watershed showed higher best critical TI threshold values than for the undisturbed watershed. Structural conservation practices implemented in the disturbed watershed reduced ephemeral channels in headwater catchments, thus producing less variability in catchments with EGs. The variation in critical thresholds for all TI models suggested that TI models tend to predict EG occurrence and length over a range of thresholds rather than find a single best value.
Accounting for individualized competing mortality risks in estimating postmenopausal breast cancer risk

PubMed Central

Schonberg, Mara A.; Li, Vicky W.; Eliassen, A. Heather; Davis, Roger B.; LaCroix, Andrea Z.; McCarthy, Ellen P.; Rosner, Bernard A.; Chlebowski, Rowan T.; Hankinson, Susan E.; Marcantonio, Edward R.; Ngo, Long H.

2016-01-01

Purpose Accurate risk assessment is necessary for decision-making around breast cancer prevention. We aimed to develop a breast cancer prediction model for postmenopausal women that would take into account their individualized competing risk of non-breast cancer death. Methods We included 73,066 women who completed the 2004 Nurses’ Health Study (NHS) questionnaire (all ≥57 years) and followed participants until May 2014. We considered 17 breast cancer risk factors (health behaviors, demographics, family history, reproductive factors), 7 risk factors for non-breast cancer death (comorbidities, functional dependency), and mammography use. We used competing risk regression to identify factors independently associated with breast cancer. We validated the final model by examining calibration (expected-to-observed ratio of breast cancer incidence, E/O) and discrimination (c-statistic) using 74,887 subjects from the Women’s Health Initiative Extension Study (WHI-ES; all were ≥55 years and followed for 5 years). Results Within 5 years, 1.8% of NHS participants were diagnosed with breast cancer (vs. 2.0% in WHI-ES, p=0.02) and 6.6% experienced non-breast cancer death (vs. 5.2% in WHI-ES, p<0.001). Using a model selection procedure which incorporated the Akaike Information Criterion, c-statistic, statistical significance, and clinical judgement, our final model included 9 breast cancer risk factors, 5 comorbidities, functional dependency, and mammography use. The model’s c-statistic was 0.61 (95% CI [0.60–0.63]) in NHS and 0.57 (0.55–0.58) in WHI-ES. On average our model under predicted breast cancer in WHI-ES (E/O 0.92 [0.88–0.97]). Conclusions We developed a novel prediction model that factors in postmenopausal women’s individualized competing risks of non-breast cancer death when estimating breast cancer risk. PMID:27770283
[Contextual indicators to assess social determinants of health and the Spanish economic recession].

PubMed

Cabrera-León, Andrés; Daponte Codina, Antonio; Mateo, Inmaculada; Arroyo-Borrell, Elena; Bartoll, Xavier; Bravo, María José; Domínguez-Berjón, María Felicitas; Renart, Gemma; Álvarez-Dardet, Carlos; Marí-Dell'Olmo, Marc; Bolívar Muñoz, Julia; Saez, Marc; Escribà-Agüir, Vicenta; Palència, Laia; López, María José; Saurina, Carme; Puig, Vanessa; Martín, Unai; Gotsens, Mercè; Borrell, Carme; Serra Saurina, Laura; Sordo, Luis; Bacigalupe, Amaia; Rodríguez-Sanz, Maica; Pérez, Glòria; Espelt, Albert; Ruiz, Miguel; Bernal, Mariola

To provide indicators to assess the impact on health, its social determinants and health inequalities from a social context and the recent economic recession in Spain and its autonomous regions. Based on the Spanish conceptual framework for determinants of social inequalities in health, we identified indicators sequentially from key documents, Web of Science, and organisations with official statistics. The information collected resulted in a large directory of indicators which was reviewed by an expert panel. We then selected a set of these indicators according to geographical (availability of data according to autonomous regions) and temporal (from at least 2006 to 2012) criteria. We identified 203 contextual indicators related to social determinants of health and selected 96 (47%) based on the above criteria; 16% of the identified indicators did not satisfy the geographical criteria and 35% did not satisfy the temporal criteria. At least 80% of the indicators related to dependence and healthcare services were excluded. The final selection of indicators covered all areas for social determinants of health, and 62% of these were not available on the Internet. Around 40% of the indicators were extracted from sources related to the Spanish Statistics Institute. We have provided an extensive directory of contextual indicators on social determinants of health and a database to facilitate assessment of the impact of the economic recession on health and health inequalities in Spain and its autonomous regions. Copyright © 2016 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.
A Coalitional Game for Distributed Inference in Sensor Networks With Dependent Observations

NASA Astrophysics Data System (ADS)

He, Hao; Varshney, Pramod K.

2016-04-01

We consider the problem of collaborative inference in a sensor network with heterogeneous and statistically dependent sensor observations. Each sensor aims to maximize its inference performance by forming a coalition with other sensors and sharing information within the coalition. It is proved that the inference performance is a nondecreasing function of the coalition size. However, in an energy constrained network, the energy consumption of inter-sensor communication also increases with increasing coalition size, which discourages the formation of the grand coalition (the set of all sensors). In this paper, the formation of non-overlapping coalitions with statistically dependent sensors is investigated under a specific communication constraint. We apply a game theoretical approach to fully explore and utilize the information contained in the spatial dependence among sensors to maximize individual sensor performance. Before formulating the distributed inference problem as a coalition formation game, we first quantify the gain and loss in forming a coalition by introducing the concepts of diversity gain and redundancy loss for both estimation and detection problems. These definitions, enabled by the statistical theory of copulas, allow us to characterize the influence of statistical dependence among sensor observations on inference performance. An iterative algorithm based on merge-and-split operations is proposed for the solution and the stability of the proposed algorithm is analyzed. Numerical results are provided to demonstrate the superiority of our proposed game theoretical approach.
Does graded reaming affect the composition of reaming products in intramedullary nailing of long bones?

PubMed

Kouzelis, Antonis Th; Kourea, Helen; Megas, Panagiotis; Panagiotopoulos, Elias; Marangos, Markos; Lambiris, Elias

2004-08-01

Reaming products taken during intramedullary nailing were examined to identify possible differences in their composition depending on the reaming percentage. Reaming products were taken from 39 fresh closed tibial and femoral diaphyseal fractures in patients with an average age of 29 years. According to histology, reaming products mainly consisted of bone trabeculae, viable or nonviable, and bone marrow stroma. A statistically significant reverse correlation exists between viable bone mass percentage and reaming progress. Reaming 1 mm less than the minimum canal diameter provides a higher viable bone mass percentage, which might be an important factor in the bone healing process.
Epithelial ovarian carcinoma diagnosis by desorption electrospray ionization mass spectrometry imaging

PubMed Central

Dória, Maria Luisa; McKenzie, James S.; Mroz, Anna; Phelps, David L.; Speller, Abigail; Rosini, Francesca; Strittmatter, Nicole; Golf, Ottmar; Veselkov, Kirill; Brown, Robert; Ghaem-Maghami, Sadaf; Takats, Zoltan

2016-01-01

Ovarian cancer is highly prevalent among European women, and is the leading cause of gynaecological cancer death. Current histopathological diagnoses of tumour severity are based on interpretation of, for example, immunohistochemical staining. Desorption electrospray mass spectrometry imaging (DESI-MSI) generates spatially resolved metabolic profiles of tissues and supports an objective investigation of tumour biology. In this study, various ovarian tissue types were analysed by DESI-MSI and co-registered with their corresponding haematoxylin and eosin (H&E) stained images. The mass spectral data reveal tissue type-dependent lipid profiles which are consistent across the n = 110 samples (n = 107 patients) used in this study. Multivariate statistical methods were used to classify samples and identify molecular features discriminating between tissue types. Three main groups of samples (epithelial ovarian carcinoma, borderline ovarian tumours, normal ovarian stroma) were compared as were the carcinoma histotypes (serous, endometrioid, clear cell). Classification rates >84% were achieved for all analyses, and variables differing statistically between groups were determined and putatively identified. The changes noted in various lipid types help to provide a context in terms of tumour biochemistry. The classification of unseen samples demonstrates the capability of DESI-MSI to characterise ovarian samples and to overcome existing limitations in classical histopathology. PMID:27976698
A Statistical Selection Strategy for Normalization Procedures in LC-MS Proteomics Experiments through Dataset Dependent Ranking of Normalization Scaling Factors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Webb-Robertson, Bobbie-Jo M.; Matzke, Melissa M.; Jacobs, Jon M.

2011-12-01

Quantification of LC-MS peak intensities assigned during peptide identification in a typical comparative proteomics experiment will deviate from run-to-run of the instrument due to both technical and biological variation. Thus, normalization of peak intensities across a LC-MS proteomics dataset is a fundamental step in pre-processing. However, the downstream analysis of LC-MS proteomics data can be dramatically affected by the normalization method selected . Current normalization procedures for LC-MS proteomics data are presented in the context of normalization values derived from subsets of the full collection of identified peptides. The distribution of these normalization values is unknown a priori. If theymore » are not independent from the biological factors associated with the experiment the normalization process can introduce bias into the data, which will affect downstream statistical biomarker discovery. We present a novel approach to evaluate normalization strategies, where a normalization strategy includes the peptide selection component associated with the derivation of normalization values. Our approach evaluates the effect of normalization on the between-group variance structure in order to identify candidate normalization strategies that improve the structure of the data without introducing bias into the normalized peak intensities.« less
[Maternal posture and its influence on birthweight].

PubMed

Takito, Monica Yuri; Benício, Maria Helena D'Aquino; Latorre, Maria do Rosário Dias de Oliveira

2005-06-01

To analyze the relationship between maternal posture/physical activity and inadequate birthweight. Prospective cohort study involving 152 pregnant women from a public low-risk antenatal care facility. Three interviews evaluating the frequency of physical activity were administered to each pregnant woman during gestation. Birthweight (inadequate when <3,000 g and adequate when > or =3,000 g) was the dependent variable and the frequency of physical activity the independent variable. Statistical analysis was performed using logistic univariate analysis and multiple regression controlling for schooling, smoking, living with spouse, and baseline nutritional status. The practice of walking for at least 50 minutes during the first period of pregnancy was identified as a protective factor against inadequate birthweight (adjusted OR=0.44; 95% CI: 0.20-0.98). Standing for 2.5 hours or longer during the second semester of pregnancy was associated with increased risk (adjusted OR=3.23; 95% CI: 1.30-7.99). Dose-response relationships were identified for washing clothing by hand and cooking (p-value for linear trend <0.01 and 0.05, respectively). After confounder control, only washing clothing during the second trimester of gestation remained statistically significant. Our results show the importance of medical orientation regarding posture and physical activity during antenatal care, aiming at the reduction of inadequate birthweight.
Alexithymia, depressive experiences, and dependency in addictive disorders.

PubMed

Speranza, Mario; Corcos, Maurice; Stéphan, Philippe; Loas, Gwenolé; Pérez-Diaz, Fernando; Lang, François; Venisse, Jean Luc; Bizouard, Paul; Flament, Martine; Halfon, Olivier; Jeammet, Philippe

2004-03-01

Alexithymia, depressive feelings, and dependency are interrelated dimensions that are considered potential "risk factors" for addictive disorders. The aim of this study was to investigate the relationships between these dimensions and to define a comprehensive model of addiction in a large sample of addicted subjects, whether affected by an eating disorder or presenting an alcohol- or a drug use-related disorder. The participants in this study were gathered from a multicenter collaborative study on addictive behaviors conducted in several psychiatric departments in France, Switzerland, and Belgium between January 1995 and March 1999. The clinical sample was composed of 564 patients (149 anorexics, 84 bulimics, 208 alcoholics, 123 drug addicts) of both genders with a mean age of 27.3 +/- 8 years. A path analysis was conducted on the 564 dependent patients and 518 matched controls using the scores of the Toronto Alexithymia Scale, the Depressive Experiences Questionnaire, and the Interpersonal Dependency Inventory. Statistical analyses showed good adjustment (Goodness of Fit Index = 0.977) between the observable data and the assumed model, thus supporting the hypothesis that a depressive dimension, whether anaclitic or self-critical, can facilitate the development of dependency in vulnerable alexithymic subjects. This result has interesting clinical implications because identifying specific patterns of relationships leading from alexithymia to dependency can provide clues to the development of targeted strategies for at-risk subjects.

Forecast Verification: Identification of small changes in weather forecasting skill

NASA Astrophysics Data System (ADS)

Weatherhead, E. C.; Jensen, T. L.

2017-12-01

Global and regonal weather forecasts have improved over the past seven decades most often because of small, incrmental improvements. The identificaiton and verification of forecast improvement due to proposed small changes in forecasting can be expensive and, if not carried out efficiently, can slow progress in forecasting development. This presentation will look at the skill of commonly used verification techniques and show how the ability to detect improvements can depend on the magnitude of the improvement, the number of runs used to test the improvement, the location on the Earth and the statistical techniques used. For continuous variables, such as temperture, wind and humidity, the skill of a forecast can be directly compared using a pair-wise statistical test that accommodates the natural autocorrelation and magnitude of variability. For discrete variables, such as tornado outbreaks, or icing events, the challenges is to reduce the false alarm rate while improving the rate of correctly identifying th discrete event. For both continuus and discrete verification results, proper statistical approaches can reduce the number of runs needed to identify a small improvement in forecasting skill. Verification within the Next Generation Global Prediction System is an important component to the many small decisions needed to make stat-of-the-art improvements to weather forecasting capabilities. The comparison of multiple skill scores with often conflicting results requires not only appropriate testing, but also scientific judgment to assure that the choices are appropriate not only for improvements in today's forecasting capabilities, but allow improvements that will come in the future.
A Statistical Study of Eiscat Electron and Ion Temperature Measurements In The E-region

NASA Astrophysics Data System (ADS)

Hussey, G.; Haldoupis, C.; Schlegel, K.; Bösinger, T.

Motivated by the large EISCAT data base, which covers over 15 years of common programme operation, and previous statistical work with EISCAT data (e.g., C. Hal- doupis, K. Schlegel, and G. Hussey, Auroral E-region electron density gradients mea- sured with EISCAT, Ann. Geopshysicae, 18, 1172-1181, 2000), a detailed statistical analysis of electron and ion EISCAT temperature measurements has been undertaken. This study was specifically concerned with the statistical dependence of heating events with other ambient parameters such as the electric field and electron density. The re- sults showed previously reported dependences such as the electron temperature being directly correlated with the ambient electric field and inversely related to the electron density. However, these correlations were found to be also dependent upon altitude. There was also evidence of the so called "Schlegel effect" (K. Schlegel, Reduced effective recombination coefficient in the disturbed polar E-region, J. Atmos. Terr. Phys., 44, 183-185, 1982); that is, the heated electron gas leads to increases in elec- tron density through a reduction in the recombination rate. This paper will present the statistical heating results and attempt to offer physical explanations and interpretations of the findings.
Methamphetamine use and dependence in vulnerable female populations.

PubMed

Kittirattanapaiboon, Phunnapa; Srikosai, Soontaree; Wittayanookulluk, Apisak

2017-07-01

The study reviews recent publications on methamphetamine use and dependence women in term of their epidemic, physical health impact, psychosocial impacts, and also in the identified vulnerable issues. Studies of vulnerable populations of women are wide ranging and include sex workers, sexual minorities, homeless, psychiatric patients, suburban women, and pregnant women, in which amphetamine type stimulants (ATSs) are the most commonly reported illicit drug used among them. The prenatal exposure of ATS demonstrated the small for gestational age and low birth weight; however, more research is needed on long-term studies of methamphetamine-exposed children. Intimate partner violence (IPV) is commonly reported by female methamphetamine users as perpetrators and victims. However, statistics and gendered power dynamics suggest that methamphetamine-related IPV indicates a higher chance of femicide. Methamphetamine-abusing women often have unresolved childhood trauma and are introduced to ATS through families or partners. Vulnerable populations of women at risk of methamphetamine abuse and dependence. Impacts on their physical and mental health, IPV, and pregnancy have been reported continuing, which guide that empowering and holistic substance abuse are necessary for specific group.
Uncertainty Quantification in Scale-Dependent Models of Flow in Porous Media: SCALE-DEPENDENT UQ

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tartakovsky, A. M.; Panzeri, M.; Tartakovsky, G. D.

Equations governing flow and transport in heterogeneous porous media are scale-dependent. We demonstrate that it is possible to identify a support scalemore » $$\\eta^*$$, such that the typically employed approximate formulations of Moment Equations (ME) yield accurate (statistical) moments of a target environmental state variable. Under these circumstances, the ME approach can be used as an alternative to the Monte Carlo (MC) method for Uncertainty Quantification in diverse fields of Earth and environmental sciences. MEs are directly satisfied by the leading moments of the quantities of interest and are defined on the same support scale as the governing stochastic partial differential equations (PDEs). Computable approximations of the otherwise exact MEs can be obtained through perturbation expansion of moments of the state variables in orders of the standard deviation of the random model parameters. As such, their convergence is guaranteed only for the standard deviation smaller than one. We demonstrate our approach in the context of steady-state groundwater flow in a porous medium with a spatially random hydraulic conductivity.« less
Statistical scaling of pore-scale Lagrangian velocities in natural porous media.

PubMed

Siena, M; Guadagnini, A; Riva, M; Bijeljic, B; Pereira Nunes, J P; Blunt, M J

2014-08-01

We investigate the scaling behavior of sample statistics of pore-scale Lagrangian velocities in two different rock samples, Bentheimer sandstone and Estaillades limestone. The samples are imaged using x-ray computer tomography with micron-scale resolution. The scaling analysis relies on the study of the way qth-order sample structure functions (statistical moments of order q of absolute increments) of Lagrangian velocities depend on separation distances, or lags, traveled along the mean flow direction. In the sandstone block, sample structure functions of all orders exhibit a power-law scaling within a clearly identifiable intermediate range of lags. Sample structure functions associated with the limestone block display two diverse power-law regimes, which we infer to be related to two overlapping spatially correlated structures. In both rocks and for all orders q, we observe linear relationships between logarithmic structure functions of successive orders at all lags (a phenomenon that is typically known as extended power scaling, or extended self-similarity). The scaling behavior of Lagrangian velocities is compared with the one exhibited by porosity and specific surface area, which constitute two key pore-scale geometric observables. The statistical scaling of the local velocity field reflects the behavior of these geometric observables, with the occurrence of power-law-scaling regimes within the same range of lags for sample structure functions of Lagrangian velocity, porosity, and specific surface area.
An efficient coding theory for a dynamic trajectory predicts non-uniform allocation of entorhinal grid cells to modules.

PubMed

Mosheiff, Noga; Agmon, Haggai; Moriel, Avraham; Burak, Yoram

2017-06-01

Grid cells in the entorhinal cortex encode the position of an animal in its environment with spatially periodic tuning curves with different periodicities. Recent experiments established that these cells are functionally organized in discrete modules with uniform grid spacing. Here we develop a theory for efficient coding of position, which takes into account the temporal statistics of the animal's motion. The theory predicts a sharp decrease of module population sizes with grid spacing, in agreement with the trend seen in the experimental data. We identify a simple scheme for readout of the grid cell code by neural circuitry, that can match in accuracy the optimal Bayesian decoder. This readout scheme requires persistence over different timescales, depending on the grid cell module. Thus, we propose that the brain may employ an efficient representation of position which takes advantage of the spatiotemporal statistics of the encoded variable, in similarity to the principles that govern early sensory processing.
EEG Sleep Stages Classification Based on Time Domain Features and Structural Graph Similarity.

PubMed

Diykh, Mohammed; Li, Yan; Wen, Peng

2016-11-01

The electroencephalogram (EEG) signals are commonly used in diagnosing and treating sleep disorders. Many existing methods for sleep stages classification mainly depend on the analysis of EEG signals in time or frequency domain to obtain a high classification accuracy. In this paper, the statistical features in time domain, the structural graph similarity and the K-means (SGSKM) are combined to identify six sleep stages using single channel EEG signals. Firstly, each EEG segment is partitioned into sub-segments. The size of a sub-segment is determined empirically. Secondly, statistical features are extracted, sorted into different sets of features and forwarded to the SGSKM to classify EEG sleep stages. We have also investigated the relationships between sleep stages and the time domain features of the EEG data used in this paper. The experimental results show that the proposed method yields better classification results than other four existing methods and the support vector machine (SVM) classifier. A 95.93% average classification accuracy is achieved by using the proposed method.
An efficient coding theory for a dynamic trajectory predicts non-uniform allocation of entorhinal grid cells to modules

PubMed Central

Mosheiff, Noga; Agmon, Haggai; Moriel, Avraham

2017-01-01

Grid cells in the entorhinal cortex encode the position of an animal in its environment with spatially periodic tuning curves with different periodicities. Recent experiments established that these cells are functionally organized in discrete modules with uniform grid spacing. Here we develop a theory for efficient coding of position, which takes into account the temporal statistics of the animal’s motion. The theory predicts a sharp decrease of module population sizes with grid spacing, in agreement with the trend seen in the experimental data. We identify a simple scheme for readout of the grid cell code by neural circuitry, that can match in accuracy the optimal Bayesian decoder. This readout scheme requires persistence over different timescales, depending on the grid cell module. Thus, we propose that the brain may employ an efficient representation of position which takes advantage of the spatiotemporal statistics of the encoded variable, in similarity to the principles that govern early sensory processing. PMID:28628647
Moving line model and avalanche statistics of Bingham fluid flow in porous media.

PubMed

Chevalier, Thibaud; Talon, Laurent

2015-07-01

In this article, we propose a simple model to understand the critical behavior of path opening during flow of a yield stress fluid in porous media as numerically observed by Chevalier and Talon (2015). This model can be mapped to the problem of a contact line moving in an heterogeneous field. Close to the critical point, this line presents an avalanche dynamic where the front advances by a succession of waiting time and large burst events. These burst events are then related to the non-flowing (i.e. unyielded) areas. Remarkably, the statistics of these areas reproduce the same properties as in the direct numerical simulations. Furthermore, even if our exponents seem to be close to the mean field universal exponents, we report an unusual bump in the distribution which depends on the disorder. Finally, we identify a scaling invariance of the cluster spatial shape that is well fit, to first order, by a self-affine parabola.
How log-normal is your country? An analysis of the statistical distribution of the exported volumes of products

NASA Astrophysics Data System (ADS)

Annunziata, Mario Alberto; Petri, Alberto; Pontuale, Giorgio; Zaccaria, Andrea

2016-10-01

We have considered the statistical distributions of the volumes of 1131 products exported by 148 countries. We have found that the form of these distributions is not unique but heavily depends on the level of development of the nation, as expressed by macroeconomic indicators like GDP, GDP per capita, total export and a recently introduced measure for countries' economic complexity called fitness. We have identified three major classes: a) an incomplete log-normal shape, truncated on the left side, for the less developed countries, b) a complete log-normal, with a wider range of volumes, for nations characterized by intermediate economy, and c) a strongly asymmetric shape for countries with a high degree of development. Finally, the log-normality hypothesis has been checked for the distributions of all the 148 countries through different tests, Kolmogorov-Smirnov and Cramér-Von Mises, confirming that it cannot be rejected only for the countries of intermediate economy.
One Hundred Ways to be Non-Fickian - A Rigorous Multi-Variate Statistical Analysis of Pore-Scale Transport

NASA Astrophysics Data System (ADS)

Most, Sebastian; Nowak, Wolfgang; Bijeljic, Branko

2015-04-01

Fickian transport in groundwater flow is the exception rather than the rule. Transport in porous media is frequently simulated via particle methods (i.e. particle tracking random walk (PTRW) or continuous time random walk (CTRW)). These methods formulate transport as a stochastic process of particle position increments. At the pore scale, geometry and micro-heterogeneities prohibit the commonly made assumption of independent and normally distributed increments to represent dispersion. Many recent particle methods seek to loosen this assumption. Hence, it is important to get a better understanding of the processes at pore scale. For our analysis we track the positions of 10.000 particles migrating through the pore space over time. The data we use come from micro CT scans of a homogeneous sandstone and encompass about 10 grain sizes. Based on those images we discretize the pore structure and simulate flow at the pore scale based on the Navier-Stokes equation. This flow field realistically describes flow inside the pore space and we do not need to add artificial dispersion during the transport simulation. Next, we use particle tracking random walk and simulate pore-scale transport. Finally, we use the obtained particle trajectories to do a multivariate statistical analysis of the particle motion at the pore scale. Our analysis is based on copulas. Every multivariate joint distribution is a combination of its univariate marginal distributions. The copula represents the dependence structure of those univariate marginals and is therefore useful to observe correlation and non-Gaussian interactions (i.e. non-Fickian transport). The first goal of this analysis is to better understand the validity regions of commonly made assumptions. We are investigating three different transport distances: 1) The distance where the statistical dependence between particle increments can be modelled as an order-one Markov process. This would be the Markovian distance for the process, where the validity of yet-unexplored non-Gaussian-but-Markovian random walks start. 2) The distance where bivariate statistical dependence simplifies to a multi-Gaussian dependence based on simple linear correlation (validity of correlated PTRW/CTRW). 3) The distance of complete statistical independence (validity of classical PTRW/CTRW). The second objective is to reveal characteristic dependencies influencing transport the most. Those dependencies can be very complex. Copulas are highly capable of representing linear dependence as well as non-linear dependence. With that tool we are able to detect persistent characteristics dominating transport even across different scales. The results derived from our experimental data set suggest that there are many more non-Fickian aspects of pore-scale transport than the univariate statistics of longitudinal displacements. Non-Fickianity can also be found in transverse displacements, and in the relations between increments at different time steps. Also, the found dependence is non-linear (i.e. beyond simple correlation) and persists over long distances. Thus, our results strongly support the further refinement of techniques like correlated PTRW or correlated CTRW towards non-linear statistical relations.
A Stochastic Model of Space-Time Variability of Mesoscale Rainfall: Statistics of Spatial Averages

NASA Technical Reports Server (NTRS)

Kundu, Prasun K.; Bell, Thomas L.

2003-01-01

A characteristic feature of rainfall statistics is that they depend on the space and time scales over which rain data are averaged. A previously developed spectral model of rain statistics that is designed to capture this property, predicts power law scaling behavior for the second moment statistics of area-averaged rain rate on the averaging length scale L as L right arrow 0. In the present work a more efficient method of estimating the model parameters is presented, and used to fit the model to the statistics of area-averaged rain rate derived from gridded radar precipitation data from TOGA COARE. Statistical properties of the data and the model predictions are compared over a wide range of averaging scales. An extension of the spectral model scaling relations to describe the dependence of the average fraction of grid boxes within an area containing nonzero rain (the "rainy area fraction") on the grid scale L is also explored.
Estimating the proportion of true null hypotheses when the statistics are discrete.

PubMed

Dialsingh, Isaac; Austin, Stefanie R; Altman, Naomi S

2015-07-15

In high-dimensional testing problems π0, the proportion of null hypotheses that are true is an important parameter. For discrete test statistics, the P values come from a discrete distribution with finite support and the null distribution may depend on an ancillary statistic such as a table margin that varies among the test statistics. Methods for estimating π0 developed for continuous test statistics, which depend on a uniform or identical null distribution of P values, may not perform well when applied to discrete testing problems. This article introduces a number of π0 estimators, the regression and 'T' methods that perform well with discrete test statistics and also assesses how well methods developed for or adapted from continuous tests perform with discrete tests. We demonstrate the usefulness of these estimators in the analysis of high-throughput biological RNA-seq and single-nucleotide polymorphism data. implemented in R. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Adverse mental health effects of cannabis use in two indigenous communities in Arnhem Land, Northern Territory, Australia: exploratory study.

PubMed

Clough, Alan R; d'Abbs, Peter; Cairney, Sheree; Gray, Dennis; Maruff, Paul; Parker, Robert; O'Reilly, Bridie

2005-07-01

We investigated adverse mental health effects and their associations with levels of cannabis use among indigenous Australian cannabis users in remote communities in the Northern Territory. Local indigenous health workers and key informants assisted in developing 28 criteria describing mental health symptoms. Five symptom clusters were identified using cluster analysis of data compiled from interviews with 103 cannabis users. Agreement was assessed (method comparison approach, kappa-statistic) with a clinician's classification of the 28 criteria into five groups labelled: 'anxiety', 'dependency', 'mood', 'vegetative' and 'psychosis'. Participants were described as showing 'anxiety', 'dependency' etc., if they reported half or more of the symptoms comprising the cluster. Associations between participants' self-reported cannabis use and each symptom cluster were assessed (logistic regression adjusting for age, sex, other substance use). Agreement between two classifications of 28 criteria into five groups was 'moderate' (64%, kappa = 0.55, p < 0.001). When five clusters were combined into three, 'anxiety-dependency', 'mood-vegetative' and 'psychosis', agreement rose to 71% (kappa = 0.56, p < 0.001). 'Anxiety-dependency' was positively associated with number of 'cones' usually smoked per week and this remained significant when adjusted for confounders (p = 0.020) and tended to remain significant in those who had never sniffed petrol (p = 0.052). Users of more than five cones per week were more likely to display 'anxiety-dependency' symptoms than those who used one cone per week (OR = 15.8, 1.8-141.2, p = 0.013). A crude association between the 'mood-vegetative' symptom cluster and number of cones usually smoked per week (p = 0.014) also remained statistically significant when adjusted for confounders (p = 0.012) but was modified by interactions with petrol sniffing (p = 0.116) and alcohol use (p = 0.276). There were no associations between cannabis use and 'psychosis'. Risks for 'anxiety-dependency' symptoms in cannabis users increased as their level of use increased. Other plausible mental health effects of cannabis in this population of comparatively new users were probably masked by alcohol use and a history of petrol sniffing.
Co-occurrence statistics as a language-dependent cue for speech segmentation.

PubMed

Saksida, Amanda; Langus, Alan; Nespor, Marina

2017-05-01

To what extent can language acquisition be explained in terms of different associative learning mechanisms? It has been hypothesized that distributional regularities in spoken languages are strong enough to elicit statistical learning about dependencies among speech units. Distributional regularities could be a useful cue for word learning even without rich language-specific knowledge. However, it is not clear how strong and reliable the distributional cues are that humans might use to segment speech. We investigate cross-linguistic viability of different statistical learning strategies by analyzing child-directed speech corpora from nine languages and by modeling possible statistics-based speech segmentations. We show that languages vary as to which statistical segmentation strategies are most successful. The variability of the results can be partially explained by systematic differences between languages, such as rhythmical differences. The results confirm previous findings that different statistical learning strategies are successful in different languages and suggest that infants may have to primarily rely on non-statistical cues when they begin their process of speech segmentation. © 2016 John Wiley & Sons Ltd.
Statistical properties of excited nuclei in the mass range 47 Less-Than-Or-Slanted-Equal-To A Less-Than-Or-Slanted-Equal-To 59

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhuravlev, B. V., E-mail: zhurav@ippe.ru; Lychagin, A. A., E-mail: Lychagin1@yandex.ru; Titarenko, N. N.

Level densities and their energy dependences for nuclei in the mass range of 47 {<=} A {<=} 59 were determined from the results obtained by measuring neutron-evaporation spectra in respective (p, n) reactions. The spectra of neutrons originating from the (p, n) reactions on {sup 47}Ti, {sup 48}Ti, {sup 49}Ti, {sup 53}Cr, {sup 54}Cr, {sup 57}Fe, and {sup 59}Co nuclei were measured in the proton-energy range of 7-11 MeV. These measurements were performed with the aid of a fast-neutron spectrometer by the time-of-flight method over the base of the EGP-15 pulsed tandem accelerator installed at the Institute for Physics andmore » Power Engineering (Obninsk, Russia). A high resolution of the spectrometer and its stability in the time of flight made it possible to identify reliably discrete low-lying levels along with the continuum part of neutron spectra. Our measured data were analyzed within the statistical equilibrium and preequilibrium models of nuclear reactions. The respective calculations were performed with the aid of the Hauser-Feshbach formalismof statistical theory supplemented with the generalized model of a superfluid nucleus, the back-shifted Fermi gas model, and the Gilbert-Cameron composite formula for nuclear level densities. Nuclear level densities for {sup 47}V, {sup 48}V, {sup 49}V, {sup 53}Mn, {sup 54}Mn, {sup 57}Co, and {sup 59}Ni and their energy dependences were determined. The results are discussed and compared with available experimental data and with recommendations of model-based systematics.« less
Using aggregated, de-identified electronic health record data for multivariate pharmacosurveillance: a case study of azathioprine.

PubMed

Patel, Vishal N; Kaelber, David C

2014-12-01

To demonstrate the use of aggregated and de-identified electronic health record (EHR) data for multivariate post-marketing pharmacosurveillance in a case study of azathioprine (AZA). Using aggregated, standardized, normalized, and de-identified, population-level data from the Explore platform (Explorys, Inc.) we searched over 10 million individuals, of which 14,580 were prescribed AZA based on RxNorm drug orders. Based on logical observation identifiers names and codes (LOINC) and vital sign data, we examined the following side effects: anemia, cell lysis, fever, hepatotoxicity, hypertension, nephrotoxicity, neutropenia, and neutrophilia. Patients prescribed AZA were compared to patients prescribed one of 11 other anti-rheumatologic drugs to determine the relative risk of side effect pairs. Compared to AZA case report trends, hepatotoxicity (marked by elevated transaminases or elevated bilirubin) did not occur as an isolated event more frequently in patients prescribed AZA than other anti-rheumatic agents. While neutropenia occurred in 24% of patients (RR 1.15, 95% CI 1.07-1.23), neutrophilia was also frequent (45%) and increased in patients prescribed AZA (RR 1.28, 95% CI 1.22-1.34). After constructing a pairwise side effect network, neutropenia had no dependencies. A reduced risk of neutropenia was found in patients with co-existing elevations in total bilirubin or liver transaminases, supporting classic clinical knowledge that agranulocytosis is a largely unpredictable phenomenon. Rounding errors propagated in the statistically de-identified datasets for cohorts as small as 40 patients only contributed marginally to the calculated risk. Our work demonstrates that aggregated, standardized, normalized and de-identified population level EHR data can provide both sufficient insight and statistical power to detect potential patterns of medication side effect associations, serving as a multivariate and generalizable approach to post-marketing drug surveillance. Copyright © 2013 Elsevier Inc. All rights reserved.
Needle Acupuncture for Substance Use Disorders: A Systematic Review

DTIC Science & Technology

2015-01-01

RCTs). We did identify statistically significant, clinically medium effects in favor of acupuncture (as an adjunctive or monotherapy) versus any...statistically significant, clinically medium effects in favor of acupuncture (as an adjunctive or monotherapy) versus any comparator at...at postintervention. We did identify statistically significant, clinically medium effects in favor of acupuncture (as an adjunctive or monotherapy
DEIVA: a web application for interactive visual analysis of differential gene expression profiles.

PubMed

Harshbarger, Jayson; Kratz, Anton; Carninci, Piero

2017-01-07

Differential gene expression (DGE) analysis is a technique to identify statistically significant differences in RNA abundance for genes or arbitrary features between different biological states. The result of a DGE test is typically further analyzed using statistical software, spreadsheets or custom ad hoc algorithms. We identified a need for a web-based system to share DGE statistical test results, and locate and identify genes in DGE statistical test results with a very low barrier of entry. We have developed DEIVA, a free and open source, browser-based single page application (SPA) with a strong emphasis on being user friendly that enables locating and identifying single or multiple genes in an immediate, interactive, and intuitive manner. By design, DEIVA scales with very large numbers of users and datasets. Compared to existing software, DEIVA offers a unique combination of design decisions that enable inspection and analysis of DGE statistical test results with an emphasis on ease of use.
Genetic overlap between Alzheimer’s disease and Parkinson’s disease at the MAPT locus

PubMed Central

Desikan, Rahul S.; Schork, Andrew J.; Wang, Yunpeng; Witoelar, Aree; Sharma, Manu; McEvoy, Linda K.; Holland, Dominic; Brewer, James B.; Chen, Chi-Hua; Thompson, Wesley K.; Harold, Denise; Williams, Julie; Owen, Michael J.; O’Donovan, Michael C.; Pericak-Vance, Margaret A.; Mayeux, Richard; Haines, Jonathan L.; Farrer, Lindsay A.; Schellenberg, Gerard D.; Heutink, Peter; Singleton, Andrew B.; Brice, Alexis; Wood, Nicolas W.; Hardy, John; Martinez, Maria; Choi, Seung Hoi; DeStefano, Anita; Ikram, M. Arfan; Bis, Joshua C.; Smith, Albert; Fitzpatrick, Annette L.; Launer, Lenore; van Duijn, Cornelia; Seshadri, Sudha; Ulstein, Ingun Dina; Aarsland, Dag; Fladby, Tormod; Djurovic, Srdjan; Hyman, Bradley T.; Snaedal, Jon; Stefansson, Hreinn; Stefansson, Kari; Gasser, Thomas; Andreassen, Ole A.; Dale, Anders M.

2015-01-01

We investigated genetic overlap between Alzheimer’s disease (AD) and Parkinson’s disease (PD). Using summary statistics (p-values) from large recent genomewide association studies (GWAS) (total n = 89,904 individuals), we sought to identify single nucleotide polymorphisms (SNPs) associating with both AD and PD. We found and replicated association of both AD and PD with the A allele of rs393152 within the extended MAPT region on chromosome 17 (meta analysis p-value across 5 independent AD cohorts = 1.65 × 10−7). In independent datasets, we found a dose-dependent effect of the A allele of rs393152 on intra-cerebral MAPT transcript levels and volume loss within the entorhinal cortex and hippocampus. Our findings identify the tau-associated MAPT locus as a site of genetic overlap between AD and PD and extending prior work, we show that the MAPT region increases risk of Alzheimer’s neurodegeneration. PMID:25687773

Two statistics for evaluating parameter identifiability and error reduction

USGS Publications Warehouse

Doherty, John; Hunt, Randall J.

2009-01-01

Two statistics are presented that can be used to rank input parameters utilized by a model in terms of their relative identifiability based on a given or possible future calibration dataset. Identifiability is defined here as the capability of model calibration to constrain parameters used by a model. Both statistics require that the sensitivity of each model parameter be calculated for each model output for which there are actual or presumed field measurements. Singular value decomposition (SVD) of the weighted sensitivity matrix is then undertaken to quantify the relation between the parameters and observations that, in turn, allows selection of calibration solution and null spaces spanned by unit orthogonal vectors. The first statistic presented, "parameter identifiability", is quantitatively defined as the direction cosine between a parameter and its projection onto the calibration solution space. This varies between zero and one, with zero indicating complete non-identifiability and one indicating complete identifiability. The second statistic, "relative error reduction", indicates the extent to which the calibration process reduces error in estimation of a parameter from its pre-calibration level where its value must be assigned purely on the basis of prior expert knowledge. This is more sophisticated than identifiability, in that it takes greater account of the noise associated with the calibration dataset. Like identifiability, it has a maximum value of one (which can only be achieved if there is no measurement noise). Conceptually it can fall to zero; and even below zero if a calibration problem is poorly posed. An example, based on a coupled groundwater/surface-water model, is included that demonstrates the utility of the statistics. ?? 2009 Elsevier B.V.
Crossover between the Gaussian orthogonal ensemble, the Gaussian unitary ensemble, and Poissonian statistics.

PubMed

Schweiner, Frank; Laturner, Jeanine; Main, Jörg; Wunner, Günter

2017-11-01

Until now only for specific crossovers between Poissonian statistics (P), the statistics of a Gaussian orthogonal ensemble (GOE), or the statistics of a Gaussian unitary ensemble (GUE) have analytical formulas for the level spacing distribution function been derived within random matrix theory. We investigate arbitrary crossovers in the triangle between all three statistics. To this aim we propose an according formula for the level spacing distribution function depending on two parameters. Comparing the behavior of our formula for the special cases of P→GUE, P→GOE, and GOE→GUE with the results from random matrix theory, we prove that these crossovers are described reasonably. Recent investigations by F. Schweiner et al. [Phys. Rev. E 95, 062205 (2017)2470-004510.1103/PhysRevE.95.062205] have shown that the Hamiltonian of magnetoexcitons in cubic semiconductors can exhibit all three statistics in dependence on the system parameters. Evaluating the numerical results for magnetoexcitons in dependence on the excitation energy and on a parameter connected with the cubic valence band structure and comparing the results with the formula proposed allows us to distinguish between regular and chaotic behavior as well as between existent or broken antiunitary symmetries. Increasing one of the two parameters, transitions between different crossovers, e.g., from the P→GOE to the P→GUE crossover, are observed and discussed.
Statistical Data Analyses of Trace Chemical, Biochemical, and Physical Analytical Signatures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Udey, Ruth Norma

Analytical and bioanalytical chemistry measurement results are most meaningful when interpreted using rigorous statistical treatments of the data. The same data set may provide many dimensions of information depending on the questions asked through the applied statistical methods. Three principal projects illustrated the wealth of information gained through the application of statistical data analyses to diverse problems.
Normality Tests for Statistical Analysis: A Guide for Non-Statisticians

PubMed Central

Ghasemi, Asghar; Zahediasl, Saleh

2012-01-01

Statistical errors are common in scientific literature and about 50% of the published articles have at least one error. The assumption of normality needs to be checked for many statistical procedures, namely parametric tests, because their validity depends on it. The aim of this commentary is to overview checking for normality in statistical analysis using SPSS. PMID:23843808
Dealing with missing standard deviation and mean values in meta-analysis of continuous outcomes: a systematic review.

PubMed

Weir, Christopher J; Butcher, Isabella; Assi, Valentina; Lewis, Stephanie C; Murray, Gordon D; Langhorne, Peter; Brady, Marian C

2018-03-07

Rigorous, informative meta-analyses rely on availability of appropriate summary statistics or individual participant data. For continuous outcomes, especially those with naturally skewed distributions, summary information on the mean or variability often goes unreported. While full reporting of original trial data is the ideal, we sought to identify methods for handling unreported mean or variability summary statistics in meta-analysis. We undertook two systematic literature reviews to identify methodological approaches used to deal with missing mean or variability summary statistics. Five electronic databases were searched, in addition to the Cochrane Colloquium abstract books and the Cochrane Statistics Methods Group mailing list archive. We also conducted cited reference searching and emailed topic experts to identify recent methodological developments. Details recorded included the description of the method, the information required to implement the method, any underlying assumptions and whether the method could be readily applied in standard statistical software. We provided a summary description of the methods identified, illustrating selected methods in example meta-analysis scenarios. For missing standard deviations (SDs), following screening of 503 articles, fifteen methods were identified in addition to those reported in a previous review. These included Bayesian hierarchical modelling at the meta-analysis level; summary statistic level imputation based on observed SD values from other trials in the meta-analysis; a practical approximation based on the range; and algebraic estimation of the SD based on other summary statistics. Following screening of 1124 articles for methods estimating the mean, one approximate Bayesian computation approach and three papers based on alternative summary statistics were identified. Illustrative meta-analyses showed that when replacing a missing SD the approximation using the range minimised loss of precision and generally performed better than omitting trials. When estimating missing means, a formula using the median, lower quartile and upper quartile performed best in preserving the precision of the meta-analysis findings, although in some scenarios, omitting trials gave superior results. Methods based on summary statistics (minimum, maximum, lower quartile, upper quartile, median) reported in the literature facilitate more comprehensive inclusion of randomised controlled trials with missing mean or variability summary statistics within meta-analyses.
Hypothesis Testing Using Spatially Dependent Heavy Tailed Multisensor Data

DTIC Science & Technology

2014-12-01

Office of Research 113 Bowne Hall Syracuse, NY 13244 -1200 ABSTRACT HYPOTHESIS TESTING USING SPATIALLY DEPENDENT HEAVY-TAILED MULTISENSOR DATA Report...consistent with the null hypothesis of linearity and can be used to estimate the distribution of a test statistic that can discrimi- nate between the null... Test for nonlinearity. Histogram is generated using the surrogate data. The statistic of the original time series is represented by the solid line
Comment on “Two statistics for evaluating parameter identifiability and error reduction” by John Doherty and Randall J. Hunt

USGS Publications Warehouse

Hill, Mary C.

2010-01-01

Doherty and Hunt (2009) present important ideas for first-order-second moment sensitivity analysis, but five issues are discussed in this comment. First, considering the composite-scaled sensitivity (CSS) jointly with parameter correlation coefficients (PCC) in a CSS/PCC analysis addresses the difficulties with CSS mentioned in the introduction. Second, their new parameter identifiability statistic actually is likely to do a poor job of parameter identifiability in common situations. The statistic instead performs the very useful role of showing how model parameters are included in the estimated singular value decomposition (SVD) parameters. Its close relation to CSS is shown. Third, the idea from p. 125 that a suitable truncation point for SVD parameters can be identified using the prediction variance is challenged using results from Moore and Doherty (2005). Fourth, the relative error reduction statistic of Doherty and Hunt is shown to belong to an emerging set of statistics here named perturbed calculated variance statistics. Finally, the perturbed calculated variance statistics OPR and PPR mentioned on p. 121 are shown to explicitly include the parameter null-space component of uncertainty. Indeed, OPR and PPR results that account for null-space uncertainty have appeared in the literature since 2000.
A method for screening active components from Chinese herbs by cell membrane chromatography-offline-high performance liquid chromatography/mass spectrometry and an online statistical tool for data processing.

PubMed

Cao, Yan; Wang, Shaozhan; Li, Yinghua; Chen, Xiaofei; Chen, Langdong; Wang, Dongyao; Zhu, Zhenyu; Yuan, Yongfang; Lv, Diya

2018-03-09

Cell membrane chromatography (CMC) has been successfully applied to screen bioactive compounds from Chinese herbs for many years, and some offline and online two-dimensional (2D) CMC-high performance liquid chromatography (HPLC) hyphenated systems have been established to perform screening assays. However, the requirement of sample preparation steps for the second-dimensional analysis in offline systems and the need for an interface device and technical expertise in the online system limit their extensive use. In the present study, an offline 2D CMC-HPLC analysis combined with the XCMS (various forms of chromatography coupled to mass spectrometry) Online statistical tool for data processing was established. First, our previously reported online 2D screening system was used to analyze three Chinese herbs that were reported to have potential anti-inflammatory effects, and two binding components were identified. By contrast, the proposed offline 2D screening method with XCMS Online analysis was applied, and three more ingredients were discovered in addition to the two compounds revealed by the online system. Then, cross-validation of the three compounds was performed, and they were confirmed to be included in the online data as well, but were not identified there because of their low concentrations and lack of credible statistical approaches. Last, pharmacological experiments showed that these five ingredients could inhibit IL-6 release and IL-6 gene expression on LPS-induced RAW cells in a dose-dependent manner. Compared with previous 2D CMC screening systems, this newly developed offline 2D method needs no sample preparation steps for the second-dimensional analysis, and it is sensitive, efficient, and convenient. It will be applicable in identifying active components from Chinese herbs and practical in discovery of lead compounds derived from herbs. Copyright © 2018 Elsevier B.V. All rights reserved.
Estimating the probability of rare events: addressing zero failure data.

PubMed

Quigley, John; Revie, Matthew

2011-07-01

Traditional statistical procedures for estimating the probability of an event result in an estimate of zero when no events are realized. Alternative inferential procedures have been proposed for the situation where zero events have been realized but often these are ad hoc, relying on selecting methods dependent on the data that have been realized. Such data-dependent inference decisions violate fundamental statistical principles, resulting in estimation procedures whose benefits are difficult to assess. In this article, we propose estimating the probability of an event occurring through minimax inference on the probability that future samples of equal size realize no more events than that in the data on which the inference is based. Although motivated by inference on rare events, the method is not restricted to zero event data and closely approximates the maximum likelihood estimate (MLE) for nonzero data. The use of the minimax procedure provides a risk adverse inferential procedure where there are no events realized. A comparison is made with the MLE and regions of the underlying probability are identified where this approach is superior. Moreover, a comparison is made with three standard approaches to supporting inference where no event data are realized, which we argue are unduly pessimistic. We show that for situations of zero events the estimator can be simply approximated with 1/2.5n, where n is the number of trials. © 2011 Society for Risk Analysis.
Identifying factors which enhance capacity to engage in clinical education among podiatry practitioners: an action research project.

PubMed

Abey, Sally; Lea, Susan; Callaghan, Lynne; Shaw, Steve; Cotton, Debbie

2015-01-01

Health profession students develop practical skills whilst integrating theory with practice in a real world environment as an important component of their training. Research in the area of practice placements has identified challenges and barriers to the delivery of effective placement learning. However, there has been little research in podiatry and the question of which factors impact upon clinical educators' capacity to engage with the role remains an under-researched area. This paper presents the second phase of an action research project designed to determine the factors that impact upon clinical educators' capacity to engage with the mentorship role. An online survey was developed and podiatry clinical educators recruited through National Health Service (NHS) Trusts. The survey included socio-demographic items, and questions relating to the factors identified as possible variables influencing clinical educator capacity; the latter was assessed using the 'Clinical Educator Capacity to Engage' scale (CECE). Descriptive statistics were used to explore demographic data whilst the relationship between the CECE and socio-demographic factors were examined using inferential statistics in relation to academic profile, career profile and organisation of the placement. The survey response rate was 42 % (n = 66). Multiple linear regression identified four independent variables which explain a significant proportion of the variability of the dependent variable, 'capacity to engage with clinical education', with an adjusted R2 of 0.428. The four variables were: protected mentorship time, clinical educator relationship with university, sign-off responsibility, and volunteer status. The identification of factors that impact upon clinical educators' capacity to engage in mentoring of students has relevance for strategic planning and policy-making with the emphasis upon capacity-building at an individual level, so that the key attitudes and characteristics that are linked with good clinical supervision are preserved.
Identification of an episignature of human colorectal cancer associated with obesity by genome-wide DNA methylation analysis.

PubMed

Crujeiras, Ana B; Morcillo, Sonsoles; Diaz-Lagares, Angel; Sandoval, Juan; Castellano-Castillo, Daniel; Torres, Esperanza; Hervas, David; Moran, Sebastian; Esteller, Manel; Macias-Gonzalez, Manuel; Casanueva, Felipe F; Tinahones, Francisco J

2018-05-01

Obesity was established as a relevant modifiable risk factor in the onset and progression of colorectal cancer (CRC). This relationship could be mediated by an epigenetic regulation. The current work aimed to explore the effects of excess body weight on the DNA methylation profile of CRC using a genome-wide DNA methylation approach and to identify an epigenetic signature of obesity-related CRC. Fifty-six CRC-diagnosed patients (50 years) were included in the study and categorized according to their body mass index (BMI) as non-obese (BMI ≤ 25 kg/m 2 ) or overweight/obese (BMI > 25 kg/m 2 ). Data from Infinium 450k array-based methylomes of 28 CRC tumor samples were coupled with information on BMI categories. Additionally, DNA methylation results were validated in 28 CRC tumor samples. The analysis revealed statistically significant differences at 299 CpG sites, and they were mostly characterized as changes towards CpG hypermethylation occurring in the obese group. The 152 identified genes were involved in inflammatory and metabolic functional processes. Among these genes, novel genes were identified as epigenetically regulated in CRC depending on adiposity. ZNF397OS and ZNF543 represented the top scoring associated events that were further validated in an independent cohort and exhibited strong correlation with BMI and excellent and statistically significant efficiency in the discrimination of obese from non-obese CRC patients (area under the curve >0.80; p < 0.05). The present study identifies a potential epigenome mark of obesity-related CRC that could be useful for precision medicine in the management of this disease taking into account adiposity as a relevant risk factor.
Statistical power analysis of cardiovascular safety pharmacology studies in conscious rats.

PubMed

Bhatt, Siddhartha; Li, Dingzhou; Flynn, Declan; Wisialowski, Todd; Hemkens, Michelle; Steidl-Nichols, Jill

2016-01-01

Cardiovascular (CV) toxicity and related attrition are a major challenge for novel therapeutic entities and identifying CV liability early is critical for effective derisking. CV safety pharmacology studies in rats are a valuable tool for early investigation of CV risk. Thorough understanding of data analysis techniques and statistical power of these studies is currently lacking and is imperative for enabling sound decision-making. Data from 24 crossover and 12 parallel design CV telemetry rat studies were used for statistical power calculations. Average values of telemetry parameters (heart rate, blood pressure, body temperature, and activity) were logged every 60s (from 1h predose to 24h post-dose) and reduced to 15min mean values. These data were subsequently binned into super intervals for statistical analysis. A repeated measure analysis of variance was used for statistical analysis of crossover studies and a repeated measure analysis of covariance was used for parallel studies. Statistical power analysis was performed to generate power curves and establish relationships between detectable CV (blood pressure and heart rate) changes and statistical power. Additionally, data from a crossover CV study with phentolamine at 4, 20 and 100mg/kg are reported as a representative example of data analysis methods. Phentolamine produced a CV profile characteristic of alpha adrenergic receptor antagonism, evidenced by a dose-dependent decrease in blood pressure and reflex tachycardia. Detectable blood pressure changes at 80% statistical power for crossover studies (n=8) were 4-5mmHg. For parallel studies (n=8), detectable changes at 80% power were 6-7mmHg. Detectable heart rate changes for both study designs were 20-22bpm. Based on our results, the conscious rat CV model is a sensitive tool to detect and mitigate CV risk in early safety studies. Furthermore, these results will enable informed selection of appropriate models and study design for early stage CV studies. Copyright © 2016 Elsevier Inc. All rights reserved.
The Essential Genome of Escherichia coli K-12.

PubMed

Goodall, Emily C A; Robinson, Ashley; Johnston, Iain G; Jabbari, Sara; Turner, Keith A; Cunningham, Adam F; Lund, Peter A; Cole, Jeffrey A; Henderson, Ian R

2018-02-20

Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. IMPORTANCE Incentives to define lists of genes that are essential for bacterial survival include the identification of potential targets for antibacterial drug development, genes required for rapid growth for exploitation in biotechnology, and discovery of new biochemical pathways. To identify essential genes in Escherichia coli , we constructed a transposon mutant library of unprecedented density. Initial automated analysis of the resulting data revealed many discrepancies compared to the literature. We now report more extensive statistical analysis supported by both literature searches and detailed inspection of high-density TraDIS sequencing data for each putative essential gene for the E. coli model laboratory organism. This paper is important because it provides a better understanding of the essential genes of E. coli , reveals the limitations of relying on automated analysis alone, and provides a new standard for the analysis of TraDIS data. Copyright © 2018 Goodall et al.
Hard, harder, hardest: principal stratification, statistical identifiability, and the inherent difficulty of finding surrogate endpoints.

PubMed

Wolfson, Julian; Henn, Lisa

2014-01-01

In many areas of clinical investigation there is great interest in identifying and validating surrogate endpoints, biomarkers that can be measured a relatively short time after a treatment has been administered and that can reliably predict the effect of treatment on the clinical outcome of interest. However, despite dramatic advances in the ability to measure biomarkers, the recent history of clinical research is littered with failed surrogates. In this paper, we present a statistical perspective on why identifying surrogate endpoints is so difficult. We view the problem from the framework of causal inference, with a particular focus on the technique of principal stratification (PS), an approach which is appealing because the resulting estimands are not biased by unmeasured confounding. In many settings, PS estimands are not statistically identifiable and their degree of non-identifiability can be thought of as representing the statistical difficulty of assessing the surrogate value of a biomarker. In this work, we examine the identifiability issue and present key simplifying assumptions and enhanced study designs that enable the partial or full identification of PS estimands. We also present example situations where these assumptions and designs may or may not be feasible, providing insight into the problem characteristics which make the statistical evaluation of surrogate endpoints so challenging.
Hard, harder, hardest: principal stratification, statistical identifiability, and the inherent difficulty of finding surrogate endpoints

PubMed Central

2014-01-01

In many areas of clinical investigation there is great interest in identifying and validating surrogate endpoints, biomarkers that can be measured a relatively short time after a treatment has been administered and that can reliably predict the effect of treatment on the clinical outcome of interest. However, despite dramatic advances in the ability to measure biomarkers, the recent history of clinical research is littered with failed surrogates. In this paper, we present a statistical perspective on why identifying surrogate endpoints is so difficult. We view the problem from the framework of causal inference, with a particular focus on the technique of principal stratification (PS), an approach which is appealing because the resulting estimands are not biased by unmeasured confounding. In many settings, PS estimands are not statistically identifiable and their degree of non-identifiability can be thought of as representing the statistical difficulty of assessing the surrogate value of a biomarker. In this work, we examine the identifiability issue and present key simplifying assumptions and enhanced study designs that enable the partial or full identification of PS estimands. We also present example situations where these assumptions and designs may or may not be feasible, providing insight into the problem characteristics which make the statistical evaluation of surrogate endpoints so challenging. PMID:25342953
Structural Blockage: A Cross-national Study of Economic Dependency, State Efficacy, and Underdevelopment.

ERIC Educational Resources Information Center

Delacroix, Jacques; Ragin, Charles C.

1981-01-01

Presents a statistical analysis of dependency of developing nations on more highly developed and industrialized nations and relates this dependency to various degrees of economic development. The analysis is based on the structural blockage argument (one of several dependency arguments contained in many versions of dependency theory). Emphasizes…
Compounding approach for univariate time series with nonstationary variances

NASA Astrophysics Data System (ADS)

Schäfer, Rudi; Barkhofen, Sonja; Guhr, Thomas; Stöckmann, Hans-Jürgen; Kuhl, Ulrich

2015-12-01

A defining feature of nonstationary systems is the time dependence of their statistical parameters. Measured time series may exhibit Gaussian statistics on short time horizons, due to the central limit theorem. The sample statistics for long time horizons, however, averages over the time-dependent variances. To model the long-term statistical behavior, we compound the local distribution with the distribution of its parameters. Here, we consider two concrete, but diverse, examples of such nonstationary systems: the turbulent air flow of a fan and a time series of foreign exchange rates. Our main focus is to empirically determine the appropriate parameter distribution for the compounding approach. To this end, we extract the relevant time scales by decomposing the time signals into windows and determine the distribution function of the thus obtained local variances.
Compounding approach for univariate time series with nonstationary variances.

PubMed

Schäfer, Rudi; Barkhofen, Sonja; Guhr, Thomas; Stöckmann, Hans-Jürgen; Kuhl, Ulrich

2015-12-01

A defining feature of nonstationary systems is the time dependence of their statistical parameters. Measured time series may exhibit Gaussian statistics on short time horizons, due to the central limit theorem. The sample statistics for long time horizons, however, averages over the time-dependent variances. To model the long-term statistical behavior, we compound the local distribution with the distribution of its parameters. Here, we consider two concrete, but diverse, examples of such nonstationary systems: the turbulent air flow of a fan and a time series of foreign exchange rates. Our main focus is to empirically determine the appropriate parameter distribution for the compounding approach. To this end, we extract the relevant time scales by decomposing the time signals into windows and determine the distribution function of the thus obtained local variances.
Predation and fragmentation portrayed in the statistical structure of prey time series

PubMed Central

Hendrichsen, Ditte K; Topping, Chris J; Forchhammer, Mads C

2009-01-01

Background Statistical autoregressive analyses of direct and delayed density dependence are widespread in ecological research. The models suggest that changes in ecological factors affecting density dependence, like predation and landscape heterogeneity are directly portrayed in the first and second order autoregressive parameters, and the models are therefore used to decipher complex biological patterns. However, independent tests of model predictions are complicated by the inherent variability of natural populations, where differences in landscape structure, climate or species composition prevent controlled repeated analyses. To circumvent this problem, we applied second-order autoregressive time series analyses to data generated by a realistic agent-based computer model. The model simulated life history decisions of individual field voles under controlled variations in predator pressure and landscape fragmentation. Analyses were made on three levels: comparisons between predated and non-predated populations, between populations exposed to different types of predators and between populations experiencing different degrees of habitat fragmentation. Results The results are unambiguous: Changes in landscape fragmentation and the numerical response of predators are clearly portrayed in the statistical time series structure as predicted by the autoregressive model. Populations without predators displayed significantly stronger negative direct density dependence than did those exposed to predators, where direct density dependence was only moderately negative. The effects of predation versus no predation had an even stronger effect on the delayed density dependence of the simulated prey populations. In non-predated prey populations, the coefficients of delayed density dependence were distinctly positive, whereas they were negative in predated populations. Similarly, increasing the degree of fragmentation of optimal habitat available to the prey was accompanied with a shift in the delayed density dependence, from strongly negative to gradually becoming less negative. Conclusion We conclude that statistical second-order autoregressive time series analyses are capable of deciphering interactions within and across trophic levels and their effect on direct and delayed density dependence. PMID:19419539
Investigation of time-dependent risk of mental disorders after infertility diagnosis, through survival analysis and data mining: a nationwide cohort study.

PubMed

Wang, Jong-Yi; Chen, Jen-De; Huang, Chun-Chi; Liu, Chiu-Shong; Chung, Tsai-Fang; Hsieh, Ming-Hong; Wang, Chia-Woei

2018-06-01

Infertile patients are vulnerable to mental disorders. However, a time-dependent model predicting the onset of mental disorders specific to infertile patients is lacking. This study examined the risk factors for the development of mental disorders in infertile patients and measured the duration until the occurrence of mental disorders after a diagnosis of infertility. A total of 13,317 infertile patients in the 2002-2013 Taiwan National Health Insurance Research Database were observed. The 11 independent variables included in the hypothesised model, together with the dates of infertility and mental disorder diagnoses, were analysed using Cox proportional hazards. Data-mining methods using C5.0 and Apriori supplemented the statistical analyses. The total prevalence rate of mental disorders among infertile patients in Taiwan was 12.41%, including anxiety (4.66%), depression (1.81%) and other mental disorders (5.94%). The average time interval for onset of mental illness identified using survival analysis was 1.67 years. Income, occupation, treatment method, co-morbidity, region and hospital level and ownership were significant predictors of development of mental illness (all p < .05). The four categories of factors associated with time-dependent onset were demographics, health, health care provider and geographical characteristics. Certain patient characteristics may predict a higher likelihood of onset of a specific mental disorder. Clinical practitioners may use the findings to identify high-risk patients and make timely health interventions.

The Effect Size Statistic: Overview of Various Choices.

ERIC Educational Resources Information Center

Mahadevan, Lakshmi

Over the years, methodologists have been recommending that researchers use magnitude of effect estimates in result interpretation to highlight the distinction between statistical and practical significance (cf. R. Kirk, 1996). A magnitude of effect statistic (i.e., effect size) tells to what degree the dependent variable can be controlled,…
Fisher's method of combining dependent statistics using generalizations of the gamma distribution with applications to genetic pleiotropic associations.

PubMed

Li, Qizhai; Hu, Jiyuan; Ding, Juan; Zheng, Gang

2014-04-01

A classical approach to combine independent test statistics is Fisher's combination of $p$-values, which follows the $\\chi ^2$ distribution. When the test statistics are dependent, the gamma distribution (GD) is commonly used for the Fisher's combination test (FCT). We propose to use two generalizations of the GD: the generalized and the exponentiated GDs. We study some properties of mis-using the GD for the FCT to combine dependent statistics when one of the two proposed distributions are true. Our results show that both generalizations have better control of type I error rates than the GD, which tends to have inflated type I error rates at more extreme tails. In practice, common model selection criteria (e.g. Akaike information criterion/Bayesian information criterion) can be used to help select a better distribution to use for the FCT. A simple strategy of the two generalizations of the GD in genome-wide association studies is discussed. Applications of the results to genetic pleiotrophic associations are described, where multiple traits are tested for association with a single marker.
Anomalous spin-dependent tunneling statistics in Fe/MgO/Fe junctions induced by disorder at the interface

NASA Astrophysics Data System (ADS)

Yan, Jiawei; Wang, Shizhuo; Xia, Ke; Ke, Youqi

2018-01-01

We present first-principles analysis of interfacial disorder effects on spin-dependent tunneling statistics in thin Fe/MgO/Fe magnetic tunnel junctions. We find that interfacial disorder scattering can significantly modulate the tunneling statistics in the minority spin of the parallel configuration (PC) while all other spin channels remain dominated by the Poissonian process. For the minority-spin channel of PC, interfacial disorder scattering favors the formation of resonant tunneling channels by lifting the limitation of symmetry conservation at low concentration, presenting an important sub-Poissonian process in PC, but is destructive to the open channels at high concentration. We find that the important modulation of tunneling statistics is independent of the type of interfacial disorder. A bimodal distribution function of transmission with disorder dependence is introduced and fits very well our first-principles results. The increase of MgO thickness can quickly change the tunneling from a sub-Poissonian to Poissonian dominated process in the minority spin of PC with disorder. Our results provide a sensitive detection method of an ultralow concentration of interfacial defects.
Detection of Clostridium difficile infection clusters, using the temporal scan statistic, in a community hospital in southern Ontario, Canada, 2006-2011.

PubMed

Faires, Meredith C; Pearl, David L; Ciccotelli, William A; Berke, Olaf; Reid-Smith, Richard J; Weese, J Scott

2014-05-12

In hospitals, Clostridium difficile infection (CDI) surveillance relies on unvalidated guidelines or threshold criteria to identify outbreaks. This can result in false-positive and -negative cluster alarms. The application of statistical methods to identify and understand CDI clusters may be a useful alternative or complement to standard surveillance techniques. The objectives of this study were to investigate the utility of the temporal scan statistic for detecting CDI clusters and determine if there are significant differences in the rate of CDI cases by month, season, and year in a community hospital. Bacteriology reports of patients identified with a CDI from August 2006 to February 2011 were collected. For patients detected with CDI from March 2010 to February 2011, stool specimens were obtained. Clostridium difficile isolates were characterized by ribotyping and investigated for the presence of toxin genes by PCR. CDI clusters were investigated using a retrospective temporal scan test statistic. Statistically significant clusters were compared to known CDI outbreaks within the hospital. A negative binomial regression model was used to identify associations between year, season, month and the rate of CDI cases. Overall, 86 CDI cases were identified. Eighteen specimens were analyzed and nine ribotypes were classified with ribotype 027 (n = 6) the most prevalent. The temporal scan statistic identified significant CDI clusters at the hospital (n = 5), service (n = 6), and ward (n = 4) levels (P ≤ 0.05). Three clusters were concordant with the one C. difficile outbreak identified by hospital personnel. Two clusters were identified as potential outbreaks. The negative binomial model indicated years 2007-2010 (P ≤ 0.05) had decreased CDI rates compared to 2006 and spring had an increased CDI rate compared to the fall (P = 0.023). Application of the temporal scan statistic identified several clusters, including potential outbreaks not detected by hospital personnel. The identification of time periods with decreased or increased CDI rates may have been a result of specific hospital events. Understanding the clustering of CDIs can aid in the interpretation of surveillance data and lead to the development of better early detection systems.
Systematic review of statistical approaches to quantify, or correct for, measurement error in a continuous exposure in nutritional epidemiology.

PubMed

Bennett, Derrick A; Landry, Denise; Little, Julian; Minelli, Cosetta

2017-09-19

Several statistical approaches have been proposed to assess and correct for exposure measurement error. We aimed to provide a critical overview of the most common approaches used in nutritional epidemiology. MEDLINE, EMBASE, BIOSIS and CINAHL were searched for reports published in English up to May 2016 in order to ascertain studies that described methods aimed to quantify and/or correct for measurement error for a continuous exposure in nutritional epidemiology using a calibration study. We identified 126 studies, 43 of which described statistical methods and 83 that applied any of these methods to a real dataset. The statistical approaches in the eligible studies were grouped into: a) approaches to quantify the relationship between different dietary assessment instruments and "true intake", which were mostly based on correlation analysis and the method of triads; b) approaches to adjust point and interval estimates of diet-disease associations for measurement error, mostly based on regression calibration analysis and its extensions. Two approaches (multiple imputation and moment reconstruction) were identified that can deal with differential measurement error. For regression calibration, the most common approach to correct for measurement error used in nutritional epidemiology, it is crucial to ensure that its assumptions and requirements are fully met. Analyses that investigate the impact of departures from the classical measurement error model on regression calibration estimates can be helpful to researchers in interpreting their findings. With regard to the possible use of alternative methods when regression calibration is not appropriate, the choice of method should depend on the measurement error model assumed, the availability of suitable calibration study data and the potential for bias due to violation of the classical measurement error model assumptions. On the basis of this review, we provide some practical advice for the use of methods to assess and adjust for measurement error in nutritional epidemiology.
28 CFR 22.22 - Revelation of identifiable data.

Code of Federal Regulations, 2010 CFR

2010-07-01

... STATISTICAL INFORMATION § 22.22 Revelation of identifiable data. (a) Except as noted in paragraph (b) of this section, research and statistical information relating to a private person may be revealed in identifiable... sections 223(a)(12)(A), 223(a)(13), 223(a)(14), and 243 of the Juvenile Justice and Delinquency Prevention...
50 CFR 600.410 - Collection and maintenance of statistics.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 50 Wildlife and Fisheries 10 2011-10-01 2011-10-01 false Collection and maintenance of statistics... of Statistics § 600.410 Collection and maintenance of statistics. (a) General. (1) All statistics..., the Assistant Administrator will remove all identifying particulars from the statistics if doing so is...
50 CFR 600.410 - Collection and maintenance of statistics.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 50 Wildlife and Fisheries 8 2010-10-01 2010-10-01 false Collection and maintenance of statistics... of Statistics § 600.410 Collection and maintenance of statistics. (a) General. (1) All statistics..., the Assistant Administrator will remove all identifying particulars from the statistics if doing so is...
Noise exposure-response relationships established from repeated binary observations: Modeling approaches and applications.

PubMed

Schäffer, Beat; Pieren, Reto; Mendolia, Franco; Basner, Mathias; Brink, Mark

2017-05-01

Noise exposure-response relationships are used to estimate the effects of noise on individuals or a population. Such relationships may be derived from independent or repeated binary observations, and modeled by different statistical methods. Depending on the method by which they were established, their application in population risk assessment or estimation of individual responses may yield different results, i.e., predict "weaker" or "stronger" effects. As far as the present body of literature on noise effect studies is concerned, however, the underlying statistical methodology to establish exposure-response relationships has not always been paid sufficient attention. This paper gives an overview on two statistical approaches (subject-specific and population-averaged logistic regression analysis) to establish noise exposure-response relationships from repeated binary observations, and their appropriate applications. The considerations are illustrated with data from three noise effect studies, estimating also the magnitude of differences in results when applying exposure-response relationships derived from the two statistical approaches. Depending on the underlying data set and the probability range of the binary variable it covers, the two approaches yield similar to very different results. The adequate choice of a specific statistical approach and its application in subsequent studies, both depending on the research question, are therefore crucial.
Using Claims Data to Predict Dependency in Activities of Daily Living as a Proxy for Frailty

PubMed Central

Faurot, Keturah R.; Funk, Michele Jonsson; Pate, Virginia; Brookhart, M. Alan; Patrick, Amanda; Hanson, Laura C.; Castillo, Wendy Camelo; Stürmer, Til

2014-01-01

Purpose Estimating drug effectiveness and safety among older adults in population-based studies using administrative healthcare claims can be hampered by unmeasured confounding due to frailty. A claims-based algorithm that identifies patients likely to be dependent, a proxy for frailty, may improve confounding control. Our objective was to develop an algorithm to predict dependency in activities of daily living (ADL) in a sample of Medicare beneficiaries. Methods Community-dwelling respondents to the 2006 Medicare Current Beneficiary Survey, >65 years old, with Medicare Part A, B, home health, and hospice claims were included. ADL dependency was defined as needing help with bathing, eating, walking, dressing, toileting, or transferring. Potential predictors were demographics, ICD-9 diagnosis/procedure and durable medical equipment codes for frailty-associated conditions. Multivariable logistic regression was to predict ADL dependency. Cox models estimated hazard ratios for death as a function of observed and predicted ADL dependency. Results Of 6391 respondents, 57% were female, 88% white, and 38% were ≥80. The prevalence of ADL dependency was 9.5%. Strong predictors of ADL dependency were charges for a home hospital bed (OR=5.44, 95% CI=3.28–9.03) and wheelchair (OR=3.91, 95% CI=2.78–5.51). The c-statistic of the final model was 0.845. Model-predicted ADL dependency of 20% or greater was associated with a hazard ratio for death of 3.19 (95% CI: 2.78, 3.68). Conclusions An algorithm for predicting ADL dependency using healthcare claims was developed to measure some aspects of frailty. Accounting for variation in frailty among older adults could lead to more valid conclusions about treatment use, safety, and effectiveness. PMID:25335470
Multivariate Statistical Modelling of Drought and Heat Wave Events

NASA Astrophysics Data System (ADS)

Manning, Colin; Widmann, Martin; Vrac, Mathieu; Maraun, Douglas; Bevaqua, Emanuele

2016-04-01

Multivariate Statistical Modelling of Drought and Heat Wave Events C. Manning1,2, M. Widmann1, M. Vrac2, D. Maraun3, E. Bevaqua2,3 1. School of Geography, Earth and Environmental Sciences, University of Birmingham, Edgbaston, Birmingham, UK 2. Laboratoire des Sciences du Climat et de l'Environnement, (LSCE-IPSL), Centre d'Etudes de Saclay, Gif-sur-Yvette, France 3. Wegener Center for Climate and Global Change, University of Graz, Brandhofgasse 5, 8010 Graz, Austria Compound extreme events are a combination of two or more contributing events which in themselves may not be extreme but through their joint occurrence produce an extreme impact. Compound events are noted in the latest IPCC report as an important type of extreme event that have been given little attention so far. As part of the CE:LLO project (Compound Events: muLtivariate statisticaL mOdelling) we are developing a multivariate statistical model to gain an understanding of the dependence structure of certain compound events. One focus of this project is on the interaction between drought and heat wave events. Soil moisture has both a local and non-local effect on the occurrence of heat waves where it strongly controls the latent heat flux affecting the transfer of sensible heat to the atmosphere. These processes can create a feedback whereby a heat wave maybe amplified or suppressed by the soil moisture preconditioning, and vice versa, the heat wave may in turn have an effect on soil conditions. An aim of this project is to capture this dependence in order to correctly describe the joint probabilities of these conditions and the resulting probability of their compound impact. We will show an application of Pair Copula Constructions (PCCs) to study the aforementioned compound event. PCCs allow in theory for the formulation of multivariate dependence structures in any dimension where the PCC is a decomposition of a multivariate distribution into a product of bivariate components modelled using copulas. A copula is a multivariate distribution function which allows one to model the dependence structure of given variables separately from the marginal behaviour. We firstly look at the structure of soil moisture drought over the entire of France using the SAFRAN dataset between 1959 and 2009. Soil moisture is represented using the Standardised Precipitation Evapotranspiration Index (SPEI). Drought characteristics are computed at grid point scale where drought conditions are identified as those with an SPEI value below -1.0. We model the multivariate dependence structure of drought events defined by certain characteristics and compute return levels of these events. We initially find that drought characteristics such as duration, mean SPEI and the maximum contiguous area to a grid point all have positive correlations, though the degree to which they are correlated can vary considerably spatially. A spatial representation of return levels then may provide insight into the areas most prone to drought conditions. As a next step, we analyse the dependence structure between soil moisture conditions preceding the onset of a heat wave and the heat wave itself.
An Assessment of Land Surface and Lightning Characteristics Associated with Lightning-Initiated Wildfires

NASA Technical Reports Server (NTRS)

Coy, James; Schultz, Christopher J.; Case, Jonathan L.

2017-01-01

Can we use modeled information of the land surface and characteristics of lightning beyond flash occurrence to increase the identification and prediction of wildfires? Combine observed cloud-to-ground (CG) flashes with real-time land surface model output, and Compare data with areas where lightning did not start a wildfire to determine what land surface conditions and lightning characteristics were responsible for causing wildfires. Statistical differences between suspected fire-starters and non-fire-starters were peak-current dependent 0-10 cm Volumetric and Relative Soil Moisture comparisons were statistically dependent to at least the p = 0.05 independence level for both polarity flash types Suspected fire-starters typically occurred in areas of lower soil moisture than non-fire-starters. GVF value comparisons were only found to be statistically dependent for -CG flashes. However, random sampling of the -CG non-fire starter dataset revealed that this relationship may not always hold.
Primary caregivers of in-home oxygen-dependent children: predictors of stress based on characteristics, needs and social support.

PubMed

Wang, Kai-Wei K; Lin, Hung-Ching; Lee, Chin-Ting; Lee, Kuo-Sheng

2016-07-01

To identify the predictors of primary caregivers' stress in caring for in-home oxygen-dependent children by examining the association between their levels of stress, caregiver needs and social support. Increasing numbers of primary caregivers of oxygen-dependent children experience caregiving stress that warrants investigation. The study used a cross-sectional design with three psychometric scales - Modified-Parenting Stress Index, Caregiver Needs Scale and Social Support Index. The data collected during 2010-2011 were from participants who were responsible for their child's care that included oxygen therapy for ≧6 hours/day; the children's ages ranged from 3 months-16 years. Descriptive statistics and multivariable linear regression were used. A total of 104 participants (M = 34, F = 70) were recruited, with an average age of 39·7 years. The average age of the oxygen-dependent children was 6·68 years and their daily use of oxygen averaged 11·39 hours. The caregivers' overall levels of stress were scored as high and information needs were scored as the highest. The most available support from family and friends was emotional support. Informational support was mostly received from health professionals, but both instrumental and emotional support were important. Levels of stress and caregiver needs were significantly correlated. Multivariable linear regression analyses identified three risk factors predicting stress, namely, the caregiver's poor health status, the child's male gender and the caregiver's greater financial need. To support these caregivers, health professionals can maintain their health status and provide instrumental, emotional, informational and financial support. © 2016 John Wiley & Sons Ltd.
Phosphorylated VEGFR2 and hypertension: potential biomarkers to indicate VEGF-dependency of advanced breast cancer in anti-angiogenic therapy.

PubMed

Fan, Minhao; Zhang, Jian; Wang, Zhonghua; Wang, Biyun; Zhang, Qunlin; Zheng, Chunlei; Li, Ting; Ni, Chen; Wu, Zhenhua; Shao, Zhimin; Hu, Xichun

2014-01-01

The efficacy of anti-VEGF agents probably lies on VEGF-dependency. Apatinib, a specific tyrosine kinase inhibitor that targets VEGF receptor 2, was assessed in patients with advanced breast cancer (ABC) (ClinicalTrials.gov NCT01176669 and NCT01653561). This substudy was to explore the potential biomarkers for VEGF-dependency in apatinib-treated breast cancer. Eighty pretreated patients received apatinib 750 or 500 mg/day orally in 4-week cycles. Circulating biomarkers were measured using a multiplex assay, and tissue biomarkers were identified with immunostaining. Baseline characteristics and adverse events (AEs) were included in the analysis. Statistical confirmation of independent predictive factors for anti-tumor efficacy was performed using Cox and Logistic regression models. Median progression-free survival (PFS) was 3.8 months, and overall survival (OS) was 10.6 months, with 17.5 % of objective response rate. Prominent AEs (≥60 %) were hypertension, hand-foot skin reaction (HFSR), and proteinuria. Higher tumor phosphorylated VEGFR2 (p-VEGFR2) expressions (P = 0.001), higher baseline serum soluble VEGFR2 (P = 0.031), hypertension (P = 0.011), and HFSR (P = 0.018) were significantly related to longer PFS, whereas hypertension (P = 0.002) and HFSR (P = 0.001) were also related to OS. Based on multivariate analysis, only p-VEGFR2 (adjusted HR, 0.40; P = 0.013) and hypertension (adjusted HR, 0.58; P = 0.038) were independent predictive factors for both PFS and clinical benefit rate. Apatinib had substantial antitumor activity in ABC and manageable toxicity. p-VEGFR2 and hypertension may be surrogate predictors of VEGF-dependency of breast cancer, which may identify an anti-angiogenesis sensitive population.
Depth-Dependent Glycosaminoglycan Concentration in Articular Cartilage by Quantitative Contrast-Enhanced Micro–Computed Tomography

PubMed Central

Mittelstaedt, Daniel

2015-01-01

Objective A quantitative contrast-enhanced micro–computed tomography (qCECT) method was developed to investigate the depth dependency and heterogeneity of the glycosaminoglycan (GAG) concentration of ex vivo cartilage equilibrated with an anionic radiographic contrast agent, Hexabrix. Design Full-thickness fresh native (n = 19 in 3 subgroups) and trypsin-degraded (n = 6) articular cartilage blocks were imaged using micro–computed tomography (μCT) at high resolution (13.4 μm3) before and after equilibration with various Hexabrix bathing concentrations. The GAG concentration was calculated depth-dependently based on Gibbs-Donnan equilibrium theory. Analysis of variance with Tukey’s post hoc was used to test for statistical significance (P < 0.05) for effect of Hexabrix bathing concentration, and for differences in bulk and zonal GAG concentrations individually and compared between native and trypsin-degraded cartilage. Results The bulk GAG concentration was calculated to be 74.44 ± 6.09 and 11.99 ± 4.24 mg/mL for native and degraded cartilage, respectively. A statistical difference was demonstrated for bulk and zonal GAG between native and degraded cartilage (P < 0.032). A statistical difference was not demonstrated for bulk GAG when comparing Hexabrix bathing concentrations (P > 0.3214) for neither native nor degraded cartilage. Depth-dependent GAG analysis of native cartilage revealed a statistical difference only in the radial zone between 30% and 50% Hexabrix bathing concentrations. Conclusions This nondestructive qCECT methodology calculated the depth-dependent GAG concentration for both native and trypsin-degraded cartilage at high spatial resolution. qCECT allows for more detailed understanding of the topography and depth dependency, which could help diagnose health, degradation, and repair of native and contrived cartilage. PMID:26425259
Neuropsychological assessment of decision making in alcohol-dependent commercial pilots.

PubMed

Georgemiller, Randy; Machizawa, Sayaka; Young, Kathleen M; Martin, Cynthia N

2013-09-01

The aim of this exploratory archival study was to discern the utility of the Iowa Gambling Task (IGT) in identifying adaptive decision-making capacities among pilots with a history of alcohol dependence both with and without Cluster B personality features. Participants included 18 male airmen at the rank of captain with a history of receiving alcohol dependence treatment and subsequent referral for a fitness-for-duty evaluation. Data from prior comprehensive neuropsychological evaluations conducted in a private practice setting at the mandate of the FAA utilizing criteria outlined in the HIMS program was used. ANOVA was conducted to compare pilots with (N = 4) and without Cluster B personality features (N = 14) on measures of decisionmaking capacities, intelligence, and executive functioning. Pilots with Cluster B personality features were found to have a significantly lower Total Net T-Score on IGT (M = 35.00, SD = 9.27) than pilots without features of Cluster B (M = 56.36, SD = 9.55). Furthermore, with the exception of the first 20 cards (i.e., Net 1); the groups significantly differed in their Net scores. No statistically significant difference was found on airmen's intelligence and executive functioning. The present study found that alcohol-dependent airmen with Cluster B personality features evidenced significantly poorer decisionmaking capacities as measured by the ICT in comparison to alcohol dependent airman without Cluster B personality features. Implications and limitations of the study are discussed.
A Probabilistic Framework for Peptide and Protein Quantification from Data-Dependent and Data-Independent LC-MS Proteomics Experiments

PubMed Central

Richardson, Keith; Denny, Richard; Hughes, Chris; Skilling, John; Sikora, Jacek; Dadlez, Michał; Manteca, Angel; Jung, Hye Ryung; Jensen, Ole Nørregaard; Redeker, Virginie; Melki, Ronald; Langridge, James I.; Vissers, Johannes P.C.

2013-01-01

A probability-based quantification framework is presented for the calculation of relative peptide and protein abundance in label-free and label-dependent LC-MS proteomics data. The results are accompanied by credible intervals and regulation probabilities. The algorithm takes into account data uncertainties via Poisson statistics modified by a noise contribution that is determined automatically during an initial normalization stage. Protein quantification relies on assignments of component peptides to the acquired data. These assignments are generally of variable reliability and may not be present across all of the experiments comprising an analysis. It is also possible for a peptide to be identified to more than one protein in a given mixture. For these reasons the algorithm accepts a prior probability of peptide assignment for each intensity measurement. The model is constructed in such a way that outliers of any type can be automatically reweighted. Two discrete normalization methods can be employed. The first method is based on a user-defined subset of peptides, while the second method relies on the presence of a dominant background of endogenous peptides for which the concentration is assumed to be unaffected. Normalization is performed using the same computational and statistical procedures employed by the main quantification algorithm. The performance of the algorithm will be illustrated on example data sets, and its utility demonstrated for typical proteomics applications. The quantification algorithm supports relative protein quantification based on precursor and product ion intensities acquired by means of data-dependent methods, originating from all common isotopically-labeled approaches, as well as label-free ion intensity-based data-independent methods. PMID:22871168
Quality of life of patients from rural and urban areas in Poland with head and neck cancer treated with radiotherapy. A study of the influence of selected socio-demographic factors

PubMed Central

Jewczak, Maciej; Skura-Madziała, Anna

2017-01-01

Introduction The quality of life (QoL) experienced by cancer patients depends both on their state of health and on sociodemographic factors. Tumours in the head and neck region have a particularly adverse effect on patients psychologically and on their social functioning. Material and methods The study involved 121 patients receiving radiotherapy treatment for head and neck cancers. They included 72 urban and 49 rural residents. QoL was assessed using the questionnaires EORTC-QLQ-C30 and QLQ-H&N35. The data were analysed using statistical methods: a χ2 test for independence and a multinomial logit model. Results The evaluation of QoL showed a strong, statistically significant, positive dependence on state of health, and a weak dependence on sociodemographic factors and place of residence. Evaluations of financial situation and living conditions were similar for rural and urban residents. Patients from urban areas had the greatest anxiety about deterioration of their state of health. Rural respondents were more often anxious about a worsening of their financial situation, and expressed a fear of loneliness. Conclusions Studying the QoL of patients with head and neck cancer provides information concerning the areas in which the disease inhibits their lives, and the extent to which it does so. It indicates conditions for the adaptation of treatment and care methods in the healthcare system which might improve the QoL of such patients. A multinomial logit model identifies the factors determining the patients’ health assessment and defines the probable values of such assessment. PMID:29181080
Resolution dependence of precipitation statistical fidelity in hindcast simulations

DOE PAGES

O'Brien, Travis A.; Collins, William D.; Kashinath, Karthik; ...

2016-06-19

This article is a U.S. Government work and is in the public domain in the USA. Numerous studies have shown that atmospheric models with high horizontal resolution better represent the physics and statistics of precipitation in climate models. While it is abundantly clear from these studies that high-resolution increases the rate of extreme precipitation, it is not clear whether these added extreme events are “realistic”; whether they occur in simulations in response to the same forcings that drive similar events in reality. In order to understand whether increasing horizontal resolution results in improved model fidelity, a hindcast-based, multiresolution experimental designmore » has been conceived and implemented: the InitiaLIzed-ensemble, Analyze, and Develop (ILIAD) framework. The ILIAD framework allows direct comparison between observed and simulated weather events across multiple resolutions and assessment of the degree to which increased resolution improves the fidelity of extremes. Analysis of 5 years of daily 5 day hindcasts with the Community Earth System Model at horizontal resolutions of 220, 110, and 28 km shows that: (1) these hindcasts reproduce the resolution-dependent increase of extreme precipitation that has been identified in longer-duration simulations, (2) the correspondence between simulated and observed extreme precipitation improves as resolution increases; and (3) this increase in extremes and precipitation fidelity comes entirely from resolved-scale precipitation. Evidence is presented that this resolution-dependent increase in precipitation intensity can be explained by the theory of Rauscher et al. (), which states that precipitation intensifies at high resolution due to an interaction between the emergent scaling (spectral) properties of the wind field and the constraint of fluid continuity.« less
Resolution dependence of precipitation statistical fidelity in hindcast simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

O'Brien, Travis A.; Collins, William D.; Kashinath, Karthik

This article is a U.S. Government work and is in the public domain in the USA. Numerous studies have shown that atmospheric models with high horizontal resolution better represent the physics and statistics of precipitation in climate models. While it is abundantly clear from these studies that high-resolution increases the rate of extreme precipitation, it is not clear whether these added extreme events are “realistic”; whether they occur in simulations in response to the same forcings that drive similar events in reality. In order to understand whether increasing horizontal resolution results in improved model fidelity, a hindcast-based, multiresolution experimental designmore » has been conceived and implemented: the InitiaLIzed-ensemble, Analyze, and Develop (ILIAD) framework. The ILIAD framework allows direct comparison between observed and simulated weather events across multiple resolutions and assessment of the degree to which increased resolution improves the fidelity of extremes. Analysis of 5 years of daily 5 day hindcasts with the Community Earth System Model at horizontal resolutions of 220, 110, and 28 km shows that: (1) these hindcasts reproduce the resolution-dependent increase of extreme precipitation that has been identified in longer-duration simulations, (2) the correspondence between simulated and observed extreme precipitation improves as resolution increases; and (3) this increase in extremes and precipitation fidelity comes entirely from resolved-scale precipitation. Evidence is presented that this resolution-dependent increase in precipitation intensity can be explained by the theory of Rauscher et al. (), which states that precipitation intensifies at high resolution due to an interaction between the emergent scaling (spectral) properties of the wind field and the constraint of fluid continuity.« less

Piracetam for acute ischaemic stroke.

PubMed

Ricci, Stefano; Celani, Maria Grazia; Cantisani, Teresa Anna; Righetti, Enrico

2012-09-12

Piracetam has neuroprotective and antithrombotic effects that may help to reduce death and disability in people with acute stroke. This is an update of a Cochrane Review first published in 1999, and previously updated in 2006 and 2009. To assess the effects of piracetam in acute, presumed ischaemic stroke. We searched the Cochrane Stroke Group Trials Register (last searched 15 May 2011), the Cochrane Central Register of Controlled Trials (CENTRAL) (The Cochrane Library 2011, Issue 2), MEDLINE (1966 to May 2011), EMBASE (1980 to May 2011), and ISI Science Citation Index (1981 to May 2011). We also contacted the manufacturer of piracetam to identify further published and unpublished studies. Randomised trials comparing piracetam with control, with at least mortality reported and entry to the trial within three days of stroke onset. Two review authors extracted data and assessed trial quality and this was checked by the other two review authors. We contacted study authors for missing information. We included three trials involving 1002 patients, with one trial contributing 93% of the data. Participants' ages ranged from 40 to 85 years, and both sexes were equally represented. Piracetam was associated with a statistically non-significant increase in death at one month (approximately 31% increase, 95% confidence interval 81% increase to 5% reduction). This trend was no longer apparent in the large trial after correction for imbalance in stroke severity. Limited data showed no difference between the treatment and control groups for functional outcome, dependence or proportion of patients dead or dependent. Adverse effects were not reported. There is some suggestion (but no statistically significant result) of an unfavourable effect of piracetam on early death, but this may have been caused by baseline differences in stroke severity in the trials. There is not enough evidence to assess the effect of piracetam on dependence.
The sumLINK statistic for genetic linkage analysis in the presence of heterogeneity.

PubMed

Christensen, G B; Knight, S; Camp, N J

2009-11-01

We present the "sumLINK" statistic--the sum of multipoint LOD scores for the subset of pedigrees with nominally significant linkage evidence at a given locus--as an alternative to common methods to identify susceptibility loci in the presence of heterogeneity. We also suggest the "sumLOD" statistic (the sum of positive multipoint LOD scores) as a companion to the sumLINK. sumLINK analysis identifies genetic regions of extreme consistency across pedigrees without regard to negative evidence from unlinked or uninformative pedigrees. Significance is determined by an innovative permutation procedure based on genome shuffling that randomizes linkage information across pedigrees. This procedure for generating the empirical null distribution may be useful for other linkage-based statistics as well. Using 500 genome-wide analyses of simulated null data, we show that the genome shuffling procedure results in the correct type 1 error rates for both the sumLINK and sumLOD. The power of the statistics was tested using 100 sets of simulated genome-wide data from the alternative hypothesis from GAW13. Finally, we illustrate the statistics in an analysis of 190 aggressive prostate cancer pedigrees from the International Consortium for Prostate Cancer Genetics, where we identified a new susceptibility locus. We propose that the sumLINK and sumLOD are ideal for collaborative projects and meta-analyses, as they do not require any sharing of identifiable data between contributing institutions. Further, loci identified with the sumLINK have good potential for gene localization via statistical recombinant mapping, as, by definition, several linked pedigrees contribute to each peak.
Semantic Annotation of Complex Text Structures in Problem Reports

NASA Technical Reports Server (NTRS)

Malin, Jane T.; Throop, David R.; Fleming, Land D.

2011-01-01

Text analysis is important for effective information retrieval from databases where the critical information is embedded in text fields. Aerospace safety depends on effective retrieval of relevant and related problem reports for the purpose of trend analysis. The complex text syntax in problem descriptions has limited statistical text mining of problem reports. The presentation describes an intelligent tagging approach that applies syntactic and then semantic analysis to overcome this problem. The tags identify types of problems and equipment that are embedded in the text descriptions. The power of these tags is illustrated in a faceted searching and browsing interface for problem report trending that combines automatically generated tags with database code fields and temporal information.
How Mean is the Mean?

PubMed Central

Speelman, Craig P.; McGann, Marek

2013-01-01

In this paper we voice concerns about the uncritical manner in which the mean is often used as a summary statistic in psychological research. We identify a number of implicit assumptions underlying the use of the mean and argue that the fragility of these assumptions should be more carefully considered. We examine some of the ways in which the potential violation of these assumptions can lead us into significant theoretical and methodological error. Illustrations of alternative models of research already extant within Psychology are used to explore methods of research less mean-dependent and suggest that a critical assessment of the assumptions underlying its use in research play a more explicit role in the process of study design and review. PMID:23888147
Risk Identification in a Smart Monitoring System Used to Preserve Artefacts Based on Textile Materials

NASA Astrophysics Data System (ADS)

Diaconescu, V. D.; Scripcariu, L.; Mătăsaru, P. D.; Diaconescu, M. R.; Ignat, C. A.

2018-06-01

Exhibited textile-materials-based artefacts can be affected by the environmental conditions. A smart monitoring system that commands an adaptive automatic environment control system is proposed for indoor exhibition spaces containing various textile artefacts. All exhibited objects are monitored by many multi-sensor nodes containing temperature, relative humidity and light sensors. Data collected periodically from the entire sensor network is stored in a database and statistically processed in order to identify and classify the environment risk. Risk consequences are analyzed depending on the risk class and the smart system commands different control measures in order to stabilize the indoor environment conditions to the recommended values and prevent material degradation.
Nonlocal polarization interferometer for entanglement detection

DOE PAGES

Williams, Brian P.; Humble, Travis S.; Grice, Warren P.

2014-10-30

We report a nonlocal interferometer capable of detecting entanglement and identifying Bell states statistically. This is possible due to the interferometer's unique correlation dependence on the antidiagonal elements of the density matrix, which have distinct bounds for separable states and unique values for the four Bell states. The interferometer consists of two spatially separated balanced Mach-Zehnder or Sagnac interferometers that share a polarization-entangled source. Correlations between these interferometers exhibit nonlocal interference, while single-photon interference is suppressed. This interferometer also allows for a unique version of the Clauser-Horne-Shimony-Holt Bell test where the local reality is the photon polarization. In conclusion, wemore » present the relevant theory and experimental results.« less
A case report of pornography addiction with dhat syndrome

PubMed Central

Darshan, M. S.; Sathyanarayana Rao, T. S.; Manickam, Sam; Tandon, Abhinav; Ram, Dushad

2014-01-01

A case of pornography addiction with dhat syndrome was diagnosed applying the existing criteria for substance dependence in International Classification for Diseases-10 and Diagnostic and Statistical Manual of Mental Disorders Fourth Edition, Text Revision. There is a lack of clear-cut criteria for identifying and defining such behavioural addictions and also lack of medical documents on pornography addiction. An applied strategy in lines with any substance addiction is used, and we found it helped our patient to gradually deaddict and then completely quit watching pornography. This is one of the few cases being reported scientifically, and we hope more work will be carried out in this ever increasing pornography addiction problem. PMID:25568482
Multiscale statistics of trajectories with applications to fluid particles in turbulence and football players

NASA Astrophysics Data System (ADS)

Schneider, Kai; Kadoch, Benjamin; Bos, Wouter

2017-11-01

The angle between two subsequent particle displacement increments is evaluated as a function of the time lag. The directional change of particles can thus be quantified at different scales and multiscale statistics can be performed. Flow dependent and geometry dependent features can be distinguished. The mean angle satisfies scaling behaviors for short time lags based on the smoothness of the trajectories. For intermediate time lags a power law behavior can be observed for some turbulent flows, which can be related to Kolmogorov scaling. The long time behavior depends on the confinement geometry of the flow. We show that the shape of the probability distribution function of the directional change can be well described by a Fischer distribution. Results for two-dimensional (direct and inverse cascade) and three-dimensional turbulence with and without confinement, illustrate the properties of the proposed multiscale statistics. The presented Monte-Carlo simulations allow disentangling geometry dependent and flow independent features. Finally, we also analyze trajectories of football players, which are, in general, not randomly spaced on a field.
The Influence of Age and Sex on Genetic Associations with Adult Body Size and Shape: A Large-Scale Genome-Wide Interaction Study.

PubMed

Winkler, Thomas W; Justice, Anne E; Graff, Mariaelisa; Barata, Llilda; Feitosa, Mary F; Chu, Su; Czajkowski, Jacek; Esko, Tõnu; Fall, Tove; Kilpeläinen, Tuomas O; Lu, Yingchang; Mägi, Reedik; Mihailov, Evelin; Pers, Tune H; Rüeger, Sina; Teumer, Alexander; Ehret, Georg B; Ferreira, Teresa; Heard-Costa, Nancy L; Karjalainen, Juha; Lagou, Vasiliki; Mahajan, Anubha; Neinast, Michael D; Prokopenko, Inga; Simino, Jeannette; Teslovich, Tanya M; Jansen, Rick; Westra, Harm-Jan; White, Charles C; Absher, Devin; Ahluwalia, Tarunveer S; Ahmad, Shafqat; Albrecht, Eva; Alves, Alexessander Couto; Bragg-Gresham, Jennifer L; de Craen, Anton J M; Bis, Joshua C; Bonnefond, Amélie; Boucher, Gabrielle; Cadby, Gemma; Cheng, Yu-Ching; Chiang, Charleston W K; Delgado, Graciela; Demirkan, Ayse; Dueker, Nicole; Eklund, Niina; Eiriksdottir, Gudny; Eriksson, Joel; Feenstra, Bjarke; Fischer, Krista; Frau, Francesca; Galesloot, Tessel E; Geller, Frank; Goel, Anuj; Gorski, Mathias; Grammer, Tanja B; Gustafsson, Stefan; Haitjema, Saskia; Hottenga, Jouke-Jan; Huffman, Jennifer E; Jackson, Anne U; Jacobs, Kevin B; Johansson, Åsa; Kaakinen, Marika; Kleber, Marcus E; Lahti, Jari; Mateo Leach, Irene; Lehne, Benjamin; Liu, Youfang; Lo, Ken Sin; Lorentzon, Mattias; Luan, Jian'an; Madden, Pamela A F; Mangino, Massimo; McKnight, Barbara; Medina-Gomez, Carolina; Monda, Keri L; Montasser, May E; Müller, Gabriele; Müller-Nurasyid, Martina; Nolte, Ilja M; Panoutsopoulou, Kalliope; Pascoe, Laura; Paternoster, Lavinia; Rayner, Nigel W; Renström, Frida; Rizzi, Federica; Rose, Lynda M; Ryan, Kathy A; Salo, Perttu; Sanna, Serena; Scharnagl, Hubert; Shi, Jianxin; Smith, Albert Vernon; Southam, Lorraine; Stančáková, Alena; Steinthorsdottir, Valgerdur; Strawbridge, Rona J; Sung, Yun Ju; Tachmazidou, Ioanna; Tanaka, Toshiko; Thorleifsson, Gudmar; Trompet, Stella; Pervjakova, Natalia; Tyrer, Jonathan P; Vandenput, Liesbeth; van der Laan, Sander W; van der Velde, Nathalie; van Setten, Jessica; van Vliet-Ostaptchouk, Jana V; Verweij, Niek; Vlachopoulou, Efthymia; Waite, Lindsay L; Wang, Sophie R; Wang, Zhaoming; Wild, Sarah H; Willenborg, Christina; Wilson, James F; Wong, Andrew; Yang, Jian; Yengo, Loïc; Yerges-Armstrong, Laura M; Yu, Lei; Zhang, Weihua; Zhao, Jing Hua; Andersson, Ehm A; Bakker, Stephan J L; Baldassarre, Damiano; Banasik, Karina; Barcella, Matteo; Barlassina, Cristina; Bellis, Claire; Benaglio, Paola; Blangero, John; Blüher, Matthias; Bonnet, Fabrice; Bonnycastle, Lori L; Boyd, Heather A; Bruinenberg, Marcel; Buchman, Aron S; Campbell, Harry; Chen, Yii-Der Ida; Chines, Peter S; Claudi-Boehm, Simone; Cole, John; Collins, Francis S; de Geus, Eco J C; de Groot, Lisette C P G M; Dimitriou, Maria; Duan, Jubao; Enroth, Stefan; Eury, Elodie; Farmaki, Aliki-Eleni; Forouhi, Nita G; Friedrich, Nele; Gejman, Pablo V; Gigante, Bruna; Glorioso, Nicola; Go, Alan S; Gottesman, Omri; Gräßler, Jürgen; Grallert, Harald; Grarup, Niels; Gu, Yu-Mei; Broer, Linda; Ham, Annelies C; Hansen, Torben; Harris, Tamara B; Hartman, Catharina A; Hassinen, Maija; Hastie, Nicholas; Hattersley, Andrew T; Heath, Andrew C; Henders, Anjali K; Hernandez, Dena; Hillege, Hans; Holmen, Oddgeir; Hovingh, Kees G; Hui, Jennie; Husemoen, Lise L; Hutri-Kähönen, Nina; Hysi, Pirro G; Illig, Thomas; De Jager, Philip L; Jalilzadeh, Shapour; Jørgensen, Torben; Jukema, J Wouter; Juonala, Markus; Kanoni, Stavroula; Karaleftheri, Maria; Khaw, Kay Tee; Kinnunen, Leena; Kittner, Steven J; Koenig, Wolfgang; Kolcic, Ivana; Kovacs, Peter; Krarup, Nikolaj T; Kratzer, Wolfgang; Krüger, Janine; Kuh, Diana; Kumari, Meena; Kyriakou, Theodosios; Langenberg, Claudia; Lannfelt, Lars; Lanzani, Chiara; Lotay, Vaneet; Launer, Lenore J; Leander, Karin; Lindström, Jaana; Linneberg, Allan; Liu, Yan-Ping; Lobbens, Stéphane; Luben, Robert; Lyssenko, Valeriya; Männistö, Satu; Magnusson, Patrik K; McArdle, Wendy L; Menni, Cristina; Merger, Sigrun; Milani, Lili; Montgomery, Grant W; Morris, Andrew P; Narisu, Narisu; Nelis, Mari; Ong, Ken K; Palotie, Aarno; Pérusse, Louis; Pichler, Irene; Pilia, Maria G; Pouta, Anneli; Rheinberger, Myriam; Ribel-Madsen, Rasmus; Richards, Marcus; Rice, Kenneth M; Rice, Treva K; Rivolta, Carlo; Salomaa, Veikko; Sanders, Alan R; Sarzynski, Mark A; Scholtens, Salome; Scott, Robert A; Scott, William R; Sebert, Sylvain; Sengupta, Sebanti; Sennblad, Bengt; Seufferlein, Thomas; Silveira, Angela; Slagboom, P Eline; Smit, Jan H; Sparsø, Thomas H; Stirrups, Kathleen; Stolk, Ronald P; Stringham, Heather M; Swertz, Morris A; Swift, Amy J; Syvänen, Ann-Christine; Tan, Sian-Tsung; Thorand, Barbara; Tönjes, Anke; Tremblay, Angelo; Tsafantakis, Emmanouil; van der Most, Peter J; Völker, Uwe; Vohl, Marie-Claude; Vonk, Judith M; Waldenberger, Melanie; Walker, Ryan W; Wennauer, Roman; Widén, Elisabeth; Willemsen, Gonneke; Wilsgaard, Tom; Wright, Alan F; Zillikens, M Carola; van Dijk, Suzanne C; van Schoor, Natasja M; Asselbergs, Folkert W; de Bakker, Paul I W; Beckmann, Jacques S; Beilby, John; Bennett, David A; Bergman, Richard N; Bergmann, Sven; Böger, Carsten A; Boehm, Bernhard O; Boerwinkle, Eric; Boomsma, Dorret I; Bornstein, Stefan R; Bottinger, Erwin P; Bouchard, Claude; Chambers, John C; Chanock, Stephen J; Chasman, Daniel I; Cucca, Francesco; Cusi, Daniele; Dedoussis, George; Erdmann, Jeanette; Eriksson, Johan G; Evans, Denis A; de Faire, Ulf; Farrall, Martin; Ferrucci, Luigi; Ford, Ian; Franke, Lude; Franks, Paul W; Froguel, Philippe; Gansevoort, Ron T; Gieger, Christian; Grönberg, Henrik; Gudnason, Vilmundur; Gyllensten, Ulf; Hall, Per; Hamsten, Anders; van der Harst, Pim; Hayward, Caroline; Heliövaara, Markku; Hengstenberg, Christian; Hicks, Andrew A; Hingorani, Aroon; Hofman, Albert; Hu, Frank; Huikuri, Heikki V; Hveem, Kristian; James, Alan L; Jordan, Joanne M; Jula, Antti; Kähönen, Mika; Kajantie, Eero; Kathiresan, Sekar; Kiemeney, Lambertus A L M; Kivimaki, Mika; Knekt, Paul B; Koistinen, Heikki A; Kooner, Jaspal S; Koskinen, Seppo; Kuusisto, Johanna; Maerz, Winfried; Martin, Nicholas G; Laakso, Markku; Lakka, Timo A; Lehtimäki, Terho; Lettre, Guillaume; Levinson, Douglas F; Lind, Lars; Lokki, Marja-Liisa; Mäntyselkä, Pekka; Melbye, Mads; Metspalu, Andres; Mitchell, Braxton D; Moll, Frans L; Murray, Jeffrey C; Musk, Arthur W; Nieminen, Markku S; Njølstad, Inger; Ohlsson, Claes; Oldehinkel, Albertine J; Oostra, Ben A; Palmer, Lyle J; Pankow, James S; Pasterkamp, Gerard; Pedersen, Nancy L; Pedersen, Oluf; Penninx, Brenda W; Perola, Markus; Peters, Annette; Polašek, Ozren; Pramstaller, Peter P; Psaty, Bruce M; Qi, Lu; Quertermous, Thomas; Raitakari, Olli T; Rankinen, Tuomo; Rauramaa, Rainer; Ridker, Paul M; Rioux, John D; Rivadeneira, Fernando; Rotter, Jerome I; Rudan, Igor; den Ruijter, Hester M; Saltevo, Juha; Sattar, Naveed; Schunkert, Heribert; Schwarz, Peter E H; Shuldiner, Alan R; Sinisalo, Juha; Snieder, Harold; Sørensen, Thorkild I A; Spector, Tim D; Staessen, Jan A; Stefania, Bandinelli; Thorsteinsdottir, Unnur; Stumvoll, Michael; Tardif, Jean-Claude; Tremoli, Elena; Tuomilehto, Jaakko; Uitterlinden, André G; Uusitupa, Matti; Verbeek, André L M; Vermeulen, Sita H; Viikari, Jorma S; Vitart, Veronique; Völzke, Henry; Vollenweider, Peter; Waeber, Gérard; Walker, Mark; Wallaschofski, Henri; Wareham, Nicholas J; Watkins, Hugh; Zeggini, Eleftheria; Chakravarti, Aravinda; Clegg, Deborah J; Cupples, L Adrienne; Gordon-Larsen, Penny; Jaquish, Cashell E; Rao, D C; Abecasis, Goncalo R; Assimes, Themistocles L; Barroso, Inês; Berndt, Sonja I; Boehnke, Michael; Deloukas, Panos; Fox, Caroline S; Groop, Leif C; Hunter, David J; Ingelsson, Erik; Kaplan, Robert C; McCarthy, Mark I; Mohlke, Karen L; O'Connell, Jeffrey R; Schlessinger, David; Strachan, David P; Stefansson, Kari; van Duijn, Cornelia M; Hirschhorn, Joel N; Lindgren, Cecilia M; Heid, Iris M; North, Kari E; Borecki, Ingrid B; Kutalik, Zoltán; Loos, Ruth J F

2015-10-01

Genome-wide association studies (GWAS) have identified more than 100 genetic variants contributing to BMI, a measure of body size, or waist-to-hip ratio (adjusted for BMI, WHRadjBMI), a measure of body shape. Body size and shape change as people grow older and these changes differ substantially between men and women. To systematically screen for age- and/or sex-specific effects of genetic variants on BMI and WHRadjBMI, we performed meta-analyses of 114 studies (up to 320,485 individuals of European descent) with genome-wide chip and/or Metabochip data by the Genetic Investigation of Anthropometric Traits (GIANT) Consortium. Each study tested the association of up to ~2.8M SNPs with BMI and WHRadjBMI in four strata (men ≤50y, men >50y, women ≤50y, women >50y) and summary statistics were combined in stratum-specific meta-analyses. We then screened for variants that showed age-specific effects (G x AGE), sex-specific effects (G x SEX) or age-specific effects that differed between men and women (G x AGE x SEX). For BMI, we identified 15 loci (11 previously established for main effects, four novel) that showed significant (FDR<5%) age-specific effects, of which 11 had larger effects in younger (<50y) than in older adults (≥50y). No sex-dependent effects were identified for BMI. For WHRadjBMI, we identified 44 loci (27 previously established for main effects, 17 novel) with sex-specific effects, of which 28 showed larger effects in women than in men, five showed larger effects in men than in women, and 11 showed opposite effects between sexes. No age-dependent effects were identified for WHRadjBMI. This is the first genome-wide interaction meta-analysis to report convincing evidence of age-dependent genetic effects on BMI. In addition, we confirm the sex-specificity of genetic effects on WHRadjBMI. These results may provide further insights into the biology that underlies weight change with age or the sexually dimorphism of body shape.
The Influence of Age and Sex on Genetic Associations with Adult Body Size and Shape: A Large-Scale Genome-Wide Interaction Study

PubMed Central

Feitosa, Mary F.; Chu, Su; Czajkowski, Jacek; Esko, Tõnu; Fall, Tove; Kilpeläinen, Tuomas O.; Lu, Yingchang; Mägi, Reedik; Mihailov, Evelin; Pers, Tune H.; Rüeger, Sina; Teumer, Alexander; Ehret, Georg B.; Ferreira, Teresa; Heard-Costa, Nancy L.; Karjalainen, Juha; Lagou, Vasiliki; Mahajan, Anubha; Neinast, Michael D.; Prokopenko, Inga; Simino, Jeannette; Teslovich, Tanya M.; Jansen, Rick; Westra, Harm-Jan; White, Charles C.; Absher, Devin; Ahluwalia, Tarunveer S.; Ahmad, Shafqat; Albrecht, Eva; Alves, Alexessander Couto; Bragg-Gresham, Jennifer L.; de Craen, Anton J. M.; Bis, Joshua C.; Bonnefond, Amélie; Boucher, Gabrielle; Cadby, Gemma; Cheng, Yu-Ching; Chiang, Charleston W. K.; Delgado, Graciela; Demirkan, Ayse; Dueker, Nicole; Eklund, Niina; Eiriksdottir, Gudny; Eriksson, Joel; Feenstra, Bjarke; Fischer, Krista; Frau, Francesca; Galesloot, Tessel E.; Geller, Frank; Goel, Anuj; Gorski, Mathias; Grammer, Tanja B.; Gustafsson, Stefan; Haitjema, Saskia; Hottenga, Jouke-Jan; Huffman, Jennifer E.; Jackson, Anne U.; Jacobs, Kevin B.; Johansson, Åsa; Kaakinen, Marika; Kleber, Marcus E.; Lahti, Jari; Leach, Irene Mateo; Lehne, Benjamin; Liu, Youfang; Lo, Ken Sin; Lorentzon, Mattias; Luan, Jian'an; Madden, Pamela A. F.; Mangino, Massimo; McKnight, Barbara; Medina-Gomez, Carolina; Monda, Keri L.; Montasser, May E.; Müller, Gabriele; Müller-Nurasyid, Martina; Nolte, Ilja M.; Panoutsopoulou, Kalliope; Pascoe, Laura; Paternoster, Lavinia; Rayner, Nigel W.; Renström, Frida; Rizzi, Federica; Rose, Lynda M.; Ryan, Kathy A.; Salo, Perttu; Sanna, Serena; Scharnagl, Hubert; Shi, Jianxin; Smith, Albert Vernon; Southam, Lorraine; Stančáková, Alena; Steinthorsdottir, Valgerdur; Strawbridge, Rona J.; Sung, Yun Ju; Tachmazidou, Ioanna; Tanaka, Toshiko; Thorleifsson, Gudmar; Trompet, Stella; Pervjakova, Natalia; Tyrer, Jonathan P.; Vandenput, Liesbeth; van der Laan, Sander W; van der Velde, Nathalie; van Setten, Jessica; van Vliet-Ostaptchouk, Jana V.; Verweij, Niek; Vlachopoulou, Efthymia; Waite, Lindsay L.; Wang, Sophie R.; Wang, Zhaoming; Wild, Sarah H.; Willenborg, Christina; Wilson, James F.; Wong, Andrew; Yang, Jian; Yengo, Loïc; Yerges-Armstrong, Laura M.; Yu, Lei; Zhang, Weihua; Zhao, Jing Hua; Andersson, Ehm A.; Bakker, Stephan J. L.; Baldassarre, Damiano; Banasik, Karina; Barcella, Matteo; Barlassina, Cristina; Bellis, Claire; Benaglio, Paola; Blangero, John; Blüher, Matthias; Bonnet, Fabrice; Bonnycastle, Lori L.; Boyd, Heather A.; Bruinenberg, Marcel; Buchman, Aron S; Campbell, Harry; Chen, Yii-Der Ida; Chines, Peter S.; Claudi-Boehm, Simone; Cole, John; Collins, Francis S.; de Geus, Eco J. C.; de Groot, Lisette C. P. G. M.; Dimitriou, Maria; Duan, Jubao; Enroth, Stefan; Eury, Elodie; Farmaki, Aliki-Eleni; Forouhi, Nita G.; Friedrich, Nele; Gejman, Pablo V.; Gigante, Bruna; Glorioso, Nicola; Go, Alan S.; Gottesman, Omri; Gräßler, Jürgen; Grallert, Harald; Grarup, Niels; Gu, Yu-Mei; Broer, Linda; Ham, Annelies C.; Hansen, Torben; Harris, Tamara B.; Hartman, Catharina A.; Hassinen, Maija; Hastie, Nicholas; Hattersley, Andrew T.; Heath, Andrew C.; Henders, Anjali K.; Hernandez, Dena; Hillege, Hans; Holmen, Oddgeir; Hovingh, Kees G; Hui, Jennie; Husemoen, Lise L.; Hutri-Kähönen, Nina; Hysi, Pirro G.; Illig, Thomas; De Jager, Philip L.; Jalilzadeh, Shapour; Jørgensen, Torben; Jukema, J. Wouter; Juonala, Markus; Kanoni, Stavroula; Karaleftheri, Maria; Khaw, Kay Tee; Kinnunen, Leena; Kittner, Steven J.; Koenig, Wolfgang; Kolcic, Ivana; Kovacs, Peter; Krarup, Nikolaj T.; Kratzer, Wolfgang; Krüger, Janine; Kuh, Diana; Kumari, Meena; Kyriakou, Theodosios; Langenberg, Claudia; Lannfelt, Lars; Lanzani, Chiara; Lotay, Vaneet; Launer, Lenore J.; Leander, Karin; Lindström, Jaana; Linneberg, Allan; Liu, Yan-Ping; Lobbens, Stéphane; Luben, Robert; Lyssenko, Valeriya; Männistö, Satu; Magnusson, Patrik K.; McArdle, Wendy L.; Menni, Cristina; Merger, Sigrun; Milani, Lili; Montgomery, Grant W.; Morris, Andrew P.; Narisu, Narisu; Nelis, Mari; Ong, Ken K.; Palotie, Aarno; Pérusse, Louis; Pichler, Irene; Pilia, Maria G.; Pouta, Anneli; Rheinberger, Myriam; Ribel-Madsen, Rasmus; Richards, Marcus; Rice, Kenneth M.; Rice, Treva K.; Rivolta, Carlo; Salomaa, Veikko; Sanders, Alan R.; Sarzynski, Mark A.; Scholtens, Salome; Scott, Robert A.; Scott, William R.; Sebert, Sylvain; Sengupta, Sebanti; Sennblad, Bengt; Seufferlein, Thomas; Silveira, Angela; Slagboom, P. Eline; Smit, Jan H.; Sparsø, Thomas H.; Stirrups, Kathleen; Stolk, Ronald P.; Stringham, Heather M.; Swertz, Morris A; Swift, Amy J.; Syvänen, Ann-Christine; Tan, Sian-Tsung; Thorand, Barbara; Tönjes, Anke; Tremblay, Angelo; Tsafantakis, Emmanouil; van der Most, Peter J.; Völker, Uwe; Vohl, Marie-Claude; Vonk, Judith M.; Waldenberger, Melanie; Walker, Ryan W.; Wennauer, Roman; Widén, Elisabeth; Willemsen, Gonneke; Wilsgaard, Tom; Wright, Alan F.; Zillikens, M. Carola; van Dijk, Suzanne C.; van Schoor, Natasja M.; Asselbergs, Folkert W.; de Bakker, Paul I. W.; Beckmann, Jacques S.; Beilby, John; Bennett, David A.; Bergman, Richard N.; Bergmann, Sven; Böger, Carsten A.; Boehm, Bernhard O.; Boerwinkle, Eric; Boomsma, Dorret I.; Bornstein, Stefan R.; Bottinger, Erwin P.; Bouchard, Claude; Chambers, John C.; Chanock, Stephen J.; Chasman, Daniel I.; Cucca, Francesco; Cusi, Daniele; Dedoussis, George; Erdmann, Jeanette; Eriksson, Johan G.; Evans, Denis A.; de Faire, Ulf; Farrall, Martin; Ferrucci, Luigi; Ford, Ian; Franke, Lude; Franks, Paul W.; Froguel, Philippe; Gansevoort, Ron T.; Gieger, Christian; Grönberg, Henrik; Gudnason, Vilmundur; Gyllensten, Ulf; Hall, Per; Hamsten, Anders; van der Harst, Pim; Hayward, Caroline; Heliövaara, Markku; Hengstenberg, Christian; Hicks, Andrew A; Hingorani, Aroon; Hofman, Albert; Hu, Frank; Huikuri, Heikki V.; Hveem, Kristian; James, Alan L.; Jordan, Joanne M.; Jula, Antti; Kähönen, Mika; Kajantie, Eero; Kathiresan, Sekar; Kiemeney, Lambertus A. L. M.; Kivimaki, Mika; Knekt, Paul B.; Koistinen, Heikki A.; Kooner, Jaspal S.; Koskinen, Seppo; Kuusisto, Johanna; Maerz, Winfried; Martin, Nicholas G; Laakso, Markku; Lakka, Timo A.; Lehtimäki, Terho; Lettre, Guillaume; Levinson, Douglas F.; Lind, Lars; Lokki, Marja-Liisa; Mäntyselkä, Pekka; Melbye, Mads; Metspalu, Andres; Mitchell, Braxton D.; Moll, Frans L.; Murray, Jeffrey C.; Musk, Arthur W.; Nieminen, Markku S.; Njølstad, Inger; Ohlsson, Claes; Oldehinkel, Albertine J.; Oostra, Ben A.; Palmer, Lyle J; Pankow, James S.; Pasterkamp, Gerard; Pedersen, Nancy L.; Pedersen, Oluf; Penninx, Brenda W.; Perola, Markus; Peters, Annette; Polašek, Ozren; Pramstaller, Peter P.; Psaty, Bruce M.; Qi, Lu; Quertermous, Thomas; Raitakari, Olli T.; Rankinen, Tuomo; Rauramaa, Rainer; Ridker, Paul M.; Rioux, John D.; Rivadeneira, Fernando; Rotter, Jerome I.; Rudan, Igor; den Ruijter, Hester M.; Saltevo, Juha; Sattar, Naveed; Schunkert, Heribert; Schwarz, Peter E. H.; Shuldiner, Alan R.; Sinisalo, Juha; Snieder, Harold; Sørensen, Thorkild I. A.; Spector, Tim D.; Staessen, Jan A.; Stefania, Bandinelli; Thorsteinsdottir, Unnur; Stumvoll, Michael; Tardif, Jean-Claude; Tremoli, Elena; Tuomilehto, Jaakko; Uitterlinden, André G.; Uusitupa, Matti; Verbeek, André L. M.; Vermeulen, Sita H.; Viikari, Jorma S.; Vitart, Veronique; Völzke, Henry; Vollenweider, Peter; Waeber, Gérard; Walker, Mark; Wallaschofski, Henri; Wareham, Nicholas J.; Watkins, Hugh; Zeggini, Eleftheria; Chakravarti, Aravinda; Clegg, Deborah J.; Cupples, L. Adrienne; Gordon-Larsen, Penny; Jaquish, Cashell E.; Rao, D. C.; Abecasis, Goncalo R.; Assimes, Themistocles L.; Barroso, Inês; Berndt, Sonja I.; Boehnke, Michael; Deloukas, Panos; Fox, Caroline S.; Groop, Leif C.; Hunter, David J.; Ingelsson, Erik; Kaplan, Robert C.; McCarthy, Mark I.; Mohlke, Karen L.; O'Connell, Jeffrey R.; Schlessinger, David; Strachan, David P.; Stefansson, Kari; van Duijn, Cornelia M.; Hirschhorn, Joel N.; Lindgren, Cecilia M.; Heid, Iris M.; North, Kari E.; Borecki, Ingrid B.; Kutalik, Zoltán; Loos, Ruth J. F.

2015-01-01

Genome-wide association studies (GWAS) have identified more than 100 genetic variants contributing to BMI, a measure of body size, or waist-to-hip ratio (adjusted for BMI, WHRadjBMI), a measure of body shape. Body size and shape change as people grow older and these changes differ substantially between men and women. To systematically screen for age- and/or sex-specific effects of genetic variants on BMI and WHRadjBMI, we performed meta-analyses of 114 studies (up to 320,485 individuals of European descent) with genome-wide chip and/or Metabochip data by the Genetic Investigation of Anthropometric Traits (GIANT) Consortium. Each study tested the association of up to ~2.8M SNPs with BMI and WHRadjBMI in four strata (men ≤50y, men >50y, women ≤50y, women >50y) and summary statistics were combined in stratum-specific meta-analyses. We then screened for variants that showed age-specific effects (G x AGE), sex-specific effects (G x SEX) or age-specific effects that differed between men and women (G x AGE x SEX). For BMI, we identified 15 loci (11 previously established for main effects, four novel) that showed significant (FDR<5%) age-specific effects, of which 11 had larger effects in younger (<50y) than in older adults (≥50y). No sex-dependent effects were identified for BMI. For WHRadjBMI, we identified 44 loci (27 previously established for main effects, 17 novel) with sex-specific effects, of which 28 showed larger effects in women than in men, five showed larger effects in men than in women, and 11 showed opposite effects between sexes. No age-dependent effects were identified for WHRadjBMI. This is the first genome-wide interaction meta-analysis to report convincing evidence of age-dependent genetic effects on BMI. In addition, we confirm the sex-specificity of genetic effects on WHRadjBMI. These results may provide further insights into the biology that underlies weight change with age or the sexually dimorphism of body shape. PMID:26426971
Research methodology in dentistry: Part II — The relevance of statistics in research

PubMed Central

Krithikadatta, Jogikalmat; Valarmathi, Srinivasan

2012-01-01

The lifeline of original research depends on adept statistical analysis. However, there have been reports of statistical misconduct in studies that could arise from the inadequate understanding of the fundamental of statistics. There have been several reports on this across medical and dental literature. This article aims at encouraging the reader to approach statistics from its logic rather than its theoretical perspective. The article also provides information on statistical misuse in the Journal of Conservative Dentistry between the years 2008 and 2011 PMID:22876003
Teach a Confidence Interval for the Median in the First Statistics Course

ERIC Educational Resources Information Center

Howington, Eric B.

2017-01-01

Few introductory statistics courses consider statistical inference for the median. This article argues in favour of adding a confidence interval for the median to the first statistics course. Several methods suitable for introductory statistics students are identified and briefly reviewed.
Speech Segmentation by Statistical Learning Depends on Attention

ERIC Educational Resources Information Center

Toro, Juan M.; Sinnett, Scott; Soto-Faraco, Salvador

2005-01-01

We addressed the hypothesis that word segmentation based on statistical regularities occurs without the need of attention. Participants were presented with a stream of artificial speech in which the only cue to extract the words was the presence of statistical regularities between syllables. Half of the participants were asked to passively listen…
Statistical Literacy: Data Tell a Story

ERIC Educational Resources Information Center

Sole, Marla A.

2016-01-01

Every day, students collect, organize, and analyze data to make decisions. In this data-driven world, people need to assess how much trust they can place in summary statistics. The results of every survey and the safety of every drug that undergoes a clinical trial depend on the correct application of appropriate statistics. Recognizing the…
A Descriptive Study of Individual and Cross-Cultural Differences in Statistics Anxiety

ERIC Educational Resources Information Center

Baloglu, Mustafa; Deniz, M. Engin; Kesici, Sahin

2011-01-01

The present study investigated individual and cross-cultural differences in statistics anxiety among 223 Turkish and 237 American college students. A 2 x 2 between-subjects factorial multivariate analysis of covariance (MANCOVA) was performed on the six dependent variables which are the six subscales of the Statistical Anxiety Rating Scale.…
Effect of Task Presentation on Students' Performances in Introductory Statistics Courses

ERIC Educational Resources Information Center

Tomasetto, Carlo; Matteucci, Maria Cristina; Carugati, Felice; Selleri, Patrizia

2009-01-01

Research on academic learning indicates that many students experience major difficulties with introductory statistics and methodology courses. We hypothesized that students' difficulties may depend in part on the fact that statistics tasks are commonly viewed as related to the threatening domain of math. In two field experiments which we carried…
The delayed rectifier, IKI, is the major conductance in type I vestibular hair cells across vestibular end organs

NASA Technical Reports Server (NTRS)

Ricci, A. J.; Rennie, K. J.; Correia, M. J.

1996-01-01

Hair cells were dissociated from the semicircular canal, utricle, lagena and saccule of white king pigeons. Type I hair cells were identified morphologically based on the ratios of neck width to cuticular plate width (NPR < 0.72) as well as neck width to cell body width (NBR < 0.64). The perforated patch variant of the whole-cell recording technique was used to measure electrical properties from type I hair cells. In voltage-clamp, the membrane properties of all identified type I cells were dominated by a predominantly outward potassium current, previously characterized in semicircular canal as IKI. Zero-current potential, activation, deactivation, slope conductance, pharmacologic and steady-state properties of the complex currents were not statistically different between type I hair cells of different vestibular end organs. The voltage dependence causes a significant proportion of this conductance to be active about the cell's zero-current potential. The first report of the whole-cell activation kinetics of the conductance is presented, showing a voltage dependence that could be best fit by an equation for a single exponential. Results presented here are the first data from pigeon dissociated type I hair cells from utricle, saccule and lagena suggesting that the basolateral conductances of a morphologically identified population of type I hair cells are conserved between functionally different vestibular end organs; the major conductance being a delayed rectifier characterized previously in semicircular canal hair cells as IKI.
Environmental risk of leptospirosis infections in the Netherlands: Spatial modelling of environmental risk factors of leptospirosis in the Netherlands.

PubMed

Rood, Ente J J; Goris, Marga G A; Pijnacker, Roan; Bakker, Mirjam I; Hartskeerl, Rudy A

2017-01-01

Leptospirosis is a globally emerging zoonotic disease, associated with various climatic, biotic and abiotic factors. Mapping and quantifying geographical variations in the occurrence of leptospirosis and the surrounding environment offer innovative methods to study disease transmission and to identify associations between the disease and the environment. This study aims to investigate geographic variations in leptospirosis incidence in the Netherlands and to identify associations with environmental factors driving the emergence of the disease. Individual case data derived over the period 1995-2012 in the Netherlands were geocoded and aggregated by municipality. Environmental covariate data were extracted for each municipality and stored in a spatial database. Spatial clusters were identified using kernel density estimations and quantified using local autocorrelation statistics. Associations between the incidence of leptospirosis and the local environment were determined using Simultaneous Autoregressive Models (SAR) explicitly modelling spatial dependence of the model residuals. Leptospirosis incidence rates were found to be spatially clustered, showing a marked spatial pattern. Fitting a spatial autoregressive model significantly improved model fit and revealed significant association between leptospirosis and the coverage of arable land, built up area, grassland and sabulous clay soils. The incidence of leptospirosis in the Netherlands could effectively be modelled using a combination of soil and land-use variables accounting for spatial dependence of incidence rates per municipality. The resulting spatially explicit risk predictions provide an important source of information which will benefit clinical awareness on potential leptospirosis infections in endemic areas.
Environmental risk of leptospirosis infections in the Netherlands: Spatial modelling of environmental risk factors of leptospirosis in the Netherlands

PubMed Central

Goris, Marga G. A.; Pijnacker, Roan; Bakker, Mirjam I.; Hartskeerl, Rudy A.

2017-01-01

Leptospirosis is a globally emerging zoonotic disease, associated with various climatic, biotic and abiotic factors. Mapping and quantifying geographical variations in the occurrence of leptospirosis and the surrounding environment offer innovative methods to study disease transmission and to identify associations between the disease and the environment. This study aims to investigate geographic variations in leptospirosis incidence in the Netherlands and to identify associations with environmental factors driving the emergence of the disease. Individual case data derived over the period 1995–2012 in the Netherlands were geocoded and aggregated by municipality. Environmental covariate data were extracted for each municipality and stored in a spatial database. Spatial clusters were identified using kernel density estimations and quantified using local autocorrelation statistics. Associations between the incidence of leptospirosis and the local environment were determined using Simultaneous Autoregressive Models (SAR) explicitly modelling spatial dependence of the model residuals. Leptospirosis incidence rates were found to be spatially clustered, showing a marked spatial pattern. Fitting a spatial autoregressive model significantly improved model fit and revealed significant association between leptospirosis and the coverage of arable land, built up area, grassland and sabulous clay soils. The incidence of leptospirosis in the Netherlands could effectively be modelled using a combination of soil and land-use variables accounting for spatial dependence of incidence rates per municipality. The resulting spatially explicit risk predictions provide an important source of information which will benefit clinical awareness on potential leptospirosis infections in endemic areas. PMID:29065186
Self-report and longitudinal predictors of violence in Iraq and Afghanistan war era veterans.

PubMed

Elbogen, Eric B; Johnson, Sally C; Newton, Virginia M; Fuller, Sara; Wagner, H Ryan; Beckham, Jean C

2013-10-01

This study, using a longitudinal design, attempted to identify whether self-reported problems with violence were empirically associated with future violent behavior among Iraq and Afghanistan war veterans and whether and how collateral informant interviews enhanced the risk assessment process. Data were gathered from N = 300 participants (n = 150 dyads of Iraq and Afghanistan war veterans and family/friends). The veterans completed baseline and follow-up interviews 3 years later on average, and family/friends provided collateral data on dependent measures at follow-up. Analyses showed that aggression toward others at follow-up was associated with younger age, posttraumatic stress disorder, combat exposure, and a history of having witnessed parental violence growing up. Self-reported problems controlling violence at baseline had robust statistical power in predicting aggression toward others at follow-up. Collateral report enhanced detection of dependent variables: 20% of cases positive for violence toward others would have been missed relying only on self-report. The results identify a subset of Iraq and Afghanistan war veterans at higher risk for problematic postdeployment adjustment and indicate that the veterans' self-report of violence was useful in predicting future aggression. Underreporting of violence was not evidenced by most veterans but could be improved upon by obtaining collateral information.

Knowledge about nicotine among HIV-positive smokers: Implications for tobacco regulatory science policy.

PubMed

Pacek, Lauren R; Rass, Olga; Johnson, Matthew W

2017-02-01

The present paper describes the general knowledge of smoking and nicotine among a sample of current smokers living with HIV (n=271) who were recruited via Amazon Mechanical Turk. Descriptive statistics were used to report sociodemographic and smoking characteristics, as well as knowledge about smoking and nicotine. The sample was comprised of relatively light smokers, both in terms of cigarettes per day (M=8.1, SD=9.7) and dependence (67.5% had low dependence according to the Heaviness of Smoking Index). The majority of participants correctly identified smoking as being a potential cause of various smoking-related conditions and correctly identified constituents in cigarette smoke. However, a majority of participants also misattributed nicotine as being a potential cause of smoking-related illness. Accurate knowledge about nicotine was low. These misperceptions are of particular concern for vulnerable populations, such as persons living with HIV, who are disproportionately burdened by the prevalence of smoking and associated morbidities and mortality. These misperceptions could have unintended consequences in the wake of a potential nicotine reduction policy, such that reduced nicotine content products are perceived as safer than normal nicotine content products currently available for sale. Additionally, incorrect knowledge about nicotine has implications for the uptake and continued use of nicotine replacement therapy. Copyright Â© 2016 Elsevier Ltd. All rights reserved.
Complications of Transfusion-Dependent β-Thalassemia Patients in Sistan and Baluchistan, South-East of Iran.

PubMed

Yaghobi, Maryam; Miri-Moghaddam, Ebrahim; Majid, Naderi; Bazi, Ali; Navidian, Ali; Kalkali, Asiyeh

2017-10-01

Background : Thalassemia syndromes are among prevalent hereditary disorders imposing high expenses on health-care system worldwide and in Iran. Organ failure represents a life-threatening challenge in transfusion- dependent β-thalassemia (TDT) patients. The purpose of the present study was to determine the frequency of organ dysfunctions among TDT patients in Sistan and Baluchistan province in South-East of Iran. Materials and Methods: Laboratory and clinical data were extracted from medical records as well as by interviews. Standard criteria were applied to recognize cardiac, gonadal, endocrine and renal dysfunctions. The collected data were analyzed using the SPSS statistics software (Ver.19). Results: A total of 613 TDT patients (54.3% males and 45.7% females) were included in this study. The mean age of patients was 13.3 ±7.7 years old. Cardiac events comprised the most encountered complications (76.4%), following by hypogonadism (46.8%), parathyroid dysfunction (22%), thyroid abnormalities (8.3%), diabetes (7.8%) and renal disease (1.8%). Hypogonadism comprised the most identified complication in patient <15 years old, while the cardiac complications were the most frequent sequela in patients >15 years old (P<0.01). Conclusion: As cardiac events are significantly more common among TDT patients, close monitoring of the heart function is recommended for identifying patients with cardiac problems.
[Hydrologic variability and sensitivity based on Hurst coefficient and Bartels statistic].

PubMed

Lei, Xu; Xie, Ping; Wu, Zi Yi; Sang, Yan Fang; Zhao, Jiang Yan; Li, Bin Bin

2018-04-01

Due to the global climate change and frequent human activities in recent years, the pure stochastic components of hydrological sequence is mixed with one or several of the variation ingredients, including jump, trend, period and dependency. It is urgently needed to clarify which indices should be used to quantify the degree of their variability. In this study, we defined the hydrological variability based on Hurst coefficient and Bartels statistic, and used Monte Carlo statistical tests to test and analyze their sensitivity to different variants. When the hydrological sequence had jump or trend variation, both Hurst coefficient and Bartels statistic could reflect the variation, with the Hurst coefficient being more sensitive to weak jump or trend variation. When the sequence had period, only the Bartels statistic could detect the mutation of the sequence. When the sequence had a dependency, both the Hurst coefficient and the Bartels statistics could reflect the variation, with the latter could detect weaker dependent variations. For the four variations, both the Hurst variability and Bartels variability increased with the increases of variation range. Thus, they could be used to measure the variation intensity of the hydrological sequence. We analyzed the temperature series of different weather stations in the Lancang River basin. Results showed that the temperature of all stations showed the upward trend or jump, indicating that the entire basin had experienced warming in recent years and the temperature variability in the upper and lower reaches was much higher. This case study showed the practicability of the proposed method.
Fast Identification of Biological Pathways Associated with a Quantitative Trait Using Group Lasso with Overlaps

PubMed Central

Silver, Matt; Montana, Giovanni

2012-01-01

Where causal SNPs (single nucleotide polymorphisms) tend to accumulate within biological pathways, the incorporation of prior pathways information into a statistical model is expected to increase the power to detect true associations in a genetic association study. Most existing pathways-based methods rely on marginal SNP statistics and do not fully exploit the dependence patterns among SNPs within pathways. We use a sparse regression model, with SNPs grouped into pathways, to identify causal pathways associated with a quantitative trait. Notable features of our “pathways group lasso with adaptive weights” (P-GLAW) algorithm include the incorporation of all pathways in a single regression model, an adaptive pathway weighting procedure that accounts for factors biasing pathway selection, and the use of a bootstrap sampling procedure for the ranking of important pathways. P-GLAW takes account of the presence of overlapping pathways and uses a novel combination of techniques to optimise model estimation, making it fast to run, even on whole genome datasets. In a comparison study with an alternative pathways method based on univariate SNP statistics, our method demonstrates high sensitivity and specificity for the detection of important pathways, showing the greatest relative gains in performance where marginal SNP effect sizes are small. PMID:22499682
A new statistical PCA-ICA algorithm for location of R-peaks in ECG.

PubMed

Chawla, M P S; Verma, H K; Kumar, Vinod

2008-09-16

The success of ICA to separate the independent components from the mixture depends on the properties of the electrocardiogram (ECG) recordings. This paper discusses some of the conditions of independent component analysis (ICA) that could affect the reliability of the separation and evaluation of issues related to the properties of the signals and number of sources. Principal component analysis (PCA) scatter plots are plotted to indicate the diagnostic features in the presence and absence of base-line wander in interpreting the ECG signals. In this analysis, a newly developed statistical algorithm by authors, based on the use of combined PCA-ICA for two correlated channels of 12-channel ECG data is proposed. ICA technique has been successfully implemented in identifying and removal of noise and artifacts from ECG signals. Cleaned ECG signals are obtained using statistical measures like kurtosis and variance of variance after ICA processing. This analysis also paper deals with the detection of QRS complexes in electrocardiograms using combined PCA-ICA algorithm. The efficacy of the combined PCA-ICA algorithm lies in the fact that the location of the R-peaks is bounded from above and below by the location of the cross-over points, hence none of the peaks are ignored or missed.
Chloride and salicylate influence prestin-dependent specific membrane capacitance: support for the area motor model.

PubMed

Santos-Sacchi, Joseph; Song, Lei

2014-04-11

The outer hair cell is electromotile, its membrane motor identified as the protein SLC26a5 (prestin). An area motor model, based on two-state Boltzmann statistics, was developed about two decades ago and derives from the observation that outer hair cell surface area is voltage-dependent. Indeed, aside from the nonlinear capacitance imparted by the voltage sensor charge movement of prestin, linear capacitance (Clin) also displays voltage dependence as motors move between expanded and compact states. Naturally, motor surface area changes alter membrane capacitance. Unit linear motor capacitance fluctuation (δCsa) is on the order of 140 zeptofarads. A recent three-state model of prestin provides an alternative view, suggesting that voltage-dependent linear capacitance changes are not real but only apparent because the two component Boltzmann functions shift their midpoint voltages (Vh) in opposite directions during treatment with salicylate, a known competitor of required chloride binding. We show here using manipulations of nonlinear capacitance with both salicylate and chloride that an enhanced area motor model, including augmented δCsa by salicylate, can accurately account for our novel findings. We also show that although the three-state model implicitly avoids measuring voltage-dependent motor capacitance, it registers δCsa effects as a byproduct of its assessment of Clin, which increases during salicylate treatment as motors are locked in the expanded state. The area motor model, in contrast, captures the characteristics of the voltage dependence of δCsa, leading to a better understanding of prestin.
Prospective Effects of Adolescent Indicators of Behavioral Disinhibition on DSM-IV Alcohol, Tobacco, and Illicit Drug Dependence in Young Adulthood

PubMed Central

Palmer, Rohan H. C.; Knopik, Valerie S.; Rhee, Soo Hyun; Hopfer, Christian J.; Corley, Robin C.; Young, Susan E.; Stallings, Michael C.; Hewitt, John K.

2013-01-01

Objective To identify robust predictors of drug dependence. Methods This longitudinal study included 2361 male and female twins from an ongoing longitudinal study at the Center for Antisocial Drug Dependence (CADD) at the University of Colorado Boulder and Denver campuses. Twins were recruited for the CADD project while they were between the ages of 12 and 18. Participants in the current study were on average approximately 15 years of age during the first wave of assessment and approximately 20 years of age at the second wave of assessment. The average time between assessments was five years. A structured interview was administered at each assessment to determine patterns of substance use and Diagnostic and Statistical Manual of Mental Disorders (DSM-IV; Fourth Edition) attention deficit hyperactivity disorder (ADHD), conduct disorder (CD), and drug dependence symptoms. Cloninger’s Tridimensional Personality Questionnaire was also used to assess novelty seeking tendencies (NS). At the second wave of assessment, DSM-IV dependence symptoms were reassessed using the same interview. Path analyses were used to examine direct and indirect mechanisms linking psychopathology and drug outcomes. Results Adolescent substance use, CD, and NS predicted young adult substance dependence, whereas the predictive effects of ADHD were few and inconsistent. Furthermore, CD and NS effects were partially mediated by adolescent substance use. Conclusions Adolescent conduct problems, novelty seeking, and drug use are important indices of future drug problems. The strongest predictor was novelty seeking. PMID:23685327
Therapeutic whole-body hypothermia reduces mortality in severe traumatic brain injury if the cooling index is sufficiently high: meta-analyses of the effect of single cooling parameters and their integrated measure.

PubMed

Olah, Emoke; Poto, Laszlo; Hegyi, Peter; Szabo, Imre; Hartmann, Petra; Solymar, Margit; Petervari, Erika; Balasko, Marta; Habon, Tamas; Rumbus, Zoltan; Tenk, Judit; Rostas, Ildiko; Weinberg, Jordan; Romanovsky, Andrej A; Garami, Andras

2018-04-21

Therapeutic hypothermia was investigated repeatedly as a tool to improve the outcome of severe traumatic brain injury (TBI), but previous clinical trials and meta-analyses found contradictory results. We aimed to determine the effectiveness of therapeutic whole-body hypothermia on the mortality of adult patients with severe TBI by using a novel approach of meta-analysis. We searched the PubMed, EMBASE, and Cochrane Library databases from inception to February 2017. The identified human studies were evaluated regarding statistical, clinical, and methodological designs to ensure inter-study homogeneity. We extracted data on TBI severity, body temperature, mortality, and cooling parameters; then we calculated the cooling index, an integrated measure of therapeutic hypothermia. Forest plot of all identified studies showed no difference in the outcome of TBI between cooled and not cooled patients, but inter-study heterogeneity was high. On the contrary, by meta-analysis of RCTs which were homogenous with regards to statistical, clinical designs and precisely reported the cooling protocol, we showed decreased odds ratio for mortality in therapeutic hypothermia compared to no cooling. As independent factors, milder and longer cooling, and rewarming at < 0.25°C/h were associated with better outcome. Therapeutic hypothermia was beneficial only if the cooling index (measure of combination of cooling parameters) was sufficiently high. We conclude that high methodological and statistical inter-study heterogeneity could underlie the contradictory results obtained in previous studies. By analyzing methodologically homogenous studies, we show that cooling improves the outcome of severe TBI and this beneficial effect depends on certain cooling parameters and on their integrated measure, the cooling index.
Public health information and statistics dissemination efforts for Indonesia on the Internet.

PubMed

Hanani, Febiana; Kobayashi, Takashi; Jo, Eitetsu; Nakajima, Sawako; Oyama, Hiroshi

2011-01-01

To elucidate current issues related to health statistics dissemination efforts on the Internet in Indonesia and to propose a new dissemination website as a solution. A cross-sectional survey was conducted. Sources of statistics were identified using link relationship and Google™ search. Menu used to locate statistics, mode of presentation and means of access to statistics, and available statistics were assessed for each site. Assessment results were used to derive design specification; a prototype system was developed and evaluated with usability test. 49 sources were identified on 18 governmental, 8 international and 5 non-government websites. Of 49 menus identified, 33% used non-intuitive titles and lead to inefficient search. 69% of them were on government websites. Of 31 websites, only 39% and 23% used graph/chart and map for presentation. Further, only 32%, 39% and 19% provided query, export and print feature. While >50% sources reported morbidity, risk factor and service provision statistics, <40% sources reported health resource and mortality statistics. Statistics portal website was developed using Joomla!™ content management system. Usability test demonstrated its potential to improve data accessibility. In this study, government's efforts to disseminate statistics in Indonesia are supported by non-governmental and international organizations and existing their information may not be very useful because it is: a) not widely distributed, b) difficult to locate, and c) not effectively communicated. Actions are needed to ensure information usability, and one of such actions is the development of statistics portal website.
The application of artificial intelligence to microarray data: identification of a novel gene signature to identify bladder cancer progression.

PubMed

Catto, James W F; Abbod, Maysam F; Wild, Peter J; Linkens, Derek A; Pilarsky, Christian; Rehman, Ishtiaq; Rosario, Derek J; Denzinger, Stefan; Burger, Maximilian; Stoehr, Robert; Knuechel, Ruth; Hartmann, Arndt; Hamdy, Freddie C

2010-03-01

New methods for identifying bladder cancer (BCa) progression are required. Gene expression microarrays can reveal insights into disease biology and identify novel biomarkers. However, these experiments produce large datasets that are difficult to interpret. To develop a novel method of microarray analysis combining two forms of artificial intelligence (AI): neurofuzzy modelling (NFM) and artificial neural networks (ANN) and validate it in a BCa cohort. We used AI and statistical analyses to identify progression-related genes in a microarray dataset (n=66 tumours, n=2800 genes). The AI-selected genes were then investigated in a second cohort (n=262 tumours) using immunohistochemistry. We compared the accuracy of AI and statistical approaches to identify tumour progression. AI identified 11 progression-associated genes (odds ratio [OR]: 0.70; 95% confidence interval [CI], 0.56-0.87; p=0.0004), and these were more discriminate than genes chosen using statistical analyses (OR: 1.24; 95% CI, 0.96-1.60; p=0.09). The expression of six AI-selected genes (LIG3, FAS, KRT18, ICAM1, DSG2, and BRCA2) was determined using commercial antibodies and successfully identified tumour progression (concordance index: 0.66; log-rank test: p=0.01). AI-selected genes were more discriminate than pathologic criteria at determining progression (Cox multivariate analysis: p=0.01). Limitations include the use of statistical correlation to identify 200 genes for AI analysis and that we did not compare regression identified genes with immunohistochemistry. AI and statistical analyses use different techniques of inference to determine gene-phenotype associations and identify distinct prognostic gene signatures that are equally valid. We have identified a prognostic gene signature whose members reflect a variety of carcinogenic pathways that could identify progression in non-muscle-invasive BCa. 2009 European Association of Urology. Published by Elsevier B.V. All rights reserved.
Statistical analysis of sperm sorting

NASA Astrophysics Data System (ADS)

Koh, James; Marcos, Marcos

2017-11-01

The success rate of assisted reproduction depends on the proportion of morphologically normal sperm. It is possible to use an external field for manipulation and sorting. Depending on their morphology, the extent of response varies. Due to the wide distribution in sperm morphology even among individuals, the resulting distribution of kinematic behaviour, and consequently the feasibility of sorting, should be analysed statistically. In this theoretical work, Resistive Force Theory and Slender Body Theory will be applied and compared. Full name is Marcos.
Experimental analysis of computer system dependability

NASA Technical Reports Server (NTRS)

Iyer, Ravishankar, K.; Tang, Dong

1993-01-01

This paper reviews an area which has evolved over the past 15 years: experimental analysis of computer system dependability. Methodologies and advances are discussed for three basic approaches used in the area: simulated fault injection, physical fault injection, and measurement-based analysis. The three approaches are suited, respectively, to dependability evaluation in the three phases of a system's life: design phase, prototype phase, and operational phase. Before the discussion of these phases, several statistical techniques used in the area are introduced. For each phase, a classification of research methods or study topics is outlined, followed by discussion of these methods or topics as well as representative studies. The statistical techniques introduced include the estimation of parameters and confidence intervals, probability distribution characterization, and several multivariate analysis methods. Importance sampling, a statistical technique used to accelerate Monte Carlo simulation, is also introduced. The discussion of simulated fault injection covers electrical-level, logic-level, and function-level fault injection methods as well as representative simulation environments such as FOCUS and DEPEND. The discussion of physical fault injection covers hardware, software, and radiation fault injection methods as well as several software and hybrid tools including FIAT, FERARI, HYBRID, and FINE. The discussion of measurement-based analysis covers measurement and data processing techniques, basic error characterization, dependency analysis, Markov reward modeling, software-dependability, and fault diagnosis. The discussion involves several important issues studies in the area, including fault models, fast simulation techniques, workload/failure dependency, correlated failures, and software fault tolerance.
Unsupervised Scalable Statistical Method for Identifying Influential Users in Online Social Networks.

PubMed

Azcorra, A; Chiroque, L F; Cuevas, R; Fernández Anta, A; Laniado, H; Lillo, R E; Romo, J; Sguera, C

2018-05-03

Billions of users interact intensively every day via Online Social Networks (OSNs) such as Facebook, Twitter, or Google+. This makes OSNs an invaluable source of information, and channel of actuation, for sectors like advertising, marketing, or politics. To get the most of OSNs, analysts need to identify influential users that can be leveraged for promoting products, distributing messages, or improving the image of companies. In this report we propose a new unsupervised method, Massive Unsupervised Outlier Detection (MUOD), based on outliers detection, for providing support in the identification of influential users. MUOD is scalable, and can hence be used in large OSNs. Moreover, it labels the outliers as of shape, magnitude, or amplitude, depending of their features. This allows classifying the outlier users in multiple different classes, which are likely to include different types of influential users. Applying MUOD to a subset of roughly 400 million Google+ users, it has allowed identifying and discriminating automatically sets of outlier users, which present features associated to different definitions of influential users, like capacity to attract engagement, capacity to attract a large number of followers, or high infection capacity.
Efficient elimination of nonstoichiometric enzyme inhibitors from HTS hit lists.

PubMed

Habig, Michael; Blechschmidt, Anke; Dressler, Sigmar; Hess, Barbara; Patel, Viral; Billich, Andreas; Ostermeier, Christian; Beer, David; Klumpp, Martin

2009-07-01

High-throughput screening often identifies not only specific, stoichiometrically binding inhibitors but also undesired compounds that unspecifically interfere with the targeted activity by nonstoichiometrically binding, unfolding, and/or inactivating proteins. In this study, the effect of such unwanted inhibitors on several different enzyme targets was assessed based on screening results for over a million compounds. In particular, the shift in potency on variation of enzyme concentration was used as a means to identify nonstoichiometric inhibitors among the screening hits. These potency shifts depended on both compound structure and target enzyme. The approach was confirmed by statistical analysis of thousands of dose-response curves, which showed that the potency of competitive and therefore clearly stoichiometric inhibitors was not affected by increasing enzyme concentration. Light-scattering measurements of thermal protein unfolding further verified that compounds that stabilize protein structure by stoichiometric binding show the same potency irrespective of enzyme concentration. In summary, measuring inhibitor IC(50) values at different enzyme concentrations is a simple, cost-effective, and reliable method to identify and eliminate compounds that inhibit a specific target enzyme via nonstoichiometric mechanisms.
The role of the GABRA2 polymorphism in multiplex alcohol dependence families with minimal comorbidity: within-family association and linkage analyses.

PubMed

Matthews, Abigail G; Hoffman, Eric K; Zezza, Nicholas; Stiffler, Scott; Hill, Shirley Y

2007-09-01

The genes encoding the gamma-aminobutyric acid(A) (GABA(A)) receptor have been the focus of several recent studies investigating the genetic etiology of alcohol dependence. Analyses of multiplex families found a particular gene, GABRA2, to be highly associated with alcohol dependence, using within-family association tests and other methods. Results were confirmed in three case-control studies. The objective of this study was to investigate the GABRA2 gene in another collection of multiplex families. Analyses were based on phenotypic and genotypic data available for 330 individuals from 65 bigenerational pedigrees with a total of 232 alcohol-dependent subjects. A proband pair of same-sex siblings meeting Diagnostic and Statistical Manual of Mental Disorders, Third Edition, criteria for alcohol dependence was required for entry of a family into the study. One member of the proband pair was identified while in treatment for alcohol dependence. Linkage and association of GABRA2 and alcohol dependence were evaluated using SIBPAL (a nonparametric linkage package) and both the Pedigree Disequilibrium Test and the Family-Based Association Test, respectively. We find no evidence of a relationship between GABRA2 and alcohol dependence. Linkage analyses exhibited no linkage using affected/affected, affected/unaffected, and unaffected/unaffected sib pairs (all p's < .13). There was no evidence of a within-family association (all p's > .39). Comorbidity may explain why our results differ from those in the literature. The presence of primary drug dependence and/or other psychiatric disorders is minimal in our pedigrees, although several of the other previously published multiplex family analyses exhibit a greater degree of comorbidity.
Replication of Genome Wide Association Studies of Alcohol Dependence: Support for Association with Variation in ADH1C

PubMed Central

Biernacka, Joanna M.; Geske, Jennifer R.; Schneekloth, Terry D.; Frye, Mark A.; Cunningham, Julie M.; Choi, Doo-Sup; Tapp, Courtney L.; Lewis, Bradley R.; Drews, Maureen S.; L.Pietrzak, Tracy; Colby, Colin L.; Hall-Flavin, Daniel K.; Loukianova, Larissa L.; Heit, John A.; Mrazek, David A.; Karpyak, Victor M.

2013-01-01

Genome-wide association studies (GWAS) have revealed many single nucleotide polymorphisms (SNPs) associated with complex traits. Although these studies frequently fail to identify statistically significant associations, the top association signals from GWAS may be enriched for true associations. We therefore investigated the association of alcohol dependence with 43 SNPs selected from association signals in the first two published GWAS of alcoholism. Our analysis of 808 alcohol-dependent cases and 1,248 controls provided evidence of association of alcohol dependence with SNP rs1614972 in the ADH1C gene (unadjusted p = 0.0017). Because the GWAS study that originally reported association of alcohol dependence with this SNP [1] included only men, we also performed analyses in sex-specific strata. The results suggest that this SNP has a similar effect in both sexes (men: OR (95%CI) = 0.80 (0.66, 0.95); women: OR (95%CI) = 0.83 (0.66, 1.03)). We also observed marginal evidence of association of the rs1614972 minor allele with lower alcohol consumption in the non-alcoholic controls (p = 0.081), and independently in the alcohol-dependent cases (p = 0.046). Despite a number of potential differences between the samples investigated by the prior GWAS and the current study, data presented here provide additional support for the association of SNP rs1614972 in ADH1C with alcohol dependence and extend this finding by demonstrating association with consumption levels in both non-alcoholic and alcohol-dependent populations. Further studies should investigate the association of other polymorphisms in this gene with alcohol dependence and related alcohol-use phenotypes. PMID:23516558
Knowledge dimensions in hypothesis test problems

NASA Astrophysics Data System (ADS)

Krishnan, Saras; Idris, Noraini

2012-05-01

The reformation in statistics education over the past two decades has predominantly shifted the focus of statistical teaching and learning from procedural understanding to conceptual understanding. The emphasis of procedural understanding is on the formulas and calculation procedures. Meanwhile, conceptual understanding emphasizes students knowing why they are using a particular formula or executing a specific procedure. In addition, the Revised Bloom's Taxonomy offers a twodimensional framework to describe learning objectives comprising of the six revised cognition levels of original Bloom's taxonomy and four knowledge dimensions. Depending on the level of complexities, the four knowledge dimensions essentially distinguish basic understanding from the more connected understanding. This study identifiesthe factual, procedural and conceptual knowledgedimensions in hypothesis test problems. Hypothesis test being an important tool in making inferences about a population from sample informationis taught in many introductory statistics courses. However, researchers find that students in these courses still have difficulty in understanding the underlying concepts of hypothesis test. Past studies also show that even though students can perform the hypothesis testing procedure, they may not understand the rationale of executing these steps or know how to apply them in novel contexts. Besides knowing the procedural steps in conducting a hypothesis test, students must have fundamental statistical knowledge and deep understanding of the underlying inferential concepts such as sampling distribution and central limit theorem. By identifying the knowledge dimensions of hypothesis test problems in this study, suitable instructional and assessment strategies can be developed in future to enhance students' learning of hypothesis test as a valuable inferential tool.
Identifying On-Orbit Test Targets for Space Fence Operational Testing

NASA Astrophysics Data System (ADS)

Pechkis, D.; Pacheco, N.; Botting, T.

2014-09-01

Space Fence will be an integrated system of two ground-based, S-band (2 to 4 GHz) phased-array radars located in Kwajalein and perhaps Western Australia [1]. Space Fence will cooperate with other Space Surveillance Network sensors to provide space object tracking and radar characterization data to support U.S. Strategic Command space object catalog maintenance and other space situational awareness needs. We present a rigorous statistical test design intended to test Space Fence to the letter of the program requirements as well as to characterize the system performance across the entire operational envelope. The design uses altitude, size, and inclination as independent factors in statistical tests of dependent variables (e.g., observation accuracy) linked to requirements. The analysis derives the type and number of necessary test targets. Comparing the resulting sample sizes with the number of currently known targets, we identify those areas where modelling and simulation methods are needed. Assuming hypothetical Kwajalein radar coverage and a conservative number of radar passes per object per day, we conclude that tests involving real-world space objects should take no more than 25 days to evaluate all operational requirements; almost 60 percent of the requirements can be tested in a single day and nearly 90 percent can be tested in one week or less. Reference: [1] L. Haines and P. Phu, Space Fence PDR Concept Development Phase, 2011 AMOS Conference Technical Papers.
An Information Theory Approach to Nonlinear, Nonequilibrium Thermodynamics

NASA Astrophysics Data System (ADS)

Rogers, David M.; Beck, Thomas L.; Rempe, Susan B.

2011-10-01

Using the problem of ion channel thermodynamics as an example, we illustrate the idea of building up complex thermodynamic models by successively adding physical information. We present a new formulation of information algebra that generalizes methods of both information theory and statistical mechanics. From this foundation we derive a theory for ion channel kinetics, identifying a nonequilibrium `process' free energy functional in addition to the well-known integrated work functionals. The Gibbs-Maxwell relation for the free energy functional is a Green-Kubo relation, applicable arbitrarily far from equilibrium, that captures the effect of non-local and time-dependent behavior from transient thermal and mechanical driving forces. Comparing the physical significance of the Lagrange multipliers to the canonical ensemble suggests definitions of nonequilibrium ensembles at constant capacitance or inductance in addition to constant resistance. Our result is that statistical mechanical descriptions derived from a few primitive algebraic operations on information can be used to create experimentally-relevant and computable models. By construction, these models may use information from more detailed atomistic simulations. Two surprising consequences to be explored in further work are that (in)distinguishability factors are automatically predicted from the problem formulation and that a direct analogue of the second law for thermodynamic entropy production is found by considering information loss in stochastic processes. The information loss identifies a novel contribution from the instantaneous information entropy that ensures non-negative loss.
Spatio-temporal dependencies between hospital beds, physicians and health expenditure using visual variables and data classification in statistical table

NASA Astrophysics Data System (ADS)

Medyńska-Gulij, Beata; Cybulski, Paweł

2016-06-01

This paper analyses the use of table visual variables of statistical data of hospital beds as an important tool for revealing spatio-temporal dependencies. It is argued that some of conclusions from the data about public health and public expenditure on health have a spatio-temporal reference. Different from previous studies, this article adopts combination of cartographic pragmatics and spatial visualization with previous conclusions made in public health literature. While the significant conclusions about health care and economic factors has been highlighted in research papers, this article is the first to apply visual analysis to statistical table together with maps which is called previsualisation.

3D Microstructural Characterization of Uranium Oxide as a Surrogate Nuclear Fuel: Effect of Oxygen Stoichiometry on Grain Boundary Distributions

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rudman, K.; Dickerson, P.; Byler, Darrin David

The initial microstructure of an oxide fuel can play a key role in its performance. At low burn-ups, the diffusion of fission products can depend strongly on grain size and grain boundary (GB) characteristics, which in turn depend on processing conditions and oxygen stoichiometry. Serial sectioning techniques using Focused Ion Beam were developed to obtain Electron Backscatter Diffraction (EBSD) data for depleted UO2 pellets that were processed to obtain 3 different oxygen stoichiometries. The EBSD data were used to create 3D microstructure reconstructions and to gather statistical information on the grain and GB crystallography, with emphasis on identifying the charactermore » (twist, tilt, mixed) for GBs that meet the Coincident Site Lattice (CSL) criterion as well as GBs with the most common misorientation angles. Data on dihedral angles at triple points were also collected. The results were compared across different samples to understand effects of oxygen content on microstructure evolution.« less
Metabolomics Approach to Investigate Estrogen Receptor-Dependent and Independent Effects of o,p'-DDT in the Uterus and Brain of Immature Mice.

PubMed

Wang, Dezhen; Zhu, Wentao; Wang, Yao; Yan, Jin; Teng, Miaomiao; Miao, Jiyan; Zhou, Zhiqiang

2017-05-10

Previous studies have demonstrated the endocrine disruption of o,p'-DDT. In this study, we used a 1 H NMR based metabolomics approach to investigate the estrogenic effects of o,p'-DDT (300 mg/kg) on the uterus and brain after 3 days of oral gavage administration, and ethynylestradiol (EE, 100 μg/kg) was used as a positive control. A supervised statistical analysis (PLS-DA) indicated that o,p'-DDT exerted both estrogenic receptor-(ER)-dependent and independent effects on the uterus but mainly ER-independent effects on the brain at metabolome levels, which was verified by coexposing with the antiestrogenic ICI 182,780. Four changed metabolites-glycine, choline, fumarate, and phenylalanine-were identified as ER-independent alterations in the uterus, while more metabolites, including γ-aminobutyrate, N-acetyl aspartate, and some amino acids, were disturbed based on the ER-independent mechanism in the brain. Together with biological end points, metabolomics is a promising approach to study potential estrogenic chemicals.
Online incidental statistical learning of audiovisual word sequences in adults: a registered report.

PubMed

Kuppuraj, Sengottuvel; Duta, Mihaela; Thompson, Paul; Bishop, Dorothy

2018-02-01

Statistical learning has been proposed as a key mechanism in language learning. Our main goal was to examine whether adults are capable of simultaneously extracting statistical dependencies in a task where stimuli include a range of structures amenable to statistical learning within a single paradigm. We devised an online statistical learning task using real word auditory-picture sequences that vary in two dimensions: (i) predictability and (ii) adjacency of dependent elements. This task was followed by an offline recall task to probe learning of each sequence type. We registered three hypotheses with specific predictions. First, adults would extract regular patterns from continuous stream (effect of grammaticality). Second, within grammatical conditions, they would show differential speeding up for each condition as a factor of statistical complexity of the condition and exposure. Third, our novel approach to measure online statistical learning would be reliable in showing individual differences in statistical learning ability. Further, we explored the relation between statistical learning and a measure of verbal short-term memory (STM). Forty-two participants were tested and retested after an interval of at least 3 days on our novel statistical learning task. We analysed the reaction time data using a novel regression discontinuity approach. Consistent with prediction, participants showed a grammaticality effect, agreeing with the predicted order of difficulty for learning different statistical structures. Furthermore, a learning index from the task showed acceptable test-retest reliability ( r = 0.67). However, STM did not correlate with statistical learning. We discuss the findings noting the benefits of online measures in tracking the learning process.
Online incidental statistical learning of audiovisual word sequences in adults: a registered report

PubMed Central

Duta, Mihaela; Thompson, Paul

2018-01-01

Statistical learning has been proposed as a key mechanism in language learning. Our main goal was to examine whether adults are capable of simultaneously extracting statistical dependencies in a task where stimuli include a range of structures amenable to statistical learning within a single paradigm. We devised an online statistical learning task using real word auditory–picture sequences that vary in two dimensions: (i) predictability and (ii) adjacency of dependent elements. This task was followed by an offline recall task to probe learning of each sequence type. We registered three hypotheses with specific predictions. First, adults would extract regular patterns from continuous stream (effect of grammaticality). Second, within grammatical conditions, they would show differential speeding up for each condition as a factor of statistical complexity of the condition and exposure. Third, our novel approach to measure online statistical learning would be reliable in showing individual differences in statistical learning ability. Further, we explored the relation between statistical learning and a measure of verbal short-term memory (STM). Forty-two participants were tested and retested after an interval of at least 3 days on our novel statistical learning task. We analysed the reaction time data using a novel regression discontinuity approach. Consistent with prediction, participants showed a grammaticality effect, agreeing with the predicted order of difficulty for learning different statistical structures. Furthermore, a learning index from the task showed acceptable test–retest reliability (r = 0.67). However, STM did not correlate with statistical learning. We discuss the findings noting the benefits of online measures in tracking the learning process. PMID:29515876
Fundamental structural characteristics of planar granular assemblies: Self-organization and scaling away friction and initial state.

PubMed

Matsushima, Takashi; Blumenfeld, Raphael

2017-03-01

The microstructural organization of a granular system is the most important determinant of its macroscopic behavior. Here we identify the fundamental factors that determine the statistics of such microstructures, using numerical experiments to gain a general understanding. The experiments consist of preparing and compacting isotropically two-dimensional granular assemblies of polydisperse frictional disks and analyzing the emergent statistical properties of quadrons-the basic structural elements of granular solids. The focus on quadrons is because the statistics of their volumes have been found to display intriguing universal-like features [T. Matsushima and R. Blumenfeld, Phys. Rev. Lett. 112, 098003 (2014)PRLTAO0031-900710.1103/PhysRevLett.112.098003]. The dependence of the structures and of the packing fraction on the intergranular friction and the initial state is analyzed, and a number of significant results are found. (i) An analytical formula is derived for the mean quadron volume in terms of three macroscopic quantities: the mean coordination number, the packing fraction, and the rattlers fraction. (ii) We derive a unique, initial-state-independent relation between the mean coordination number and the rattler-free packing fraction. The relation is supported numerically for a range of different systems. (iii) We collapse the quadron volume distributions from all systems onto one curve, and we verify that they all have an exponential tail. (iv) The nature of the quadron volume distribution is investigated by decomposition into conditional distributions of volumes given the cell order, and we find that each of these also collapses onto a single curve. (v) We find that the mean quadron volume decreases with increasing intergranular friction coefficients, an effect that is prominent in high-order cells. We argue that this phenomenon is due to an increased probability of stable irregularly shaped cells, and we test this using a herewith developed free cell analytical model. We conclude that, in principle, the microstructural characteristics are governed mainly by the packing procedure, while the effects of intergranular friction and initial states are details that can be scaled away. However, mechanical stability constraints suppress slightly the occurrence of small quadron volumes in cells of order ≥6, and the magnitude of this effect does depend on friction. We quantify in detail this dependence and the deviation it causes from an exact collapse for these cells. (vi) We argue that our results support strongly the view that ensemble granular statistical mechanics does not satisfy the uniform measure assumption of conventional statistical mechanics. Results (i)-(iv) have been reported in the aforementioned reference, and they are reviewed and elaborated on here.
Statistical Literacy as a Function of Online versus Hybrid Course Delivery Format for an Introductory Graduate Statistics Course

ERIC Educational Resources Information Center

Hahs-Vaughn, Debbie L.; Acquaye, Hannah; Griffith, Matthew D.; Jo, Hang; Matthews, Ken; Acharya, Parul

2017-01-01

Statistical literacy refers to understanding fundamental statistical concepts. Assessment of statistical literacy can take the forms of tasks that require students to identify, translate, compute, read, and interpret data. In addition, statistical instruction can take many forms encompassing course delivery format such as face-to-face, hybrid,…
X-ray studies of quasars with the Einstein Observatory. IV - X-ray dependence on radio emission

NASA Technical Reports Server (NTRS)

Worrall, D. M.; Tananbaum, H.; Giommi, P.; Zamorani, G.

1987-01-01

The X-ray properties of a sample of 114 radio-loud quasars observed with the Einstein Observatory are examined, and the results are compared with those obtained from a large sample of radio-quiet quasars. The results of statistical analysis of the dependence of X-ray luminosity on combined functions of optical and radio luminosity show that the dependence on both luminosities is important. However, statistically significant differences are found between subsamples of flat radio spectra quasars and steep radio spectra quasars with regard to dependence of X-ray luminosity on only radio luminosity. The data are consistent with radio-loud quasars having a physical component, not directly related to the optical luminosity, which produces the core radio luminosity plus 'extra' X-ray emission.
Influence of case definition on incidence and outcome of acute coronary syndromes

PubMed Central

Torabi, Azam; Cleland, John G F; Sherwi, Nasser; Atkin, Paul; Panahi, Hossein; Kilpatrick, Eric; Thackray, Simon; Hoye, Angela; Alamgir, Farqad; Goode, Kevin; Rigby, Alan; Clark, Andrew L

2016-01-01

Objective Acute coronary syndromes (ACS) are common, but their incidence and outcome might depend greatly on how data are collected. We compared case ascertainment rates for ACS and myocardial infarction (MI) in a single institution using several different strategies. Methods The Hull and East Yorkshire Hospitals serve a population of ∼560 000. Patients admitted with ACS to cardiology or general medical wards were identified prospectively by trained nurses during 2005. Patients with a death or discharge code of MI were also identified by the hospital information department and, independently, from Myocardial Infarction National Audit Project (MINAP) records. The hospital laboratory identified all patients with an elevated serum troponin-T (TnT) by contemporary criteria (>0.03 µg/L in 2005). Results The prospective survey identified 1731 admissions (1439 patients) with ACS, including 764 admissions (704 patients) with MIs. The hospital information department reported only 552 admissions (544 patients) with MI and only 206 admissions (203 patients) were reported to the MINAP. Using all 3 strategies, 934 admissions (873 patients) for MI were identified, for which TnT was >1 µg/L in 443, 0.04–1.0 µg/L in 435, ≤0.03 µg/L in 19 and not recorded in 37. A further 823 patients had TnT >0.03 µg/L, but did not have ACS ascertained by any survey method. Of the 873 patients with MI, 146 (16.7%) died during admission and 218 (25.0%) by 1 year, but ranging from 9% for patients enrolled in the MINAP to 27% for those identified by the hospital information department. Conclusions MINAP and hospital statistics grossly underestimated the incidence of MI managed by our hospital. The 1-year mortality was highly dependent on the method of ascertainment. PMID:28123755
Epigenomic study identifies a novel mesenchyme homeobox2-GLI1 transcription axis involved in cancer drug resistance, overall survival and therapy prognosis in lung cancer patients

PubMed Central

Armas-López, Leonel; Piña-Sánchez, Patricia; Arrieta, Oscar; de Alba, Enrique Guzman; Ortiz-Quintero, Blanca; Santillán-Doherty, Patricio; Christiani, David C.; Zúñiga, Joaquín; Ávila-Moreno, Federico

2017-01-01

Several homeobox-related gene (HOX) transcription factors such as mesenchyme HOX-2 (MEOX2) have previously been associated with cancer drug resistance, malignant progression and/or clinical prognostic responses in lung cancer patients; however, the mechanisms involved in these responses have yet to be elucidated. Here, an epigenomic strategy was implemented to identify novel MEOX2 gene promoter transcription targets and propose a new molecular mechanism underlying lung cancer drug resistance and poor clinical prognosis. Chromatin immunoprecipitation (ChIP) assays derived from non-small cell lung carcinomas (NSCLC) hybridized on gene promoter tiling arrays and bioinformatics analyses were performed, and quantitative, functional and clinical validation were also carried out. We statistically identified a common profile consisting of 78 gene promoter targets, including Hedgehog-GLI1 gene promoter sequences (FDR≤0.1 and FDR≤0.2). The GLI-1 gene promoter region from −2,192 to −109 was occupied by MEOX2, accompanied by transcriptionally active RNA Pol II and was epigenetically linked to the active histones H3K27Ac and H3K4me3; these associations were quantitatively validated. Moreover, siRNA genetic silencing assays identified a MEOX2-GLI1 axis involved in cellular cytotoxic resistance to cisplatinum in a dose-dependent manner, as well as cellular migration and proliferation. Finally, Kaplan-Maier survival analyses identified significant MEOX2-dependent GLI-1 protein expression associated with clinical progression and poorer overall survival using an independent cohort of NSCLC patients undergoing platinum-based oncological therapy with both epidermal growth factor receptor (EGFR)-non-mutated and EGFR-mutated status. In conclusion, this is the first study to investigate epigenome-wide MEOX2-transcription factor occupation identifying a novel overexpressed MEOX2-GLI1 axis and its clinical association with platinum-based cancer drug resistance and EGFR-tyrosine kinase inhibitor (TKI)-based therapy responses in NSCLC patients. PMID:28978016
Charge and energy dependence of the residence time of cosmic ray nuclei below 15 GeV/nucleon

NASA Technical Reports Server (NTRS)

Soutoul, A.; Engelmann, J. J.; Ferrando, P.; Koch-Miramond, L.; Masse, P.; Webber, W. R.

1985-01-01

The relative abundance of nuclear species measured in cosmic rays at Earth has often been interpreted with the simple leaky box model. For this model to be consistent an essential requirement is that the escape length does not depend on the nuclear species. The discrepancy between escape length values derived from iron secondaries and from the B/C ratio was identified by Garcia-Munoz and his co-workers using a large amount of experimental data. Ormes and Protheroe found a similar trend in the HEAO data although they questioned its significance against uncertainties. They also showed that the change in the B/C ratio values implies a decrease of the residence time of cosmic rays at low energies in conflict with the diffusive convective picture. These conclusions crucially depend on the partial cross section values and their uncertainties. Recently new accurate cross sections of key importance for propagation calculations have been measured. Their statistical uncertainties are often better than 4% and their values significantly different from those previously accepted. Here, these new cross sections are used to compare the observed B/C+O and (Sc to Cr)/Fe ratio to those predicted with the simple leaky box model.
Business Statistics Education: Content and Software in Undergraduate Business Statistics Courses.

ERIC Educational Resources Information Center

Tabatabai, Manouchehr; Gamble, Ralph

1997-01-01

Survey responses from 204 of 500 business schools identified most often topics in business statistics I and II courses. The most popular software at both levels was Minitab. Most schools required both statistics I and II. (SK)
Identifiability of PBPK Models with Applications to Dimethylarsinic Acid Exposure

EPA Science Inventory

Any statistical model should be identifiable in order for estimates and tests using it to be meaningful. We consider statistical analysis of physiologically-based pharmacokinetic (PBPK) models in which parameters cannot be estimated precisely from available data, and discuss diff...
DOE Office of Scientific and Technical Information (OSTI.GOV)

Sigeti, David E.; Pelak, Robert A.

We present a Bayesian statistical methodology for identifying improvement in predictive simulations, including an analysis of the number of (presumably expensive) simulations that will need to be made in order to establish with a given level of confidence that an improvement has been observed. Our analysis assumes the ability to predict (or postdict) the same experiments with legacy and new simulation codes and uses a simple binomial model for the probability, {theta}, that, in an experiment chosen at random, the new code will provide a better prediction than the old. This model makes it possible to do statistical analysis withmore » an absolute minimum of assumptions about the statistics of the quantities involved, at the price of discarding some potentially important information in the data. In particular, the analysis depends only on whether or not the new code predicts better than the old in any given experiment, and not on the magnitude of the improvement. We show how the posterior distribution for {theta} may be used, in a kind of Bayesian hypothesis testing, both to decide if an improvement has been observed and to quantify our confidence in that decision. We quantify the predictive probability that should be assigned, prior to taking any data, to the possibility of achieving a given level of confidence, as a function of sample size. We show how this predictive probability depends on the true value of {theta} and, in particular, how there will always be a region around {theta} = 1/2 where it is highly improbable that we will be able to identify an improvement in predictive capability, although the width of this region will shrink to zero as the sample size goes to infinity. We show how the posterior standard deviation may be used, as a kind of 'plan B metric' in the case that the analysis shows that {theta} is close to 1/2 and argue that such a plan B should generally be part of hypothesis testing. All the analysis presented in the paper is done with a general beta-function prior for {theta}, enabling sequential analysis in which a small number of new simulations may be done and the resulting posterior for {theta} used as a prior to inform the next stage of power analysis.« less
Statistical Approaches to Adjusting Weights for Dependent Arms in Network Meta-analysis.

PubMed

Su, Yu-Xuan; Tu, Yu-Kang

2018-05-22

Network meta-analysis compares multiple treatments in terms of their efficacy and harm by including evidence from randomized controlled trials. Most clinical trials use parallel design, where patients are randomly allocated to different treatments and receive only one treatment. However, some trials use within person designs such as split-body, split-mouth and cross-over designs, where each patient may receive more than one treatment. Data from treatment arms within these trials are no longer independent, so the correlations between dependent arms need to be accounted for within the statistical analyses. Ignoring these correlations may result in incorrect conclusions. The main objective of this study is to develop statistical approaches to adjusting weights for dependent arms within special design trials. In this study, we demonstrate the following three approaches: the data augmentation approach, the adjusting variance approach, and the reducing weight approach. These three methods could be perfectly applied in current statistic tools such as R and STATA. An example of periodontal regeneration was used to demonstrate how these approaches could be undertaken and implemented within statistical software packages, and to compare results from different approaches. The adjusting variance approach can be implemented within the network package in STATA, while reducing weight approach requires computer software programming to set up the within-study variance-covariance matrix. This article is protected by copyright. All rights reserved.
The precise time-dependent solution of the Fokker–Planck equation with anomalous diffusion

DOE Office of Scientific and Technical Information (OSTI.GOV)

Guo, Ran; Du, Jiulin, E-mail: jiulindu@aliyun.com

2015-08-15

We study the time behavior of the Fokker–Planck equation in Zwanzig’s rule (the backward-Ito’s rule) based on the Langevin equation of Brownian motion with an anomalous diffusion in a complex medium. The diffusion coefficient is a function in momentum space and follows a generalized fluctuation–dissipation relation. We obtain the precise time-dependent analytical solution of the Fokker–Planck equation and at long time the solution approaches to a stationary power-law distribution in nonextensive statistics. As a test, numerically we have demonstrated the accuracy and validity of the time-dependent solution. - Highlights: • The precise time-dependent solution of the Fokker–Planck equation with anomalousmore » diffusion is found. • The anomalous diffusion satisfies a generalized fluctuation–dissipation relation. • At long time the time-dependent solution approaches to a power-law distribution in nonextensive statistics. • Numerically we have demonstrated the accuracy and validity of the time-dependent solution.« less
Age-dependent risk factors for malnutrition in traumatology and orthopedic patients.

PubMed

Lambert, Christine; Nüssler, Andreas; Biesalski, Hans Konrad; Freude, Thomas; Bahrs, Christian; Ochs, Gunnar; Flesch, Ingo; Stöckle, Ulrich; Ihle, Christoph

2017-05-01

The aim of this study was to investigate the prevalence of risk of malnutrition (RoM) in an orthopedic and traumatology patient cohort with a broad range of ages. In addition to the classical indicators for risk assessment (low body mass index, weight loss, and comorbidity), this study aimed to analyze the effects of lifestyle factors (eating pattern, smoking, physical activity) on RoM. The prospective cohort study included 1053 patients in a level 1 trauma center in Germany. RoM was assessed by Nutritional Risk Screening (NRS) 2002 and for the elderly additionally by Mini Nutritional Assessment (MNA). Age-dependent risk factors identified in univariate statistical analysis were used for multivariate logistic regression models. The prevalence of patients at RoM (NRS ≥3) was 22%. In the three age categories (<50 y, 50-69 y, and ≥70 y), loss of appetite, weight loss, number of comorbidities, drugs and gastrointestinal symptoms significantly increased RoM in univariate statistical analysis. In patients ages ≥70 y, several disease- and lifestyle-related factors (not living at home, less frequent consumption of vegetables and whole meal bread, low physical activity, and smoking) were associated with RoM. Multivariate logistic regression model for the total study population identified weight loss (odds ratio [OR], 6.09; 95% confidence interval [CI], 4.14-8.83), loss of appetite (OR, 3.81; 95% CI, 2.52-5.78), age-specific low BMI (OR, 1.87; 95% CI, 1.18-2.97), number of drugs taken (OR, 1.19; 95% CI, 1.12-1.26), age (OR, 1.03; 95% CI, 1.02-1.04), and days per week with vegetable consumption (OR, 0.938; 95% CI, 0.89-0.99) as risk factors. Malnutrition in trauma and orthopedic patients is not only a problem related to age. Lifestyle-related factors also contribute significantly to malnutrition in geriatric patients. Copyright © 2017 Elsevier Inc. All rights reserved.
On the Spike Train Variability Characterized by Variance-to-Mean Power Relationship.

PubMed

Koyama, Shinsuke

2015-07-01

We propose a statistical method for modeling the non-Poisson variability of spike trains observed in a wide range of brain regions. Central to our approach is the assumption that the variance and the mean of interspike intervals are related by a power function characterized by two parameters: the scale factor and exponent. It is shown that this single assumption allows the variability of spike trains to have an arbitrary scale and various dependencies on the firing rate in the spike count statistics, as well as in the interval statistics, depending on the two parameters of the power function. We also propose a statistical model for spike trains that exhibits the variance-to-mean power relationship. Based on this, a maximum likelihood method is developed for inferring the parameters from rate-modulated spike trains. The proposed method is illustrated on simulated and experimental spike trains.
Recurrence of attic cholesteatoma: different methods of estimating recurrence rates.

PubMed

Stangerup, S E; Drozdziewicz, D; Tos, M; Hougaard-Jensen, A

2000-09-01

One problem in cholesteatoma surgery is recurrence of cholesteatoma, which is reported to vary from 5% to 71%. This great variability can be explained by issues such as the type of cholesteatoma, surgical technique, follow-up rate, length of the postoperative observation period, and statistical method applied. The aim of this study was to illustrate the impact of applying different statistical methods to the same material. Thirty-three children underwent single-stage surgery for attic cholesteatoma during a 15-year period. Thirty patients (94%) attended a re-evaluation. During the observation period of 15 years, recurrence of cholesteatoma occurred in 10 ears. The cumulative total recurrence rate varied from 30% to 67%, depending on the statistical method applied. In conclusion, the choice of statistical method should depend on the number of patients, follow-up rates, length of the postoperative observation period and presence of censored data.
Reaction Event Counting Statistics of Biopolymer Reaction Systems with Dynamic Heterogeneity.

PubMed

Lim, Yu Rim; Park, Seong Jun; Park, Bo Jung; Cao, Jianshu; Silbey, Robert J; Sung, Jaeyoung

2012-04-10

We investigate the reaction event counting statistics (RECS) of an elementary biopolymer reaction in which the rate coefficient is dependent on states of the biopolymer and the surrounding environment and discover a universal kinetic phase transition in the RECS of the reaction system with dynamic heterogeneity. From an exact analysis for a general model of elementary biopolymer reactions, we find that the variance in the number of reaction events is dependent on the square of the mean number of the reaction events when the size of measurement time is small on the relaxation time scale of rate coefficient fluctuations, which does not conform to renewal statistics. On the other hand, when the size of the measurement time interval is much greater than the relaxation time of rate coefficient fluctuations, the variance becomes linearly proportional to the mean reaction number in accordance with renewal statistics. Gillespie's stochastic simulation method is generalized for the reaction system with a rate coefficient fluctuation. The simulation results confirm the correctness of the analytic results for the time dependent mean and variance of the reaction event number distribution. On the basis of the obtained results, we propose a method of quantitative analysis for the reaction event counting statistics of reaction systems with rate coefficient fluctuations, which enables one to extract information about the magnitude and the relaxation times of the fluctuating reaction rate coefficient, without a bias that can be introduced by assuming a particular kinetic model of conformational dynamics and the conformation dependent reactivity. An exact relationship is established between a higher moment of the reaction event number distribution and the multitime correlation of the reaction rate for the reaction system with a nonequilibrium initial state distribution as well as for the system with the equilibrium initial state distribution.
An Assessment Blueprint for EncStat: A Statistics Anxiety Intervention Program.

ERIC Educational Resources Information Center

Watson, Freda S.; Lang, Thomas R.; Kromrey, Jeffrey D.; Ferron, John M.; Hess, Melinda R.; Hogarty, Kristine Y.

EncStat (Encouraged about Statistics) is a multimedia program being developed to identify and assist students with statistics anxiety or negative attitudes about statistics. This study explored the validity of the assessment instruments included in EncStat with respect to their diagnostic value for statistics anxiety and negative attitudes about…

5 CFR 1630.2 - Definitions.

Code of Federal Regulations, 2011 CFR

2011-01-01

.... The participant's Social Security number will remain the identifier for the submission of data and... individual or other identifying particular assigned to the individual; (l) Statistical record means a record in a system of records maintained for statistical research or reporting purposes only and not used in...
39 CFR 262.5 - Systems (Privacy).

Code of Federal Regulations, 2010 CFR

2010-07-01

..., partnerships or corporations. A business firm identified by the name of one or more persons is not an... computer matches are specifically excluded from the term “matching program”: (i) Statistical matches whose purpose is solely to produce aggregate data stripped of personal identifiers. (ii) Statistical matches...
Statistical prediction of September Arctic Sea Ice minimum based on stable teleconnections with global climate and oceanic patterns

NASA Astrophysics Data System (ADS)

Ionita, M.; Grosfeld, K.; Scholz, P.; Lohmann, G.

2016-12-01

Sea ice in both Polar Regions is an important indicator for the expression of global climate change and its polar amplification. Consequently, a broad information interest exists on sea ice, its coverage, variability and long term change. Knowledge on sea ice requires high quality data on ice extent, thickness and its dynamics. However, its predictability depends on various climate parameters and conditions. In order to provide insights into the potential development of a monthly/seasonal signal, we developed a robust statistical model based on ocean heat content, sea surface temperature and atmospheric variables to calculate an estimate of the September minimum sea ice extent for every year. Although previous statistical attempts at monthly/seasonal forecasts of September sea ice minimum show a relatively reduced skill, here it is shown that more than 97% (r = 0.98) of the September sea ice extent can predicted three months in advance by using previous months conditions via a multiple linear regression model based on global sea surface temperature (SST), mean sea level pressure (SLP), air temperature at 850hPa (TT850), surface winds and sea ice extent persistence. The statistical model is based on the identification of regions with stable teleconnections between the predictors (climatological parameters) and the predictand (here sea ice extent). The results based on our statistical model contribute to the sea ice prediction network for the sea ice outlook report (https://www.arcus.org/sipn) and could provide a tool for identifying relevant regions and climate parameters that are important for the sea ice development in the Arctic and for detecting sensitive and critical regions in global coupled climate models with focus on sea ice formation.
Surface Ozone Variability and Trends over the South African Highveld from 1990 to 2007

NASA Technical Reports Server (NTRS)

Balashov, Nikolay V.; Thompson, Anne M.; Piketh, Stuart J.; Langerman, Kristy E.

2014-01-01

Surface ozone is a secondary air pollutant formed from reactions between nitrogen oxides (NOx = NO + NO2) and volatile organic compounds in the presence of sunlight. In this work we examine effects of the climate pattern known as the El Niño-Southern Oscillation (ENSO) and NOx variability on surface ozone from 1990 to 2007 over the South African Highveld, a heavily populated region in South Africa with numerous industrial facilities. Over summer and autumn (December-May) on the Highveld, El Niño, as signified by positive sea surface temperature (SST) anomalies over the central Pacific Ocean, is typically associated with drier and warmer than normal conditions favoring ozone formation. Conversely, La Niña, or negative SST anomalies over the central Pacific Ocean, is typically associated with cloudier and above normal rainfall conditions, hindering ozone production. We use a generalized regression model to identify any linear dependence that the Highveld ozone, measured at five air quality monitoring stations, may have on ENSO and NOx. Our results indicate that four out of the five stations exhibit a statistically significant sensitivity to ENSO at some point over the December-May period where El Niño amplifies ozone formation and La Niña reduces ozone formation. Three out of the five stations reveal statistically significant sensitivity to NOx variability, primarily in winter and spring. Accounting for ENSO and NOx effects throughout the study period of 18 years, two stations exhibit statistically significant negative ozone trends in spring, one station displays a statistically significant positive trend in August, and two stations show no statistically significant change in surface ozone.
The Brightness of Colour

PubMed Central

Corney, David; Haynes, John-Dylan; Rees, Geraint; Lotto, R. Beau

2009-01-01

Background The perception of brightness depends on spatial context: the same stimulus can appear light or dark depending on what surrounds it. A less well-known but equally important contextual phenomenon is that the colour of a stimulus can also alter its brightness. Specifically, stimuli that are more saturated (i.e. purer in colour) appear brighter than stimuli that are less saturated at the same luminance. Similarly, stimuli that are red or blue appear brighter than equiluminant yellow and green stimuli. This non-linear relationship between stimulus intensity and brightness, called the Helmholtz-Kohlrausch (HK) effect, was first described in the nineteenth century but has never been explained. Here, we take advantage of the relative simplicity of this ‘illusion’ to explain it and contextual effects more generally, by using a simple Bayesian ideal observer model of the human visual ecology. We also use fMRI brain scans to identify the neural correlates of brightness without changing the spatial context of the stimulus, which has complicated the interpretation of related fMRI studies. Results Rather than modelling human vision directly, we use a Bayesian ideal observer to model human visual ecology. We show that the HK effect is a result of encoding the non-linear statistical relationship between retinal images and natural scenes that would have been experienced by the human visual system in the past. We further show that the complexity of this relationship is due to the response functions of the cone photoreceptors, which themselves are thought to represent an efficient solution to encoding the statistics of images. Finally, we show that the locus of the response to the relationship between images and scenes lies in the primary visual cortex (V1), if not earlier in the visual system, since the brightness of colours (as opposed to their luminance) accords with activity in V1 as measured with fMRI. Conclusions The data suggest that perceptions of brightness represent a robust visual response to the likely sources of stimuli, as determined, in this instance, by the known statistical relationship between scenes and their retinal responses. While the responses of the early visual system (receptors in this case) may represent specifically the statistics of images, post receptor responses are more likely represent the statistical relationship between images and scenes. A corollary of this suggestion is that the visual cortex is adapted to relate the retinal image to behaviour given the statistics of its past interactions with the sources of retinal images: the visual cortex is adapted to the signals it receives from the eyes, and not directly to the world beyond. PMID:19333398
Quantifying risk of penile prosthesis infection with elevated glycosylated hemoglobin.

PubMed

Wilson, S K; Carson, C C; Cleves, M A; Delk, J R

1998-05-01

Elevation of glycosylated hemoglobin above levels of 11.5 mg.% has been considered a contraindication to penile prosthesis implantation in diabetic patients. We determine the predictive value of glycosylated hemoglobin A1C in penile prosthesis infections in diabetic and nondiabetic patients to confirm or deny this prevalent opinion. We conducted a 2-year prospective study of 389 patients, including 114 diabetics, who underwent 3-piece penile prosthesis implantation. All patients had similar preoperative preparation without regard to diabetic status, control or glycosylated hemoglobin A1C level. Risk of infection was statistically analyzed for diabetics versus nondiabetics, glycosylated hemoglobin A1C values above and below 11.5 mg.%, insulin dependent versus oral medication diabetics, and fasting blood sugars above and below 180 mg.%. Prosthesis infections developed in 10 diabetics (8.7%) and 11 nondiabetics (4.0%). No increased infection rate was observed in diabetics with high fasting sugars or diabetics on insulin. There was no statistically significant increased infection risk with increased levels of glycosylated hemoglobin A1C among all patients or among only the diabetics. In fact, there was no meaningful difference in the median or mean level of glycosylated hemoglobin A1C in the infected and noninfected patients regardless of diabetes. Use of glycosylated hemoglobin A1C values to identify and exclude surgical candidates with increased risk of infections is not proved by this study. Elevation of fasting sugar or insulin dependence also does not increase risk of infection in diabetics undergoing prosthesis implantation.
Application of statistical mining in healthcare data management for allergic diseases

NASA Astrophysics Data System (ADS)

Wawrzyniak, Zbigniew M.; Martínez Santolaya, Sara

2014-11-01

The paper aims to discuss data mining techniques based on statistical tools in medical data management in case of long-term diseases. The data collected from a population survey is the source for reasoning and identifying disease processes responsible for patient's illness and its symptoms, and prescribing a knowledge and decisions in course of action to correct patient's condition. The case considered as a sample of constructive approach to data management is a dependence of allergic diseases of chronic nature on some symptoms and environmental conditions. The knowledge summarized in a systematic way as accumulated experience constitutes to an experiential simplified model of the diseases with feature space constructed of small set of indicators. We have presented the model of disease-symptom-opinion with knowledge discovery for data management in healthcare. The feature is evident that the model is purely data-driven to evaluate the knowledge of the diseases` processes and probability dependence of future disease events on symptoms and other attributes. The example done from the outcomes of the survey of long-term (chronic) disease shows that a small set of core indicators as 4 or more symptoms and opinions could be very helpful in reflecting health status change over disease causes. Furthermore, the data driven understanding of the mechanisms of diseases gives physicians the basis for choices of treatment what outlines the need of data governance in this research domain of discovered knowledge from surveys.
Remodeling Pearson's Correlation for Functional Brain Network Estimation and Autism Spectrum Disorder Identification.

PubMed

Li, Weikai; Wang, Zhengxia; Zhang, Limei; Qiao, Lishan; Shen, Dinggang

2017-01-01

Functional brain network (FBN) has been becoming an increasingly important way to model the statistical dependence among neural time courses of brain, and provides effective imaging biomarkers for diagnosis of some neurological or psychological disorders. Currently, Pearson's Correlation (PC) is the simplest and most widely-used method in constructing FBNs. Despite its advantages in statistical meaning and calculated performance, the PC tends to result in a FBN with dense connections. Therefore, in practice, the PC-based FBN needs to be sparsified by removing weak (potential noisy) connections. However, such a scheme depends on a hard-threshold without enough flexibility. Different from this traditional strategy, in this paper, we propose a new approach for estimating FBNs by remodeling PC as an optimization problem, which provides a way to incorporate biological/physical priors into the FBNs. In particular, we introduce an L 1 -norm regularizer into the optimization model for obtaining a sparse solution. Compared with the hard-threshold scheme, the proposed framework gives an elegant mathematical formulation for sparsifying PC-based networks. More importantly, it provides a platform to encode other biological/physical priors into the PC-based FBNs. To further illustrate the flexibility of the proposed method, we extend the model to a weighted counterpart for learning both sparse and scale-free networks, and then conduct experiments to identify autism spectrum disorders (ASD) from normal controls (NC) based on the constructed FBNs. Consequently, we achieved an 81.52% classification accuracy which outperforms the baseline and state-of-the-art methods.
Validity and reliability of three definitions of hip osteoarthritis: cross sectional and longitudinal approach.

PubMed

Reijman, M; Hazes, J M W; Pols, H A P; Bernsen, R M D; Koes, B W; Bierma-Zeinstra, S M A

2004-11-01

To compare the reliability and validity in a large open population of three frequently used radiological definitions of hip osteoarthritis (OA): Kellgren and Lawrence grade, minimal joint space (MJS), and Croft grade; and to investigate whether the validity of the three definitions of hip OA is sex dependent. from the Rotterdam study (aged > or= 55 years, n = 3585) were evaluated. The inter-rater reliability was tested in a random set of 148 x rays. The validity was expressed as the ability to identify patients who show clinical symptoms of hip OA (construct validity) and as the ability to predict total hip replacement (THR) at follow up (predictive validity). Inter-rater reliability was similar for the Kellgren and Lawrence grade and MJS (kappa statistics 0.68 and 0.62, respectively) but lower for Croft's grade (kappa statistic, 0.51). The Kellgren and Lawrence grade and MJS showed the strongest associations with clinical symptoms of hip OA. Sex appeared to be an effect modifier for Kellgren and Lawrence and MJS definitions, women showing a stronger association between grading and symptoms than men. However, the sex dependency was attributed to differences in height between women and men. The Kellgren and Lawrence grade showed the highest predictive value for THR at follow up. Based on these findings, Kellgren and Lawrence still appears to be a useful OA definition for epidemiological studies focusing on the presence of hip OA.
Molecular-dynamics study of propane-hydrate dissociation: Fluctuation-dissipation and non-equilibrium analysis.

PubMed

Ghaani, Mohammad Reza; English, Niall J

2018-03-21

Equilibrium and non-equilibrium molecular-dynamics (MD) simulations have been performed to investigate thermal-driven break-up of planar propane-hydrate interfaces in contact with liquid water over the 260-320 K range. Two types of hydrate-surface water-lattice molecular termination were adopted, at the hydrate edge with water, for comparison: a 001-direct surface cleavage and one with completed cages. Statistically significant differences in melting temperatures and initial break-up rates were observed between both interface types. Dissociation rates were observed to be strongly dependent on temperature, with higher rates at larger over-temperatures vis-à-vis melting. A simple coupled mass and heat transfer model, developed previously, was applied to fit the observed dissociation profiles, and this helps us to identify clearly two distinct hydrate-decomposition régimes; following a highly temperature-dependent break-up phase, a second well-defined stage is essentially independent of temperature, in which the remaining nanoscale, de facto two-dimensional system's lattice framework is intrinsically unstable. Further equilibrium MD-analysis of the two-phase systems at their melting point, with consideration of the relaxation times gleaned from the auto-correlation functions of fluctuations in a number of enclathrated guest molecules, led to statistically significant differences between the two surface-termination cases; a consistent correlation emerged in both cases between the underlying, non-equilibrium, thermal-driven dissociation rates sampled directly from melting with that from an equilibrium-MD fluctuation-dissipation approach.
Molecular-dynamics study of propane-hydrate dissociation: Fluctuation-dissipation and non-equilibrium analysis

NASA Astrophysics Data System (ADS)

Ghaani, Mohammad Reza; English, Niall J.

2018-03-01

Equilibrium and non-equilibrium molecular-dynamics (MD) simulations have been performed to investigate thermal-driven break-up of planar propane-hydrate interfaces in contact with liquid water over the 260-320 K range. Two types of hydrate-surface water-lattice molecular termination were adopted, at the hydrate edge with water, for comparison: a 001-direct surface cleavage and one with completed cages. Statistically significant differences in melting temperatures and initial break-up rates were observed between both interface types. Dissociation rates were observed to be strongly dependent on temperature, with higher rates at larger over-temperatures vis-à-vis melting. A simple coupled mass and heat transfer model, developed previously, was applied to fit the observed dissociation profiles, and this helps us to identify clearly two distinct hydrate-decomposition régimes; following a highly temperature-dependent break-up phase, a second well-defined stage is essentially independent of temperature, in which the remaining nanoscale, de facto two-dimensional system's lattice framework is intrinsically unstable. Further equilibrium MD-analysis of the two-phase systems at their melting point, with consideration of the relaxation times gleaned from the auto-correlation functions of fluctuations in a number of enclathrated guest molecules, led to statistically significant differences between the two surface-termination cases; a consistent correlation emerged in both cases between the underlying, non-equilibrium, thermal-driven dissociation rates sampled directly from melting with that from an equilibrium-MD fluctuation-dissipation approach.
Maritime Transportation Risk Assessment of Tianjin Port with Bayesian Belief Networks.

PubMed

Zhang, Jinfen; Teixeira, Ângelo P; Guedes Soares, C; Yan, Xinping; Liu, Kezhong

2016-06-01

This article develops a Bayesian belief network model for the prediction of accident consequences in the Tianjin port. The study starts with a statistical analysis of historical accident data of six years from 2008 to 2013. Then a Bayesian belief network is constructed to express the dependencies between the indicator variables and accident consequences. The statistics and expert knowledge are synthesized in the Bayesian belief network model to obtain the probability distribution of the consequences. By a sensitivity analysis, several indicator variables that have influence on the consequences are identified, including navigational area, ship type and time of the day. The results indicate that the consequences are most sensitive to the position where the accidents occurred, followed by time of day and ship length. The results also reflect that the navigational risk of the Tianjin port is at the acceptable level, despite that there is more room of improvement. These results can be used by the Maritime Safety Administration to take effective measures to enhance maritime safety in the Tianjin port. © 2016 Society for Risk Analysis.
The accurate assessment of small-angle X-ray scattering data

DOE PAGES

Grant, Thomas D.; Luft, Joseph R.; Carter, Lester G.; ...

2015-01-23

Small-angle X-ray scattering (SAXS) has grown in popularity in recent times with the advent of bright synchrotron X-ray sources, powerful computational resources and algorithms enabling the calculation of increasingly complex models. However, the lack of standardized data-quality metrics presents difficulties for the growing user community in accurately assessing the quality of experimental SAXS data. Here, a series of metrics to quantitatively describe SAXS data in an objective manner using statistical evaluations are defined. These metrics are applied to identify the effects of radiation damage, concentration dependence and interparticle interactions on SAXS data from a set of 27 previously described targetsmore » for which high-resolution structures have been determined via X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy. Studies show that these metrics are sufficient to characterize SAXS data quality on a small sample set with statistical rigor and sensitivity similar to or better than manual analysis. The development of data-quality analysis strategies such as these initial efforts is needed to enable the accurate and unbiased assessment of SAXS data quality.« less
A statistical model to estimate the local vulnerability to severe weather

NASA Astrophysics Data System (ADS)

Pardowitz, Tobias

2018-06-01

We present a spatial analysis of weather-related fire brigade operations in Berlin. By comparing operation occurrences to insured losses for a set of severe weather events we demonstrate the representativeness and usefulness of such data in the analysis of weather impacts on local scales. We investigate factors influencing the local rate of operation occurrence. While depending on multiple factors - which are often not available - we focus on publicly available quantities. These include topographic features, land use information based on satellite data and information on urban structure based on data from the OpenStreetMap project. After identifying suitable predictors such as housing coverage or local density of the road network we set up a statistical model to be able to predict the average occurrence frequency of local fire brigade operations. Such model can be used to determine potential hotspots for weather impacts even in areas or cities where no systematic records are available and can thus serve as a basis for a broad range of tools or applications in emergency management and planning.
Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data

NASA Technical Reports Server (NTRS)

Ulbrich, Norbert Manfred; Volden, Thomas R.

2010-01-01

The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.
Use of statistical design of experiments for surface modification of Kapton films by CF4sbnd O2 microwave plasma treatment

NASA Astrophysics Data System (ADS)

Grandoni, Andrea; Mannini, Giacomo; Glisenti, Antonella; Manariti, Antonella; Galli, Giancarlo

2017-10-01

A statistical design of experiments (DoE) was used to evaluate the effects of CF4sbnd O2 plasma on Kapton films in which the duration of treatment, volume ratio of plasma gases, and microwave power were selected as effective experimental factors for systematic investigation of surface modification. Static water contact angle (θW), polar component of surface free energy (γSp) and surface O/C atomic ratio were analyzed as response variables. A significant enhancement in wettability and polarity of the treated films compared to untreated Kapton films was observed; depending on the experimental conditions, θW very significantly decreased, showing full wettability, and γSp rose dramatically, up to ten times. Within the DoE the conditions of plasma treatment were identified that resulted in selected optimal values of θW, γSp and O/C responses. Surface chemical changes were detected by XPS and ATR-IR investigations that evidenced both the introduction of fluorinated groups and the opening of the imide ring in the plasma-treated films.
Probabilistic biological network alignment.

PubMed

Todor, Andrei; Dobra, Alin; Kahveci, Tamer

2013-01-01

Interactions between molecules are probabilistic events. An interaction may or may not happen with some probability, depending on a variety of factors such as the size, abundance, or proximity of the interacting molecules. In this paper, we consider the problem of aligning two biological networks. Unlike existing methods, we allow one of the two networks to contain probabilistic interactions. Allowing interaction probabilities makes the alignment more biologically relevant at the expense of explosive growth in the number of alternative topologies that may arise from different subsets of interactions that take place. We develop a novel method that efficiently and precisely characterizes this massive search space. We represent the topological similarity between pairs of aligned molecules (i.e., proteins) with the help of random variables and compute their expected values. We validate our method showing that, without sacrificing the running time performance, it can produce novel alignments. Our results also demonstrate that our method identifies biologically meaningful mappings under a comprehensive set of criteria used in the literature as well as the statistical coherence measure that we developed to analyze the statistical significance of the similarity of the functions of the aligned protein pairs.
Statistical models for RNA-seq data derived from a two-condition 48-replicate experiment.

PubMed

Gierliński, Marek; Cole, Christian; Schofield, Pietà; Schurch, Nicholas J; Sherstnev, Alexander; Singh, Vijender; Wrobel, Nicola; Gharbi, Karim; Simpson, Gordon; Owen-Hughes, Tom; Blaxter, Mark; Barton, Geoffrey J

2015-11-15

High-throughput RNA sequencing (RNA-seq) is now the standard method to determine differential gene expression. Identifying differentially expressed genes crucially depends on estimates of read-count variability. These estimates are typically based on statistical models such as the negative binomial distribution, which is employed by the tools edgeR, DESeq and cuffdiff. Until now, the validity of these models has usually been tested on either low-replicate RNA-seq data or simulations. A 48-replicate RNA-seq experiment in yeast was performed and data tested against theoretical models. The observed gene read counts were consistent with both log-normal and negative binomial distributions, while the mean-variance relation followed the line of constant dispersion parameter of ∼0.01. The high-replicate data also allowed for strict quality control and screening of 'bad' replicates, which can drastically affect the gene read-count distribution. RNA-seq data have been submitted to ENA archive with project ID PRJEB5348. g.j.barton@dundee.ac.uk. © The Author 2015. Published by Oxford University Press.
Statistical models for RNA-seq data derived from a two-condition 48-replicate experiment

PubMed Central

Cole, Christian; Schofield, Pietà; Schurch, Nicholas J.; Sherstnev, Alexander; Singh, Vijender; Wrobel, Nicola; Gharbi, Karim; Simpson, Gordon; Owen-Hughes, Tom; Blaxter, Mark; Barton, Geoffrey J.

2015-01-01

Motivation: High-throughput RNA sequencing (RNA-seq) is now the standard method to determine differential gene expression. Identifying differentially expressed genes crucially depends on estimates of read-count variability. These estimates are typically based on statistical models such as the negative binomial distribution, which is employed by the tools edgeR, DESeq and cuffdiff. Until now, the validity of these models has usually been tested on either low-replicate RNA-seq data or simulations. Results: A 48-replicate RNA-seq experiment in yeast was performed and data tested against theoretical models. The observed gene read counts were consistent with both log-normal and negative binomial distributions, while the mean-variance relation followed the line of constant dispersion parameter of ∼0.01. The high-replicate data also allowed for strict quality control and screening of ‘bad’ replicates, which can drastically affect the gene read-count distribution. Availability and implementation: RNA-seq data have been submitted to ENA archive with project ID PRJEB5348. Contact: g.j.barton@dundee.ac.uk PMID:26206307
Self-efficacy: a mediator of smoking behavior and depression among college students.

PubMed

Mee, Susan

2014-01-01

Cigarette smoking is a growing problem among adolescents. This correlational study tested theoretical relationships between the dependent variable (smoking behavior) and the independent variables (depression and smoking resistance self-efficacy) in a convenience sample of 364 college students ages 18 to 21 years recruited from a large urban public college. An a priori mediational model tested the role of smoking resistance self-efficacy as a mediator in the relationship between smoking behavior and depression. Findings showed there was a statistically significant positive relationship between depression and smoking behavior (r = 0.122, p = 0.01). There was a statistically significant negative relationship between smoking resistance self-efficacy and smoking behavior (r = -0.744, p = 0.01). Additionally, smoking resistance self-efficacy was a mediator of the relationship between depression and smoking behavior (beta = -0.757, p = 0.001). This study identifies a need for further theory-driven study of the relation of adolescent depression and smoking behavior. The findings of this study have implications for nursing interventions targeted to both current smokers and smoking initiation prevention programs.

Statistics of initial density perturbations in heavy ion collisions and their fluid dynamic response

NASA Astrophysics Data System (ADS)

Floerchinger, Stefan; Wiedemann, Urs Achim

2014-08-01

An interesting opportunity to determine thermodynamic and transport properties in more detail is to identify generic statistical properties of initial density perturbations. Here we study event-by-event fluctuations in terms of correlation functions for two models that can be solved analytically. The first assumes Gaussian fluctuations around a distribution that is fixed by the collision geometry but leads to non-Gaussian features after averaging over the reaction plane orientation at non-zero impact parameter. In this context, we derive a three-parameter extension of the commonly used Bessel-Gaussian event-by-event distribution of harmonic flow coefficients. Secondly, we study a model of N independent point sources for which connected n-point correlation functions of initial perturbations scale like 1 /N n-1. This scaling is violated for non-central collisions in a way that can be characterized by its impact parameter dependence. We discuss to what extent these are generic properties that can be expected to hold for any model of initial conditions, and how this can improve the fluid dynamical analysis of heavy ion collisions.
The socio-spatial context as a risk factor for hospitalization due to mental illness in the metropolitan areas of Portugal.

PubMed

Loureiro, Adriana; Costa, Cláudia; Almendra, Ricardo; Freitas, Ângela; Santana, Paula

2015-11-01

This study's aims are: (i) identifying spatial patterns for the risk of hospitalization due to mental illness and for the potential risk resulting from contextual factors with influence on mental health; and (ii) analyzing the spatial association between risk of hospitalization due to mental illness and potential risk resulting from contextual factors in the metropolitan areas of Lisbon and Porto, Portugal. A cross-sectional ecological study was conducted by applying statistical methods for assessing spatial dependency and heterogeneity. Results reveal a spatial association between risk of hospitalization due to mental illness and potential risk resulting from contextual factors with a statistical relevance of moderate intensity. 20% of the population under study lives in areas with a simultaneously high potential risk resulting from contextual factors and risk of hospitalization due to mental illness. Porto Metropolitan Area show the highest percentage of population living in parishes with a significantly high risk of hospitalization due to mental health, which puts forward the need for interventions on territory-adjusted contextual factors influencing mental health.
The Earthquake‐Source Inversion Validation (SIV) Project

USGS Publications Warehouse

Mai, P. Martin; Schorlemmer, Danijel; Page, Morgan T.; Ampuero, Jean-Paul; Asano, Kimiyuki; Causse, Mathieu; Custodio, Susana; Fan, Wenyuan; Festa, Gaetano; Galis, Martin; Gallovic, Frantisek; Imperatori, Walter; Käser, Martin; Malytskyy, Dmytro; Okuwaki, Ryo; Pollitz, Fred; Passone, Luca; Razafindrakoto, Hoby N. T.; Sekiguchi, Haruko; Song, Seok Goo; Somala, Surendra N.; Thingbaijam, Kiran K. S.; Twardzik, Cedric; van Driel, Martin; Vyas, Jagdish C.; Wang, Rongjiang; Yagi, Yuji; Zielke, Olaf

2016-01-01

Finite‐fault earthquake source inversions infer the (time‐dependent) displacement on the rupture surface from geophysical data. The resulting earthquake source models document the complexity of the rupture process. However, multiple source models for the same earthquake, obtained by different research teams, often exhibit remarkable dissimilarities. To address the uncertainties in earthquake‐source inversion methods and to understand strengths and weaknesses of the various approaches used, the Source Inversion Validation (SIV) project conducts a set of forward‐modeling exercises and inversion benchmarks. In this article, we describe the SIV strategy, the initial benchmarks, and current SIV results. Furthermore, we apply statistical tools for quantitative waveform comparison and for investigating source‐model (dis)similarities that enable us to rank the solutions, and to identify particularly promising source inversion approaches. All SIV exercises (with related data and descriptions) and statistical comparison tools are available via an online collaboration platform, and we encourage source modelers to use the SIV benchmarks for developing and testing new methods. We envision that the SIV efforts will lead to new developments for tackling the earthquake‐source imaging problem.
Mechanical and statistical evidence of the causality of human-made mass shifts on the Earth's upper crust and the occurrence of earthquakes

NASA Astrophysics Data System (ADS)

Klose, Christian D.

2013-01-01

A global catalog of small- to large-sized earthquakes was systematically analyzed to identify causality and correlatives between human-made mass shifts in the upper Earth's crust and the occurrence of earthquakes. The mass shifts, ranging between 1 kt and 1 Tt, result from large-scale geoengineering operations, including mining, water reservoirs, hydrocarbon production, fluid injection/extractions, deep geothermal energy production and coastal management. This article shows evidence that geomechanical relationships exist with statistical significance between (a) seismic moment magnitudes M of observed earthquakes, (b) lateral distances of the earthquake hypocenters to the geoengineering "operation points" and (c) mass removals or accumulations on the Earth's crust. Statistical findings depend on uncertainties, in particular, of source parameter estimations of seismic events before instrumental recoding. Statistical observations, however, indicate that every second, seismic event tends to occur after a decade. The chance of an earthquake to nucleate after 2 or 20 years near an area with a significant mass shift is 25 or 75 %, respectively. Moreover, causative effects of seismic activities highly depend on the tectonic stress regime in which the operations take place (i.e., extensive, transverse or compressive). Results are summarized as follows: First, seismic moment magnitudes increase the more mass is locally shifted on the Earth's crust. Second, seismic moment magnitudes increase the larger the area in the crust is geomechanically polluted. Third, reverse faults tend to be more trigger-sensitive than normal faults due to a stronger alteration of the minimum vertical principal stress component. Pure strike-slip faults seem to rupture randomly and independently from the magnitude of the mass changes. Finally, mainly due to high estimation uncertainties of source parameters and, in particular, of shallow seismic events (<10 km), it remains still very difficult to discriminate between induced and triggered earthquakes with respect to the data catalog of this study. However, first analyses indicate that small- to medium-sized earthquakes (M6) seem to be triggered. The rupture propagation of triggered events might be dominated by pre-existing tectonic stress conditions.
Prediction of hemoglobin in blood donors using a latent class mixed-effects transition model.

PubMed

Nasserinejad, Kazem; van Rosmalen, Joost; de Kort, Wim; Rizopoulos, Dimitris; Lesaffre, Emmanuel

2016-02-20

Blood donors experience a temporary reduction in their hemoglobin (Hb) value after donation. At each visit, the Hb value is measured, and a too low Hb value leads to a deferral for donation. Because of the recovery process after each donation as well as state dependence and unobserved heterogeneity, longitudinal data of Hb values of blood donors provide unique statistical challenges. To estimate the shape and duration of the recovery process and to predict future Hb values, we employed three models for the Hb value: (i) a mixed-effects models; (ii) a latent-class mixed-effects model; and (iii) a latent-class mixed-effects transition model. In each model, a flexible function was used to model the recovery process after donation. The latent classes identify groups of donors with fast or slow recovery times and donors whose recovery time increases with the number of donations. The transition effect accounts for possible state dependence in the observed data. All models were estimated in a Bayesian way, using data of new entrant donors from the Donor InSight study. Informative priors were used for parameters of the recovery process that were not identified using the observed data, based on results from the clinical literature. The results show that the latent-class mixed-effects transition model fits the data best, which illustrates the importance of modeling state dependence, unobserved heterogeneity, and the recovery process after donation. The estimated recovery time is much longer than the current minimum interval between donations, suggesting that an increase of this interval may be warranted. Copyright © 2015 John Wiley & Sons, Ltd.
Detecting subtle hydrochemical anomalies with multivariate statistics: an example from homogeneous groundwaters in the Great Artesian Basin, Australia

NASA Astrophysics Data System (ADS)

O'Shea, Bethany; Jankowski, Jerzy

2006-12-01

The major ion composition of Great Artesian Basin groundwater in the lower Namoi River valley is relatively homogeneous in chemical composition. Traditional graphical techniques have been combined with multivariate statistical methods to determine whether subtle differences in the chemical composition of these waters can be delineated. Hierarchical cluster analysis and principal components analysis were successful in delineating minor variations within the groundwaters of the study area that were not visually identified in the graphical techniques applied. Hydrochemical interpretation allowed geochemical processes to be identified in each statistically defined water type and illustrated how these groundwaters differ from one another. Three main geochemical processes were identified in the groundwaters: ion exchange, precipitation, and mixing between waters from different sources. Both statistical methods delineated an anomalous sample suspected of being influenced by magmatic CO2 input. The use of statistical methods to complement traditional graphical techniques for waters appearing homogeneous is emphasized for all investigations of this type. Copyright
Andreev Bound States Formation and Quasiparticle Trapping in Quench Dynamics Revealed by Time-Dependent Counting Statistics.

PubMed

Souto, R Seoane; Martín-Rodero, A; Yeyati, A Levy

2016-12-23

We analyze the quantum quench dynamics in the formation of a phase-biased superconducting nanojunction. We find that in the absence of an external relaxation mechanism and for very general conditions the system gets trapped in a metastable state, corresponding to a nonequilibrium population of the Andreev bound states. The use of the time-dependent full counting statistics analysis allows us to extract information on the asymptotic population of even and odd many-body states, demonstrating that a universal behavior, dependent only on the Andreev state energy, is reached in the quantum point contact limit. These results shed light on recent experimental observations on quasiparticle trapping in superconducting atomic contacts.
The Global Statistical Response of the Outer Radiation Belt During Geomagnetic Storms

NASA Astrophysics Data System (ADS)

Murphy, K. R.; Watt, C. E. J.; Mann, I. R.; Jonathan Rae, I.; Sibeck, D. G.; Boyd, A. J.; Forsyth, C. F.; Turner, D. L.; Claudepierre, S. G.; Baker, D. N.; Spence, H. E.; Reeves, G. D.; Blake, J. B.; Fennell, J.

2018-05-01

Using the total radiation belt electron content calculated from Van Allen Probe phase space density, the time-dependent and global response of the outer radiation belt during storms is statistically studied. Using phase space density reduces the impacts of adiabatic changes in the main phase, allowing a separation of adiabatic and nonadiabatic effects and revealing a clear modality and repeatable sequence of events in storm time radiation belt electron dynamics. This sequence exhibits an important first adiabatic invariant (μ)-dependent behavior in the seed (150 MeV/G), relativistic (1,000 MeV/G), and ultrarelativistic (4,000 MeV/G) populations. The outer radiation belt statistically shows an initial phase dominated by loss followed by a second phase of rapid acceleration, while the seed population shows little loss and immediate enhancement. The time sequence of the transition to the acceleration is also strongly μ dependent and occurs at low μ first, appearing to be repeatable from storm to storm.
Statistical Analysis of Human Body Movement and Group Interactions in Response to Music

NASA Astrophysics Data System (ADS)

Desmet, Frank; Leman, Marc; Lesaffre, Micheline; de Bruyn, Leen

Quantification of time series that relate to physiological data is challenging for empirical music research. Up to now, most studies have focused on time-dependent responses of individual subjects in controlled environments. However, little is known about time-dependent responses of between-subject interactions in an ecological context. This paper provides new findings on the statistical analysis of group synchronicity in response to musical stimuli. Different statistical techniques were applied to time-dependent data obtained from an experiment on embodied listening in individual and group settings. Analysis of inter group synchronicity are described. Dynamic Time Warping (DTW) and Cross Correlation Function (CCF) were found to be valid methods to estimate group coherence of the resulting movements. It was found that synchronicity of movements between individuals (human-human interactions) increases significantly in the social context. Moreover, Analysis of Variance (ANOVA) revealed that the type of music is the predominant factor in both the individual and the social context.
Validating Future Force Performance Measures (Army Class): Concluding Analyses

DTIC Science & Technology

2016-06-01

32 Table 3.10. Descriptive Statistics and Intercorrelations for LV Final Predictor Factor Scores...55 Table 4.7. Descriptive Statistics for Analysis Criteria...Soldier attrition and performance: Dependability (Non- Delinquency ), Adjustment, Physical Conditioning, Leadership, Work Orientation, and Agreeableness
Comparing Methods for Item Analysis: The Impact of Different Item-Selection Statistics on Test Difficulty

ERIC Educational Resources Information Center

Jones, Andrew T.

2011-01-01

Practitioners often depend on item analysis to select items for exam forms and have a variety of options available to them. These include the point-biserial correlation, the agreement statistic, the B index, and the phi coefficient. Although research has demonstrated that these statistics can be useful for item selection, no research as of yet has…
Comparing Trend and Gap Statistics across Tests: Distributional Change Using Ordinal Methods and Bayesian Inference

ERIC Educational Resources Information Center

Denbleyker, John Nickolas

2012-01-01

The shortcomings of the proportion above cut (PAC) statistic used so prominently in the educational landscape renders it a very problematic measure for making correct inferences with student test data. The limitations of PAC-based statistics are more pronounced with cross-test comparisons due to their dependency on cut-score locations. A better…
Co-acting gene networks predict TRAIL responsiveness of tumour cells with high accuracy.

PubMed

O'Reilly, Paul; Ortutay, Csaba; Gernon, Grainne; O'Connell, Enda; Seoighe, Cathal; Boyce, Susan; Serrano, Luis; Szegezdi, Eva

2014-12-19

Identification of differentially expressed genes from transcriptomic studies is one of the most common mechanisms to identify tumor biomarkers. This approach however is not well suited to identify interaction between genes whose protein products potentially influence each other, which limits its power to identify molecular wiring of tumour cells dictating response to a drug. Due to the fact that signal transduction pathways are not linear and highly interlinked, the biological response they drive may be better described by the relative amount of their components and their functional relationships than by their individual, absolute expression. Gene expression microarray data for 109 tumor cell lines with known sensitivity to the death ligand cytokine tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) was used to identify genes with potential functional relationships determining responsiveness to TRAIL-induced apoptosis. The machine learning technique Random Forest in the statistical environment "R" with backward elimination was used to identify the key predictors of TRAIL sensitivity and differentially expressed genes were identified using the software GeneSpring. Gene co-regulation and statistical interaction was assessed with q-order partial correlation analysis and non-rejection rate. Biological (functional) interactions amongst the co-acting genes were studied with Ingenuity network analysis. Prediction accuracy was assessed by calculating the area under the receiver operator curve using an independent dataset. We show that the gene panel identified could predict TRAIL-sensitivity with a very high degree of sensitivity and specificity (AUC=0·84). The genes in the panel are co-regulated and at least 40% of them functionally interact in signal transduction pathways that regulate cell death and cell survival, cellular differentiation and morphogenesis. Importantly, only 12% of the TRAIL-predictor genes were differentially expressed highlighting the importance of functional interactions in predicting the biological response. The advantage of co-acting gene clusters is that this analysis does not depend on differential expression and is able to incorporate direct- and indirect gene interactions as well as tissue- and cell-specific characteristics. This approach (1) identified a descriptor of TRAIL sensitivity which performs significantly better as a predictor of TRAIL sensitivity than any previously reported gene signatures, (2) identified potential novel regulators of TRAIL-responsiveness and (3) provided a systematic view highlighting fundamental differences between the molecular wiring of sensitive and resistant cell types.
Understanding Statistics and Statistics Education: A Chinese Perspective

ERIC Educational Resources Information Center

Shi, Ning-Zhong; He, Xuming; Tao, Jian

2009-01-01

In recent years, statistics education in China has made great strides. However, there still exists a fairly large gap with the advanced levels of statistics education in more developed countries. In this paper, we identify some existing problems in statistics education in Chinese schools and make some proposals as to how they may be overcome. We…
Are the Nonparametric Person-Fit Statistics More Powerful than Their Parametric Counterparts? Revisiting the Simulations in Karabatsos (2003)

ERIC Educational Resources Information Center

Sinharay, Sandip

2017-01-01

Karabatsos compared the power of 36 person-fit statistics using receiver operating characteristics curves and found the "H[superscript T]" statistic to be the most powerful in identifying aberrant examinees. He found three statistics, "C", "MCI", and "U3", to be the next most powerful. These four statistics,…
Nuclear magnetic resonance (NMR) study of the effect of cisplatin on the metabolic profile of MG-63 osteosarcoma cells.

PubMed

Duarte, Iola F; Lamego, Ines; Marques, Joana; Marques, M Paula M; Blaise, Benjamin J; Gil, Ana M

2010-11-05

In the present study, (1)H HRMAS NMR spectroscopy was used to assess the changes in the intracellular metabolic profile of MG-63 human osteosarcoma (OS) cells induced by the chemotherapy agent cisplatin (CDDP) at different times of exposure. Multivariate analysis was applied to the cells spectra, enabling consistent variation patterns to be detected and drug-specific metabolic effects to be identified. Statistical recoupling of variables (SRV) analysis and spectral integration enabled the most relevant spectral changes to be evaluated, revealing significant time-dependent alterations in lipids, choline-containing compounds, some amino acids, polyalcohols, and nitrogenated bases. The metabolic relevance of these compounds in the response of MG-63 cells to CDDP treatment is discussed.
A parametric multiclass Bayes error estimator for the multispectral scanner spatial model performance evaluation

NASA Technical Reports Server (NTRS)

Mobasseri, B. G.; Mcgillem, C. D.; Anuta, P. E. (Principal Investigator)

1978-01-01

The author has identified the following significant results. The probability of correct classification of various populations in data was defined as the primary performance index. The multispectral data being of multiclass nature as well, required a Bayes error estimation procedure that was dependent on a set of class statistics alone. The classification error was expressed in terms of an N dimensional integral, where N was the dimensionality of the feature space. The multispectral scanner spatial model was represented by a linear shift, invariant multiple, port system where the N spectral bands comprised the input processes. The scanner characteristic function, the relationship governing the transformation of the input spatial, and hence, spectral correlation matrices through the systems, was developed.
Evaluation of the quality of the teaching-learning process in undergraduate courses in Nursing.

PubMed

González-Chordá, Víctor Manuel; Maciá-Soler, María Loreto

2015-01-01

to identify aspects of improvement of the quality of the teaching-learning process through the analysis of tools that evaluated the acquisition of skills by undergraduate students of Nursing. prospective longitudinal study conducted in a population of 60 secondyear Nursing students based on registration data, from which quality indicators that evaluate the acquisition of skills were obtained, with descriptive and inferential analysis. nine items were identified and nine learning activities included in the assessment tools that did not reach the established quality indicators (p<0.05). There are statistically significant differences depending on the hospital and clinical practices unit (p<0.05). the analysis of the evaluation tools used in the article "Nursing Care in Welfare Processes" of the analyzed university undergraduate course enabled the detection of the areas for improvement in the teachinglearning process. The challenge of education in nursing is to reach the best clinical research and educational results, in order to provide improvements to the quality of education and health care.
Correlation approach to identify coding regions in DNA sequences

NASA Technical Reports Server (NTRS)

Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

1994-01-01

Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.
Entropy of Leukemia on Multidimensional Morphological and Molecular Landscapes

NASA Astrophysics Data System (ADS)

Vilar, Jose M. G.

2014-04-01

Leukemia epitomizes the class of highly complex diseases that new technologies aim to tackle by using large sets of single-cell-level information. Achieving such a goal depends critically not only on experimental techniques but also on approaches to interpret the data. A most pressing issue is to identify the salient quantitative features of the disease from the resulting massive amounts of information. Here, I show that the entropies of cell-population distributions on specific multidimensional molecular and morphological landscapes provide a set of measures for the precise characterization of normal and pathological states, such as those corresponding to healthy individuals and acute myeloid leukemia (AML) patients. I provide a systematic procedure to identify the specific landscapes and illustrate how, applied to cell samples from peripheral blood and bone marrow aspirates, this characterization accurately diagnoses AML from just flow cytometry data. The methodology can generally be applied to other types of cell populations and establishes a straightforward link between the traditional statistical thermodynamics methodology and biomedical applications.

Genome-wide association study in discordant sibships identifies multiple inherited susceptibility alleles linked to lung cancer.

PubMed

Galvan, Antonella; Falvella, Felicia S; Frullanti, Elisa; Spinola, Monica; Incarbone, Matteo; Nosotti, Mario; Santambrogio, Luigi; Conti, Barbara; Pastorino, Ugo; Gonzalez-Neira, Anna; Dragani, Tommaso A

2010-03-01

We analyzed a series of young (median age = 52 years) non-smoker lung cancer patients and their unaffected siblings as controls, using a genome-wide 620 901 single-nucleotide polymorphism (SNP) array analysis and a case-control DNA pooling approach. We identified 82 putatively associated SNPs that were retested by individual genotyping followed by use of the sib transmission disequilibrium test, pointing to 36 SNPs associated with lung cancer risk in the discordant sibs series. Analysis of these 36 SNPs in a polygenic model characterized by additive and interchangeable effects of rare alleles revealed a highly statistically significant dosage-dependent association between risk allele carrier status and proportion of cancer cases. Replication of the same 36 SNPs in a population-based series confirmed the association with lung cancer for three SNPs, suggesting that phenocopies and genetic heterogeneity can play a major role in the complex genetics of lung cancer risk in the general population.
An Entropy-Based Measure of Dependence between Two Groups of Random Variables. Research Report. ETS RR-07-20

ERIC Educational Resources Information Center

Kong, Nan

2007-01-01

In multivariate statistics, the linear relationship among random variables has been fully explored in the past. This paper looks into the dependence of one group of random variables on another group of random variables using (conditional) entropy. A new measure, called the K-dependence coefficient or dependence coefficient, is defined using…
Caregivers' burden in patients with COPD.

PubMed

Miravitlles, Marc; Peña-Longobardo, Luz María; Oliva-Moreno, Juan; Hidalgo-Vega, Álvaro

2015-01-01

Chronic obstructive pulmonary disease (COPD) is a very prevalent and invalidating disease. The aim of this study was to analyze the burden borne by informal caregivers of patients with COPD. We used the Survey on Disabilities, Personal Autonomy, and Dependency Situations (Encuesta sobre Discapacidad, Autonomía personal y Situaciones de Dependencia [EDAD]-2008) to obtain information on the characteristics of disabled individuals with COPD and their caregivers in Spain. Additionally, statistical multivariate analyses were performed to analyze the impact that an increase in dependence would have on the problems for which caregivers provide support, in terms of health, professional, and leisure/social dimensions. A total of 461,884 individuals with one or more disabilities and with COPD were identified, and 220,892 informal caregivers were estimated. Results showed that 35% of informal caregivers had health-related problems due to the caregiving provided; 83% had leisure/social-related problems; and among caregivers of working age, 38% recognized having profession-related problems. The probability of a problem arising was significantly associated with the degree of dependence of the patient receiving care. Caregivers of patients with great dependence showed a 39% higher probability of presenting health-related problems, 27% more professional problems, and 23% more leisure problems compared with those with nondependent patients. The results show the large impact on society in terms of the welfare of informal caregivers of patients with COPD. A higher level of dependence was associated with more severe problems in caregivers, in all dimensions.
Caregivers’ burden in patients with COPD

PubMed Central

Miravitlles, Marc; Peña-Longobardo, Luz María; Oliva-Moreno, Juan; Hidalgo-Vega, Álvaro

2015-01-01

Objective Chronic obstructive pulmonary disease (COPD) is a very prevalent and invalidating disease. The aim of this study was to analyze the burden borne by informal caregivers of patients with COPD. Methods We used the Survey on Disabilities, Personal Autonomy, and Dependency Situations (Encuesta sobre Discapacidad, Autonomía personal y Situaciones de Dependencia [EDAD]-2008) to obtain information on the characteristics of disabled individuals with COPD and their caregivers in Spain. Additionally, statistical multivariate analyses were performed to analyze the impact that an increase in dependence would have on the problems for which caregivers provide support, in terms of health, professional, and leisure/social dimensions. Results A total of 461,884 individuals with one or more disabilities and with COPD were identified, and 220,892 informal caregivers were estimated. Results showed that 35% of informal caregivers had health-related problems due to the caregiving provided; 83% had leisure/social-related problems; and among caregivers of working age, 38% recognized having profession-related problems. The probability of a problem arising was significantly associated with the degree of dependence of the patient receiving care. Caregivers of patients with great dependence showed a 39% higher probability of presenting health-related problems, 27% more professional problems, and 23% more leisure problems compared with those with nondependent patients. Conclusion The results show the large impact on society in terms of the welfare of informal caregivers of patients with COPD. A higher level of dependence was associated with more severe problems in caregivers, in all dimensions. PMID:25709429
A Bifactor Approach to Model Multifaceted Constructs in Statistical Mediation Analysis

ERIC Educational Resources Information Center

Gonzalez, Oscar; MacKinnon, David P.

2018-01-01

Statistical mediation analysis allows researchers to identify the most important mediating constructs in the causal process studied. Identifying specific mediators is especially relevant when the hypothesized mediating construct consists of multiple related facets. The general definition of the construct and its facets might relate differently to…
CAGE, RAPS4, RAPS4-QF and AUDIT Screening Tests for Men and Women Admitted for Acute Alcohol Intoxication to an Emergency Department: Are Standard Thresholds Appropriate?

PubMed Central

Geneste, J.; Pereira, B.; Arnaud, B.; Christol, N.; Liotier, J.; Blanc, O.; Teissedre, F.; Hope, S.; Schwan, R.; Llorca, P.M.; Schmidt, J.; Cherpitel, C.J.; Malet, L.; Brousse, G.

2012-01-01

Aims: A number of screening instruments are routinely used in Emergency Department (ED) situations to identify alcohol-use disorders (AUD). We wished to study the psychometric features, particularly concerning optimal thresholds scores (TSs), of four assessment scales frequently used to screen for abuse and/or dependence, the cut-down annoyed guilty eye-opener (CAGE), Rapid Alcohol Problem Screen 4 (RAPS4), RAPS4-quantity-frequency and AUD Identification Test (AUDIT) questionnaires, particularly in the sub-group of people admitted for acute alcohol intoxication (AAI). Methods: All included patients [AAI admitted to ED (blood alcohol level ≥0.8 g/l)] were assessed by the four scales, and with a gold standard (alcohol dependence⁄abuse section of the Mini International Neuropsychiatric Interview), to determine AUD status. To investigate the TSs of the scales, we used Youden's index, efficiency, receiver operating characteristic (ROC) curve techniques and quality ROC curve technique for optimized TS (indices of quality). Results: A total of 164 persons (122 males, 42 females) were included in the study. Nineteen (11.60%) were identified as alcohol abusers alone and 128 (78.1%) as alcohol dependents (DSM-IV). Results suggest a statistically significant difference between men and women (P < 0.05) in performance of the screening tests RAPS4 (≥1) and CAGE (≥2) for detecting abuse. Also, in this population, we show an increase in TSs of RAPS4 (≥2) and CAGE (≥3) for detecting dependence compared with those typically accepted in non-intoxicated individuals. The AUDIT test demonstrates good performance for detecting alcohol abuse and/or alcohol-dependent patients (≥7 for women and ≥12 for men) and for distinguishing alcohol dependence (≥11 for women and ≥14 for men) from other conditions. Conclusion: Our study underscores for the first time the need to adapt, taking into account gender, the thresholds of tests typically used for detection of abuse and dependence in this population. PMID:22414922
Publication of statistically significant research findings in prosthodontics & implant dentistry in the context of other dental specialties.

PubMed

Papageorgiou, Spyridon N; Kloukos, Dimitrios; Petridis, Haralampos; Pandis, Nikolaos

2015-10-01

To assess the hypothesis that there is excessive reporting of statistically significant studies published in prosthodontic and implantology journals, which could indicate selective publication. The last 30 issues of 9 journals in prosthodontics and implant dentistry were hand-searched for articles with statistical analyses. The percentages of significant and non-significant results were tabulated by parameter of interest. Univariable/multivariable logistic regression analyses were applied to identify possible predictors of reporting statistically significance findings. The results of this study were compared with similar studies in dentistry with random-effects meta-analyses. From the 2323 included studies 71% of them reported statistically significant results, with the significant results ranging from 47% to 86%. Multivariable modeling identified that geographical area and involvement of statistician were predictors of statistically significant results. Compared to interventional studies, the odds that in vitro and observational studies would report statistically significant results was increased by 1.20 times (OR: 2.20, 95% CI: 1.66-2.92) and 0.35 times (OR: 1.35, 95% CI: 1.05-1.73), respectively. The probability of statistically significant results from randomized controlled trials was significantly lower compared to various study designs (difference: 30%, 95% CI: 11-49%). Likewise the probability of statistically significant results in prosthodontics and implant dentistry was lower compared to other dental specialties, but this result did not reach statistical significant (P>0.05). The majority of studies identified in the fields of prosthodontics and implant dentistry presented statistically significant results. The same trend existed in publications of other specialties in dentistry. Copyright © 2015 Elsevier Ltd. All rights reserved.
Risk assessment of student performance in the International Foundations of Medicine Clinical Science Examination by the use of statistical modeling.

PubMed

David, Michael C; Eley, Diann S; Schafer, Jennifer; Davies, Leo

2016-01-01

The primary aim of this study was to assess the predictive validity of cumulative grade point average (GPA) for performance in the International Foundations of Medicine (IFOM) Clinical Science Examination (CSE). A secondary aim was to develop a strategy for identifying students at risk of performing poorly in the IFOM CSE as determined by the National Board of Medical Examiners' International Standard of Competence. Final year medical students from an Australian university medical school took the IFOM CSE as a formative assessment. Measures included overall IFOM CSE score as the dependent variable, cumulative GPA as the predictor, and the factors age, gender, year of enrollment, international or domestic status of student, and language spoken at home as covariates. Multivariable linear regression was used to measure predictor and covariate effects. Optimal thresholds of risk assessment were based on receiver-operating characteristic (ROC) curves. Cumulative GPA (nonstandardized regression coefficient [B]: 81.83; 95% confidence interval [CI]: 68.13 to 95.53) and international status (B: -37.40; 95% CI: -57.85 to -16.96) from 427 students were found to be statistically associated with increased IFOM CSE performance. Cumulative GPAs of 5.30 (area under ROC [AROC]: 0.77; 95% CI: 0.72 to 0.82) and 4.90 (AROC: 0.72; 95% CI: 0.66 to 0.78) were identified as being thresholds of significant risk for domestic and international students, respectively. Using cumulative GPA as a predictor of IFOM CSE performance and accommodating for differences in international status, it is possible to identify students who are at risk of failing to satisfy the National Board of Medical Examiners' International Standard of Competence.
Detecting disease-predisposing variants: the haplotype method.

PubMed Central

Valdes, A M; Thomson, G

1997-01-01

For many HLA-associated diseases, multiple alleles-- and, in some cases, multiple loci--have been suggested as the causative agents. The haplotype method for identifying disease-predisposing amino acids in a genetic region is a stratification analysis. We show that, for each haplotype combination containing all the amino acid sites involved in the disease process, the relative frequencies of amino acid variants at sites not involved in disease but in linkage disequilibrium with the disease-predisposing sites are expected to be the same in patients and controls. The haplotype method is robust to mode of inheritance and penetrance of the disease and can be used to determine unequivocally whether all amino acid sites involved in the disease have not been identified. Using a resampling technique, we developed a statistical test that takes account of the nonindependence of the sites sampled. Further, when multiple sites in the genetic region are involved in disease, the test statistic gives a closer fit to the null expectation when some--compared with none--of the true predisposing factors are included in the haplotype analysis. Although the haplotype method cannot distinguish between very highly correlated sites in one population, ethnic comparisons may help identify the true predisposing factors. The haplotype method was applied to insulin-dependent diabetes mellitus (IDDM) HLA class II DQA1-DQB1 data from Caucasian, African, and Japanese populations. Our results indicate that the combination DQA1#52 (Arg predisposing) DQB1#57 (Asp protective), which has been proposed as an important IDDM agent, does not include all the predisposing elements. With rheumatoid arthritis HLA class II DRB1 data, the results were consistent with the shared-epitope hypothesis. PMID:9042931
Breast cancer screening in women at increased risk according to different family histories: an update of the Modena Study Group experience

PubMed Central

Cortesi, Laura; Turchetti, Daniela; Marchi, Isabella; Fracca, Antonella; Canossi, Barbara; Rachele, Battista; Silvia, Ruscelli; Rita, Pecchi Anna; Pietro, Torricelli; Massimo, Federico

2006-01-01

Background Breast cancer (BC) detection in women with a genetic susceptibility or strong family history is considered mandatory compared with BC screening in the general population. However, screening modalities depend on the level of risk. Here we present an update of our screening programs based on risk classification. Methods We defined different risk categories and surveillance strategies to identify early BC in 1325 healthy women recruited by the Modena Study Group for familial breast and ovarian cancer. Four BC risk categories included BRCA1/2 carriers, increased, intermediate, and slightly increased risk. Women who developed BC from January 1, 1994, through December 31, 2005 (N = 44) were compared with the number of expected cases matched for age and period. BRCA1/2 carriers were identified by mutational analysis. Other risk groups were defined by different levels of family history for breast or ovarian cancer (OC). The standardized incidence ratio (SIR) was used to evaluate the observed and expected ratio among groups. All statistical tests were two-sided. Results After a median follow-up of 55 months, there was a statistically significant difference between observed and expected incidence [SIR = 4.9; 95% confidence interval (CI) = 1.6 to 7.6; p < 0.001]. The incidence observed among BRCA carriers (SIR = 20.3; 95% CI = 3.1 to 83.9; P < 0.001), women at increased (SIR = 4.5; 95% CI = 1.5 to 8.3; P < 0.001) or intermediate risk (SIR = 7.0, 95% CI = 2.0 to 17.1; P = 0.0018) was higher than expected, while the difference between observed and expected among women at slightly increased risk was not statistically significant (SIR = 2.4, 95% CI = 0.9 to 8.3; P = .74). Conclusion The rate of cancers detected in women at high risk according to BRCA status or strong family history, as defined according to our operational criteria, was significantly higher than expected in an age-matched general population. However, we failed to identify a greater incidence of BC in the slightly increased risk group. These results support the effectiveness of the proposed program to identify and monitor individuals at high risk, whereas prospective trials are needed for women belonging to families with sporadic BC or OC. PMID:16916448
The Dependence of Strength in Plastics upon Polymer Chain Length and Chain Orientation: An Experiment Emphasizing the Statistical Handling and Evaluation of Data.

ERIC Educational Resources Information Center

Spencer, R. Donald

1984-01-01

Describes an experiment (using plastic bags) designed to give students practical understanding on using statistics to evaluate data and how statistical treatment of experimental results can enhance their value in solving scientific problems. Students also gain insight into the orientation and structure of polymers by examining the plastic bags.…
Sub-poissonian photon statistics in the coherent state Jaynes-Cummings model in non-resonance

NASA Astrophysics Data System (ADS)

Zhang, Jia-tai; Fan, An-fu

1992-03-01

We study a model with a two-level atom (TLA) non-resonance interacting with a single-mode quantized cavity field (QCF). The photon number probability function, the mean photon number and Mandel's fluctuation parameter are calculated. The sub-Poissonian distributions of the photon statistics are obtained in non-resonance interaction. This statistical properties are strongly dependent on the detuning parameters.
Modeling spiking behavior of neurons with time-dependent Poisson processes.

PubMed

Shinomoto, S; Tsubo, Y

2001-10-01

Three kinds of interval statistics, as represented by the coefficient of variation, the skewness coefficient, and the correlation coefficient of consecutive intervals, are evaluated for three kinds of time-dependent Poisson processes: pulse regulated, sinusoidally regulated, and doubly stochastic. Among these three processes, the sinusoidally regulated and doubly stochastic Poisson processes, in the case when the spike rate varies slowly compared with the mean interval between spikes, are found to be consistent with the three statistical coefficients exhibited by data recorded from neurons in the prefrontal cortex of monkeys.
Acceleration techniques for dependability simulation. M.S. Thesis

NASA Technical Reports Server (NTRS)

Barnette, James David

1995-01-01

As computer systems increase in complexity, the need to project system performance from the earliest design and development stages increases. We have to employ simulation for detailed dependability studies of large systems. However, as the complexity of the simulation model increases, the time required to obtain statistically significant results also increases. This paper discusses an approach that is application independent and can be readily applied to any process-based simulation model. Topics include background on classical discrete event simulation and techniques for random variate generation and statistics gathering to support simulation.
Phase dependence of the unnormalized second-order photon correlation function

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ciornea, V.; Bardetski, P.; Macovei, M. A., E-mail: macovei@phys.asm.md

2016-10-15

We investigate the resonant quantum dynamics of a multi-qubit ensemble in a microcavity. Both the quantum-dot subsystem and the microcavity mode are pumped coherently. We find that the microcavity photon statistics depends on the phase difference of the driving lasers, which is not the case for the photon intensity at resonant driving. This way, one can manipulate the two-photon correlations. In particular, higher degrees of photon correlations and, eventually, stronger intensities are obtained. Furthermore, the microcavity photon statistics exhibits steady-state oscillatory behaviors as well as asymmetries.
An information-theoretic approach to the modeling and analysis of whole-genome bisulfite sequencing data.

PubMed

Jenkinson, Garrett; Abante, Jordi; Feinberg, Andrew P; Goutsias, John

2018-03-07

DNA methylation is a stable form of epigenetic memory used by cells to control gene expression. Whole genome bisulfite sequencing (WGBS) has emerged as a gold-standard experimental technique for studying DNA methylation by producing high resolution genome-wide methylation profiles. Statistical modeling and analysis is employed to computationally extract and quantify information from these profiles in an effort to identify regions of the genome that demonstrate crucial or aberrant epigenetic behavior. However, the performance of most currently available methods for methylation analysis is hampered by their inability to directly account for statistical dependencies between neighboring methylation sites, thus ignoring significant information available in WGBS reads. We present a powerful information-theoretic approach for genome-wide modeling and analysis of WGBS data based on the 1D Ising model of statistical physics. This approach takes into account correlations in methylation by utilizing a joint probability model that encapsulates all information available in WGBS methylation reads and produces accurate results even when applied on single WGBS samples with low coverage. Using the Shannon entropy, our approach provides a rigorous quantification of methylation stochasticity in individual WGBS samples genome-wide. Furthermore, it utilizes the Jensen-Shannon distance to evaluate differences in methylation distributions between a test and a reference sample. Differential performance assessment using simulated and real human lung normal/cancer data demonstrate a clear superiority of our approach over DSS, a recently proposed method for WGBS data analysis. Critically, these results demonstrate that marginal methods become statistically invalid when correlations are present in the data. This contribution demonstrates clear benefits and the necessity of modeling joint probability distributions of methylation using the 1D Ising model of statistical physics and of quantifying methylation stochasticity using concepts from information theory. By employing this methodology, substantial improvement of DNA methylation analysis can be achieved by effectively taking into account the massive amount of statistical information available in WGBS data, which is largely ignored by existing methods.
Prediction of rainfall anomalies during the dry to wet transition season over the Southern Amazonia using machine learning tools

NASA Astrophysics Data System (ADS)

Shan, X.; Zhang, K.; Zhuang, Y.; Fu, R.; Hong, Y.

2017-12-01

Seasonal prediction of rainfall during the dry-to-wet transition season in austral spring (September-November) over southern Amazonia is central for improving planting crops and fire mitigation in that region. Previous studies have identified the key large-scale atmospheric dynamic and thermodynamics pre-conditions during the dry season (June-August) that influence the rainfall anomalies during the dry to wet transition season over Southern Amazonia. Based on these key pre-conditions during dry season, we have evaluated several statistical models and developed a Neural Network based statistical prediction system to predict rainfall during the dry to wet transition for Southern Amazonia (5-15°S, 50-70°W). Multivariate Empirical Orthogonal Function (EOF) Analysis is applied to the following four fields during JJA from the ECMWF Reanalysis (ERA-Interim) spanning from year 1979 to 2015: geopotential height at 200 hPa, surface relative humidity, convective inhibition energy (CIN) index and convective available potential energy (CAPE), to filter out noise and highlight the most coherent spatial and temporal variations. The first 10 EOF modes are retained for inputs to the statistical models, accounting for at least 70% of the total variance in the predictor fields. We have tested several linear and non-linear statistical methods. While the regularized Ridge Regression and Lasso Regression can generally capture the spatial pattern and magnitude of rainfall anomalies, we found that that Neural Network performs best with an accuracy greater than 80%, as expected from the non-linear dependence of the rainfall on the large-scale atmospheric thermodynamic conditions and circulation. Further tests of various prediction skill metrics and hindcasts also suggest this Neural Network prediction approach can significantly improve seasonal prediction skill than the dynamic predictions and regression based statistical predictions. Thus, this statistical prediction system could have shown potential to improve real-time seasonal rainfall predictions in the future.
Statistical assessment of the learning curves of health technologies.

PubMed

Ramsay, C R; Grant, A M; Wallace, S A; Garthwaite, P H; Monk, A F; Russell, I T

2001-01-01

(1) To describe systematically studies that directly assessed the learning curve effect of health technologies. (2) Systematically to identify 'novel' statistical techniques applied to learning curve data in other fields, such as psychology and manufacturing. (3) To test these statistical techniques in data sets from studies of varying designs to assess health technologies in which learning curve effects are known to exist. METHODS - STUDY SELECTION (HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW): For a study to be included, it had to include a formal analysis of the learning curve of a health technology using a graphical, tabular or statistical technique. METHODS - STUDY SELECTION (NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH): For a study to be included, it had to include a formal assessment of a learning curve using a statistical technique that had not been identified in the previous search. METHODS - DATA SOURCES: Six clinical and 16 non-clinical biomedical databases were searched. A limited amount of handsearching and scanning of reference lists was also undertaken. METHODS - DATA EXTRACTION (HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW): A number of study characteristics were abstracted from the papers such as study design, study size, number of operators and the statistical method used. METHODS - DATA EXTRACTION (NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH): The new statistical techniques identified were categorised into four subgroups of increasing complexity: exploratory data analysis; simple series data analysis; complex data structure analysis, generic techniques. METHODS - TESTING OF STATISTICAL METHODS: Some of the statistical methods identified in the systematic searches for single (simple) operator series data and for multiple (complex) operator series data were illustrated and explored using three data sets. The first was a case series of 190 consecutive laparoscopic fundoplication procedures performed by a single surgeon; the second was a case series of consecutive laparoscopic cholecystectomy procedures performed by ten surgeons; the third was randomised trial data derived from the laparoscopic procedure arm of a multicentre trial of groin hernia repair, supplemented by data from non-randomised operations performed during the trial. RESULTS - HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW: Of 4571 abstracts identified, 272 (6%) were later included in the study after review of the full paper. Some 51% of studies assessed a surgical minimal access technique and 95% were case series. The statistical method used most often (60%) was splitting the data into consecutive parts (such as halves or thirds), with only 14% attempting a more formal statistical analysis. The reporting of the studies was poor, with 31% giving no details of data collection methods. RESULTS - NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH: Of 9431 abstracts assessed, 115 (1%) were deemed appropriate for further investigation and, of these, 18 were included in the study. All of the methods for complex data sets were identified in the non-clinical literature. These were discriminant analysis, two-stage estimation of learning rates, generalised estimating equations, multilevel models, latent curve models, time series models and stochastic parameter models. In addition, eight new shapes of learning curves were identified. RESULTS - TESTING OF STATISTICAL METHODS: No one particular shape of learning curve performed significantly better than another. The performance of 'operation time' as a proxy for learning differed between the three procedures. Multilevel modelling using the laparoscopic cholecystectomy data demonstrated and measured surgeon-specific and confounding effects. The inclusion of non-randomised cases, despite the possible limitations of the method, enhanced the interpretation of learning effects. CONCLUSIONS - HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW: The statistical methods used for assessing learning effects in health technology assessment have been crude and the reporting of studies poor. CONCLUSIONS - NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH: A number of statistical methods for assessing learning effects were identified that had not hitherto been used in health technology assessment. There was a hierarchy of methods for the identification and measurement of learning, and the more sophisticated methods for both have had little if any use in health technology assessment. This demonstrated the value of considering fields outside clinical research when addressing methodological issues in health technology assessment. CONCLUSIONS - TESTING OF STATISTICAL METHODS: It has been demonstrated that the portfolio of techniques identified can enhance investigations of learning curve effects. (ABSTRACT TRUNCATED)
Spatio-Temporal Analysis of Smear-Positive Tuberculosis in the Sidama Zone, Southern Ethiopia

PubMed Central

Dangisso, Mesay Hailu; Datiko, Daniel Gemechu; Lindtjørn, Bernt

2015-01-01

Background Tuberculosis (TB) is a disease of public health concern, with a varying distribution across settings depending on socio-economic status, HIV burden, availability and performance of the health system. Ethiopia is a country with a high burden of TB, with regional variations in TB case notification rates (CNRs). However, TB program reports are often compiled and reported at higher administrative units that do not show the burden at lower units, so there is limited information about the spatial distribution of the disease. We therefore aim to assess the spatial distribution and presence of the spatio-temporal clustering of the disease in different geographic settings over 10 years in the Sidama Zone in southern Ethiopia. Methods A retrospective space–time and spatial analysis were carried out at the kebele level (the lowest administrative unit within a district) to identify spatial and space-time clusters of smear-positive pulmonary TB (PTB). Scan statistics, Global Moran’s I, and Getis and Ordi (Gi*) statistics were all used to help analyze the spatial distribution and clusters of the disease across settings. Results A total of 22,545 smear-positive PTB cases notified over 10 years were used for spatial analysis. In a purely spatial analysis, we identified the most likely cluster of smear-positive PTB in 192 kebeles in eight districts (RR= 2, p<0.001), with 12,155 observed and 8,668 expected cases. The Gi* statistic also identified the clusters in the same areas, and the spatial clusters showed stability in most areas in each year during the study period. The space-time analysis also detected the most likely cluster in 193 kebeles in the same eight districts (RR= 1.92, p<0.001), with 7,584 observed and 4,738 expected cases in 2003-2012. Conclusion The study found variations in CNRs and significant spatio-temporal clusters of smear-positive PTB in the Sidama Zone. The findings can be used to guide TB control programs to devise effective TB control strategies for the geographic areas characterized by the highest CNRs. Further studies are required to understand the factors associated with clustering based on individual level locations and investigation of cases. PMID:26030162
Inferring general relations between network characteristics from specific network ensembles.

PubMed

Cardanobile, Stefano; Pernice, Volker; Deger, Moritz; Rotter, Stefan

2012-01-01

Different network models have been suggested for the topology underlying complex interactions in natural systems. These models are aimed at replicating specific statistical features encountered in real-world networks. However, it is rarely considered to which degree the results obtained for one particular network class can be extrapolated to real-world networks. We address this issue by comparing different classical and more recently developed network models with respect to their ability to generate networks with large structural variability. In particular, we consider the statistical constraints which the respective construction scheme imposes on the generated networks. After having identified the most variable networks, we address the issue of which constraints are common to all network classes and are thus suitable candidates for being generic statistical laws of complex networks. In fact, we find that generic, not model-related dependencies between different network characteristics do exist. This makes it possible to infer global features from local ones using regression models trained on networks with high generalization power. Our results confirm and extend previous findings regarding the synchronization properties of neural networks. Our method seems especially relevant for large networks, which are difficult to map completely, like the neural networks in the brain. The structure of such large networks cannot be fully sampled with the present technology. Our approach provides a method to estimate global properties of under-sampled networks in good approximation. Finally, we demonstrate on three different data sets (C. elegans neuronal network, R. prowazekii metabolic network, and a network of synonyms extracted from Roget's Thesaurus) that real-world networks have statistical relations compatible with those obtained using regression models.

Adjusted scaling of FDG positron emission tomography images for statistical evaluation in patients with suspected Alzheimer's disease.

PubMed

Buchert, Ralph; Wilke, Florian; Chakrabarti, Bhismadev; Martin, Brigitte; Brenner, Winfried; Mester, Janos; Clausen, Malte

2005-10-01

Statistical parametric mapping (SPM) gained increasing acceptance for the voxel-based statistical evaluation of brain positron emission tomography (PET) with the glucose analog 2-[18F]-fluoro-2-deoxy-d-glucose (FDG) in patients with suspected Alzheimer's disease (AD). To increase the sensitivity for detection of local changes, individual differences of total brain FDG uptake are usually compensated for by proportional scaling. However, in cases of extensive hypometabolic areas, proportional scaling overestimates scaled uptake. This may cause significant underestimation of the extent of hypometabolic areas by the statistical test. To detect this problem, the authors tested for hypermetabolism. In patients with no visual evidence of true focal hypermetabolism, significant clusters of hypermetabolism in the presence of extended hypometabolism were interpreted as false-positive findings, indicating relevant overestimation of scaled uptake. In this case, scaled uptake was reduced step by step until there were no more significant clusters of hypermetabolism. In 22 consecutive patients with suspected AD, proportional scaling resulted in relevant overestimation of scaled uptake in 9 patients. Scaled uptake had to be reduced by 11.1% +/- 5.3% in these cases to eliminate the artifacts. Adjusted scaling resulted in extension of existing and appearance of new clusters of hypometabolism. Total volume of the additional voxels with significant hypometabolism depended linearly on the extent of the additional scaling and was 202 +/- 118 mL on average. Adjusted scaling helps to identify characteristic metabolic patterns in patients with suspected AD. It is expected to increase specificity of FDGPET in this group of patients.
Analysis of in vivo corrosion of 316L stainless steel posterior thoracolumbar plate systems: a retrieval study.

PubMed

Majid, Kamran; Crowder, Terence; Baker, Erin; Baker, Kevin; Koueiter, Denise; Shields, Edward; Herkowitz, Harry N

2011-12-01

One hundred eighteen patients retrieved 316L stainless steel thoracolumbar plates, of 3 different designs, used for fusion in 60 patients were examined for evidence of corrosion. A medical record review and statistical analysis were also carried out. This study aims to identify types of corrosion and examine preferential metal ion release and the possibility of statistical correlation to clinical effects. Earlier studies have found that stainless steel spine devices showed evidence of mild-to-severe corrosion; fretting and crevice corrosion were the most commonly reported types. Studies have also shown the toxicity of metal ions released from stainless steel corrosion and how the ions may adversely affect bone formation and/or induce granulomatous foreign body responses. The retrieved plates were visually inspected and graded based on the degree of corrosion. The plates were then analyzed with optical microscopy, scanning electron microscopy, and energy dispersive x-ray spectroscopy. A retrospective medical record review was performed and statistical analysis was carried out to determine any correlations between experimental findings and patient data. More than 70% of the plates exhibited some degree of corrosion. Both fretting and crevice corrosion mechanisms were observed, primarily at the screw plate interface. Energy dispersive x-ray spectroscopy analysis indicated reductions in nickel content in corroded areas, suggestive of nickel ion release to the surrounding biological environment. The incidence and severity of corrosion was significantly correlated with the design of the implant. Stainless steel thoracolumbar plates show a high incidence of corrosion, with statistical dependence on device design.
A family of nonlinear Schrödinger equations admitting q-plane wave solutions

NASA Astrophysics Data System (ADS)

Nobre, F. D.; Plastino, A. R.

2017-08-01

Nonlinear Schrödinger equations with power-law nonlinearities have attracted considerable attention recently. Two previous proposals for these types of equations, corresponding respectively to the Gross-Pitaievsky equation and to the one associated with nonextensive statistical mechanics, are here unified into a single, parameterized family of nonlinear Schrödinger equations. Power-law nonlinear terms characterized by exponents depending on a real index q, typical of nonextensive statistical mechanics, are considered in such a way that the Gross-Pitaievsky equation is recovered in the limit q → 1. A classical field theory shows that, due to these nonlinearities, an extra field Φ (x → , t) (besides the usual one Ψ (x → , t)) must be introduced for consistency. The new field can be identified with Ψ* (x → , t) only when q → 1. For q ≠ 1 one has a pair of coupled nonlinear wave equations governing the joint evolution of the complex valued fields Ψ (x → , t) and Φ (x → , t). These equations reduce to the usual pair of complex-conjugate ones only in the q → 1 limit. Interestingly, the nonlinear equations obeyed by Ψ (x → , t) and Φ (x → , t) exhibit a common, soliton-like, traveling solution, which is expressible in terms of the q-exponential function that naturally emerges within nonextensive statistical mechanics.
A statistical model including age to predict passenger postures in the rear seats of automobiles.

PubMed

Park, Jangwoon; Ebert, Sheila M; Reed, Matthew P; Hallman, Jason J

2016-06-01

Few statistical models of rear seat passenger posture have been published, and none has taken into account the effects of occupant age. This study developed new statistical models for predicting passenger postures in the rear seats of automobiles. Postures of 89 adults with a wide range of age and body size were measured in a laboratory mock-up in seven seat configurations. Posture-prediction models for female and male passengers were separately developed by stepwise regression using age, body dimensions, seat configurations and two-way interactions as potential predictors. Passenger posture was significantly associated with age and the effects of other two-way interaction variables depended on age. A set of posture-prediction models are presented for women and men, and the prediction results are compared with previously published models. This study is the first study of passenger posture to include a large cohort of older passengers and the first to report a significant effect of age for adults. The presented models can be used to position computational and physical human models for vehicle design and assessment. Practitioner Summary: The significant effects of age, body dimensions and seat configuration on rear seat passenger posture were identified. The models can be used to accurately position computational human models or crash test dummies for older passengers in known rear seat configurations.
Subjective evaluation of compressed image quality

NASA Astrophysics Data System (ADS)

Lee, Heesub; Rowberg, Alan H.; Frank, Mark S.; Choi, Hyung-Sik; Kim, Yongmin

1992-05-01

Lossy data compression generates distortion or error on the reconstructed image and the distortion becomes visible as the compression ratio increases. Even at the same compression ratio, the distortion appears differently depending on the compression method used. Because of the nonlinearity of the human visual system and lossy data compression methods, we have evaluated subjectively the quality of medical images compressed with two different methods, an intraframe and interframe coding algorithms. The evaluated raw data were analyzed statistically to measure interrater reliability and reliability of an individual reader. Also, the analysis of variance was used to identify which compression method is better statistically, and from what compression ratio the quality of a compressed image is evaluated as poorer than that of the original. Nine x-ray CT head images from three patients were used as test cases. Six radiologists participated in reading the 99 images (some were duplicates) compressed at four different compression ratios, original, 5:1, 10:1, and 15:1. The six readers agree more than by chance alone and their agreement was statistically significant, but there were large variations among readers as well as within a reader. The displacement estimated interframe coding algorithm is significantly better in quality than that of the 2-D block DCT at significance level 0.05. Also, 10:1 compressed images with the interframe coding algorithm do not show any significant differences from the original at level 0.05.
Quantitative approaches in climate change ecology

PubMed Central

Brown, Christopher J; Schoeman, David S; Sydeman, William J; Brander, Keith; Buckley, Lauren B; Burrows, Michael; Duarte, Carlos M; Moore, Pippa J; Pandolfi, John M; Poloczanska, Elvira; Venables, William; Richardson, Anthony J

2011-01-01

Contemporary impacts of anthropogenic climate change on ecosystems are increasingly being recognized. Documenting the extent of these impacts requires quantitative tools for analyses of ecological observations to distinguish climate impacts in noisy data and to understand interactions between climate variability and other drivers of change. To assist the development of reliable statistical approaches, we review the marine climate change literature and provide suggestions for quantitative approaches in climate change ecology. We compiled 267 peer-reviewed articles that examined relationships between climate change and marine ecological variables. Of the articles with time series data (n = 186), 75% used statistics to test for a dependency of ecological variables on climate variables. We identified several common weaknesses in statistical approaches, including marginalizing other important non-climate drivers of change, ignoring temporal and spatial autocorrelation, averaging across spatial patterns and not reporting key metrics. We provide a list of issues that need to be addressed to make inferences more defensible, including the consideration of (i) data limitations and the comparability of data sets; (ii) alternative mechanisms for change; (iii) appropriate response variables; (iv) a suitable model for the process under study; (v) temporal autocorrelation; (vi) spatial autocorrelation and patterns; and (vii) the reporting of rates of change. While the focus of our review was marine studies, these suggestions are equally applicable to terrestrial studies. Consideration of these suggestions will help advance global knowledge of climate impacts and understanding of the processes driving ecological change.
On the insufficiency of arbitrarily precise covariance matrices: non-Gaussian weak-lensing likelihoods

NASA Astrophysics Data System (ADS)

Sellentin, Elena; Heavens, Alan F.

2018-01-01

We investigate whether a Gaussian likelihood, as routinely assumed in the analysis of cosmological data, is supported by simulated survey data. We define test statistics, based on a novel method that first destroys Gaussian correlations in a data set, and then measures the non-Gaussian correlations that remain. This procedure flags pairs of data points that depend on each other in a non-Gaussian fashion, and thereby identifies where the assumption of a Gaussian likelihood breaks down. Using this diagnosis, we find that non-Gaussian correlations in the CFHTLenS cosmic shear correlation functions are significant. With a simple exclusion of the most contaminated data points, the posterior for s8 is shifted without broadening, but we find no significant reduction in the tension with s8 derived from Planck cosmic microwave background data. However, we also show that the one-point distributions of the correlation statistics are noticeably skewed, such that sound weak-lensing data sets are intrinsically likely to lead to a systematically low lensing amplitude being inferred. The detected non-Gaussianities get larger with increasing angular scale such that for future wide-angle surveys such as Euclid or LSST, with their very small statistical errors, the large-scale modes are expected to be increasingly affected. The shifts in posteriors may then not be negligible and we recommend that these diagnostic tests be run as part of future analyses.
Efficacy of a composite biological age score to predict ten-year survival among Kansas and Nebraska Mennonites.

PubMed

Uttley, M; Crawford, M H

1994-02-01

In 1980 and 1981 Mennonite descendants of a group of Russian immigrants participated in a multidisciplinary study of biological aging. The Mennonites live in Goessel, Kansas, and Henderson, Nebraska. In 1991 the survival status of the participants was documented by each church secretary. Data are available for 1009 individuals, 177 of whom are now deceased. They ranged from 20 to 95 years in age when the data were collected. Biological ages were computed using a stepwise multiple regression procedure based on 38 variables previously identified as being related to survival, with chronological age as the dependent variable. Standardized residuals place participants in either a predicted-younger or a predicted-older group. The independence of the variables biological age and survival status is tested with the chi-square statistic. The significance of biological age differences between surviving and deceased Mennonites is determined by t test values. The two statistics provide consistent results. Predicted age group classification and survival status are related. The group of deceased participants is generally predicted to be older than the group of surviving participants, although neither statistic is significant for all subgroups of Mennonites. In most cases, however, individuals in the predicted-older groups are at a relatively higher risk of dying compared with those in the predicted-younger groups, although the increased risk is not always significant.
Improving information retrieval in functional analysis.

PubMed

Rodriguez, Juan C; González, Germán A; Fresno, Cristóbal; Llera, Andrea S; Fernández, Elmer A

2016-12-01

Transcriptome analysis is essential to understand the mechanisms regulating key biological processes and functions. The first step usually consists of identifying candidate genes; to find out which pathways are affected by those genes, however, functional analysis (FA) is mandatory. The most frequently used strategies for this purpose are Gene Set and Singular Enrichment Analysis (GSEA and SEA) over Gene Ontology. Several statistical methods have been developed and compared in terms of computational efficiency and/or statistical appropriateness. However, whether their results are similar or complementary, the sensitivity to parameter settings, or possible bias in the analyzed terms has not been addressed so far. Here, two GSEA and four SEA methods and their parameter combinations were evaluated in six datasets by comparing two breast cancer subtypes with well-known differences in genetic background and patient outcomes. We show that GSEA and SEA lead to different results depending on the chosen statistic, model and/or parameters. Both approaches provide complementary results from a biological perspective. Hence, an Integrative Functional Analysis (IFA) tool is proposed to improve information retrieval in FA. It provides a common gene expression analytic framework that grants a comprehensive and coherent analysis. Only a minimal user parameter setting is required, since the best SEA/GSEA alternatives are integrated. IFA utility was demonstrated by evaluating four prostate cancer and the TCGA breast cancer microarray datasets, which showed its biological generalization capabilities. Copyright © 2016 Elsevier Ltd. All rights reserved.
Using Network Analysis to Characterize Biogeographic Data in a Community Archive

NASA Astrophysics Data System (ADS)

Wellman, T. P.; Bristol, S.

2017-12-01

Informative measures are needed to evaluate and compare data from multiple providers in a community-driven data archive. This study explores insights from network theory and other descriptive and inferential statistics to examine data content and application across an assemblage of publically available biogeographic data sets. The data are archived in ScienceBase, a collaborative catalog of scientific data supported by the U.S Geological Survey to enhance scientific inquiry and acuity. In gaining understanding through this investigation and other scientific venues our goal is to improve scientific insight and data use across a spectrum of scientific applications. Network analysis is a tool to reveal patterns of non-trivial topological features in the data that do not exhibit complete regularity or randomness. In this work, network analyses are used to explore shared events and dependencies between measures of data content and application derived from metadata and catalog information and measures relevant to biogeographic study. Descriptive statistical tools are used to explore relations between network analysis properties, while inferential statistics are used to evaluate the degree of confidence in these assessments. Network analyses have been used successfully in related fields to examine social awareness of scientific issues, taxonomic structures of biological organisms, and ecosystem resilience to environmental change. Use of network analysis also shows promising potential to identify relationships in biogeographic data that inform programmatic goals and scientific interests.
Brain fingerprinting field studies comparing P300-MERMER and P300 brainwave responses in the detection of concealed information.

PubMed

Farwell, Lawrence A; Richardson, Drew C; Richardson, Graham M

2013-08-01

Brain fingerprinting detects concealed information stored in the brain by measuring brainwave responses. We compared P300 and P300-MERMER event-related brain potentials for error rate/accuracy and statistical confidence in four field/real-life studies. 76 tests detected presence or absence of information regarding (1) real-life events including felony crimes; (2) real crimes with substantial consequences (either a judicial outcome, i.e., evidence admitted in court, or a $100,000 reward for beating the test); (3) knowledge unique to FBI agents; and (4) knowledge unique to explosives (EOD/IED) experts. With both P300 and P300-MERMER, error rate was 0 %: determinations were 100 % accurate, no false negatives or false positives; also no indeterminates. Countermeasures had no effect. Median statistical confidence for determinations was 99.9 % with P300-MERMER and 99.6 % with P300. Brain fingerprinting methods and scientific standards for laboratory and field applications are discussed. Major differences in methods that produce different results are identified. Markedly different methods in other studies have produced over 10 times higher error rates and markedly lower statistical confidences than those of these, our previous studies, and independent replications. Data support the hypothesis that accuracy, reliability, and validity depend on following the brain fingerprinting scientific standards outlined herein.
Incidence of iatrogenic opioid dependence or abuse in patients with pain who were exposed to opioid analgesic therapy: a systematic review and meta-analysis.

PubMed

Higgins, C; Smith, B H; Matthews, K

2018-06-01

The prevalence and incidence of chronic conditions, such as pain and opioid dependence, have implications for policy development, resource allocation, and healthcare delivery. The primary objective of the current review was to estimate the incidence of iatrogenic opioid dependence or abuse after treatment with opioid analgesics. Systematic electronic searches utilised six research databases (Embase, Medline, PubMed, Cinahl Plus, Web of Science, OpenGrey). A 'grey' literature search and a reference search of included articles were also undertaken. The PICOS framework was used to develop search strategies and the findings are reported in accordance with the PRISMA Statement. After eligibility reviews of 6164 articles, 12 studies (involving 310 408 participants) were retained for inclusion in the meta-analyses. A random effects model (DerSimonian-Laird method) generated a pooled incidence of opioid dependence or abuse of 4.7%. There was little within-study risk of bias and no significant publication bias; however, substantial heterogeneity was found among study effects (99.78%). Sensitivity analyses indicated that the diagnostic criteria selected for identifying opioid dependence or abuse (Diagnostic Statistical Manual (DSM-IV) vs International Classification of Diseases (ICD-9)) accounted for 20% and duration of exposure to opioid analgesics accounted for 18% of variance in study effects. Longer-term opioid analgesic exposure, and prescription of strong rather than weak opioids, were associated with a significantly lower incidence of opioid dependence or abuse. The incidence of iatrogenic opioid dependence or abuse was 4.7% of those prescribed opioids for pain. Further research is required to confirm the potential for our findings to inform prevention of this serious adverse event. Copyright © 2018 British Journal of Anaesthesia. Published by Elsevier Ltd. All rights reserved.
The effects of intraspecific competition and stabilizing selection on a polygenic trait.

PubMed Central

Bürger, Reinhard; Gimelfarb, Alexander

2004-01-01

The equilibrium properties of an additive multilocus model of a quantitative trait under frequency- and density-dependent selection are investigated. Two opposing evolutionary forces are assumed to act: (i) stabilizing selection on the trait, which favors genotypes with an intermediate phenotype, and (ii) intraspecific competition mediated by that trait, which favors genotypes whose effect on the trait deviates most from that of the prevailing genotypes. Accordingly, fitnesses of genotypes have a frequency-independent component describing stabilizing selection and a frequency- and density-dependent component modeling competition. We study how the equilibrium structure, in particular, number, degree of polymorphism, and genetic variance of stable equilibria, is affected by the strength of frequency dependence, and what role the number of loci, the amount of recombination, and the demographic parameters play. To this end, we employ a statistical and numerical approach, complemented by analytical results, and explore how the equilibrium properties averaged over a large number of genetic systems with a given number of loci and average amount of recombination depend on the ecological and demographic parameters. We identify two parameter regions with a transitory region in between, in which the equilibrium properties of genetic systems are distinctively different. These regions depend on the strength of frequency dependence relative to pure stabilizing selection and on the demographic parameters, but not on the number of loci or the amount of recombination. We further study the shape of the fitness function observed at equilibrium and the extent to which the dynamics in this model are adaptive, and we present examples of equilibrium distributions of genotypic values under strong frequency dependence. Consequences for the maintenance of genetic variation, the detection of disruptive selection, and models of sympatric speciation are discussed. PMID:15280253
Informatic support for processing the data regarding the environment factors possibly involved in the etiopathogenesis of insulin-dependent diabetes mellitus ETIODIAB.

PubMed

Alecu, S; Dadarlat, V; Stanciu, E; Ionescu-Tirgoviste, C; Konerth, A M

1997-01-01

Diabetes represents a heterogeneous group of disturbances, which can have a different aetiology, but have in common glucidic, lipidic and proteinic metabolic disturbances. Insulin-dependent diabetes appears in genetically susceptible persons, as an autoimmune disease activated by environment factors. Epidemiological studies performed in different countries, notice the increasing of diabetes cases in the last decades. Therefore the informatic system EtioDiab (from Etiopathological diabetes) has been developed. The purpose of this system is to assist the medical research regarding the environment factors involved in the etiopathogenesis of insulin-dependent diabetes. The system offers the possibility of calculation of many statistic indicators, of graphic representation of the recorded data, of verification of the statistical hypotheses.
Origin of the spike-timing-dependent plasticity rule

NASA Astrophysics Data System (ADS)

Cho, Myoung Won; Choi, M. Y.

2016-08-01

A biological synapse changes its efficacy depending on the difference between pre- and post-synaptic spike timings. Formulating spike-timing-dependent interactions in terms of the path integral, we establish a neural-network model, which makes it possible to predict relevant quantities rigorously by means of standard methods in statistical mechanics and field theory. In particular, the biological synaptic plasticity rule is shown to emerge as the optimal form for minimizing the free energy. It is further revealed that maximization of the entropy of neural activities gives rise to the competitive behavior of biological learning. This demonstrates that statistical mechanics helps to understand rigorously key characteristic behaviors of a neural network, thus providing the possibility of physics serving as a useful and relevant framework for probing life.
Geometry of the q-exponential distribution with dependent competing risks and accelerated life testing

NASA Astrophysics Data System (ADS)

Zhang, Fode; Shi, Yimin; Wang, Ruibing

2017-02-01

In the information geometry suggested by Amari (1985) and Amari et al. (1987), a parametric statistical model can be regarded as a differentiable manifold with the parameter space as a coordinate system. Note that the q-exponential distribution plays an important role in Tsallis statistics (see Tsallis, 2009), this paper investigates the geometry of the q-exponential distribution with dependent competing risks and accelerated life testing (ALT). A copula function based on the q-exponential function, which can be considered as the generalized Gumbel copula, is discussed to illustrate the structure of the dependent random variable. Employing two iterative algorithms, simulation results are given to compare the performance of estimations and levels of association under different hybrid progressively censoring schemes (HPCSs).
DRAGO (KIAA0247), a new DNA damage-responsive, p53-inducible gene that cooperates with p53 as oncosuppressor. [Corrected].

PubMed

Polato, Federica; Rusconi, Paolo; Zangrossi, Stefano; Morelli, Federica; Boeri, Mattia; Musi, Alberto; Marchini, Sergio; Castiglioni, Vittoria; Scanziani, Eugenio; Torri, Valter; Broggini, Massimo

2014-04-01

p53 influences genomic stability, apoptosis, autophagy, response to stress, and DNA damage. New p53-target genes could elucidate mechanisms through which p53 controls cell integrity and response to damage. DRAGO (drug-activated gene overexpressed, KIAA0247) was characterized by bioinformatics methods as well as by real-time polymerase chain reaction, chromatin immunoprecipitation and luciferase assays, time-lapse microscopy, and cell viability assays. Transgenic mice (94 p53(-/-) and 107 p53(+/-) mice on a C57BL/6J background) were used to assess DRAGO activity in vivo. Survival analyses were performed using Kaplan-Meier curves and the Mantel-Haenszel test. All statistical tests were two-sided. We identified DRAGO as a new p53-responsive gene induced upon treatment with DNA-damaging agents. DRAGO is highly conserved, and its ectopic overexpression resulted in growth suppression and cell death. DRAGO(-/-) mice are viable without macroscopic alterations. However, in p53(-/-) or p53(+/-) mice, the deletion of both DRAGO alleles statistically significantly accelerated tumor development and shortened lifespan compared with p53(-/-) or p53(+/-) mice bearing wild-type DRAGO alleles (p53(-/-), DRAGO(-/-) mice: hazard ratio [HR] = 3.25, 95% confidence interval [CI] = 1.7 to 6.1, P < .001; p53(+/-), DRAGO(-/-) mice: HR = 2.35, 95% CI = 1.3 to 4.0, P < .001; both groups compared with DRAGO(+/+) counterparts). DRAGO mRNA levels were statistically significantly reduced in advanced-stage, compared with early-stage, ovarian tumors, but no mutations were found in several human tumors. We show that DRAGO expression is regulated both at transcriptional-through p53 (and p73) and methylation-dependent control-and post-transcriptional levels by miRNAs. DRAGO represents a new p53-dependent gene highly regulated in human cells and whose expression cooperates with p53 in tumor suppressor functions.
DRAGO (KIAA0247), a New DNA Damage–Responsive, p53-Inducible Gene That Cooperates With p53 as Oncosupprossor

PubMed Central

Polato, Federica; Rusconi, Paolo

2014-01-01

Background p53 influences genomic stability, apoptosis, autophagy, response to stress, and DNA damage. New p53-target genes could elucidate mechanisms through which p53 controls cell integrity and response to damage. Methods DRAGO (drug-activated gene overexpressed, KIAA0247) was characterized by bioinformatics methods as well as by real-time polymerase chain reaction, chromatin immunoprecipitation and luciferase assays, time-lapse microscopy, and cell viability assays. Transgenic mice (94 p53−/− and 107 p53+/− mice on a C57BL/6J background) were used to assess DRAGO activity in vivo. Survival analyses were performed using Kaplan–Meier curves and the Mantel–Haenszel test. All statistical tests were two-sided. Results We identified DRAGO as a new p53-responsive gene induced upon treatment with DNA-damaging agents. DRAGO is highly conserved, and its ectopic overexpression resulted in growth suppression and cell death. DRAGO−/− mice are viable without macroscopic alterations. However, in p53−/− or p53+/− mice, the deletion of both DRAGO alleles statistically significantly accelerated tumor development and shortened lifespan compared with p53−/− or p53+/− mice bearing wild-type DRAGO alleles (p53−/−, DRAGO−/− mice: hazard ratio [HR] = 3.25, 95% confidence interval [CI] = 1.7 to 6.1, P < .001; p53+/−, DRAGO−/− mice: HR = 2.35, 95% CI = 1.3 to 4.0, P < .001; both groups compared with DRAGO+/+ counterparts). DRAGO mRNA levels were statistically significantly reduced in advanced-stage, compared with early-stage, ovarian tumors, but no mutations were found in several human tumors. We show that DRAGO expression is regulated both at transcriptional—through p53 (and p73) and methylation-dependent control—and post-transcriptional levels by miRNAs. Conclusions DRAGO represents a new p53-dependent gene highly regulated in human cells and whose expression cooperates with p53 in tumor suppressor functions. PMID:24652652
Identifying Dynamic Functional Connectivity Changes in Dementia with Lewy Bodies Based on Product Hidden Markov Models.

PubMed

Sourty, Marion; Thoraval, Laurent; Roquet, Daniel; Armspach, Jean-Paul; Foucher, Jack; Blanc, Frédéric

2016-01-01

Exploring time-varying connectivity networks in neurodegenerative disorders is a recent field of research in functional MRI. Dementia with Lewy bodies (DLB) represents 20% of the neurodegenerative forms of dementia. Fluctuations of cognition and vigilance are the key symptoms of DLB. To date, no dynamic functional connectivity (DFC) investigations of this disorder have been performed. In this paper, we refer to the concept of connectivity state as a piecewise stationary configuration of functional connectivity between brain networks. From this concept, we propose a new method for group-level as well as for subject-level studies to compare and characterize connectivity state changes between a set of resting-state networks (RSNs). Dynamic Bayesian networks, statistical and graph theory-based models, enable one to learn dependencies between interacting state-based processes. Product hidden Markov models (PHMM), an instance of dynamic Bayesian networks, are introduced here to capture both statistical and temporal aspects of DFC of a set of RSNs. This analysis was based on sliding-window cross-correlations between seven RSNs extracted from a group independent component analysis performed on 20 healthy elderly subjects and 16 patients with DLB. Statistical models of DFC differed in patients compared to healthy subjects for the occipito-parieto-frontal network, the medial occipital network and the right fronto-parietal network. In addition, pairwise comparisons of DFC of RSNs revealed a decrease of dependency between these two visual networks (occipito-parieto-frontal and medial occipital networks) and the right fronto-parietal control network. The analysis of DFC state changes thus pointed out networks related to the cognitive functions that are known to be impaired in DLB: visual processing as well as attentional and executive functions. Besides this context, product HMM applied to RSNs cross-correlations offers a promising new approach to investigate structural and temporal aspects of brain DFC.
A novel predictive pharmacokinetic/pharmacodynamic model of repolarization prolongation derived from the effects of terfenadine, cisapride and E-4031 in the conscious chronic av node--ablated, His bundle-paced dog.

PubMed

Nolan, Emily R; Feng, Meihua Rose; Koup, Jeffrey R; Liu, Jing; Turluck, Daniel; Zhang, Yiqun; Paulissen, Jerome B; Olivier, N Bari; Miller, Teresa; Bailie, Marc B

2006-01-01

Terfenadine, cisapride, and E-4031, three drugs that prolong ventricular repolarization, were selected to evaluate the sensitivity of the conscious chronic atrioventricular node--ablated, His bundle-paced Dog for defining drug induced cardiac repolarization prolongation. A novel predictive pharmacokinetic/pharmacodynamic model of repolarization prolongation was generated from these data. Three male beagle dogs underwent radiofrequency AV nodal ablation, and placement of a His bundle-pacing lead and programmable pacemaker under anesthesia. Each dog was restrained in a sling for a series of increasing dose infusions of each drug while maintained at a constant heart rate of 80 beats/min. RT interval, a surrogate for QT interval in His bundle-paced dogs, was recorded throughout the experiment. E-4031 induced a statistically significant RT prolongation at the highest three doses. Cisapride resulted in a dose-dependent increase in RT interval, which was statistically significant at the two highest doses. Terfenadine induced a dose-dependent RT interval prolongation with a statistically significant change occurring only at the highest dose. The relationship between drug concentration and RT interval change was described by a sigmoid E(max) model with an effect site. Maximum RT change (E(max)), free drug concentration at half of the maximum effect (EC(50)), and free drug concentration associated with a 10 ms RT prolongation (EC(10 ms)) were estimated. A linear correlation between EC(10 ms) and HERG IC(50) values was identified. The conscious dog with His bundle-pacing detects delayed cardiac repolarization related to I(Kr) inhibition, and detects repolarization change induced by drugs with activity at multiple ion channels. A clinically relevant sensitivity and a linear correlation with in vitro HERG data make the conscious His bundle-paced dog a valuable tool for detecting repolarization effect of new chemical entities.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.