These are representative sample records from Science.gov related to your search topic.
For comprehensive and current results, perform a real-time search at Science.gov.
1

Jackknifing Disattenuated Correlations  

ERIC Educational Resources Information Center

The utility of the jackknife for constructing confidence intervals and testing hypotheses about the disattenuated correlation is evaluated for small samples. Results of computer simulations support the claim that the jackknife can be used to construct confidence intervals but has limited utility for testing hypotheses about the disattenuated…

Rogers, W. Todd

1976-01-01

2

ESTIMATING SPECIES RICHNESS USING THE JACKKNIFE PROCEDURE  

EPA Science Inventory

An exact expression is given for the jackknife estimate of the number of species in a community and its variance when one uses quadrat sampling procedures. The jackknife estimate is a function of the number of species that occur in one and only one quadrat. The variance of the nu...

3

BOOTSTRAP AND JACKKNIFE RESAMPLING ALGORITHMS FOR ESTIMATION OF REGRESSION PARAMETERS  

Microsoft Academic Search

In this paper, the hierarchical ways for building a regression model by using bootstrap and jackknife resampling methods were presented. Bootstrap approaches based on the observations and errors resampling, and jackknife approaches based on the delete-one and delete-d observations were considered. And also we consider estimating bootstrap and jackknife bias, standard errors and confidence intervals of the regression coefficients, and

Suat SAHINLER; Dervis TOPUZ

4

Bootstrap Methods: Another Look at the Jackknife  

Microsoft Academic Search

We discuss the following problem: given a random sample $\\\\mathbf{X} = (X_1, X_2, \\\\cdots, X_n)$ from an unknown probability distribution $F$, estimate the sampling distribution of some prespecified random variable $R(\\\\mathbf{X}, F)$, on the basis of the observed data $\\\\mathbf{x}$. (Standard jackknife theory gives an approximate mean and variance in the case $R(\\\\mathbf{X}, F) = \\\\theta(\\\\hat{F}) - \\\\theta(F), \\\\theta$ some

B. Efron

1979-01-01

5

Jackknife, Bootstrap and Other Resampling Methods in Regression Analysis  

Microsoft Academic Search

Motivated by a representation for the least squares estimator, we propose a class of weighted jackknife variance estimators for the least squares estimator by deleting any fixed number of observations at a time. They are unbiased for homoscedastic errors and a special case, the delete-one jackknife, is almost unbiased for heteroscedastic errors. The method is extended to cover nonlinear parameters,

C. F. J. Wu

1986-01-01

6

Jackknife, Bootstrap and other Resampling Methods in Regression Analysis.  

National Technical Information Service (NTIS)

Motivated by a representation for the least squares estimator, we propose a class of weighted jackknife variance estimators for the least squares estimator by deleting any fixed number of observations at a time. They are unbiased for homoscedastic errors ...

C. F. Wu

1986-01-01

7

Jackknife free-response ROC methodology  

NASA Astrophysics Data System (ADS)

Although ROC analysis is the accepted methodology for evaluation of diagnostic imaging systems, it has some serious shortcomings. By contrast, FROC methodology allows the observer to report multiple abnormalities per case, and uses the location of reported abnormalities to improve the measurement. Because ROC methodology has no way to allow multiple responses or use the location information, its statistical power will suffer. The FROC method has not enjoyed widespread acceptance because of concern about whether responses made to the same case can be treated as independent. We propose a new jackknife FROC method (JAFROC) that does not make the independence assumption. The new method combines elements of FROC and the Dorfman-Berbaum-Metz (DBM) multi-reader ROC methods. To compare the JAFROC method to an earlier free-response method (alternative free-response or AFROC method), and to the DBM method, which uses conventional ROC scoring, we developed a model for generating simulated FROC detection and location data. The simulation model is quite general and can be used to evaluate any method for analysis of multiple-response detection-and-localization data. It allowed us to examine null hypothesis (NH) behavior and statistical power of analytic methods. We found that AFROC analysis did not pass the NH test, being unduly conservative. Both the JAFROC method and the DBM passed the NH test, but JAFROC had more statistical power than the DBM method. The results of this comparison suggests that future studies of diagnostic performance may enjoy improved statistical power or reduced sample size requirements through the use of the JAFROC method.

Chakraborty, Dev P.; Berbaum, Kevin S.

2004-05-01

8

Delta and Jackknife Estimators with Low Bias for Functions of Binomial and  

E-print Network

, Probability and Statistics Group School of Mathematics, The University of Manchester #12;Delta and jackknife of Mathematics University of Manchester Manchester M13 9PL, UK Abstract: An estimator is said to be of order for the first division Spanish soccer league (Diaz-Emparanza and Nunez-Anton, 2010). The aim of this note

Sidorov, Nikita

9

Exhausted jackknife validation exemplified by prediction of temperature optimum in enzymatic reaction of cellulases.  

PubMed

This was the continuation of our previous study along the same line with more focus on technical details because the data are usually divided into two datasets, one for model development and the other for model validation during the development of predictive model. The widely used validation method is the delete-1 jackknife validation. However, no systematical studies were conducted to determine whether the jackknife validation with different deletions works better because the number of validations with different deletions increases in a factorial fashion. Therefore it is only small dataset that can be used for such an exhausted study. Cellulase is an enzyme playing an important role in modern industry, and many parameters related to cellulase in enzymatic reactions were poorly documented. With increased interests in cellulases in bio-fuel industry, the prediction of parameters in enzymatic reactions is listed on agenda. In this study, two aims were defined (a) which amino acid property works better to predict the temperature optimum and (b) with which deletion the jackknife validation works. The results showed that the amino acid distribution probability works better in predicting the optimum temperature of catalytic reaction by cellulase, and the delete-4, more precisely one-fifth deletion, jackknife validation works better. PMID:22207587

Yan, Shaomin; Wu, Guang

2012-02-01

10

A Note on Allocating Items to Subtests in Multiple Matrix Sampling and Approximating Standard Errors of Estimate with the Jackknife.  

ERIC Educational Resources Information Center

Investigated empirically through post mortem item-examinee sampling were the relative merits of two alternative procedures for allocating items to subtests in multiple matrix sampling and the feasibility of using the jackknife in approximating standard errors of estimate. The results indicate clearly that a partially balanced incomplete block…

Shoemaker, David M.

11

Regional Seismic Tomography in Brazil and Uncertainty Evaluation Through Jackknife Re- Sampling Method  

NASA Astrophysics Data System (ADS)

We used the regional seismic tomography to study the upper mantle beneath SE and Central Brazil. This method is based on the inversion of P- and S-wave relative travel time residuals (VanDecar, 1991) obtained from more than 80 stations in an area of 20 x 20 degrees. The ~11000 P and PKP residuals and ~8000 S, ScS, SKS, and SKKS residuals have been obtained from waveform cross-correlations for up to 12 simultaneous stations. Our results show correlations of seismic anomalies with the main tectonic structures and reveal new anomalies not yet observed in previous works. High velocity anomalies in the western portion of the Sao Francisco Craton support the hypothesis that this craton was part of a major Neoproterozoic plate. Low velocity anomalies beneath the Tocantins Province (mainly fold belts between the Amazon and Sao Francisco cratons) are interpreted as due to lithospheric thinning. Assumpcao et al. (2004) showed a good correlation between intraplate seismicity and low velocity anomalies in this region. The slab of the Nazca Plate is observed as a high velocity anomaly beneath the Parana basin (at 700-1200 km depths). At these depths, large low velocity anomalies appear accompanying the slab. Synthetic tests show that these anomalies are artifacts of the inversion generated by the presence of the slab. We use the Jackknife re-sampling method to evaluate the robustness of the tomographic results with respect to the data. The main advantage is that it is not necessary to assume a particular error distribution, since the model variability is accessed directly from the data variability. The approach is based on a random removal of a small percentage of the data (1%) to generate various new subsets, which are inverted to evaluate the model variability. These local estimates include inherently the highly variable ray coverage and measurement errors and can provide confidence in the interpretation of anomalies. This measure should not be interpreted as the resolution. As expected, the Jackknife approach shows that the inversions are less robust at shallow depth and at the margins of the study volume. The model variability is also used as an additional criterion to determine the optimum number of iterations in the inversion.

Rocha, M. P.; Schimmel, M.; Assumpcao, M.

2007-05-01

12

Electromyographic activity of the rectus abdominis during a traditional crunch and the basic jackknife exercise with the Ab Lounge™.  

PubMed

The use of nontraditional exercise devices such as the Ab Lounge™ has been promoted as being as effective as the traditional abdominal crunch in strengthening the abdominal musculature. Evidence for this is lacking, however. The purpose of this study was to compare the degree of activation of the upper and lower rectus abdominis using electromyography (EMG) during a traditional crunch with the basic jackknife using the Ab Lounge™. Twenty-two subjects (6 men and 16 women) were randomly selected from the student population at the University of the West Indies (Mona Campus). The mean age of the participants was 20.5 ± 1.5 years, height 166.4 ± 6.2 cm, weight 64 ± 10.3 kg, and waist-hip ratio 0.7 ± 0.1. Surface EMG was used to assess the muscle activity from the upper and lower rectus abdominis while each exercise was performed. The EMG data were full-wave rectified and normalized using a mathematical model that was set up in Microsoft Excel for Windows XP. Statistical analysis was performed on the data using a univariate analysis of variance with gender as a covariate. Significance was determined by p < 0.05. The mean EMG data recorded for the upper rectus abdominis was significantly higher with the traditional crunch when compared with the basic jackknife performed on the Ab Lounge™ (F = 4.39, p = 0.04). The traditional crunch produced a higher level of activity in the lower rectus abdominis when compared with the basic jackknife, but this was not statistically significant (F = 0.249, p = 0.62). There was no significant interaction between gender and the effect of the type of exercise on upper and lower rectus abdominis activation. These results suggest that the traditional abdominal crunch is more effective than the basic jackknife is in activating the rectus abdominis musculature. PMID:21912295

Nelson, Gail A; Bent-Forsythe, Denise A; Roopchand-Martin, Sharmella C

2012-06-01

13

Jackknife-corrected parametric bootstrap estimates of growth rates in bivalve mollusks using nearest living relatives.  

PubMed

Quantitative estimates of growth rates can augment ecological and paleontological applications of body-size data. However, in contrast to body-size estimates, assessing growth rates is often time-consuming, expensive, or unattainable. Here we use an indirect approach, a jackknife-corrected parametric bootstrap, for efficient approximation of growth rates using nearest living relatives with known age-size relationships. The estimate is developed by (1) collecting a sample of published growth rates of closely related species, (2) calculating the average growth curve using those published age-size relationships, (3) resampling iteratively these empirically known growth curves to estimate the standard errors and confidence bands around the average growth curve, and (4) applying the resulting estimate of uncertainty to bracket age-size relationships of the species of interest. This approach was applied to three monophyletic families (Donacidae, Mactridae, and Semelidae) of mollusk bivalves, a group characterized by indeterministic shell growth, but widely used in ecological, paleontological, and geochemical research. The resulting indirect estimates were tested against two previously published geochemical studies and, in both cases, yielded highly congruent age estimates. In addition, a case study in applied fisheries was used to illustrate the potential of the proposed approach for augmenting aquaculture management practices. The resulting estimates of growth rates place body size data in a constrained temporal context and confidence intervals associated with resampling estimates allow for assessing the statistical uncertainty around derived temporal ranges. The indirect approach should allow for improved evaluation of diverse research questions, from sustainability of industrial shellfish harvesting to climatic interpretations of stable isotope proxies extracted from fossil skeletons. PMID:24071629

Dexter, Troy A; Kowalewski, Micha?

2013-12-01

14

Pre-collapse identification of sinkholes in unconsolidated media at Dead Sea area by `nanoseismic monitoring' (graphical jackknife location of weak sources by few, low-SNR records)  

NASA Astrophysics Data System (ADS)

The sudden failure of near-surface cavities and the resulting sinkholes have constituted a recent hazard affecting the populations, lifelines and the economy of the Dead Sea region. This paper describes how seismic monitoring techniques could detect the extremely low-energy signals produced by cavitation in unconsolidated, layered media. Dozens of such events were recorded within a radius of 200 m during several night-time experiments carried out along the western Dead Sea shores. The absence of prior knowledge about cavitation-induced events in unconsolidated media required an initial signal characterization, for which a series of source processes were simulated in the field under controlled conditions. The waveform analysis by sonograms recognizes two main groups of seismic events: impacts on dry material and impacts in liquid. Our analysis demonstrates that the discrimination between both types of source functions is robust despite the extreme nature of the scatter media. In addition to their association with specific source processes, these events can be precisely located by a graphical, error-resistant jackknifing approach. Using an extended ML scale, their source energy can be quantified, and related to standard seismic activity. In summary, it is now possible to monitor subsurface material failures before sinkhole collapse since the discrimination of impact signals on the basis of their frequency content is indicative of the maturity of the cavitation process.

Wust-Bloch, Gilles Hillel; Joswig, Manfred

2006-12-01

15

The many structural faces of calmodulin: a multitasking molecular jackknife.  

PubMed

Calmodulin (CaM) is a highly conserved protein and a crucial calcium sensor in eukaryotes. CaM is a regulator of hundreds of diverse target proteins. A wealth of studies has been carried out on the structure of CaM, both in the unliganded form and in complexes with target proteins and peptides. The outcome of these studies points toward a high propensity to attain various conformational states, depending on the binding partner. The purpose of this review is to provide examples of different conformations of CaM trapped in the crystal state. In addition, comparisons are made to corresponding studies in solution. The different CaM conformations in crystal structures are also compared based on the positions of the metal ions bound to their EF hands, in terms of distances, angles, and pseudo-torsion angles. Possible caveats and artifacts in CaM crystal structures are discussed, as well as the possibilities of trapping biologically relevant CaM conformations in the crystal state. PMID:25005783

Kursula, Petri

2014-10-01

16

Electric motor based steering for jackknife avoidance in large trucks  

Microsoft Academic Search

Electric motor steering systems have recently been introduced for passenger vehicles to improve handling stability under adverse road conditions. A new advanced steering system is examined where the handwheel angle is augmented by an electronically controlled electric motor. This is referred to as active front steering (AFS). Given the increased emphasis on developing hybrid electric-diesel trucks with electric driven accessories,

Roy McCann; Anh Le

2005-01-01

17

ROCView: prototype software for data collection in jackknife alternative free-response receiver operating characteristic analysis  

PubMed Central

ROCView has been developed as an image display and response capture (IDRC) solution to image display and consistent recording of reader responses in relation to the free-response receiver operating characteristic paradigm. A web-based solution to IDRC for observer response studies allows observations to be completed from any location, assuming that display performance and viewing conditions are consistent with the study being completed. The simplistic functionality of the software allows observations to be completed without supervision. ROCView can display images from multiple modalities, in a randomised order if required. Following registration, observers are prompted to begin their image evaluation. All data are recorded via mouse clicks, one to localise (mark) and one to score confidence (rate) using either an ordinal or continuous rating scale. Up to nine “mark-rating” pairs can be made per image. Unmarked images are given a default score of zero. Upon completion of the study, both true-positive and false-positive reports can be downloaded and adapted for analysis. ROCView has the potential to be a useful tool in the assessment of modality performance difference for a range of imaging methods. PMID:22573294

Thompson, J; Hogg, P; Thompson, S; Manning, D; Szczepura, K

2012-01-01

18

The Examining of Generalization Quantitative Scientific Findings by Using the Jackknife Method: An Application  

ERIC Educational Resources Information Center

The outcomes which cannot be generalized are specific for a sample but are unable to be reflected to the rest of the population. The parameters that are reached at the end of the statistics that are scarce in sample arise doubts in the aspect of generalization. In these cases, parameter estimation may not be very stable and outlier values can…

Kyari, Murat; Buyukozturk, Sener

2009-01-01

19

A Leisurely Look at the Bootstrap, the Jackknife, and Cross-Validation  

Microsoft Academic Search

This is an invited expository article for The American Statistician. It reviews the nonparametric estimation of statistical error, mainly the bias and standard error of an estimator, or the error rate of a prediction rule. The presentation is written at a relaxed mathematical level, omitting most proofs, regularity conditions, and technical details.

Bradley Efron; Gail Gong

1983-01-01

20

Iterative multi-atlas based segmentation with multi-channel image registration and Jackknife Context Model  

Microsoft Academic Search

For medical image segmentation, multi-atlas based segmentation methods have attracted great attention recently. Within the multi-atlas segmentation framework, labels of all atlases are propagated to the target image by means of image registration and then fused to achieve segmentation of the target image. While most multi-atlas based segmentation methods focus on developing effective label fusion strategies, few of them make

Yongfu Hao; Tianzi Jiang; Yong Fan

2012-01-01

21

Cross-Validation, the Jackknife, and the Bootstrap: Excess Error Estimation in Forward Logistic Regression  

Microsoft Academic Search

Given a prediction rule based on a set of patients, what is the probability of incorrectly predicting the outcome of a new patient? Call this probability the true error. An optimistic estimate is the apparent error, or the proportion of incorrect predictions on the original set of patients, and it is the goal of this article to study estimates of

Gail Gong

1986-01-01

22

The complete mitochondrial genome of the grand jackknife clam, Solen grandis (Bivalvia: Solenidae): a novel gene order and unusual non-coding region  

Microsoft Academic Search

Molluscs in general, and bivalves in particular, exhibit an extraordinary degree of mitochondrial gene order variation when\\u000a compared with other metazoans. The complete mitochondrial genome of Solen grandis (Bivalvia: Solenidae) was determined using long-PCR and genome walking techniques. The entire mitochondrial genome sequence\\u000a of S. grandis is 16,784 bp in length, and contains 36 genes including 12 protein-coding genes (atp8 is

Yang Yuan; Qi Li; Lingfeng Kong; Hong Yu

23

The complete mitochondrial genome of the grand jackknife clam, Solen grandis (Bivalvia: Solenidae): a novel gene order and unusual non-coding region.  

PubMed

Molluscs in general, and bivalves in particular, exhibit an extraordinary degree of mitochondrial gene order variation when compared with other metazoans. The complete mitochondrial genome of Solen grandis (Bivalvia: Solenidae) was determined using long-PCR and genome walking techniques. The entire mitochondrial genome sequence of S. grandis is 16,784 bp in length, and contains 36 genes including 12 protein-coding genes (atp8 is absent), 2 ribosomal RNAs, and 22 tRNAs. All genes are encoded on the same strand. Compared with other species, it bears a novel gene order. Besides these, we find a peculiar non-coding region of 435 bp with a microsatellite-like (TA)(12) element, poly-structures and many hairpin structures. In contrast to the available heterodont mitochondrial genomes from GenBank, the complete mtDNA of S. grandis has the shortest cox3 gene, and the longest atp6, nad4, nad5 genes. PMID:21598108

Yuan, Yang; Li, Qi; Kong, Lingfeng; Yu, Hong

2012-02-01

24

Pre-collapse identification of sinkholes in unconsolidated media at Dead Sea area by `nanoseismic monitoring' (graphical jackknife location of weak sources by few, low-SNR records)  

Microsoft Academic Search

The sudden failure of near-surface cavities and the resulting sinkholes have constituted a recent hazard affecting the populations, lifelines and the economy of the Dead Sea region. This paper describes how seismic monitoring techniques could detect the extremely low-energy signals produced by cavitation in unconsolidated, layered media. Dozens of such events were recorded within a radius of 200 m during

Gilles Hillel Wust-Bloch; Manfred Joswig

2006-01-01

25

MOLECULAR PHYLOGENY OF THE LICHEN FAMILIES CLADONIACEAE , SPHAEROPHORACEAE , AND STEREOCAULACEAE (LECANORALES, ASCOMYCOTINA)  

Microsoft Academic Search

Abstract: Maximum parsimony analysis of nuclear SSU rDNA sequences was utilized to infer the phylogenetic relationships of representatives of the macrolichen families Cladoniaceae, Sphaerophoraceae, andStereocaulaceae (Lecanorales subord. Cladoniineae, Ascomycotina). Farris' parsimony jackknifing, and a similar jackknife strategy with branch-swapping and multiple addition sequences in PAUP*, were performed to assess branch support. The results indicate that the Sphaerophoraceae should be emended

Mats Wedin; Heidi Döring; Stefan Ekman

2000-01-01

26

Comparison of several non-linear-regression methods for fitting the Michaelis-Menten equation.  

PubMed Central

The known jackknife methods (i.e. standard jackknife, weighted jackknife, linear jackknife and weighted linear jackknife) for the determination of the parameters (as well as of their confidence regions) were tested and compared with the simple Marquardt's technique (comprising the calculation of confidence intervals from the variance-co-variance matrix). The simulated data corresponding to the Michaelis-Menten equation with defined structure and magnitude of error of the dependent variable were used for fitting. There were no essential differences between the results of both point and interval parameter estimations by the tested methods. Marquardt's procedure yielded slightly better results than the jackknives for five scattered data points (the use of this method is advisable for routine analyses). The classical jackknife was slightly superior to the other methods for 20 data points (this method can be recommended for very precise calculations if great numbers of data are available). The weighting does not seem to be necessary in this type of equation because the parameter estimates obtained with all methods with the use of constant weights were comparable with those calculated with the weights corresponding exactly to the real error structure whereas the relative weighting led to rather worse results. PMID:4062884

Matyska, L; Kovar, J

1985-01-01

27

Statistical Inference for Regression Models with Covariate Measurement Error and Auxiliary Information  

PubMed Central

We consider statistical inference on a regression model in which some covariables are measured with errors together with an auxiliary variable. The proposed estimation for the regression coefficients is based on some estimating equations. This new method alleates some drawbacks of previously proposed estimations. This includes the requirment of undersmoothing the regressor functions over the auxiliary variable, the restriction on other covariables which can be observed exactly, among others. The large sample properties of the proposed estimator are established. We further propose a jackknife estimation, which consists of deleting one estimating equation (instead of one obervation) at a time. We show that the jackknife estimator of the regression coefficients and the estimating equations based estimator are asymptotically equivalent. Simulations show that the jackknife estimator has smaller biases when sample size is small or moderate. In addition, the jackknife estimation can also provide a consistent estimator of the asymptotic covariance matrix, which is robust to the heteroscedasticity. We illustrate these methods by applying them to a real data set from marketing science. PMID:22199460

You, Jinhong; Zhou, Haibo

2011-01-01

28

Amphibians in a human-dominated landscape: the community structure is related to habitat features and isolation  

Microsoft Academic Search

We studied amphibian populations in a human-dominated landscape, in Northern Italy, to evaluate the effects of patch quality and isolation on each species distribution and community structure. We used logistic and linear multiple regression to relate am- phibian presence during the breeding season in 84 wetlands to wetland features and isolation. Jackknife procedure was used to evaluate predictive capability of

Gentile Francesco Ficetola; Fiorenza De Bernardi

29

A Sparse Kernel Density Estimation Algorithm using Forward Constrained Regression  

E-print Network

estimators with comparable accuracy to that of the classical Parzen window estimate. Key words: cross validation, jackknife parameter estimator, Parzen window, probability density function, sparse modelling. 1 is to estimate the probability density function (pdf) from observed data samples [1­4]. A general and powerful

Chen, Sheng

30

Using Nonlinear Energy Operator Index As Pseudo Amino Acid Compositions for Predicting Protein Subcellular Location  

Microsoft Academic Search

The functions of proteins are closely correlated with their subcellular localizations. Based on the concept of pseudo amino acid composition, the nonlinear energy operator (NEO) approach is introduced to incorporate the sequence order effect on protein subcellular localization. Results obtained through self- consistency, jackknife and independent dataset tests indicate that the prediction accuracies by the current algorithm are significantly higher

Xiaoli Guo; Xiaoming Chen; Yihong Qiu; Zhende Huang; Yisheng Zhu

2009-01-01

31

Resampling Methods Revisited: Advancing the Understanding and Applications in Educational Research  

ERIC Educational Resources Information Center

Resampling methods including randomization test, cross-validation, the jackknife and the bootstrap are widely employed in the research areas of natural science, engineering and medicine, but they lack appreciation in educational research. The purpose of the present review is to revisit and highlight the key principles and developments of…

Bai, Haiyan; Pan, Wei

2008-01-01

32

A Demonstration of a Systematic Item-Reduction Approach Using Structural Equation Modeling  

ERIC Educational Resources Information Center

Establishing model parsimony is an important component of structural equation modeling (SEM). Unfortunately, little attention has been given to developing systematic procedures to accomplish this goal. To this end, the current study introduces an innovative application of the jackknife approach first presented in Rensvold and Cheung (1999). Unlike…

Larwin, Karen; Harvey, Milton

2012-01-01

33

Variance Estimation for NAEP Data Using a Resampling-Based Approach: An Application of Cognitive Diagnostic Models. Research Report. ETS RR-10-26  

ERIC Educational Resources Information Center

This paper presents an application of a jackknifing approach to variance estimation of ability inferences for groups of students, using a multidimensional discrete model for item response data. The data utilized to demonstrate the approach come from the National Assessment of Educational Progress (NAEP). In contrast to the operational approach…

Hsieh, Chueh-an; Xu, Xueli; von Davier, Matthias

2010-01-01

34

Estimated genotype error rates from bowhead whale microsatellite data  

Microsoft Academic Search

We calculate error rates using opportunistic replicate samples in the microsatellite data for bowhead whales. The estimated rate (1%\\/genotype) falls within normal ranges reviewed in this paper. The results of a jackknife analysis identified five individuals that were highly influential on estimates of Hardy-Weinberg equilibrium for four different markers. In each case, the influential individual was homozygous for a rare

Phillip A Morin; Richard G LeDuc; Eric Archer; Karen K Martien; Barbara L Taylor; Ryan Huebinger; John W. Bickham

35

The Beginner's Guide to the Bootstrap Method of Resampling.  

ERIC Educational Resources Information Center

The bootstrap method of resampling can be useful in estimating the replicability of study results. The bootstrap procedure creates a mock population from a given sample of data from which multiple samples are then drawn. The method extends the usefulness of the jackknife procedure as it allows for computation of a given statistic across a maximal…

Lane, Ginny G.

36

Bootstrapping a Regression Equation: Some Empirical Results  

Microsoft Academic Search

The bootstrap, like the jackknife, is a technique for estimating standard errors. The idea is to use Monte Carlo simulation based on a nonparametric estimate of the underlying error distribution. The main object of this article is to present the bootstrap in the context of an econometric equation describing the demand for energy by industry. As it turns out, the

David A. Freedman; Stephen C. Peters

1984-01-01

37

Censored Data and the Bootstrap  

Microsoft Academic Search

This article concerns setting standard errors and confidence intervals for the parameters of an unknown distribution when the data is subject to right censoring. The bootstrap, which is an elaboration of the jackknife, provides a general method for answering such questions. The validity of bootstrap methods is investigated using real data, computer simulations, and, in the final section, brief theoretical

Bradley Efron

1981-01-01

38

The Influence Curve and its Role in Robust Estimation  

Microsoft Academic Search

This paper treats essentially the first derivative of an estimator viewed as functional and the ways in which it can be used to study local robustness properties. A theory of robust estimation “near” strict parametric models is briefly sketched and applied to some classical situations. Relations between von Mises functionals, the jackknife and U-statistics are indicated. A number of classical

Frank R. Hampel

1974-01-01

39

Research Article Received 14 November 2008, Accepted 8 September 2010 Published online 6 November 2010 in Wiley Online Library  

E-print Network

in conditional logistic regression Jenny X. Sun,a Samiran Sinha,a Suojin Wanga and Tapabrata Maitib We employ John Wiley & Sons, Ltd. Keywords: bias; conditional logistic regression; jackknife method; matched case of Cordeiro and McCullagh [4] for correcting bias in the conditional logistic regression setting. One main

Sinha, Samiran

40

ESTIMATION OF THE COANCESTRY COEFFICIENT: BASIS FOR A SHORT-TERM GENETIC DISTANCE  

Microsoft Academic Search

A distance measure for populations diverging by drift only is based on the coancestry coefficient 0, and three estimators of the distance Si@= -h(l - 0) are constructed for multiallelic, multilocus data. Simulations of a monoecious population mating at random showed that a weighted ratio of single-locus estimators performed better than an unweighted average or a least squares estimator. Jackknifing

JOHN REYNOLDS; B. S. WEIR; C. CLARK COCKERHAM

1983-01-01

41

Wavelet Support Vector Machine and Particle Swarm Optimizer for Prediction of Protein Structural Class  

Microsoft Academic Search

Determination of protein structural class is a quite meaningful topic in protein science. In this paper a wavelet support vector machine (WSVM) coupled with particle swarm optimizer (PSO) is presented for prediction of protein structural class, which is featured by introducing wavelet as a kernel and using PSO to optimize kernel parameters. As a demonstration, the rigorous jackknife cross-validation test

Chao Chen; Xiao-yong Zou

2011-01-01

42

Geographic variation in the G matrices of wild populations of the barn swallow  

Microsoft Academic Search

In this paper, we present an analysis of genetic variation in three wild populations of the barn swallow, Hirundo rustica. We estimated the P, E, and G matrices for six linear morphological measurements and tested for variation among populations using the Flury hierarchical method and the jackknife followed by MANOVA method. Because of nonpositive-definite matrices, we had to employ ‘bending’

D A Roff; T Mousseau; A P Møller; F de Lope; N Saino

2004-01-01

43

A Review of Published Analyses of Case-Cohort Studies and Recommendations for Future Reporting  

E-print Network

are not valid in the weighted versions, and should be replaced by alternatives such as a robust jack-knife estimator [4]. Weighted Cox regression models can be fit using standard statistical software packages, including Stata [5] and R [6]. The STROBE...

Sharp, Stephen J.; Poulaliou, Manon; Thompson, Simon G.; White, Ian R.; Wood, Angela M.

2014-06-27

44

PROTEOMICS & BIOINFORMATICS  

E-print Network

, China; 2 Shanghai Institute of Applied Physics, Chinese Acadamy of Sciences, Shanghai 201800, China; 3 Applied Biosystems, Inc., Beijing 100027, China; #12;Zuo et al. / Jackknife and Bootstrap Tests of CVTrees organism is represented by a composition vector made of K-peptide counts obtained from the organism

Hao, Bailin

45

A Note on Allocating Items to Subtests in Multiple Matrix Sampling.  

ERIC Educational Resources Information Center

Investigated empirically through post mortem item-examinee sampling were the relative merits of two alternative procedures for allocating items to subtests in multiple matrix sampling and the feasibility of using the jackknife in approximating standard errors of estimate. The results indicate clearly that a partially balanced incomplete block…

Shoemaker, David M.

46

Actual averting expenditure versus stated willingness to pay  

Microsoft Academic Search

The purpose of this study is to perform a complete comparison of actual averting expenditure and stated willingness to pay measures, and to determine if the averting expenditure is a lower bound of the willingness to pay measured from contingent valuation experiment as suggested by literature. In addition to the single value comparison, Bootstrap, Krinsky and Robb, Jackknife, and Cameron

Pei-Ing Wu; Chu-Li Huang

2001-01-01

47

Vehicle System Dynamics, 32 (1999), pp.389408 0042-3114/99/3204-389$15.00 Swets & Zeitlinger  

E-print Network

Vehicle System Dynamics, 32 (1999), pp.389­408 0042-3114/99/3204-389$15.00 ©Swets & Zeitlinger Worst-Case Vehicle Evaluation Methodology-- Examples on Truck Rollover/Jackknifing and Active Yaw Control Systems WEN-HOU MA* AND HUEI PENG SUMMARY A worst-case vehicle evaluation methodology is presented

Peng, Huei

48

Variance Estimation Using Replication Methods in Structural Equation Modeling with Complex Sample Data  

ERIC Educational Resources Information Center

This article discusses replication sampling variance estimation techniques that are often applied in analyses using data from complex sampling designs: jackknife repeated replication, balanced repeated replication, and bootstrapping. These techniques are used with traditional analyses such as regression, but are currently not used with structural…

Stapleton, Laura M.

2008-01-01

49

Examining the relationships among academic self-concept, instrumental motivation, and TIMSS 2007 science scores: a cross-cultural comparison of five East Asian countries\\/regions and the United States  

Microsoft Academic Search

Many American authors expressed their concern that US competitiveness in science, technology, engineering, and mathematics (STEM) is losing ground. Using the Trends in International Mathematics and Science Study (TIMSS) 2007 data, this study investigated how academic self-concept and instrumental motivation influence science test performance among East Asian and American students. Jackknife regression modelling indicated that in East Asia science competency

Chong Ho Yu

2012-01-01

50

International Journal of Medical Microbiology 298 (2008) 245252 Differentiation of fecal Escherichia coli from poultry and free-living birds  

E-print Network

analysis with the jack-knife algorithm of (GTG)5-PCR DNA fingerprints revealed that 95%, 94.1%, 93.2%, 84; Genotyping; (GTG)5-PCR; DNA fingerprinting Introduction Fecal pollution from non-point sources Escherichia coli from poultry and free-living birds by (GTG)5-PCR genomic fingerprinting Bidyut R. Mohapatraa

Mazumder, Asit

51

Estimation of the Cross-Product of Two Mean Vectors  

ERIC Educational Resources Information Center

The problem of estimating the cross-product of two mean vectors in three-dimensional Euclidian space is considered. Two "natural" estimators are developed, both of which turn out to be biased. A third, unbiased estimator, resulting from a jackknife procedure, is also investigated. It is shown that, under normality, the latter is best among all the…

Neudecker, Heinz; Zmyslony, Roman; Trenkler, Gotz

2003-01-01

52

Possession, Transportation, and Use of Firearms by Older Youth in 4-H Shooting Sports Programs  

ERIC Educational Resources Information Center

Thirty years ago we would think nothing of driving to school with a jackknife in our pocket or rifle in the gun rack. Since then, the practices of possessing, transporting, and using firearms have been limited by laws, rules, and public perception. Despite restrictions on youth, the Youth Handgun Safety Act does afford 4-H shooting sports members…

White, David J.; Williver, S. Todd

2014-01-01

53

PER UNIT COSTS TO OWN AND OPERATE FARM MACHINERY  

Microsoft Academic Search

Entropy and jackknife estimation procedures were used to find that custom rates are 20.3% lower than the true cost to own and operate machinery for an average size Kansas farm. A method was then developed to estimate a farms total machinery costs with which to benchmark machinery costs.

Aaron J. Beaton; Kevin C. Dhuyvetter; Terry L. Kastens

2003-01-01

54

Life Table and Consumption Capacity of Corn Earworm, Helicoverpa armigera, Fed Asparagus, Asparagus officinalis  

PubMed Central

The life table and consumption rate of Helicoverpa armigera (Hübner) (Lepidoptera: Noctuidae) reared on asparagus, Asparagus officinalis L. (Asparagales: Asparagaceae) were studied under laboratory conditions to assess their interaction. Development, survival, fecundity, and consumption data were analyzed by the age-stage, two-sex life table. This study indicated that asparagus is a natural host of H. armigera. However, the poor nutritional content in asparagus foliage and the poor fitness of H. armigera that fed on asparagus indicated that asparagus is a suboptimal host in comparison to hybrid sweet corn. The uncertainty associated with life table parameters was estimated by using jackknife and bootstrap techniques, and the results were compared for statistical inference. The intrinsic rate of increase (r), finite rate of increase (?), net reproductive rate (R0), and mean generation time (T) were estimated by the jackknife technique to be 0.0780 day-1, 1.0811 day-1, 67.4 offspring, and 54.8 days, respectively, while those estimated by the bootstrap technique were 0.0752 day-1, 1.0781 day-1, 68.0 offspring, and 55.3 days, respectively. The net consumption rate of H. armigera, as estimated by the jackknife and bootstrap technique, was 1183.02 and 1132.9 mg per individual, respectively. The frequency distribution of sample means obtained by the jackknife technique failed the normality test, while the bootstrap results fit the normal distribution well. By contrast, the relationship between the mean fecundity and the net reproductive rate, as estimated by the bootstrap technique, was slightly inconsistent with the relationship found by mathematical proof. The application of the jackknife and bootstrap techniques in estimating population parameters requires further examination.

Jha, Ratna Kumar; Tuan, Shu-Jen; Chi, Hsin; Tang, Li-Cheng

2014-01-01

55

Growth estimation of mangrove cockle Anadara tuberculosa (Mollusca: Bivalvia): application and evaluation of length-based methods.  

PubMed

Growth is one of the key processes in the dynamic of exploited resources, since it provides part of the information required for structured population models. Growth of mangrove cockle, Anadara tuberculosa was estimated through length-based methods (ELEFAN I y NSLCA) and using diverse shell length intervals (SLI). The variability of L(infinity), k and phi prime (phi') estimates and the effect of each sample were quantified by jackknife techniques. Results showed the same L(infinity) estimates from ELEFAN I and NSLCA across each SLI used, and all L(infinity) were within the expected range. On the contrary, k estimates differed between methods. Jackknife estimations uncovered the tendency of ELEFAN I to overestimate k with increases in SLI, and allowed the identification of differences in uncertainty (PE and CV) between both methods. The average values of phi' derived from NSCLA1.5 and length-age sources were similar and corresponded to ranges reported by other authors. Estimates of L(infinity), k and (phi' from NSCLA1.5 were 85.97 mm, 0.124/year and 2.953 with jackknife and 86.36mm de L(infinity), 0.110/year de k and 2.914 de phi' without jackknife, respectively. Based on the observed evidence and according to the biology of the species, NSCLA is suggested to be used with jackknife and a SLI of 1.5 mm as an ad hoc approach to estimate the growth parameters of mangrove cockle. PMID:21513195

Flores, Luis A

2011-03-01

56

Life table and consumption capacity of corn earworm, Helicoverpa armigera, fed asparagus, Asparagus officinalis.  

PubMed

The life table and consumption rate of Helicoverpa armigera (Hübner) (Lepidoptera: Noctuidae) reared on asparagus, Asparagus officinalis L. (Asparagales: Asparagaceae) were studied under laboratory conditions to assess their interaction. Development, survival, fecundity, and consumption data were analyzed by the age-stage, twosex life table. This study indicated that asparagus is a natural host of H. armigera. However, the poor nutritional content in asparagus foliage and the poor fitness of H. armigera that fed on asparagus indicated that asparagus is a suboptimal host in comparison to hybrid sweet corn. The uncertainty associated with life table parameters was estimated by using jackknife and bootstrap techniques, and the results were compared for statistical inference. The intrinsic rate of increase ( r), finite rate of increase ( ?), net reproductive rate ( R0), and mean generation time ( T) were estimated by the jackknife technique to be 0.0780 day(-1), 1.0811 day(-1), 67.4 offspring, and 54.8 days, respectively, while those estimated by the bootstrap technique were 0.0752 day(-1), 1.0781 day(-1), 68.0 offspring, and 55.3 days, respectively. The net consumption rate of H. armigera, as estimated by the jackknife and bootstrap technique, was 1183.02 and 1132.9 mg per individual, respectively. The frequency distribution of sample means obtained by the jackknife technique failed the normality test, while the bootstrap results fit the normal distribution well. By contrast, the relationship between the mean fecundity and the net reproductive rate, as estimated by the bootstrap technique, was slightly inconsistent with the relationship found by mathematical proof. The application of the jackknife and bootstrap techniques in estimating population parameters requires further examination. PMID:25373181

Jha, Ratna Kumar; Tuan, Shu-Jen; Chi, Hsin; Tang, Li-Cheng

2014-01-01

57

Inferring phylogenetic networks from gene order data.  

PubMed

Existing algorithms allow us to infer phylogenetic networks from sequences (DNA, protein or binary), sets of trees, and distance matrices, but there are no methods to build them using the gene order data as an input. Here we describe several methods to build split networks from the gene order data, perform simulation studies, and use our methods for analyzing and interpreting different real gene order datasets. All proposed methods are based on intermediate data, which can be generated from genome structures under study and used as an input for network construction algorithms. Three intermediates are used: set of jackknife trees, distance matrix, and binary encoding. According to simulations and case studies, the best intermediates are jackknife trees and distance matrix (when used with Neighbor-Net algorithm). Binary encoding can also be useful, but only when the methods mentioned above cannot be used. PMID:24069602

Morozov, Alexey Anatolievich; Galachyants, Yuri Pavlovich; Likhoshway, Yelena Valentinovna

2013-01-01

58

On Isoscalar magnetic moments of excited states  

Microsoft Academic Search

We first review the isoscalar magnetic moments of odd A mirror pairs of closed major shells plus or minus one nucleon. We note systematic deviations in experiment-Schmidt. For j=l+1\\/2 the deviation is positive (stretch) but for j=l-1\\/2 it is negative (jackknife). But the main emphasis is on 2+ states of even-even N=Z nuclei which have isospin T=0 and hence isoscalar

Yitzhak Sharon; Larry Zamick; Sean Yeager

2008-01-01

59

Optimal weights for local multi-atlas fusion using supervised learning and dynamic information (SuperDyn): Validation on hippocampus segmentation  

Microsoft Academic Search

We developed a novel method for spatially-local selection of atlas-weights in multi-atlas segmentation that combines supervised learning on a training set and dynamic information in the form of local registration accuracy estimates (SuperDyn). Supervised learning was applied using a jackknife learning approach and the methods were evaluated using leave-N-out cross-validation. We applied our segmentation method to hippocampal segmentation in 1.5T

Ali R. Khan; Nicolas Cherbuin; Wei Wen; Kaarin J. Anstey; Perminder Sachdev; Mirza Faisal Beg

2011-01-01

60

Resampling Methods: Concepts, Applications, and Justification  

NSDL National Science Digital Library

Created by Chong Hu Yu for Cisco Systems, this journal article is a summary of resampling methods such as the jackknife, bootstrap, and permutation tests. It summarizes the tests, describes various software to perform the tests, and has a list of references. The author provides an introduction, resampling methods, software for, the rationale of supporting, criticisms of resampling, a conclusion and references. This is a expansive resource which goes very in-depth into the study of resampling methods.

Yu, Chong H.

2009-02-24

61

Calibration of Littoral Diatoms to Water Chemistry in Standing Fresh Waters (Flanders, Lower Belgium): Inference Models for Historical Sediment Assemblages  

Microsoft Academic Search

Relationships between littoral surface-sediment diatom assemblages and ambient limnological conditions were examined in 186\\u000a lentic fresh waters throughout lower Belgium (Flanders). Most of these waters were small, unstratified, alkaline and rich\\u000a in nutrients. Using weighted-averaging techniques, robust and accurate transfer functions were developed for median pH-values\\u000a ranging from 3.4 to 9.3 and dissolved inorganic carbon concentrations from ?1 (jackknifed r

Luc Denys

2006-01-01

62

Changes in aquatic plant communitieson the island of Valaam due to invasion by the muskrat Ondatra zibethicus L.(Rodentia, Mammalia)  

Microsoft Academic Search

Muskrat invaded Valaam Island (Northern part of European Russia) in the 1970s. Aquatic plant communities of 1962 and 1993 were compared on the same plots. Quantitative changes were tested with the help of jack-knifing estimates of most known inventory (a-) diversity indicators. Qualitative transformations were assessed using ß-diversity values. The results demonstrated substantially more discriminant ability of diversity measures than

Vladimir V. Smirnov; Kirill Tretyakov

1998-01-01

63

Sample size effects in multivariate fitting of correlated data  

E-print Network

A common problem in analysis of experiments or in lattice QCD simulations is fitting a parameterized model to the average over a number of samples of correlated data values. If the number of samples is not infinite, estimates of the variance of the parameters ("error bars") and of the goodness of fit are affected. We illustrate these problems with numerical simulations, and calculate approximate corrections to the variance of the parameters for estimates made in the standard way from derivatives of the parameters' probability distribution as well as from jackknife and bootstrap estimates.

D. Toussaint; W. Freeman

2008-08-15

64

The Purley train crash mechanism: injuries and prevention.  

PubMed Central

On the afternoon of Saturday 4th March 1989 two trains, both bound for London Victoria Station, collided. Part of the rear train rolled down a steep railway embankment and jack-knifed against a tree. The mechanism of the crash and the injuries sustained by the 55 victims who were seen in the A&E Department of the Mayday University Hospital are described. Improvements in signalling technology and design of rolling stock which may reduce both the risk of collision and severity of injury in future accidents are discussed. Images Fig. 1 PMID:1388485

Fothergill, N J; Ebbs, S R; Reese, A; Partridge, R J; Mowbray, M; Southcott, R D; Hashemi, K

1992-01-01

65

Comparison of confidence intervals for adjusted attributable risk estimates under multinomial sampling.  

PubMed

The epidemiologic concept of the adjusted attributable risk is a useful approach to quantitatively describe the importance of risk factors on the population level. It measures the proportional reduction in disease probability when a risk factor is eliminated from the population, accounting for effects of confounding and effect-modification by nuisance variables. The computation of asymptotic variance estimates for estimates of the adjusted attributable risk is often done by applying the delta method. Investigations on the delta method have shown, however, that the delta method generally tends to underestimate the standard error, leading to biased confidence intervals. We compare confidence intervals for the adjusted attributable risk derived by applying computer intensive methods like the bootstrap or jackknife to confidence intervals based on asymptotic variance estimates using an extensive Monte Carlo simulation and within a real data example from a cohort study in cardiovascular disease epidemiology. Our results show that confidence intervals based on bootstrap and jackknife methods outperform intervals based on asymptotic theory. Best variants of computer intensive confidence intervals are indicated for different situations. PMID:17094345

Lehnert-Batar, Andrea; Pfahlberg, Annette; Gefeller, Olaf

2006-08-01

66

Phylogenetic relationships among New Caledonian Sapotaceae (Ericales): molecular evidence for generic polyphyly and repeated dispersal.  

PubMed

The phylogeny of a representative group of genera and species from the Sapotaceae tribe Chrysophylleae, mainly from Australia and New Caledonia, was studied by jackknife analyses of sequences of nuclear ribosomal DNA. The phylogeny conflicts with current opinions on generic delimitation in Sapotaceae. Pouteria and Niemeyera, as presently circumscribed, are both shown to be nonmonophyletic. In contrast, all species currently assigned to these and other segregate genera confined to Australia, New Caledonia, or neighboring islands, form a supported clade. Earlier classifications in which more genera are recognized may better reflect relationships among New Caledonian taxa. Hence, there is need for a revision of generic boundaries in Chrysophylleae, and particularly within the Pouteria complex, including Leptostylis, Niemeyera, Pichonia, Pouteria pro parte (the main part of section Oligotheca), and Pycnandra. Section Oligotheca have been recognized as the separate genus Planchonella, a monophyletic group that needs to be resurrected. Three clades with strong support in our jackknife analysis have one Australian species that is sister to a relatively large group of New Caledonian endemics, suggesting multiple dispersal events between this small and isolated tropical island and Australia. The phylogeny also suggests an interesting case of a relatively recent and rapid radiation of several lineages of Sapotaceae within New Caledonia. PMID:21652444

Bartish, Igor V; Swenson, Ulf; Munzinger, Jérôme; Anderberg, Arne A

2005-04-01

67

Variance estimation for clustered recurrent event data with a small number of clusters.  

PubMed

Often in biomedical studies, the event of interest is recurrent and within-subject events cannot usually be assumed independent. In semi-parametric estimation of the proportional rates model, a working independence assumption leads to an estimating equation for the regression parameter vector, with within-subject correlation accounted for through a robust (sandwich) variance estimator; these methods have been extended to the case of clustered subjects. We consider variance estimation in the setting where subjects are clustered and the study consists of a small number of moderate-to-large-sized clusters. We demonstrate through simulation that the robust estimator is quite inaccurate in this setting. We propose a corrected version of the robust variance estimator, as well as jackknife and bootstrap estimators. Simulation studies reveal that the corrected variance is considerably more accurate than the robust estimator, and slightly more accurate than the jackknife and bootstrap variance. The proposed methods are used to compare hospitalization rates between Canada and the U.S. in a multi-centre dialysis study. PMID:16149126

Schaubel, Douglas E

2005-10-15

68

Reweighting estimators for Cox regression with missing covariate data: analysis of insulin resistance and risk of stroke in the Northern Manhattan Study.  

PubMed

Incomplete covariates often obscure analysis results from a Cox regression. In an analysis of the Northern Manhattan Study (NOMAS) to determine the influence of insulin resistance on the incidence of stroke in nondiabetic individuals, insulin level is unknown for 34.1% of the subjects. The available data suggest that the missingness mechanism depends on outcome variables, which may generate biases in estimating the parameters of interest if only using the complete observations. This article aimed to introduce practical strategies to analyze the NOMAS data and present sensitivity analyses by using the reweighting method in standard statistical packages. When the data set structure is in counting process style, the reweighting estimates can be obtained by built-in procedures with variance estimated by the jackknife method. Simulation results indicate that the jackknife variance estimate provides reasonable coverage probability in moderate sample sizes. We subsequently conducted sensitivity analyses for the NOMAS data, showing that the risk estimates are robust to a variety of missingness mechanisms. At the end of this article, we present the core SAS and R programs used in the analysis. PMID:21965165

Xu, Qiang; Paik, Myunghee Cho; Rundek, Tatjana; Elkind, Mitchell S V; Sacco, Ralph L

2011-12-10

69

Statistical discrimination of liquid gasoline samples from casework.  

PubMed

The intention of this study was to differentiate liquid gasoline samples from casework by utilizing multivariate pattern recognition procedures on data from gas chromatography-mass spectrometry. A supervised learning approach was undertaken to achieve this goal employing the methods of principal component analysis (PCA), canonical variate analysis (CVA), orthogonal canonical variate analysis (OCVA), and linear discriminant analysis. The study revealed that the variability in the sample population was sufficient enough to distinguish all the samples from one another knowing their groups a priori. CVA was able to differentiate all samples in the population using only three dimensions, while OCVA required four dimensions. PCA required 10 dimensions of data in order to predict the correct groupings. These results were all cross-validated using the "jackknife" method to confirm the classification functions and compute estimates of error rates. The results of this initial study have helped to develop procedures for the application of multivariate analysis to fire debris casework. PMID:18643865

Petraco, Nicholas D K; Gil, Mark; Pizzola, Peter A; Kubic, T A

2008-09-01

70

Geographic variation in the G matrices of wild populations of the barn swallow.  

PubMed

In this paper, we present an analysis of genetic variation in three wild populations of the barn swallow, Hirundo rustica. We estimated the P, E, and G matrices for six linear morphological measurements and tested for variation among populations using the Flury hierarchical method and the jackknife followed by MANOVA method. Because of nonpositive-definite matrices, we had to employ 'bending' to analyse the G and E matrices with the Flury method. Both statistical methods agree in finding that the P and G matrices are significantly different but comparison between the analysis of the P matrices and pairwise analyses of the P, E, and G matrices suggests caution in interpreting the Flury results concerning differences in matrix structure. The significant variation among the populations in the G matrices appears to be due in large measure to the most geographically distant population. PMID:15218508

Roff, D A; Mousseau, T; Møller, A P; de Lope, F; Saino, N

2004-07-01

71

Prediction of protein structural class using a complexity-based distance measure.  

PubMed

Knowledge of structural class plays an important role in understanding protein folding patterns. So it is necessary to develop effective and reliable computational methods for prediction of protein structural class. To this end, we present a new method called NN-CDM, a nearest neighbor classifier with a complexity-based distance measure. Instead of extracting features from protein sequences as done previously, distance between each pair of protein sequences is directly evaluated by a complexity measure of symbol sequences. Then the nearest neighbor classifier is adopted as the predictive engine. To verify the performance of this method, jackknife cross-validation tests are performed on several benchmark datasets. Results show that our approach achieves a high prediction accuracy over some classical methods. PMID:19330425

Liu, Taigang; Zheng, Xiaoqi; Wang, Jun

2010-03-01

72

Predicting peroxidase subcellular location by hybridizing different descriptors of Chou' pseudo amino acid patterns.  

PubMed

Peroxidases as universal enzymes are essential for the regulation of reactive oxygen species levels and play major roles in both disease prevention and human pathologies. Automated prediction of functional protein localization is rarely reported and also is important for designing new drugs and drug targets. In this study, we first propose a support vector machine (SVM)-based method to predict peroxidase subcellular localization. Various Chou' pseudo amino acid descriptors and gene ontology (GO)-homology patterns were selected as input features to multiclass SVM. Prediction results showed that the smoothed PSSM encoding pattern performed better than the other approaches. The best overall prediction accuracy was 87.0% in a jackknife test using a PSSM profile of pattern with width=5. We also demonstrate that the present GO annotation is far from complete or deep enough for annotating proteins with a specific function. PMID:24802134

Zuo, Yong-Chun; Peng, Yong; Liu, Li; Chen, Wei; Yang, Lei; Fan, Guo-Liang

2014-08-01

73

Bootstrapped MRMC confidence intervals  

NASA Astrophysics Data System (ADS)

The multiple-reader, multiple-case (MRMC) paradigm of Swets and Pickett (1982) for ROC analysis was expressed as a components of variance model by Dorfman, Berbaum, and Metz (1992) and validated by Roe and Metz (1997) for Type I error rates. Our group proposed an analysis of the MRMC components of variance model using bootstrap (Beiden, Wagner, and Campbell, 2000) experiments instead of jackknife pseudo-values. These approaches have been challenged by some contemporary authors (e.g. Zhou, Obuchowski, and McClish, 2002). The purpose of the present paper is to formally compare the models and to carry out validation tests of their performance. We investigate different approaches to statistical inference, including several types of nonparametric bootstrap confidence intervals and report on validation and simulation experiments of Type I errors.

Samuelson, Frank W.; Wagner, Robert F.

2005-04-01

74

Ontogeny of the barley plant as related to mutation expression and detection of pollen mutations  

SciTech Connect

Clustering of mutant pollen grains in a population of normal pollen due to premeiotic mutational events complicates translating mutation frequencies into rates. Embryo ontogeny in barley will be described and used to illustrate the formation of such mutant clusters. The nature of the statistics for mutation frequency will be described from a study of the reversion frequencies of various waxy mutants in barley. Computer analysis by a jackknife method of the reversion frequencies of a waxy mutant treated with the mutagen sodium azide showed a significantly higher reversion frequency than untreated material. Problems of the computer analysis suggest a better experimental design for pollen mutation experiments. Preliminary work on computer modeling for pollen development and mutation will be described.

Hodgdon, A.L.; Marcus, A.H.; Arenaz, P.; Rosichan, J.L.; Bogyo, T.P.; Nilan, R.A.

1980-05-29

75

Estimating contaminant loads in rivers: An application of adjusted maximum likelihood to type 1 censored data  

USGS Publications Warehouse

This paper presents an adjusted maximum likelihood estimator (AMLE) that can be used to estimate fluvial transport of contaminants, like phosphorus, that are subject to censoring because of analytical detection limits. The AMLE is a generalization of the widely accepted minimum variance unbiased estimator (MVUE), and Monte Carlo experiments confirm that it shares essentially all of the MVUE's desirable properties, including high efficiency and negligible bias. In particular, the AMLE exhibits substantially less bias than alternative censored-data estimators such as the MLE (Tobit) or the MLE followed by a jackknife. As with the MLE and the MVUE the AMLE comes close to achieving the theoretical Frechet-Crame??r-Rao bounds on its variance. This paper also presents a statistical framework, applicable to both censored and complete data, for understanding and estimating the components of uncertainty associated with load estimates. This can serve to lower the cost and improve the efficiency of both traditional and real-time water quality monitoring.

Cohn, T. A.

2005-01-01

76

Identification of Antioxidants from Sequence Information Using Na?ve Bayes  

PubMed Central

Antioxidant proteins are substances that protect cells from the damage caused by free radicals. Accurate identification of new antioxidant proteins is important in understanding their roles in delaying aging. Therefore, it is highly desirable to develop computational methods to identify antioxidant proteins. In this study, a Naïve Bayes-based method was proposed to predict antioxidant proteins using amino acid compositions and dipeptide compositions. In order to remove redundant information, a novel feature selection technique was employed to single out optimized features. In the jackknife test, the proposed method achieved an accuracy of 66.88% for the discrimination between antioxidant and nonantioxidant proteins, which is superior to that of other state-of-the-art classifiers. These results suggest that the proposed method could be an effective and promising high-throughput method for antioxidant protein identification. PMID:24062796

Feng, Peng-Mian; Chen, Wei

2013-01-01

77

Na?ve Bayes Classifier with Feature Selection to Identify Phage Virion Proteins  

PubMed Central

Knowledge about the protein composition of phage virions is a key step to understand the functions of phage virion proteins. However, the experimental method to identify virion proteins is time consuming and expensive. Thus, it is highly desirable to develop novel computational methods for phage virion protein identification. In this study, a Naïve Bayes based method was proposed to predict phage virion proteins using amino acid composition and dipeptide composition. In order to remove redundant information, a novel feature selection technique was employed to single out optimized features. In the jackknife test, the proposed method achieved an accuracy of 79.15% for phage virion and nonvirion proteins classification, which are superior to that of other state-of-the-art classifiers. These results indicate that the proposed method could be as an effective and promising high-throughput method in phage proteomics research. PMID:23762187

Feng, Peng-Mian; Ding, Hui; Chen, Wei

2013-01-01

78

Prediction of Cancer Drugs by Chemical-Chemical Interactions  

PubMed Central

Cancer, which is a leading cause of death worldwide, places a big burden on health-care system. In this study, an order-prediction model was built to predict a series of cancer drug indications based on chemical-chemical interactions. According to the confidence scores of their interactions, the order from the most likely cancer to the least one was obtained for each query drug. The 1st order prediction accuracy of the training dataset was 55.93%, evaluated by Jackknife test, while it was 55.56% and 59.09% on a validation test dataset and an independent test dataset, respectively. The proposed method outperformed a popular method based on molecular descriptors. Moreover, it was verified that some drugs were effective to the ‘wrong’ predicted indications, indicating that some ‘wrong’ drug indications were actually correct indications. Encouraged by the promising results, the method may become a useful tool to the prediction of drugs indications. PMID:24498372

Li, Hai-Peng; Feng, Kai-Yan; Chen, Lei; Zheng, Ming-Yue; Cai, Yu-Dong

2014-01-01

79

Prediction of protein structural classes using hybrid properties.  

PubMed

In this paper, amino acid compositions are combined with some protein sequence properties (physiochemical properties) to predict protein structural classes. We are able to predict protein structural classes using a mathematical model that combines the nearest neighbor algorithm (NNA), mRMR (minimum redundancy, maximum relevance), and feature forward searching strategy. Jackknife cross-validation is used to evaluate the prediction accuracy. As a result, the prediction success rate improves to 68.8%, which is better than the 62.2% obtained when using only amino acid compositions. Therefore, we conclude that the physiochemical properties are factors that contribute to the protein folding phenomena and the most contributing features are found to be the amino acid composition. We expect that prediction accuracy will improve further as more sequence information comes to light. A web server for predicting the protein structural classes is available at http://app3.biosino.org:8080/liwenjin/index.jsp. PMID:18953662

Li, Wenjin; Lin, Kao; Feng, Kaiyan; Cai, Yudong

2008-01-01

80

Pine Hollow Watershed Project : FY 2000 Projects.  

SciTech Connect

The Pine Hollow Project (1999-010-00) is an on-going watershed restoration effort administered by Sherman County Soil and Water Conservation District and spearheaded by Pine Hollow/Jackknife Watershed Council. The headwaters are located near Shaniko in Wasco County, and the mouth is in Sherman County on the John Day River. Pine Hollow provides more than 20 miles of potential summer steelhead spawning and rearing habitat. The watershed is 92,000 acres. Land use is mostly range, with some dryland grain. There are no water rights on Pine Hollow. Due to shallow soils, the watershed is prone to rapid runoff events which scour out the streambed and the riparian vegetation. This project seeks to improve the quality of upland, riparian and in-stream habitat by restoring the natural hydrologic function of the entire watershed. Project implementation to date has consisted of construction of water/sediment control basins, gradient terraces on croplands, pasture cross-fences, upland water sources, and grass seeding on degraded sites, many of which were crop fields in the early part of the century. The project is expected to continue through about 2007. From March 2000 to June 2001, the Pine Hollow Project built 6 sediment basins, 1 cross-fence, 2 spring developments, 1 well development, 1 solar pump, 50 acres of native range seeding and 1 livestock waterline. FY2000 projects were funded by BPA, Oregon Watershed Enhancement Board, US Fish and Wildlife Service and landowners. In-kind services were provided by Sherman County Soil and Water Conservation District, USDA Natural Resources Conservation Service, USDI Bureau of Land Management, Oregon Department of Fish and Wildlife, Pine Hollow/Jackknife Watershed Council, landowners and Wasco County Soil and Water Conservation District.

Sherman County Soil and Water Conservation District

2001-06-01

81

Tracing the pathways of neotropical migratory shorebirds using stable isotopes: a pilot study.  

PubMed

We evaluated the potential use of stable isotopes to establish linkages between the wintering grounds and the breeding grounds of the Pectoral Sandpiper (Calidris melanotos), the White-rumped Sandpiper (Calidris fuscicollis), the Baird's Sandpiper (Calidris bairdii), and other Neotropical migratory shorebird species (e.g., Tringa spp.). These species molt their flight feathers on the wintering grounds and hence their flight feathers carry chemical signatures that are characteristic of their winter habitat. The objective of our pilot study was to assess the feasibility of identifying the winter origin of individual birds by: (1) collecting shorebird flight feathers from several widely separated Argentine sites and analyzing these for a suite of stable isotopes; and 2) analyzing the deuterium and 18O isotope data that were available from precipitation measurement stations in Argentina. Isotopic ratios (delta13C, delta15N and delta34S) of flight feathers were significantly different among three widely separated sites in Argentina during January 2001. In terms of relative importance in separating the sites, delta34S was most important, followed by delta15N, and then delta13C. In the complete discriminant analysis, the classification function correctly predicted group membership in 85% of the cases (jackknifed classification matrix). In a stepwise analysis delta13C was dropped from the solution, and site membership was correctly predicted in 92% of cases (jackknifed matrix). Analysis of precipitation data showed that both deltaD and delta18O were significantly related to both latitude and longitude on a countrywide scale (p < 0.001). Other variables, month, altitude, explained little additional variation in these isotope ratios. Several issues were identified that will likely constrain the degree of accuracy one can expect in predicting the geographic origin of birds from Argentina. There was unexplained variation in isotope ratios within and among the different wing feathers from individual birds. Such variation may indicate that birds are not faithful to a local site during their winter stay in Argentina. There was significant interannual variation in the deltaD and delta18O of precipitation. Hence, specific locations may not have a constant signature for some isotopes. Moreover, the fractionation that occurs in wetlands due to evaporation significantly skews local deltaD and delta18O values, which may undermine the strong large-scale gradients seen in the precipitation data. We are continuing the research with universities in Argentina with a focus on expanding the breadth of feather collection and attempting to resolve the identified issues. PMID:14521278

Farmer, A; Rye, R; Landis, G; Bern, C; Kester, C; Ridley, I

2003-09-01

82

Spectral estimation for geophysical time-series with inconvenient gaps  

NASA Astrophysics Data System (ADS)

The power of spectral estimation as a tool for studying geophysical processes is often limited by short records or breaks in available time-series. Direct spectral estimation using multitaper techniques designed to reduce variance and minimize leakage can help alleviate the first problem. For records with gaps, systematic interpolation or averaging of multitaper spectra derived from record fragments may prove adequate in some cases, but can be cumbersome to implement. Alternatively, multitapers can be modified for use in direct spectral estimation with intermittently sampled data. However, their performance has not been adequately studied. We investigate reliability and resolution of techniques that adapt prolate and minimum bias (MB) multitapers to accommodate the longest breaks in sampling, comparing the tapering functions (referred to as PRG or MBG tapers) with the standard prolate and MB tapers used for complete data series, and with the section-averaging approach. Using a synthetic data set, we test both jackknife and bootstrap methods to calculate confidence intervals for PRG and MBG multitaper spectral estimates and find the jackknife is both more accurate and faster to compute. To implement these techniques for a variety of data sets, we provide an algorithm that allows the user to balance judicious interpolation against the use of suitably adapted tapers, providing empirical measures of both bias and frequency resolution for candidate sets of tapers. These techniques are tested on diverse geophysical data sets: a record of change in the length of day, a model of the external dipole part of the geomagnetic field produced by the magnetospheric ring current, and a 12 Myr long irregularly sampled relative geomagnetic palaeointensity record with pernicious gaps. We conclude that both PRG and MBG tapers generally perform as well as, or better than, an optimized form of the commonly used section averaging approach. The greatest improvements seem to occur when the gap structure creates data segments of very unequal lengths. Ease of computation and more robust behaviour can make MBG tapers a better choice than PRG except when very fine-scale frequency resolution is required. These techniques could readily be applied for cross-spectral and transfer function estimation and are a useful addition to the geophysical toolbox.

Smith-Boughner, L. T.; Constable, C. G.

2012-09-01

83

Use of remote sensing for analysis and estimation of vector-borne disease  

NASA Astrophysics Data System (ADS)

An epidemiological data of malaria cases were correlated with satellite-based vegetation health (VH) indices to investigate if they can be used as a proxy for monitoring the number of malaria cases. Mosquitoes, which spread malaria in Bangladesh, are very sensitive to environmental conditions, especially to changes in weather. Therefore, VH indices, which characterize weather conditions, were tested as indicators of mosquitoes' activities in the spread of malaria. Satellite data were presented by the following VH indices: Vegetation Condition Index (VCI), Temperature Condition Index (TCI), and Vegetation Health Index (VHI). They were derived from radiances and measured by the Advanced Very High Resolution Radiometer (AVHRR) flown on NOAA afternoon polar orbiting satellites. Assessment of sensitivity of the VH was performed using correlation and regression analysis. Estimation models were validated using of Jackknife Cross-Validation procedure. Results show that the VH indices can be used for detection, and numerical estimate of the number of malaria cases. During the cooler months (January--April) when mosquitoes are less active, the correlation is low and increases considerably during the warm and wet season (April--November), for TCI in early October and for VCI in mid September. All analysis and estimation model developed here are based on data obtained for Bangladesh.

Rahman, Atiqur

84

Orthology Inference in Nonmodel Organisms Using Transcriptomes and Low-Coverage Genomes: Improving Accuracy and Matrix Occupancy for Phylogenomics  

PubMed Central

Orthology inference is central to phylogenomic analyses. Phylogenomic data sets commonly include transcriptomes and low-coverage genomes that are incomplete and contain errors and isoforms. These properties can severely violate the underlying assumptions of orthology inference with existing heuristics. We present a procedure that uses phylogenies for both homology and orthology assignment. The procedure first uses similarity scores to infer putative homologs that are then aligned, constructed into phylogenies, and pruned of spurious branches caused by deep paralogs, misassembly, frameshifts, or recombination. These final homologs are then used to identify orthologs. We explore four alternative tree-based orthology inference approaches, of which two are new. These accommodate gene and genome duplications as well as gene tree discordance. We demonstrate these methods in three published data sets including the grape family, Hymenoptera, and millipedes with divergence times ranging from approximately 100 to over 400 Ma. The procedure significantly increased the completeness and accuracy of the inferred homologs and orthologs. We also found that data sets that are more recently diverged and/or include more high-coverage genomes had more complete sets of orthologs. To explicitly evaluate sources of conflicting phylogenetic signals, we applied serial jackknife analyses of gene regions keeping each locus intact. The methods described here can scale to over 100 taxa. They have been implemented in python with independent scripts for each step, making it easy to modify or incorporate them into existing pipelines. All scripts are available from https://bitbucket.org/yangya/phylogenomic_dataset_construction. PMID:25158799

Yang, Ya; Smith, Stephen A.

2014-01-01

85

Tomographic imaging of local earthquake delay times for three-dimensional velocity variation in western Washington  

SciTech Connect

Tomographic inversion is applied to delay times from local earthquakes to image three dimensional velocity variations in the Puget Sound region of western Washington. The 37,500 square km region is represented by nearly cubic blocks of 5 km per side. P-wave arrival time observations from 4,387 crustal earthquakes, with depths of 0 to 40 km, were used as sources producing 36,865 rays covering the target region. A conjugate gradient method (LSQR) is used to invert the large, sparse system of equations. To diminish the effects of noisy data, the Laplacian is constrained to be zero within horizontal layers, providing smoothing of the model. The resolution is estimated by calculating impulse responses at blocks of interest and estimates of standard errors are calculated by the jackknife statistical procedure. Results of the inversion are correlated with some known geologic features and independent geophysical measurements. High P-wave velocities along the eastern flank of the Olympic Peninsula are interpreted to reflect the subsurface extension of Crescent terrane. Low velocities beneath the Puget Sound further to the east are inferred to reflect thick sediment accumulations. The Crescent terrane appears to extend beneath Puget Sound, consistent with its interpretation as a major accretionary unit. In the southern Puget Sound basin, high velocity anomalies at depths of 10-20 km are interpreted as Crescent terrane and are correlated with a region of low seismicity. Near Mt. Ranier, high velocity anomalies may reflect buried plutons.

Lees, J.M.; Crosson, R.S. (Univ. of Washington, Seattle (United States))

1990-04-10

86

Heritable changes in regional cortical thickness with age.  

PubMed

It is now well established that regional indices of brain structure such as cortical thickness, surface area or grey matter volume exhibit spatially variable patterns of heritability. However, a recent study found these patterns to change with age during development, a result supported by gene expression studies. Changes in heritability have not been investigated in adulthood so far and could have important implications in the study of heritability and genetic correlations in the brain as well as in the discovery of specific genes explaining them. Herein, we tested for genotype by age (G ×A) interactions, an extension of genotype by environment interactions, through adulthood and healthy aging in 902 subjects from the Genetics of Brain Structure (GOBS) study. A "jackknife" based method for the analysis of stable cortical thickness clusters (JASC) and scale selection is also introduced. Although additive genetic variance remained constant throughout adulthood, we found evidence for incomplete pleiotropy across age in the cortical thickness of paralimbic and parieto-temporal areas. This suggests that different genetic factors account for cortical thickness heritability at different ages in these regions. PMID:24752552

Chouinard-Decorte, Francois; McKay, D Reese; Reid, Andrew; Khundrakpam, Budhachandra; Zhao, Lu; Karama, Sherif; Rioux, Pierre; Sprooten, Emma; Knowles, Emma; Kent, Jack W; Curran, Joanne E; Göring, Harald H H; Dyer, Thomas D; Olvera, Rene L; Kochunov, Peter; Duggirala, Ravi; Fox, Peter T; Almasy, Laura; Blangero, John; Bellec, Pierre; Evans, Alan C; Glahn, David C

2014-06-01

87

Orthology inference in nonmodel organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics.  

PubMed

Orthology inference is central to phylogenomic analyses. Phylogenomic data sets commonly include transcriptomes and low-coverage genomes that are incomplete and contain errors and isoforms. These properties can severely violate the underlying assumptions of orthology inference with existing heuristics. We present a procedure that uses phylogenies for both homology and orthology assignment. The procedure first uses similarity scores to infer putative homologs that are then aligned, constructed into phylogenies, and pruned of spurious branches caused by deep paralogs, misassembly, frameshifts, or recombination. These final homologs are then used to identify orthologs. We explore four alternative tree-based orthology inference approaches, of which two are new. These accommodate gene and genome duplications as well as gene tree discordance. We demonstrate these methods in three published data sets including the grape family, Hymenoptera, and millipedes with divergence times ranging from approximately 100 to over 400 Ma. The procedure significantly increased the completeness and accuracy of the inferred homologs and orthologs. We also found that data sets that are more recently diverged and/or include more high-coverage genomes had more complete sets of orthologs. To explicitly evaluate sources of conflicting phylogenetic signals, we applied serial jackknife analyses of gene regions keeping each locus intact. The methods described here can scale to over 100 taxa. They have been implemented in python with independent scripts for each step, making it easy to modify or incorporate them into existing pipelines. All scripts are available from https://bitbucket.org/yangya/phylogenomic_dataset_construction. PMID:25158799

Yang, Ya; Smith, Stephen A

2014-11-01

88

Hinged Plakin Domains Provide Specialized Degrees of Articulation in Envoplakin, Periplakin and Desmoplakin  

PubMed Central

Envoplakin, periplakin and desmoplakin are cytoskeletal proteins that provide structural integrity within the skin and heart by resisting shear forces. Here we reveal the nature of unique hinges within their plakin domains that provides divergent degrees of flexibility between rigid long and short arms composed of spectrin repeats. The range of mobility of the two arms about the hinge is revealed by applying the ensemble optimization method to small-angle X-ray scattering data. Envoplakin and periplakin adopt ‘L’ shaped conformations exhibiting a ‘helicopter propeller’-like mobility about the hinge. By contrast desmoplakin exhibits essentially unrestricted mobility by ‘jack-knifing’ about the hinge. Thus the diversity of molecular jointing that can occur about plakin hinges includes ‘L’ shaped bends, ‘U’ turns and fully extended ‘I’ orientations between rigid blocks of spectrin repeats. This establishes specialised hinges in plakin domains as a key source of flexibility that may allow sweeping of cellular spaces during assembly of cellular structures and could impart adaptability, so preventing irreversible damage to desmosomes and the cell cytoskeleton upon exposure to mechanical stress. PMID:23922795

Al-Jassar, Caezar; Bernad?, Pau; Chidgey, Martyn; Overduin, Michael

2013-01-01

89

RaPiDS: an algorithm for rapid expression profile database search.  

PubMed

In this paper we present a fast algorithm and implementation for computing the Spearman rank correlation (SRC) between a query expression profile and each expression profile in a database of profiles. The algorithm is linear in the size of the profile database with a very small constant factor. It is designed to efficiently handle multiple profile platforms and missing values. We show that our specialized algorithm and C++ implementation can achieve an approximately 100-fold speed-up over a reasonable baseline implementation using Perl hash tables. RaPiDS is designed for general similarity search rather than classification - but in order to attempt to classify the usefulness of SRC as a similarity measure we investigate the usefulness of this program as a classifier for classifying normal human cell types based on gene expression. Specifically we use the k nearest neighbor classifier with a t statistic derived from SRC as the similarity measure for profile pairs. We estimate the accuracy using a jackknife test on the microarray data with manually checked cell type annotation. Preliminary results suggest the measure is useful (64% accuracy on 1,685 profiles vs. the majority class classifier's 17.5%) for profiles measured under similar conditions (same laboratory and chip platform); but requires improvement when comparing profiles from different experimental series. PMID:17503380

Horton, Paul B; Kiseleva, Larisa; Fujibuchi, Wataru

2006-01-01

90

Bacterial community structure and soil properties of a subarctic tundra soil in Council, Alaska  

PubMed Central

The subarctic region is highly responsive and vulnerable to climate change. Understanding the structure of subarctic soil microbial communities is essential for predicting the response of the subarctic soil environment to climate change. To determine the composition of the bacterial community and its relationship with soil properties, we investigated the bacterial community structure and properties of surface soil from the moist acidic tussock tundra in Council, Alaska. We collected 70 soil samples with 25-m intervals between sampling points from 0–10 cm to 10–20 cm depths. The bacterial community was analyzed by pyrosequencing of 16S rRNA genes, and the following soil properties were analyzed: soil moisture content (MC), pH, total carbon (TC), total nitrogen (TN), and inorganic nitrogen ( and ). The community compositions of the two different depths showed that Alphaproteobacteria decreased with soil depth. Among the soil properties measured, soil pH was the most significant factor correlating with bacterial community in both upper and lower-layer soils. Bacterial community similarity based on jackknifed unweighted unifrac distance showed greater similarity across horizontal layers than through the vertical depth. This study showed that soil depth and pH were the most important soil properties determining bacterial community structure of the subarctic tundra soil in Council, Alaska. PMID:24893754

Kim, Hye Min; Jung, Ji Young; Yergeau, Etienne; Hwang, Chung Yeon; Hinzman, Larry; Nam, Sungjin; Hong, Soon Gyu; Kim, Ok-Sun; Chun, Jongsik; Lee, Yoo Kyung

2014-01-01

91

How many nucleotides are required to resolve a phylogenetic problem? The use of a new statistical method applicable to available sequences.  

PubMed

The evolution of bootstrap proportions (BP) with sequence length was studied using a 28S ribosomal RNA data set. For different sequence lengths, informative sites were jackknifed several times. Bootstrapping was subsequently performed on each of these subsamples. For each node, BPs so obtained were plotted against sequence length, showing the evolution of the robustness with increasing number of informative sites. For robust nodes (BP of 100%), the pattern of BPs is unvarying and is described by a simple function BP = 100 (1-e-b(x-x')), where x is the number of informative sites and b and x' are two parameters estimated using a nonlinear regression procedure. When a node has a BP < 100% and the pattern of BPs fits this function, it is possible to estimate the number of informative sites required to obtain a given average BP. The method also identifies nonrobust nodes (nonascending clusters of BP dots), for which it seems to be more cost effective and fruitful to turn to other species and/or genes rather than to continue sequencing longer gene lengths from the same species to reach a BP of 95%. PMID:7697188

Lecointre, G; Philippe, H; Vân Lê, H L; Le Guyader, H

1994-12-01

92

Using logistic regression to estimate delay-discounting functions.  

PubMed

The monetary choice questionnaire (MCQ) and similar computer tasks ask preference questions in order to ascertain indifference, the perceived equivalence of immediate versus larger delayed rewards. Indifference data are then fitted with a hyperbolic function, summarizing the decline in perceived value with delay time. We present a fitting method that estimates the hyperbolic parameter k directly from survey responses. Binary preferences are modeled as a function of time (X2) and a transformed reward ratio (X1), yielding logistic regression coefficients beta 2 and beta 1. The hyperbolic parameter emerges as k = beta 2/beta 1, where the logistic predicted p = .5 (the definition of indifference). The MCQ was administered to 1,073 adolescents and was scored using both standard and logistic methods. The means for In(k) were similar (standard, -4.53; logistic, -4.51), and the results were highly correlated (rho = .973). Simulated MCQ data showed that k was unbiased, except where beta 1 > or = -1, indicating a vague survey response. Jackknife standard errors provided excellent coverage. PMID:15190698

Wileyto, E Paul; Audrain-McGovern, Janet; Epstein, Leonard H; Lerman, Caryn

2004-02-01

93

Predictability of Zimbabwe summer rainfall  

NASA Astrophysics Data System (ADS)

Predictors of Zimbabwe summer rainfall are investigated with a view to improved long-range forecasts. Teleconnectivity is assessed in respect of sea-surface temperatures, the Southern Oscillation index, the Quasi-biennial Oscillation (QBO), outgoing longwave radiation (OLR) and wind. Spectral analyses of historical rainfall gives an indication of cycles in the range 2.3, 18 and 3.8 years, possibly associated with the QBO, the luni-solar tide and the El No-Southern Oscillation (ENSO), respectively. Pair-wise correlations are found between Zimbabwe summer rainfall and SST in the central Indian Ocean (r<-0.5) in austral spring. Below normal OLR values in September over southern Africa corresponds with good rains in the following summer. Rainfall-upper-wind correlations are optimum (r<-0.7) over the equatorial Atlantic in spring. Comparatively weak correlation with the QBO may also reflect biennial adjustment of monsoon and global ENSO teleconnections. Additional predictor variables are utilized and multivariate models are formulated for early and late summer rainfall and maize yield in Zimbabwe. The models use three to five predictors, are trained over a 22-year period and perform well in jack-knife skill tests. Summer rainfall forecasts with one season lead times are viable and could ameliorate hardship caused by drought.

Makarau, Amos; Jury, Mark R.

1997-11-01

94

Combination of statistical approaches for analysis of 2-DE data gives complementary results.  

PubMed

Five methods for finding significant changes in proteome data have been used to analyze a two-dimensional gel electrophoresis data set. We used both univariate (ANOVA) and multivariate (Partial Least Squares with jackknife, Cross Model Validation, Power-PLS and CovProc) methods. The gels were taken from a time-series experiment exploring the changes in metabolic enzymes in bovine muscle at five time-points after slaughter. The data set consisted of 1377 protein spots, and for each analysis, the data set were preprocessed to fit the requirements of the chosen method. The generated results were one list from each analysis method of proteins found to be significantly changed according to the experimental design. Although the number of selected variables varied between the methods, we found that this was dependent on the specific aim of each method. CovProc and P-PLS focused more on getting the minimum necessary subset of proteins to explain properties of the samples. These methods ended up with less selected proteins. There was also a correlation between level of significance and frequency of selection for the selected proteins. PMID:19367717

Grove, Harald; Jørgensen, Bo M; Jessen, Flemming; Søndergaard, Ib; Jacobsen, Susanne; Hollung, Kristin; Indahl, Ulf; Faergestad, Ellen M

2008-12-01

95

iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition  

PubMed Central

The ?54 promoters are unique in prokaryotic genome and responsible for transcripting carbon and nitrogen-related genes. With the avalanche of genome sequences generated in the postgenomic age, it is highly desired to develop automated methods for rapidly and effectively identifying the ?54 promoters. Here, a predictor called ‘iPro54-PseKNC’ was developed. In the predictor, the samples of DNA sequences were formulated by a novel feature vector called ‘pseudo k-tuple nucleotide composition’, which was further optimized by the incremental feature selection procedure. The performance of iPro54-PseKNC was examined by the rigorous jackknife cross-validation tests on a stringent benchmark data set. As a user-friendly web-server, iPro54-PseKNC is freely accessible at http://lin.uestc.edu.cn/server/iPro54-PseKNC. For the convenience of the vast majority of experimental scientists, a step-by-step protocol guide was provided on how to use the web-server to get the desired results without the need to follow the complicated mathematics that were presented in this paper just for its integrity. Meanwhile, we also discovered through an in-depth statistical analysis that the distribution of distances between the transcription start sites and the translation initiation sites were governed by the gamma distribution, which may provide a fundamental physical principle for studying the ?54 promoters. PMID:25361964

Lin, Hao; Deng, En-Ze; Ding, Hui; Chen, Wei; Chou, Kuo-Chen

2014-01-01

96

Diagnostic performance of detecting breast cancer on computed radiographic (CR) mammograms: comparison of hard copy film, 3-megapixel liquid-crystal-display (LCD) monitor and 5-megapixel LCD monitor.  

PubMed

The purpose was to compare observer performance in the detection of breast cancer using hard-copy film, and 3-megapixel (3-MP) and 5-megapixel (5-MP) liquid crystal display (LCD) monitors in a simulated screening setting. We amassed 100 sample sets, including 32 patients with surgically proven breast cancer (masses present, N = 12; microcalcifications, N = 10; other types, N = 10) and 68 normal controls. All the mammograms were obtained using computed radiography (CR; sampling pitch of 50 mum). Twelve mammographers independently assessed CR mammograms presented in random order for hard-copy and soft-copy reading at minimal 4-week intervals. Observers rated the images on seven-point (1 to 7) and continuous (0 to 100) malignancy scales. Receiver-operating-characteristics analysis was performed, and the average area under the curve (AUC) was calculated for each modality. The jackknife method with the Bonferroni correction was applied to multireader/multicase analysis. The average AUC values for the 3-MP LCD, 5-MP LCD, and hard-copy film were 0.954, 0.947, and 0.956 on the seven-point scale and 0.943, 0.923, and 0.944 on the continuous scale, respectively. There were no significant differences among the three modalities on either scale. Soft-copy reading using 3-MP and 5-MP LCDs is comparable to hard-copy reading for detecting breast cancer. PMID:18491108

Yamada, Takayuki; Suzuki, Akihiko; Uchiyama, Nachiko; Ohuchi, Noriaki; Takahashi, Shoki

2008-11-01

97

KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns  

PubMed Central

Due to the importance of protein phosphorylation in cellular control, many researches are undertaken to predict the kinase-specific phosphorylation sites. Referred to our previous work, KinasePhos 1.0, incorporated profile hidden Markov model (HMM) with flanking residues of the kinase-specific phosphorylation sites. Herein, a new web server, KinasePhos 2.0, incorporates support vector machines (SVM) with the protein sequence profile and protein coupling pattern, which is a novel feature used for identifying phosphorylation sites. The coupling pattern [XdZ] denotes the amino acid coupling-pattern of amino acid types X and Z that are separated by d amino acids. The differences or quotients of coupling strength CXdZ between the positive set of phosphorylation sites and the background set of whole protein sequences from Swiss-Prot are computed to determine the number of coupling patterns for training SVM models. After the evaluation based on k-fold cross-validation and Jackknife cross-validation, the average predictive accuracy of phosphorylated serine, threonine, tyrosine and histidine are 90, 93, 88 and 93%, respectively. KinasePhos 2.0 performs better than other tools previously developed. The proposed web server is freely available at http://KinasePhos2.mbc.nctu.edu.tw/. PMID:17517770

Wong, Yung-Hao; Lee, Tzong-Yi; Liang, Han-Kuen; Huang, Chia-Mao; Wang, Ting-Yuan; Yang, Yi-Huan; Chu, Chia-Huei; Huang, Hsien-Da; Ko, Ming-Tat; Hwang, Jenn-Kang

2007-01-01

98

Predicting subcellular location of proteins using integrated-algorithm method.  

PubMed

Protein's subcellular location, which indicates where a protein resides in a cell, is an important characteristic of protein. Correctly assigning proteins to their subcellular locations would be of great help to the prediction of proteins' function, genome annotation, and drug design. Yet, in spite of great technical advance in the past decades, it is still time-consuming and laborious to experimentally determine protein subcellular locations on a high throughput scale. Hence, four integrated-algorithm methods were developed to fulfill such high throughput prediction in this article. Two data sets taken from the literature (Chou and Elrod, Protein Eng 12:107-118, 1999) were used as training set and test set, which consisted of 2,391 and 2,598 proteins, respectively. Amino acid composition was applied to represent the protein sequences. The jackknife cross-validation was used to test the training set. The final best integrated-algorithm predictor was constructed by integrating 10 algorithms in Weka (a software tool for tackling data mining tasks, http://www.cs.waikato.ac.nz/ml/weka/ ) based on an mRMR (Minimum Redundancy Maximum Relevance, http://research.janelia.org/peng/proj/mRMR/ ) method. It can achieve correct rate of 77.83 and 80.56% for the training set and test set, respectively, which is better than all of the 60 algorithms collected in Weka. This predicting software is available upon request. PMID:19662505

Cai, Yu-Dong; Lu, Lin; Chen, Lei; He, Jian-Feng

2010-08-01

99

Prediction of the types of ion channel-targeted conotoxins based on radial basis function network.  

PubMed

Conotoxins are small disulfide-rich peptide toxins, which have the exceptional diversity of sequences. Because conotoxins are able to specifically bind to ion channels and interfere with neurotransmission, they are considered as the excellent pharmacological candidates in drug design. Appropriate type assignment of newly sequenced mature ion channel-targeted conotoxins with computational method is conducive to explore the biological and pharmacological functions of conotoxins. In this paper, we developed a novel method based on binomial distribution and radial basis function network to predict the types of ion-channel targeted conotoxins. We achieved the overall accuracy of 89.3% with average accuracy of 89.7% in the prediction of three types of ion channel-targeted conotoxins in jackknife cross-validation test, indicating that the method is superior to other state-of-the-art methods. In addition, we evaluated the proposed model with an independent dataset including 77 conotoxins. The overall accuracy of 85.7% was achieved, validating that our model is reliable. Moreover, we used the proposed method to annotate 336 function-undefined mature conotoxins in the UniProt Database. The model provides the valuable instructions for theoretical and experimental research on conotoxins. PMID:23280100

Yuan, Lu-Feng; Ding, Chen; Guo, Shou-Hui; Ding, Hui; Chen, Wei; Lin, Hao

2013-03-01

100

Analysis and identification of essential genes in humans using topological properties and biological information.  

PubMed

Genes that are indispensable for survival are termed essential genes. The analysis and identification of essential genes are very important for understanding the minimal requirements of cellular survival and for practical purposes. Proteins do not exert their function in isolation of one another but rather interact together in PPI networks. A global analysis of protein interaction networks provides an effective way to elucidate the relationships between proteins. With the recent large-scale identifications of essential genes and the production of large amounts of PPIs in humans, we are able to investigate the topological properties and biological properties of essential genes. However, until recently, no one has ever investigated human essential genes using topological and biological properties. In this study, for the first time, 28 topological properties and 22 biological properties were used to investigate the characteristics of essential and non-essential genes in humans. Most of the properties were statistically discriminative between essential and non-essential genes. The F-score was used to estimate the essentiality of each property. The GO-enrichment analysis was performed to investigate the functions of the essential and non-essential genes. Finally, based on the topological features and the biological characteristics, a machine-learning classifier was constructed to predict the essential genes. The results of the jackknife test and 10-fold cross validation test are encouraging, indicating that our classifier is an effective human essential gene discovery method. PMID:25168893

Yang, Lei; Wang, Jizhe; Wang, Huiping; Lv, Yingli; Zuo, Yongchun; Li, Xiang; Jiang, Wei

2014-11-10

101

Relationship between amino acid properties and functional parameters in olfactory receptors and discrimination of mutants with enhanced specificity  

PubMed Central

Background Olfactory receptors are key components in signal transduction. Mutations in olfactory receptors alter the odor response, which is a fundamental response of organisms to their immediate environment. Understanding the relationship between odorant response and mutations in olfactory receptors is an important problem in bioinformatics and computational biology. In this work, we have systematically analyzed the relationship between various physical, chemical, energetic and conformational properties of amino acid residues, and the change of odor response/compound's potency/half maximal effective concentration (EC50) due to amino acid substitutions. Results We observed that both the characteristics of odorant molecule (ligand) and amino acid properties are important for odor response and EC50. Additional information on neighboring and surrounding residues of the mutants enhanced the correlation between amino acid properties and EC50. Further, amino acid properties have been combined systematically using multiple regression techniques and we obtained a correlation of 0.90-0.98 with odor response/EC50 of goldfish, mouse and human olfactory receptors. In addition, we have utilized machine learning methods to discriminate the mutants, which enhance or reduce EC50 values upon mutation and we obtained an accuracy of 93% and 79% for self-consistency and jack-knife tests, respectively. Conclusions Our analysis provides deep insights for understanding the odor response of olfactory receptor mutants and the present method could be used for identifying the mutants with enhanced specificity. PMID:22594995

2012-01-01

102

A method to distinguish between lysine acetylation and lysine methylation from protein sequences.  

PubMed

Lysine acetylation and methylation are two major post-translational modifications of lysine residues. They play vital roles in both biological and pathological processes. Specific lysine residues in H3 histone protein tails appear to be targeted for either acetylation or methylation. Hence it is very challenging to distinguish between acetylated and methylated lysine residues using computational methods. This work presents a method that incorporates protein sequence information, secondary structure and amino acid properties to differentiate acetyl-lysine from methyl-lysine. We apply an encoding scheme based on grouped weight and position weight amino acid composition to extract sequence information and physicochemical properties around lysine sites. The proposed method achieves an accuracy of 93.3% using a jackknife test. Feature analysis demonstrates that the prediction model with multiple features can take full advantage of the supplementary information from different features to improve classification performance and prediction robustness. Analysis of the characteristics of lysine residues which can be either methylated or acetylated shows that they are more similar to methyl-lysine than to acetyl-lysine. PMID:22796329

Shi, Shao-Ping; Qiu, Jian-Ding; Sun, Xing-Yu; Suo, Sheng-Bao; Huang, Shu-Yun; Liang, Ru-Ping

2012-10-01

103

Optimizing the feature set for a Bayesian network for breast cancer diagnosis using genetic algorithm techniques  

NASA Astrophysics Data System (ADS)

This study investigates the degree to which the performance of Bayesian belief networks (BBNs), for computer-assisted diagnosis of breast cancer, can be improved by optimizing their input feature sets using a genetic algorithm (GA). 421 cases (all women) were used in this study, of which 92 were positive for breast cancer. Each case contained both non-image information and image information derived from mammograms by radiologists. A GA was used to select an optimal subset of features, from a total of 21, to use as the basis for a BBN classifier. The figure-of-merit used in the GA's evaluation of feature subsets was Az, the area under the ROC curve produced by the corresponding BBN classifier. For each feature subset evaluated by the GA, a BBN was developed to classify positive and negative cases. Overall performance of the BBNs was evaluated using a jackknife testing method to calculate Az, for their respective ROC curves. The Az value of the BBN incorporating all 21 features was 0.851 plus or minus 0.012. After a 93 generation search, the GA found an optimal feature set with four non-image and four mammographic features, which achieved an Az value of 0.927 plus or minus 0.009. This study suggests that GAs are a viable means to optimize feature sets, and optimizing feature sets can result in significant performance improvements.

Wang, Xiao Hui; Zheng, Bin; Chang, Yuan-Hsiang; Good, Walter F.

1999-05-01

104

Parallel Worldline Numerics: Implementation and Error Analysis  

E-print Network

We give an overview of the worldline numerics technique, and discuss the parallel CUDA implementation of a worldline numerics algorithm. In the worldline numerics technique, we wish to generate an ensemble of representative closed-loop particle trajectories, and use these to compute an approximate average value for Wilson loops. We show how this can be done with a specific emphasis on cylindrically symmetric magnetic fields. The fine-grained, massive parallelism provided by the GPU architecture results in considerable speedup in computing Wilson loop averages. Furthermore, we give a brief overview of uncertainty analysis in the worldline numerics method. There are uncertainties from discretizing each loop, and from using a statistical ensemble of representative loops. The former can be minimized so that the latter dominates. However, determining the statistical uncertainties is complicated by two subtleties. Firstly, the distributions generated by the worldline ensembles are highly non-Gaussian, and so the standard error in the mean is not a good measure of the statistical uncertainty. Secondly, because the same ensemble of worldlines is used to compute the Wilson loops at different values of $T$ and $x_\\mathrm{ cm}$, the uncertainties associated with each computed value of the integrand are strongly correlated. We recommend a form of jackknife analysis which deals with both of these problems.

Dan Mazur; Jeremy S. Heyl

2014-07-28

105

Bacterial community structure and soil properties of a subarctic tundra soil in Council, Alaska.  

PubMed

The subarctic region is highly responsive and vulnerable to climate change. Understanding the structure of subarctic soil microbial communities is essential for predicting the response of the subarctic soil environment to climate change. To determine the composition of the bacterial community and its relationship with soil properties, we investigated the bacterial community structure and properties of surface soil from the moist acidic tussock tundra in Council, Alaska. We collected 70 soil samples with 25-m intervals between sampling points from 0-10 cm to 10-20 cm depths. The bacterial community was analyzed by pyrosequencing of 16S rRNA genes, and the following soil properties were analyzed: soil moisture content (MC), pH, total carbon (TC), total nitrogen (TN), and inorganic nitrogen (NH4+ and NO3-). The community compositions of the two different depths showed that Alphaproteobacteria decreased with soil depth. Among the soil properties measured, soil pH was the most significant factor correlating with bacterial community in both upper and lower-layer soils. Bacterial community similarity based on jackknifed unweighted unifrac distance showed greater similarity across horizontal layers than through the vertical depth. This study showed that soil depth and pH were the most important soil properties determining bacterial community structure of the subarctic tundra soil in Council, Alaska. PMID:24893754

Kim, Hye Min; Jung, Ji Young; Yergeau, Etienne; Hwang, Chung Yeon; Hinzman, Larry; Nam, Sungjin; Hong, Soon Gyu; Kim, Ok-Sun; Chun, Jongsik; Lee, Yoo Kyung

2014-08-01

106

Multivariate seismic calibration for the Novaya Zemlya test site. Report No. 2, 27 June 1991-22 June 1992  

SciTech Connect

Within the last year, Soviet yield data have been acquired by DARPA for over 40 underground nuclear explosions at the Novaya Zemlya Test Site between 1964 and 1990. These yields are compared to previous estimates by other authors, based on observed seismic magnitudes and magnitude-log yield relations transported from other test sites. Several discrepancies in the yield data are noted. Seismic magnitude data, based on NORSAR Lg and P coda, Grafenberg Lg, and a world-wide m sub b, have been published by Ringdal and Fyen (1991) for 18 of these events. A similar set of Soviet network magnitudes have been published by Israelsson (1992). Using these data, estimates of the multivariate calibration parameters of the magnitude-log yield relations are computed. An outlier test is applied to the residuals to the lines of best fit. One of the two smallest events is identified as an outlier for every multivariate magnitude combination. A classical confidence interval is presented to estimate future yields, based on estimates of the unknown multivariate calibration parameters. A test of TTBT compliance and a definition of the F-number, based on the confidence interval, are also provided. F-number estimates are obtained for various magnitude combinations by jackknifing. The reliability of the results is discussed, in light of the fact that the data are tightly clustered for 16 of the 18 events.

Fisk, M.D.; Gray, H.L.; Alewine, R.W.; McCartor, G.D.

1992-09-30

107

Oxydemeton-methyl resistance, mechanisms, and associated fitness cost in green peach aphids (Hemiptera: Aphididae).  

PubMed

Susceptibility to oxydemeton-methyl and imidacloprid, and the inhibitory effects of oxydemeton-methyl and some organophosphate compounds on acetylcholinesterase (AChE) and carboxylesterase activity were studied in two populations (Karaj and Rasht) of green peach aphids, Myzus persicae (Sulzer). Results show that the Karaj population was resistant to oxydemeton-methyl but susceptible to imidacloprid. The esterase activity of the resistant and susceptible populations suggests that one of the resistance mechanisms to oxydemeton-methyl was esterase-based. The inhibition assay shows that the AChE of the Karaj population is less sensitive to oxydemeton-methyl and paraoxon derivatives. Regarding the paraoxon derivatives, the smaller paraoxon side chain is more potent against the modified AChE than against the AChE from the susceptible strain. Fertility life table parameters of green peach aphid populations resistant and susceptible to oxydemeton-methyl also were studied under laboratory conditions. The standard errors of the population growth parameters were calculated using the Jackknife method. Results showed that susceptible strain exhibits a significantly higher r(m) than the resistant strain, probably because the resistant strain had a higher generation time than the susceptible strain. These results suggested that the resistant Karaj strain may be less fit than the susceptible strain. PMID:18767757

Ghadamyari, M; Talebi, K; Mizuno, H; Kono, Y

2008-08-01

108

Modeling habitat suitability for complex species distributions by environmental-distance geometric mean.  

PubMed

This paper presents a new habitat suitability modeling method whose main properties are as follows: (1) It is based on the density of observation points in the environmental space, which enables it to fit complex distributions (e.g. nongaussian, bimodal, asymmetrical, etc.). (2) This density is modeled by computing the geometric mean to all observation points, which we show to be a good trade-off between goodness of fit and prediction power. (3) It does not need any absence information, which is generally difficult to collect and of dubious reliability. (4) The environmental space is represented either by an expert-selection of standardized variables or the axes of a factor analysis [in this paper we used the Ecological Niche Factor Analysis (ENFA)]. We first explain the details of the geometric mean algorithm and then we apply it to the bearded vulture (Gypaetus barbatus) habitat in the Swiss Alps. The results are compared to those obtained by the "median algorithm" and tested by jack-knife cross-validation. We also discuss other related algorithms (BIOCLIM, HABITAT, and DOMAIN). All these analyses were implemented into and performed with the ecology-oriented GIS software BIOMAPPER 2.0.The results show the geometric mean to perform better than the median algorithm, as it produces a tighter fit to the bimodal distribution of the bearded vulture in the environmental space. However, the "median algorithm" being quicker, it could be preferred when modeling more usual distribution. PMID:15015699

Hirzel, Alexandre H; Arlettaz, Raphaël

2003-11-01

109

Stock structure of Lake Baikal omul as determined by whole-body morphology  

USGS Publications Warehouse

In Lake Baikal, three morphotypes of omul Coregonus autumnalis migratorius are recognized; the littoral, pelagic, and deep-water forms. Morphotype assignment is difficult, and similar to that encountered in pelagic and deep-water coregonines in the Laurentian Great Lakes. Principal component analysis revealed separation of all three morphotypes based on caudal peduncle length and depth, length and depth of the body between the dorsal and anal fin, and distance between the pectoral and pelvic fins. Strong negative loadings were associated with head measurements. Omul of the same morphotype captured at different locations were classified to location of capture using step-wise discriminant function analysis. Jackknife correct classifications ranged from 43 to 78% for littoral omul from five locations, and 45-86% for pelagic omul from four locations. Patterns of local misclassification of littoral omul suggested that the sub-population structure, hence stock affinity, may be influenced by movements and intermixing of individuals among areas that are joined bathymetrically. Pelagic omul were more distinguishable by site and may support a previous hypothesis of a spawning based rather than a foraging-based sub-population structure. Omul morphotypes may reflect adaptations to both ecological and local environmental conditions, and may have a genetic basis.

Bronte, Charles R.; Fleischer, G.W.; Maistrenko, S.G.; Pronin, N.M.

1999-01-01

110

Differentiating prenatal exposure to methamphetamine and alcohol versus alcohol and not methamphetamine using tensor based brain morphometry and discriminant analysis  

PubMed Central

Here we investigate the effects of prenatal exposure to methamphetamine (MA) on local brain volume using magnetic resonance imaging. Because many who use MA during pregnancy also use alcohol, a known teratogen, we examined whether local brain volumes differed among 61 children (ages 5 to 15), 21 with prenatal MA exposure, 18 with concomitant prenatal alcohol exposure (the MAA group), 13 with heavy prenatal alcohol but not MA exposure (ALC group), and 27 unexposed controls (CON group). Volume reductions were observed in both exposure groups relative to controls in striatal and thalamic regions bilaterally, and right prefrontal and left occipitoparietal cortices. Striatal volume reductions were more severe in the MAA group than in the ALC group, and within the MAA group, a negative correlation between full-scale IQ (FSIQ) scores and caudate volume was observed. Limbic structures including the anterior and posterior cingulate, the inferior frontal gyrus (IFG) and ventral and lateral temporal lobes bilaterally were increased in volume in both exposure groups. Further, cingulate and right IFG volume increases were more pronounced in the MAA than ALC group. Discriminant function analyses using local volume measurements and FSIQ were used to predict group membership, yielding factor scores that correctly classified 72% of participants in jackknife analyses. These findings suggest that striatal and limbic structures, known to be sites of neurotoxicity in adult MA abusers, may be more vulnerable to prenatal MA exposure than alcohol exposure, and that more severe striatal damage is associated with more severe cognitive deficit. PMID:20237258

Sowell, Elizabeth R.; Leow, Alex D.; Bookheimer, Susan Y.; Smith, Lynne M.; O’Connor, Mary J.; Kan, Eric; Rosso, Carly; Houston, Suzanne; Dinov, Ivo D.; Thompson, Paul M.

2010-01-01

111

Protein location prediction using atomic composition and global features of the amino acid sequence  

SciTech Connect

Subcellular location of protein is constructive information in determining its function, screening for drug candidates, vaccine design, annotation of gene products and in selecting relevant proteins for further studies. Computational prediction of subcellular localization deals with predicting the location of a protein from its amino acid sequence. For a computational localization prediction method to be more accurate, it should exploit all possible relevant biological features that contribute to the subcellular localization. In this work, we extracted the biological features from the full length protein sequence to incorporate more biological information. A new biological feature, distribution of atomic composition is effectively used with, multiple physiochemical properties, amino acid composition, three part amino acid composition, and sequence similarity for predicting the subcellular location of the protein. Support Vector Machines are designed for four modules and prediction is made by a weighted voting system. Our system makes prediction with an accuracy of 100, 82.47, 88.81 for self-consistency test, jackknife test and independent data test respectively. Our results provide evidence that the prediction based on the biological features derived from the full length amino acid sequence gives better accuracy than those derived from N-terminal alone. Considering the features as a distribution within the entire sequence will bring out underlying property distribution to a greater detail to enhance the prediction accuracy.

Cherian, Betsy Sheena, E-mail: betsy.skb@gmail.com [Centre for Bioinformatics, University of Kerala, Kariyavattom Campus, Thiruvananthapuram, Kerala (India); Nair, Achuthsankar S. [Centre for Bioinformatics, University of Kerala, Kariyavattom Campus, Thiruvananthapuram, Kerala (India)] [Centre for Bioinformatics, University of Kerala, Kariyavattom Campus, Thiruvananthapuram, Kerala (India)

2010-01-22

112

Estimation of age at death from the pubic symphysis and the auricular surface of the ilium using a smoothing procedure.  

PubMed

We discuss here the estimation of age at death from two indicators (pubic symphysis and the sacro-pelvic surface of the ilium) based on four different osteological series from Portugal, Great-Britain, South Africa or USA (European origin). These samples and the scoring system of the two indicators were used by Schmitt et al. (2002), applying the methodology proposed by Lucy et al. (1996). In the present work, the same data was processed using a modification of the empirical method proposed by Lucy et al. (2002). The various probability distributions are estimated from training data by using kernel density procedures and Jackknife methodology. Bayes's theorem is then used to produce the posterior distribution from which point and interval estimates may be made. This statistical approach reduces the bias of the estimates to less than 70% of what was obtained by the initial method. This reduction going up to 52% if knowledge of sex of the individual is available, and produces an age for all the individuals that improves age at death assessment. PMID:22206714

Martins, Rui; Oliveira, Paulo Eduardo; Schmitt, Aurore

2012-06-10

113

Variables influencing the presence of subyearling fall Chinook salmon in shoreline habitats of the Hanford Reach, Columbia River  

USGS Publications Warehouse

Little information currently exists on habitat use by subyearling fall Chinook salmon Oncorhynchus tshawytscha rearing in large, main-stem habitats. We collected habitat use information on subyearlings in the Hanford Reach of the Columbia River during May 1994 and April-May 1995 using point abundance electrofishing. We analyzed measures of physical habitat using logistic regression to predict fish presence and absence in shoreline habitats. The difference between water temperature at the point of sampling and in the main river channel was the most important variable for predicting the presence and absence of subyearlings. Mean water velocities of 45 cm/s or less and habitats with low lateral bank slopes were also associated with a greater likelihood of subyearling presence. Intermediate-sized gravel and cobble substrates were significant predictors of fish presence, but small (256-mm) substrates were not. Our rearing model was accurate at predicting fish presence and absence using jackknifing (80% correct) and classification of observations from an independent data set (76% correct). The habitat requirements of fall Chinook salmon in the Hanford Reach are similar to those reported for juvenile Chinook salmon in smaller systems but are met in functionally different ways in a large river.

Tiffan, K. F.; Clark, L. O.; Garland, R. D.; Rondorf, D. W.

2006-01-01

114

Analysis of clustered data in receiver operating characteristic studies.  

PubMed

Clustered data is not simply correlated data, but has its own unique aspects. In this paper, various methods for correlated receiver operating characteristic (ROC) curve data that have been extended specifically to clustered data are reviewed. For those methods that have not yet been extended, suggestions for their application to clustered ROC studies are provided. Various methods with respect to their ability to meet either of two objectives of the analysis of clustered ROC data are compared to consider a variety of ROC indices and their accessibility to researchers. The available statistical methods for clustered data vary in the range of indices that can be considered and in their accessibility to researchers. Parametric models permit all indices to be considered but, owing to computational complexity, are the least accessible of available methods. Nonparametric methods are much more accessible, but only permit estimation and inference about ROC curve area. The jackknife method is the most accessible and permits any index to be considered. Future development of methods for clustered ROC studies should consider the continuation ratio model, which will permit the application of widely available software for the analysis of mixed generalized linear models. Another area of development should be in the adoption of bootstrapping methods to clustered ROC data. PMID:9871950

Beam, C A

1998-12-01

115

A pseudo-Thellier relative palaeointensity record, and rock magnetic and geochemical parameters in relation to climate during the last 276 kyr in the Azores region  

NASA Astrophysics Data System (ADS)

In the pseudo-Thellier method for relative palaeointensity determinations (Tauxe et al. 1995) the slope of the NRM intensity left after AF demagnetization versus ARM intensity gained at the same peak field is used as a palaeointensity measure. We tested this method on a marine core from the Azores, spanning the last 276 kyr. We compared the pseudo-Thellier palaeointensity record with the conventional record obtained earlier by Lehman et al. (1996), who normalized NRM by SIRM. The two records show similar features: intensity lows with deviating palaeomagnetic directions at 40-45 ka and at 180-190 ka. The first interval is associated with the Laschamps excursion, while the 180-190 ka low represents the Iceland Basin excursion (Channell et al. 1997). The pseudo-Thellier method, in combination with a jackknife resampling scheme, provides error estimates on the palaeointensity. Spectral analysis of the rock magnetic parameters and the palaeointensity estimates shows orbitally forced periods, particularly 23 kyr for climatic precession. This suggests that palaeointensity is still slightly contaminated by climate. Fuzzy c-means cluster analysis of rock magnetic and geochemical parameters yields a seven-cluster model of predominantly calcareous clusters and detrital clusters. The clusters show a strong correlation with climate, for example samples from detrital clusters predominantly appear during rapid warming. Although both the pseudo-Thellier palaeointensity ma and fuzzy clusters show climatic influences, we have not been able to find an unambiguous connection between the clusters and ma .

Kruiver, P. P.; Kok, Y. S.; Dekkers, M. J.; Langereis, C. G.; Laj, C.

1999-03-01

116

PSSP-RFE: accurate prediction of protein structural class by recursive feature extraction from PSI-BLAST profile, physical-chemical property and functional annotations.  

PubMed

Protein structure prediction is critical to functional annotation of the massively accumulated biological sequences, which prompts an imperative need for the development of high-throughput technologies. As a first and key step in protein structure prediction, protein structural class prediction becomes an increasingly challenging task. Amongst most homological-based approaches, the accuracies of protein structural class prediction are sufficiently high for high similarity datasets, but still far from being satisfactory for low similarity datasets, i.e., below 40% in pairwise sequence similarity. Therefore, we present a novel method for accurate and reliable protein structural class prediction for both high and low similarity datasets. This method is based on Support Vector Machine (SVM) in conjunction with integrated features from position-specific score matrix (PSSM), PROFEAT and Gene Ontology (GO). A feature selection approach, SVM-RFE, is also used to rank the integrated feature vectors through recursively removing the feature with the lowest ranking score. The definitive top features selected by SVM-RFE are input into the SVM engines to predict the structural class of a query protein. To validate our method, jackknife tests were applied to seven widely used benchmark datasets, reaching overall accuracies between 84.61% and 99.79%, which are significantly higher than those achieved by state-of-the-art tools. These results suggest that our method could serve as an accurate and cost-effective alternative to existing methods in protein structural classification, especially for low similarity datasets. PMID:24675610

Li, Liqi; Cui, Xiang; Yu, Sanjiu; Zhang, Yuan; Luo, Zhong; Yang, Hua; Zhou, Yue; Zheng, Xiaoqi

2014-01-01

117

Human DNA Ligase III Recognizes DNA Ends by Dynamic Switching between Two DNA-Bound States  

SciTech Connect

Human DNA ligase III has essential functions in nuclear and mitochondrial DNA replication and repair and contains a PARP-like zinc finger (ZnF) that increases the extent of DNA nick joining and intermolecular DNA ligation, yet the bases for ligase III specificity and structural variation among human ligases are not understood. Here combined crystal structure and small-angle X-ray scattering results reveal dynamic switching between two nick-binding components of ligase III: the ZnF-DNA binding domain (DBD) forms a crescent-shaped surface used for DNA end recognition which switches to a ring formed by the nucleotidyl transferase (NTase) and OB-fold (OBD) domains for catalysis. Structural and mutational analyses indicate that high flexibility and distinct DNA binding domain features in ligase III assist both nick sensing and the transition from nick sensing by the ZnF to nick joining by the catalytic core. The collective results support a 'jackknife model' in which the ZnF loads ligase III onto nicked DNA and conformational changes deliver DNA into the active site. This work has implications for the biological specificity of DNA ligases and functions of PARP-like zinc fingers.

Cotner-Gohara, Elizabeth; Kim, In-Kwon; Hammel, Michal; Tainer, John A.; Tomkinson, Alan E.; Ellenberger, Tom (Scripps); (Maryland-MED); (WU-MED); (LBNL)

2010-09-13

118

Identification of Essential Proteins Based on Ranking Edge-Weights in Protein-Protein Interaction Networks  

PubMed Central

Essential proteins are those that are indispensable to cellular survival and development. Existing methods for essential protein identification generally rely on knock-out experiments and/or the relative density of their interactions (edges) with other proteins in a Protein-Protein Interaction (PPI) network. Here, we present a computational method, called EW, to first rank protein-protein interactions in terms of their Edge Weights, and then identify sub-PPI-networks consisting of only the highly-ranked edges and predict their proteins as essential proteins. We have applied this method to publicly-available PPI data on Saccharomyces cerevisiae (Yeast) and Escherichia coli (E. coli) for essential protein identification, and demonstrated that EW achieves better performance than the state-of-the-art methods in terms of the precision-recall and Jackknife measures. The highly-ranked protein-protein interactions by our prediction tend to be biologically significant in both the Yeast and E. coli PPI networks. Further analyses on systematically perturbed Yeast and E. coli PPI networks through randomly deleting edges demonstrate that the proposed method is robust and the top-ranked edges tend to be more associated with known essential proteins than the lowly-ranked edges. PMID:25268881

Wang, Yan; Sun, Huiyan; Du, Wei; Blanzieri, Enrico; Viero, Gabriella; Xu, Ying; Liang, Yanchun

2014-01-01

119

Quality assessment of High Angular Resolution Diffusion Imaging data using bootstrap on Q-ball reconstruction  

PubMed Central

Purpose To develop a bootstrap method to assess the quality of High Angular Resolution Diffusion Imaging (HARDI) data using Q-Ball imaging (QBI) reconstruction. Materials and Methods HARDI data were re-shuffled using regular bootstrap with jackknife sampling. For each bootstrap dataset, the diffusion orientation distribution function (ODF) was estimated voxel-wise using QBI reconstruction based on spherical harmonics functions. The reproducibility of the ODF was assessed using the Jensen-Shannon divergence (JSD) and the angular confidence interval was derived for the first and the second ODF maxima. The sensitivity of the bootstrap method was evaluated on a human subject by adding synthetic noise to the data, by acquiring a map of image signal-to-noise ratio (SNR) and by varying the echo time and the b-value. Results The JSD was directly linked to the image SNR. The impact of echo times and b-values was reflected by both the JSD and the angular confidence interval, proving the usefulness of the bootstrap method to evaluate specific features of HARDI data. Conclusion The bootstrap method can effectively assess the quality of HARDI data and can be used to evaluate new hardware and pulse sequences, perform multi-fiber probabilistic tractography, and provide reliability metrics to support clinical studies. PMID:21509879

Cohen-Adad, J.; Descoteaux, M.; Wald, L.L.

2011-01-01

120

Analysis of secondary outcomes in nested case-control study designs.  

PubMed

One of the main perceived advantages of using a case-cohort design compared with a nested case-control design in an epidemiologic study is the ability to evaluate with the same subcohort outcomes other than the primary outcome of interest. In this paper, we show that valid inferences about secondary outcomes can also be achieved in nested case-control studies by using the inclusion probability weighting method in combination with an approximate jackknife standard error that can be computed using existing software. Simulation studies demonstrate that when the sample size is sufficient, this approach yields valid type 1 error and coverage rates for the analysis of secondary outcomes in nested case-control designs. Interestingly, the statistical power of the nested case-control design was comparable with that of the case-cohort design when the primary and secondary outcomes were positively correlated. The proposed method is illustrated with the data from a cohort in Cardiovascular Health Study to study the association of C-reactive protein levels and the incidence of congestive heart failure. Copyright © 2014 John Wiley & Sons, Ltd. PMID:24919979

Kim, Ryung S; Kaplan, Robert C

2014-10-30

121

Prediction of Drugs Target Groups Based on ChEBI Ontology  

PubMed Central

Most drugs have beneficial as well as adverse effects and exert their biological functions by adjusting and altering the functions of their target proteins. Thus, knowledge of drugs target proteins is essential for the improvement of therapeutic effects and mitigation of undesirable side effects. In the study, we proposed a novel prediction method based on drug/compound ontology information extracted from ChEBI to identify drugs target groups from which the kind of functions of a drug may be deduced. By collecting data in KEGG, a benchmark dataset consisting of 876 drugs, categorized into four target groups, was constructed. To evaluate the method more thoroughly, the benchmark dataset was divided into a training dataset and an independent test dataset. It is observed by jackknife test that the overall prediction accuracy on the training dataset was 83.12%, while it was 87.50% on the test dataset—the predictor exhibited an excellent generalization. The good performance of the method indicates that the ontology information of the drugs contains rich information about their target groups, and the study may become an inspiration to solve the problems of this sort and bridge the gap between ChEBI ontology and drugs target groups. PMID:24350241

Gao, Yu-Fei; Chen, Lei; Huang, Guo-Hua; Zhang, Tao; Feng, Kai-Yan; Li, Hai-Peng; Jiang, Yang

2013-01-01

122

Prediction of space sickness in astronauts from preflight fluid, electrolyte, and cardiovascular variables and Weightless Environmental Training Facility (WETF) training  

NASA Technical Reports Server (NTRS)

Nine preflight variables related to fluid, electrolyte, and cardiovascular status from 64 first-time Shuttle crewmembers were differentially weighted by discrimination analysis to predict the incidence and severity of each crewmember's space sickness as rated by NASA flight surgeons. The nine variables are serum uric acid, red cell count, environmental temperature at the launch site, serum phosphate, urine osmolality, serum thyroxine, sitting systolic blood pressure, calculated blood volume, and serum chloride. Using two methods of cross-validation on the original samples (jackknife and a stratefied random subsample), these variables enable the prediction of space sickness incidence (NONE or SICK) with 80 percent sickness and space severity (NONE, MILD, MODERATE, of SEVERE) with 59 percent success by one method of cross-validation and 67 percent by another method. Addition of a tenth variable, hours spent in the Weightlessness Environment Training Facility (WETF) did not improve the prediction of space sickness incidences but did improve the prediction of space sickness severity to 66 percent success by the first method of cross-validation of original samples and to 71 percent by the second method. Results to date suggest the presence of predisposing physiologic factors to space sickness that implicate fluid shift etiology. The data also suggest that prior exposure to fluid shift during WETF training may produce some circulatory pre-adaption to fluid shifts in weightlessness that results in a reduction of space sickness severity.

Simanonok, K.; Mosely, E.; Charles, J.

1992-01-01

123

Driver assistance system for passive multi-trailer vehicles with haptic steering limitations on the leading unit.  

PubMed

Driving vehicles with one or more passive trailers has difficulties in both forward and backward motion due to inter-unit collisions, jackknife, and lack of visibility. Consequently, advanced driver assistance systems (ADAS) for multi-trailer combinations can be beneficial to accident avoidance as well as to driver comfort. The ADAS proposed in this paper aims to prevent unsafe steering commands by means of a haptic handwheel. Furthermore, when driving in reverse, the steering-wheel and pedals can be used as if the vehicle was driven from the back of the last trailer with visual aid from a rear-view camera. This solution, which can be implemented in drive-by-wire vehicles with hitch angle sensors, profits from two methods previously developed by the authors: safe steering by applying a curvature limitation to the leading unit, and a virtual tractor concept for backward motion that includes the complex case of set-point propagation through on-axle hitches. The paper addresses system requirements and provides implementation details to tele-operate two different off- and on-axle combinations of a tracked mobile robot pulling and pushing two dissimilar trailers. PMID:23552102

Morales, Jesús; Mandow, Anthony; Martínez, Jorge L; Reina, Antonio J; García-Cerezo, Alfonso

2013-01-01

124

Prospective Clinical Study of 551 Cases of Liposuction and Abdominoplasty Performed Individually and in Combination  

PubMed Central

Background: Despite the popularity of these procedures, there are limited published prospective studies evaluating liposuction and abdominoplasty. Lipoabdominoplasty is a subject of recent attention. Several investigators have recommended alternative techniques that preserve the Scarpa fascia in an effort to reduce complications, particularly the risk of seromas. Methods: Over a 5-year period, 551 consecutive patients were treated with ultrasonic liposuction alone (n = 384), liposuction/abdominoplasty (n = 150), or abdominoplasty alone (n = 17). In lipoabdominoplasties, the abdomen and flanks were first treated with liposuction. A traditional flap dissection was used for all abdominoplasties. Scalpel dissection was used rather than electrodissection. A supine “jackknife” position was used in surgery to provide maximum hip flexion, allowing a secure deep fascial repair. Results: The complication rate after liposuction was 4.2% vs 50% for patients treated with an abdominoplasty. Approximately half of the abdominoplasty complications were minor scar deformities, including widened umbilical scars (17.3%) that were revised. The seroma rate after abdominoplasties was 5.4%; there were no seromas after liposuction alone. Conclusions: Lipoabdominoplasty may be performed safely, so that patients may benefit from both modalities. The seroma rate is reduced by avoiding electrodissection, making Scarpa fascia preservation a moot point. A deep fascial repair keeps the abdominoplasty scar within the bikini line. Deep venous thrombosis and other complications may be minimized with precautions that do not include anticoagulation.

2013-01-01

125

Interpolation of Global Monthly Rain-Gauge Observations for Climate Change Analysis  

NASA Astrophysics Data System (ADS)

Monthly precipitation sums are observed at thousands of meteorological stations worldwide. Different institutes (e.g. the Global Precipitation Climatology Centre, GPCC, and the Climatic Research Unit, CRU, of the University of East Anglia) interpolate these observations to regular grids. These data are used widely in climate research, e.g. for the investigation of the hydrological cycle and climate change. Results of the interpolation depend on the station density, which varies considerably around the globe. It also depends on the interpolation method used (e.g. Ordinary Kriging and Shepard's Method). These methods are general interpolation methods that do not take into account the specifics of precipitation. The question discussed in this presentation is whether we can do better by using an interpolation strategy especially designed for monthly precipitation observations. Based on a dense local dataset (one station per 109 km2) and a less dense global dataset (one station per 27,000 km2) of 50 years of monthly precipitation observations, various interpolation strategies are compared. This includes the interpolation of transformed variables, the consideration of local spatial correlation of precipitation as well as data quality. The Jack-knife error is used to compare the different strategies. The major result is that some strategies used so far are far from optimal.

Grieser, Jürgen

2014-05-01

126

A Multi-label Classifier for Prediction Membrane Protein Functional Types in Animal.  

PubMed

Membrane protein is an important composition of cell membrane. Given a membrane protein sequence, how can we identify its type(s) is very important because the type keeps a close correlation with its functions. According to previous studies, membrane protein can be divided into the following eight types: single-pass type I, single-pass type II, single-pass type III, single-pass type IV, multipass, lipid-anchor, GPI-anchor, peripheral membrane protein. With the avalanche of newly found protein sequences in the post-genomic age, it is urgent to develop an automatic and effective computational method to rapid and reliable prediction of the types of membrane proteins. At present, most of the existing methods were based on the assumption that one membrane protein only belongs to one type. Actually, a membrane protein may simultaneously exist at two or more different functional types. In this study, a new method by hybridizing the pseudo amino acid composition with multi-label algorithm called LIFT (multi-label learning with label-specific features) was proposed to predict the functional types both singleplex and multiplex animal membrane proteins. Experimental result on a stringent benchmark dataset of membrane proteins by jackknife test show that the absolute-true obtained was 0.6342, indicating that our approach is quite promising. It may become a useful high-through tool, or at least play a complementary role to the existing predictors in identifying functional types of membrane proteins. PMID:25107302

Zou, Hong-Liang

2014-11-01

127

Prediction of Body Fluids where Proteins are Secreted into Based on Protein Interaction Network  

PubMed Central

Determining the body fluids where secreted proteins can be secreted into is important for protein function annotation and disease biomarker discovery. In this study, we developed a network-based method to predict which kind of body fluids human proteins can be secreted into. For a newly constructed benchmark dataset that consists of 529 human-secreted proteins, the prediction accuracy for the most possible body fluid location predicted by our method via the jackknife test was 79.02%, significantly higher than the success rate by a random guess (29.36%). The likelihood that the predicted body fluids of the first four orders contain all the true body fluids where the proteins can be secreted into is 62.94%. Our method was further demonstrated with two independent datasets: one contains 57 proteins that can be secreted into blood; while the other contains 61 proteins that can be secreted into plasma/serum and were possible biomarkers associated with various cancers. For the 57 proteins in first dataset, 55 were correctly predicted as blood-secrete proteins. For the 61 proteins in the second dataset, 58 were predicted to be most possible in plasma/serum. These encouraging results indicate that the network-based prediction method is quite promising. It is anticipated that the method will benefit the relevant areas for both basic research and drug development. PMID:21829572

Hu, Le-Le; Huang, Tao; Cai, Yu-Dong; Chou, Kuo-Chen

2011-01-01

128

Tributaries under Mediterranean climate: their role in macrobenthos diversity maintenance.  

PubMed

The taxonomic richness erosion and the role of tributaries in the maintenance of the taxonomic richness were considered in a Mediterranean catchment in southeastern France. Nine stations were chosen along the Arc stream (three stations downstream from an organic effluent and one station upstream from the pollution source) and on two groups of tributaries (three intermittent and two perennial). High biodiversity erosion was noticed in the main stem, revealing diffuse sources of pollution added to the expected effect of the localized organic pollution. Jackknife richness estimator and beta diversity indicated that the intermittent tributaries had the highest richness values and harboured 70% of the taxa recorded at the catchment scale. The intermittent flow tributaries seem to play a major role in maintaining the taxonomic richness in such catchments, highly impacted by anthropogenic activities. The detailed examination and the preservation of these ecosystems should be an important step in catchment management, and support the need for catchment-scale conservation of freshwater invertebrates. PMID:18558378

Maasri, Alain; Dumont, Bernard; Claret, Cécile; Archambaud-Suard, Gaït; Gandouin, Emmanuel; Franquet, Evelyne

2008-07-01

129

Phylogenetic analysis of the Triticeae (Poaceae) based on rpoA sequence data.  

PubMed

A phylogenetic analysis was conducted on 31 diploid species representing 21 of the 24 monogenomic genera of the Triticeae. The data used were derived from a 1343- to 1358-bp region of the plastid genome spanning the entire rpoA gene plus minor parts of the petD and rps11 genes and the two intergenic spacers surrounding rpoA. Bromus inermis (Bromeae) was used as an outgroup. A total of 68 variable sites, 25 of them phylogenetically informative, and seven length mutations were detected. The length mutations occurred in the noncoding regions. Phylogenetic analyses were performed on the whole data set and on various subsets. The analysis of the unweighted data resulted in 48 equally parsimonious trees (length 98, CI = 0.88, RI = 0.92, ti = 0.25). A parsimony jackknife analysis proved several clades to be well supported. The effect of transition/transversion weighting was also investigated. In general, congruence with other data sets was negatively effected by weighting. The preferred phylogenetic hypothesis is congruent with a phylogeny based on plastid RFLP data including the same taxa, but it is largely incongruent with phylogenies derived from nuclear rDNA and morphology. PMID:9126564

Petersen, G; Seberg, O

1997-04-01

130

North American Tropical Cyclone Landfall and SST: A Statistical Model Study  

NASA Technical Reports Server (NTRS)

A statistical-stochastic model of the complete life cycle of North Atlantic (NA) tropical cyclones (TCs) is used to examine the relationship between climate and landfall rates along the North American Atlantic and Gulf Coasts. The model draws on archived data of TCs throughout the North Atlantic to estimate landfall rates at high geographic resolution as a function of the ENSO state and one of two different measures of sea surface temperature (SST): 1) SST averaged over the NA subtropics and the hurricane season and 2) this SST relative to the seasonal global subtropical mean SST (termed relSST). Here, the authors focus on SST by holding ENSO to a neutral state. Jackknife uncertainty tests are employed to test the significance of SST and relSST landfall relationships. There are more TC and major hurricane landfalls overall in warm years than cold, using either SST or relSST, primarily due to a basinwide increase in the number of storms. The signal along the coast, however, is complex. Some regions have large and significant sensitivity (e.g., an approximate doubling of annual major hurricane landfall probability on Texas from -2 to +2 standard deviations in relSST), while other regions have no significant sensitivity (e.g., the U.S. mid-Atlantic and Northeast coasts). This geographic structure is due to both shifts in the regions of primary TC genesis and shifts in TC propagation.

Hall, Timothy; Yonekura, Emmi

2013-01-01

131

Accurate prediction of protein structural class.  

PubMed

Because of the increasing gap between the data from sequencing and structural genomics, the accurate prediction of the structural class of a protein domain solely from the primary sequence has remained a challenging problem in structural biology. Traditional sequence-based predictors generally select several sequence features and then feed them directly into a classification program to identify the structural class. The current best sequence-based predictor achieved an overall accuracy of 74.1% when tested on a widely used, non-homologous benchmark dataset 25PDB. In the present work, we built a multiple linear regression (MLR) model to convert the 440-dimensional (440D) sequence feature vector extracted from the Position Specific Scoring Matrix (PSSM) of a protein domain to a 4-dimensinal (4D) structural feature vector, which could then be used to predict the four major structural classes. We performed 10-fold cross-validation and jackknife tests of the method on a large non-homologous dataset containing 8,244 domains distributed among the four major classes. The performance of our approach outperformed all of the existing sequence-based methods and had an overall accuracy of 83.1%, which is even higher than the results of those predicted secondary structure-based methods. PMID:22723837

Xia, Xia-Yu; Ge, Meng; Wang, Zhi-Xin; Pan, Xian-Ming

2012-01-01

132

Using near-infrared overtone regions to determine biodiesel content and adulteration of diesel/biodiesel blends with vegetable oils.  

PubMed

This work evaluates the use of near-infrared (NIR) overtone regions to determine biodiesel content, as well potential adulteration with vegetable oil, in diesel/biodiesel blends. For this purpose, NIR spectra (12,000-6300 cm(-1)) were obtained using three different optical path lengths: 10 mm, 20 mm and 50 mm. Two strategies of regression with variable selection were evaluated: partial least squares (PLS) with significant regression coefficients selected by Jack-Knife algorithm (PLS/JK) and multiple linear regression (MLR) with wavenumber selection by successive projections algorithm (MLR/SPA). For comparison, the results obtained by using PLS full-spectrum models are also presented. In addition, the performance of models using NIR (1.0 mm optical path length, 9000-4000 cm(-1)) and MIR (UATR - universal attenuated total reflectance, 4000-650 cm(-1)) spectral regions was also investigated. The results demonstrated the potential of overtone regions with MLR/SPA regression strategy to determine biodiesel content in diesel/biodiesel blends, considering the possible presence of raw oil as a contaminant. This strategy is simple, fast and uses a fewer number of spectral variables. Considering this, the overtone regions can be useful to develop low cost instruments for quality control of diesel/biodiesel blends, considering the lower cost of optical components for this spectral region. PMID:22284883

de Vasconcelos, Fernanda Vera Cruz; de Souza, Paulo Fernandes Barbosa; Pimentel, Maria Fernanda; Pontes, Márcio José Coelho; Pereira, Claudete Fernandes

2012-02-24

133

Discriminating lysosomal membrane protein types using dynamic neural network.  

PubMed

This work presents a dynamic artificial neural network methodology, which classifies the proteins into their classes from their sequences alone: the lysosomal membrane protein classes and the various other membranes protein classes. In this paper, neural networks-based lysosomal-associated membrane protein type prediction system is proposed. Different protein sequence representations are fused to extract the features of a protein sequence, which includes seven feature sets; amino acid (AA) composition, sequence length, hydrophobic group, electronic group, sum of hydrophobicity, R-group, and dipeptide composition. To reduce the dimensionality of the large feature vector, we applied the principal component analysis. The probabilistic neural network, generalized regression neural network, and Elman regression neural network (RNN) are used as classifiers and compared with layer recurrent network (LRN), a dynamic network. The dynamic networks have memory, i.e. its output depends not only on the input but the previous outputs also. Thus, the accuracy of LRN classifier among all other artificial neural networks comes out to be the highest. The overall accuracy of jackknife cross-validation is 93.2% for the data-set. These predicted results suggest that the method can be effectively applied to discriminate lysosomal associated membrane proteins from other membrane proteins (Type-I, Outer membrane proteins, GPI-Anchored) and Globular proteins, and it also indicates that the protein sequence representation can better reflect the core feature of membrane proteins than the classical AA composition. PMID:23968467

Tripathi, Vijay; Gupta, Dwijendra Kumar

2014-01-01

134

Predicting Chemical Toxicity Effects Based on Chemical-Chemical Interactions  

PubMed Central

Toxicity is a major contributor to high attrition rates of new chemical entities in drug discoveries. In this study, an order-classifier was built to predict a series of toxic effects based on data concerning chemical-chemical interactions under the assumption that interactive compounds are more likely to share similar toxicity profiles. According to their interaction confidence scores, the order from the most likely toxicity to the least was obtained for each compound. Ten test groups, each of them containing one training dataset and one test dataset, were constructed from a benchmark dataset consisting of 17,233 compounds. By a Jackknife test on each of these test groups, the 1st order prediction accuracies of the training dataset and the test dataset were all approximately 79.50%, substantially higher than the rate of 25.43% achieved by random guesses. Encouraged by the promising results, we expect that our method will become a useful tool in screening out drugs with high toxicity. PMID:23457578

Zhang, Jian; Feng, Kai-Rui; Zheng, Ming-Yue; Cai, Yu-Dong

2013-01-01

135

[Suitable distribution area of Eriosoma lanigerum (Hausmann) in China and related affecting factors].  

PubMed

Eriosoma lanigerum (Hausmann) is an important quarantine insect of apple tree, and usually causes serious economic losses in apple production area every year. To predict the suitable distribution area of E. lanigerum and the environmental factors affecting the insect' s colonization and dispersal could provide references for the forecast of the insect's distribution area, the constitution of effective quarantine measures, and the control decisions. In this study, niche model MaxEnt and ArcGIS were applied to analyze and predict the suitable distribution area of E. lanigerum, ROC was used to evaluate the prediction model and the prediction results, and Jackknife analysis was made to analyze the most important environmental factors affecting the occurrence of E. lanigerum. The results showed that E. lanigerum had a wide distribution area in China, its suitable distribution index was the highest in Liaoning, Shandong, Henan, Hebei, Anhui, Jiangsu, and Shaanxi provinces, and the most important environmental factors affecting the occurrence of E. lanigerum were temperature-dependent factors. PMID:22803484

Hong, Bo; Wang, Ying-Lun; Zhao, Hui-Yan

2012-04-01

136

Predicting Metabolic Pathways of Small Molecules and Enzymes Based on Interaction Information of Chemicals and Proteins  

PubMed Central

Metabolic pathway analysis, one of the most important fields in biochemistry, is pivotal to understanding the maintenance and modulation of the functions of an organism. Good comprehension of metabolic pathways is critical to understanding the mechanisms of some fundamental biological processes. Given a small molecule or an enzyme, how may one identify the metabolic pathways in which it may participate? Answering such a question is a first important step in understanding a metabolic pathway system. By utilizing the information provided by chemical-chemical interactions, chemical-protein interactions, and protein-protein interactions, a novel method was proposed by which to allocate small molecules and enzymes to 11 major classes of metabolic pathways. A benchmark dataset consisting of 3,348 small molecules and 654 enzymes of yeast was constructed to test the method. It was observed that the first order prediction accuracy evaluated by the jackknife test was 79.56% in identifying the small molecules and enzymes in a benchmark dataset. Our method may become a useful vehicle in predicting the metabolic pathways of small molecules and enzymes, providing a basis for some further analysis of the pathway systems. PMID:23029334

Feng, Kai-Yan; Huang, Tao; Jiang, Yang

2012-01-01

137

Multiple Subject Barycentric Discriminant Analysis (MUSUBADA): How to Assign Scans to Categories without Using Spatial Normalization  

PubMed Central

We present a new discriminant analysis (DA) method called Multiple Subject Barycentric Discriminant Analysis (MUSUBADA) suited for analyzing fMRI data because it handles datasets with multiple participants that each provides different number of variables (i.e., voxels) that are themselves grouped into regions of interest (ROIs). Like DA, MUSUBADA (1) assigns observations to predefined categories, (2) gives factorial maps displaying observations and categories, and (3) optimally assigns observations to categories. MUSUBADA handles cases with more variables than observations and can project portions of the data table (e.g., subtables, which can represent participants or ROIs) on the factorial maps. Therefore MUSUBADA can analyze datasets with different voxel numbers per participant and, so does not require spatial normalization. MUSUBADA statistical inferences are implemented with cross-validation techniques (e.g., jackknife and bootstrap), its performance is evaluated with confusion matrices (for fixed and random models) and represented with prediction, tolerance, and confidence intervals. We present an example where we predict the image categories (houses, shoes, chairs, and human, monkey, dog, faces,) of images watched by participants whose brains were scanned. This example corresponds to a DA question in which the data table is made of subtables (one per subject) and with more variables than observations. PMID:22548125

Abdi, Hervé; Williams, Lynne J.; Connolly, Andrew C.; Gobbini, M. Ida; Dunlop, Joseph P.; Haxby, James V.

2012-01-01

138

Limited sampling strategy for estimation of amikacin optimal sampling time in critically ill adults.  

PubMed

Aminoglycosides are a class of antibiotics that are commonly used in the treatment of gram-negative pathogens in the critically ill population. Unfortunately, dosing of these aminoglycosides in critically ill patients is difficult due to their altered pharmacokinetics in the critically ill and narrow therapeutic index. In this study, we evaluated whether a limited sampling strategy can be used to predict the area under the concentration (AUC) curve of amikacin concentrations over a 24-hour period after a single dose of intravenous amikacin (25 mg/kg). This open-labelled, non-comparative prospective study recruited 20 adult critically ill trauma patients with a diagnosis of hospital-acquired infection. We assessed the best estimate of plasma amikacin concentrations over a 24-hour period by multiple stepwise regression, using nine blood samples during this study period as the gold standard. Using a jackknife procedure, the AUC of amikacin over a 24-hour period was estimated by choosing a combination of the amikacin concentrations measured at different time-points. Overall, the mean prediction error of all models was not statistically different from zero (P >0.05). Based on bias and imprecision, all models gave good estimate of AUC of amikacin over a 24-hour period, but a two-point sampling strategy at 1.5 and 6 hours post-dose appeared to offer the best compromise between accuracy and cost-effectiveness in optimising the dosing of amikacin in critically ill patients. PMID:24580389

Mahmoudi, L; Mohammadpour, A H; Niknam, R; Ahmadi, A; Mojtahedzdeh, M

2014-03-01

139

An elusive search for regional flood frequency estimates in the River Nile basin  

NASA Astrophysics Data System (ADS)

Estimation of peak flow quantiles in ungauged catchments is a challenge often faced by water professionals in many parts of the world. Approaches to address such problem exist but widely used technique such as flood frequency regionalization is often not subjected to performance evaluation. In this study we used the jack-knifing principle to assess the performance of the flood frequency regionalization in the complex and data scarce River Nile basin by examining the error (regionalization error) between locally and regionally estimated peak flow quantiles for different return periods (QT). Agglomerative hierarchical clustering based algorithms were used to search for regions with similar hydrological characteristics taking into account the huge catchment area and strong climatic differences across the area. Hydrological data sets employed were from 180 gauged catchments and several physical characteristics in order to regionalize 365 identified catchments. The GEV distribution, selected using L-moment based approach, was used to construct regional growth curves from which peak flow growth factors (QT/MAF) could be derived and mapped through interpolation. Inside each region, variations in at-site flood frequency distribution were modeled by regression of the mean annual maximum peak flow (MAF) versus catchment area. The results show that the performance of the regionalization is heavily dependent on the historical flow record length and the similarity of the hydrological characteristics inside the regions. The flood frequency regionalization of the River Nile basin can be improved if sufficient flow data of longer record length × 40 become available.

Nyeko-Ogiramoi, P.; Willems, P.; Mutua, F. M.; Moges, S. A.

2012-03-01

140

An elusive search for regional flood frequency estimates in the River Nile basin  

NASA Astrophysics Data System (ADS)

Estimation of peak flow quantiles in ungauged catchments is a challenge often faced by water professionals in many parts of the world. Approaches to address such problem exist, but widely used techniques such as flood frequency regionalisation is often not subjected to performance evaluation. In this study, the jack-knifing principle is used to assess the performance of the flood frequency regionalisation in the complex and data-scarce River Nile basin by examining the error (regionalisation error) between locally and regionally estimated peak flow quantiles for different return periods (QT). Agglomerative hierarchical clustering based algorithms were used to search for regions with similar hydrological characteristics. Hydrological data employed were from 180 gauged catchments and several physical characteristics in order to regionalise 365 identified catchments. The Generalised Extreme Value (GEV) distribution, selected using L-moment based approach, was used to construct regional growth curves from which peak flow growth factors could be derived and mapped through interpolation. Inside each region, variations in at-site flood frequency distribution were modelled by regression of the mean annual maximum peak flow (MAF) versus catchment area. The results showed that the performance of the regionalisation is heavily dependent on the historical flow record length and the similarity of the hydrological characteristics inside the regions. The flood frequency regionalisation of the River Nile basin can be improved if sufficient flow data of longer record length of at least 40 yr become available.

Nyeko-Ogiramoi, P.; Willems, P.; Mutua, F. M.; Moges, S. A.

2012-09-01

141

EFICAz: a comprehensive approach for accurate genome-scale enzyme function inference  

PubMed Central

EFICAz (Enzyme Function Inference by Combined Approach) is an automatic engine for large-scale enzyme function inference that combines predictions from four different methods developed and optimized to achieve high prediction accuracy: (i) recognition of functionally discriminating residues (FDRs) in enzyme families obtained by a Conservation-controlled HMM Iterative procedure for Enzyme Family classification (CHIEFc), (ii) pairwise sequence comparison using a family specific Sequence Identity Threshold, (iii) recognition of FDRs in Multiple Pfam enzyme families, and (iv) recognition of multiple Prosite patterns of high specificity. For FDR (i.e. conserved positions in an enzyme family that discriminate between true and false members of the family) identification, we have developed an Evolutionary Footprinting method that uses evolutionary information from homofunctional and heterofunctional multiple sequence alignments associated with an enzyme family. The FDRs show a significant correlation with annotated active site residues. In a jackknife test, EFICAz shows high accuracy (92%) and sensitivity (82%) for predicting four EC digits in testing sequences that are <40% identical to any member of the corresponding training set. Applied to Escherichia coli genome, EFICAz assigns more detailed enzymatic function than KEGG, and generates numerous novel predictions. PMID:15576349

Tian, Weidong; Arakaki, Adrian K.; Skolnick, Jeffrey

2004-01-01

142

Identifying the subfamilies of voltage-gated potassium channels using feature selection technique.  

PubMed

Voltage-gated K+ channel (VKC) plays important roles in biology procession, especially in nervous system. Different subfamilies of VKCs have different biological functions. Thus, knowing VKCs' subfamilies has become a meaningful job because it can guide the direction for the disease diagnosis and drug design. However, the traditional wet-experimental methods were costly and time-consuming. It is highly desirable to develop an effective and powerful computational tool for identifying different subfamilies of VKCs. In this study, a predictor, called iVKC-OTC, has been developed by incorporating the optimized tripeptide composition (OTC) generated by feature selection technique into the general form of pseudo-amino acid composition to identify six subfamilies of VKCs. One of the remarkable advantages of introducing the optimized tripeptide composition is being able to avoid the notorious dimension disaster or over fitting problems in statistical predictions. It was observed on a benchmark dataset, by using a jackknife test, that the overall accuracy achieved by iVKC-OTC reaches to 96.77% in identifying the six subfamilies of VKCs, indicating that the new predictor is promising or at least may become a complementary tool to the existing methods in this area. It has not escaped our notice that the optimized tripeptide composition can also be used to investigate other protein classification problems. PMID:25054318

Liu, Wei-Xin; Deng, En-Ze; Chen, Wei; Lin, Hao

2014-01-01

143

MEASUREMENT OF COSMIC MICROWAVE BACKGROUND POLARIZATION POWER SPECTRA FROM TWO YEARS OF BICEP DATA  

SciTech Connect

Background Imaging of Cosmic Extragalactic Polarization (BICEP) is a bolometric polarimeter designed to measure the inflationary B-mode polarization of the cosmic microwave background (CMB) at degree angular scales. During three seasons of observing at the South Pole (2006 through 2008), BICEP mapped {approx}2% of the sky chosen to be uniquely clean of polarized foreground emission. Here, we present initial results derived from a subset of the data acquired during the first two years. We present maps of temperature, Stokes Q and U, E and B modes, and associated angular power spectra. We demonstrate that the polarization data are self-consistent by performing a series of jackknife tests. We study potential systematic errors in detail and show that they are sub-dominant to the statistical errors. We measure the E-mode angular power spectrum with high precision at 21 <= l <= 335, detecting for the first time the peak expected at l {approx} 140. The measured E-mode spectrum is consistent with expectations from a LAMBDACDM model, and the B-mode spectrum is consistent with zero. The tensor-to-scalar ratio derived from the B-mode spectrum is r = 0.02{sup +0.31}{sub -0.26}, or r < 0.72 at 95% confidence, the first meaningful constraint on the inflationary gravitational wave background to come directly from CMB B-mode polarization.

Chiang, H. C.; Barkats, D.; Bock, J. J.; Hristov, V. V.; Jones, W. C.; Kovac, J. M.; Lange, A. E.; Mason, P. V.; Matsumura, T. [Department of Physics, California Institute of Technology, Pasadena, CA 91125 (United States); Ade, P. A. R. [Department of Physics and Astronomy, University of Wales, Cardiff, CF24 3YB, Wales (United Kingdom); Battle, J. O.; Dowell, C. D.; Nguyen, H. T. [Jet Propulsion Laboratory, Pasadena, CA 91109 (United States); Bierman, E. M.; Keating, B. G. [Department of Physics, University of California at San Diego, La Jolla, CA 92093 (United States); Duband, L. [SBT, Commissariat a l'Energie Atomique, Grenoble (France); Hivon, E. F. [Institut d'Astrophysique de Paris, Paris (France); Holzapfel, W. L. [Department of Physics, University of California at Berkeley, Berkeley, CA 94720 (United States); Kuo, C. L. [Stanford University, Palo Alto, CA 94305 (United States); Leitch, E. M. [University of Chicago, Chicago, IL 60637 (United States)

2010-03-10

144

Swfoldrate: predicting protein folding rates from amino acid sequence with sliding window method.  

PubMed

Protein folding is the process by which a protein processes from its denatured state to its specific biologically active conformation. Understanding the relationship between sequences and the folding rates of proteins remains an important challenge. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. In this study, the long-range and short-range contact in protein were used to derive extended version of the pseudo amino acid composition based on sliding window method. This method is capable of predicting the protein folding rates just from the amino acid sequence without the aid of any structural class information. We systematically studied the contributions of individual features to folding rate prediction. The optimal feature selection procedures are adopted by means of combining the forward feature selection and sequential backward selection method. Using the jackknife cross validation test, the method was demonstrated on the large dataset. The predictor was achieved on the basis of multitudinous physicochemical features and statistical features from protein using nonlinear support vector machine (SVM) regression model, the method obtained an excellent agreement between predicted and experimentally observed folding rates of proteins. The correlation coefficient is 0.9313 and the standard error is 2.2692. The prediction server is freely available at http://www.jci-bioinfo.cn/swfrate/input.jsp. PMID:22933332

Cheng, Xiang; Xiao, Xuan; Wu, Zhi-cheng; Wang, Pu; Lin, Wei-zhong

2013-01-01

145

Predicting miRNA's target from primary structure by the nearest neighbor algorithm.  

PubMed

We used a machine learning method, the nearest neighbor algorithm (NNA), to learn the relationship between miRNAs and their target proteins, generating a predictor which can then judge whether a new miRNA-target pair is true or not. We acquired 198 positive (true) miRNA-target pairs from Tarbase and the literature, and generated 4,888 negative (false) pairs through random combination. A 0/1 system and the frequencies of single nucleotides and di-nucleotides were used to encode miRNAs into vectors while various physicochemical parameters were used to encode the targets. The NNA was then applied, learning from these data to produce a predictor. We implemented minimum redundancy maximum relevance (mRMR) and properties forward selection (PFS) to reduce the redundancy of our encoding system, obtaining 91 most efficient properties. Finally, via the Jackknife cross-validation test, we got a positive accuracy of 69.2% and an overall accuracy of 96.0% with all the 253 properties. Besides, we got a positive accuracy of 83.8% and an overall accuracy of 97.2% with the 91 most efficient properties. A web-server for predictions is also made available at http://app3.biosino.org:8080/miRTP/index.jsp. PMID:20041294

Lin, Kao; Qian, Ziliang; Lu, Lin; Lu, Lingyi; Lai, Lihui; Gu, Jieyi; Zeng, Zhenbing; Li, Haipeng; Cai, Yudong

2010-11-01

146

Separation of malignant and benign masses using maximum-likelihood modeling and neural networks  

NASA Astrophysics Data System (ADS)

This study attempted to accurately segment the masses and distinguish malignant from benign tumors. The masses were segmented using a technique that combines pixel aggregation with maximum likelihood analysis. We found that the segmentation method can delineate the tumor body as well as tumor peripheral regions covering typical mass boundaries and some spiculation patterns. We have developed a Multiple Circular Path Convolution Neural Network (MCPCNN) to analyze a set of mass intensity, shape, and texture features for determination of the tumors as malignant or benign. The features were also fed into a conventional neural network for comparison. We also used values obtained from the maximum likelihood values as inputs into a conventional backpropagation neural network. We have tested these methods on 51 mammograms using a grouped Jackknife experiment incorporated with the ROC method. Tumor sizes ranged from 6mm to 3cm. The conventional neural network whose inputs were image features achieved an Az of 0.66. However the MCPCNN achieved an Az value of 0.71. The conventional neural network whose inputs were maximum likelihood values achieved an Az value of 0.84. In addition, the maximum likelihood segmentation method can identify the mass body and boundary regions, which is essential to the analysis of mammographic masses.

Kinnard, Lisa M.; Lo, Shih-Chung B.; Wang, Paul C.; Freedman, Matthew T.; Chouikha, Mohammed F.

2002-05-01

147

Identifying the Subfamilies of Voltage-Gated Potassium Channels Using Feature Selection Technique  

PubMed Central

Voltage-gated K+ channel (VKC) plays important roles in biology procession, especially in nervous system. Different subfamilies of VKCs have different biological functions. Thus, knowing VKCs’ subfamilies has become a meaningful job because it can guide the direction for the disease diagnosis and drug design. However, the traditional wet-experimental methods were costly and time-consuming. It is highly desirable to develop an effective and powerful computational tool for identifying different subfamilies of VKCs. In this study, a predictor, called iVKC-OTC, has been developed by incorporating the optimized tripeptide composition (OTC) generated by feature selection technique into the general form of pseudo-amino acid composition to identify six subfamilies of VKCs. One of the remarkable advantages of introducing the optimized tripeptide composition is being able to avoid the notorious dimension disaster or over fitting problems in statistical predictions. It was observed on a benchmark dataset, by using a jackknife test, that the overall accuracy achieved by iVKC-OTC reaches to 96.77% in identifying the six subfamilies of VKCs, indicating that the new predictor is promising or at least may become a complementary tool to the existing methods in this area. It has not escaped our notice that the optimized tripeptide composition can also be used to investigate other protein classification problems. PMID:25054318

Liu, Wei-Xin; Deng, En-Ze; Chen, Wei; Lin, Hao

2014-01-01

148

The Effects of Topography on Shortwave solar radiation modelling: The JGrass-NewAge System way  

NASA Astrophysics Data System (ADS)

The NewAGE-SwRB and NewAGE-DEC-MOD's are the two components of JGrass-NewAge hydrological modeling system to estimate the shortwave incident radiation. Shortwave solar radiation at the land surface is influenced by topographic parameters such as slope, aspect, altitude, and skyview factor, hence, detail analyses and discussions on their effect is the way to improve the modeling approach. The NewAGE-SwRB accounts for slope, aspect, shadow and the topographical information of the sites to estimate the cloudless irradiance. The first part of the paper is on the topographic parameter analysis using Udig GIS spatial toolbox, which is integrated in JGrass-NewAge system, and indicates the effect of each topographic parameters on the shortwave radiation. A statistical study on station topographic geometry (slope, aspect, altitude and Sky-view factor) and correlation of pairs of measurements of station analyzed to get closer look at the impact of rugged topography. The jackknife correlation coefficients has been used to analyze the estimate bias between shortwave radiations in different topographic geometric position, thereby helping to develop generalized linear models to explain the impacts of those topographic features. In addition to the NewAGE-SwRB accounts for the topographical parameters, there are three (an estimation of the visibility extent(V), the single-scattering albedo fraction of incident energy scattered to total attenuation by aerosols (Wo), and fraction of forward scattering to total scattering (Fs )) parameter needed to run the NewAGE-DEC-MOD's component. Sufficient knowledge regarding the magnitude and spatial distribution of the these parameters are very crucial. In this paper, the particle swarm NewAge component of the NewAge System used for automatic calibration of NewAGE-DEC-MOD's parameters for each stations based on different optimization and objective functions. Finally, the estimated parameters for all measurements station are interpolated in space, and, Kriging spatial interpolation techniques has applied to give their spatial structure. Different variogram models were determined to explain the spatial corrologram of parameters over space, and in return, used to estimate spatially distributed parameters using kriging. Jackknife kriging, which is a rekriging of each station by eliminating one sample from the original sample set and then taking the average of the rekriged estimates, has been used to test the practical validity of the model. The method gives better estimation and also resulting with standard deviation as useful indicator of uncertainty associated with station estimates. This analysis helps to understand spatial variability of radiative transmittance with position, height, aspect, slope and other topographic features. Two basin shortwave radiation data set (one in flat topography and the other in mountainous topography) are used to test statistical analysis of the modeling components of JGrass-NewAGE model systems.

Abera, Wuletawu; Formetta, Giuseppe; Rigon, Riccardo

2013-04-01

149

Demographic history and rare allele sharing among human populations  

PubMed Central

High-throughput sequencing technology enables population-level surveys of human genomic variation. Here, we examine the joint allele frequency distributions across continental human populations and present an approach for combining complementary aspects of whole-genome, low-coverage data and targeted high-coverage data. We apply this approach to data generated by the pilot phase of the Thousand Genomes Project, including whole-genome 2–4× coverage data for 179 samples from HapMap European, Asian, and African panels as well as high-coverage target sequencing of the exons of 800 genes from 697 individuals in seven populations. We use the site frequency spectra obtained from these data to infer demographic parameters for an Out-of-Africa model for populations of African, European, and Asian descent and to predict, by a jackknife-based approach, the amount of genetic diversity that will be discovered as sample sizes are increased. We predict that the number of discovered nonsynonymous coding variants will reach 100,000 in each population after ?1,000 sequenced chromosomes per population, whereas ?2,500 chromosomes will be needed for the same number of synonymous variants. Beyond this point, the number of segregating sites in the European and Asian panel populations is expected to overcome that of the African panel because of faster recent population growth. Overall, we find that the majority of human genomic variable sites are rare and exhibit little sharing among diverged populations. Our results emphasize that replication of disease association for specific rare genetic variants across diverged populations must overcome both reduced statistical power because of rarity and higher population divergence. PMID:21730125

Gravel, Simon; Henn, Brenna M.; Gutenkunst, Ryan N.; Indap, Amit R.; Marth, Gabor T.; Clark, Andrew G.; Yu, Fuli; Gibbs, Richard A.; Bustamante, Carlos D.; Altshuler, David L.; Durbin, Richard M.; Abecasis, Goncalo R.; Bentley, David R.; Chakravarti, Aravinda; Clark, Andrew G.; Collins, Francis S.; De La Vega, Francisco M.; Donnelly, Peter; Egholm, Michael; Flicek, Paul; Gabriel, Stacey B.; Gibbs, Richard A.; Knoppers, Bartha M.; Lander, Eric S.; Lehrach, Hans; Mardis, Elaine R.; McVean, Gil A.; Nickerson, Debbie A.; Peltonen, Leena; Schafer, Alan J.; Sherry, Stephen T.; Wang, Jun; Wilson, Richard K.; Gibbs, Richard A.; Deiros, David; Metzker, Mike; Muzny, Donna; Reid, Jeff; Wheeler, David; Wang, Jun; Li, Jingxiang; Jian, Min; Li, Guoqing; Li, Ruiqiang; Liang, Huiqing; Tian, Geng; Wang, Bo; Wang, Jian; Wang, Wei; Yang, Huanming; Zhang, Xiuqing; Zheng, Huisong; Lander, Eric S.; Altshuler, David L.; Ambrogio, Lauren; Bloom, Toby; Cibulskis, Kristian; Fennell, Tim J.; Gabriel, Stacey B.; Jaffe, David B.; Shefler, Erica; Sougnez, Carrie L.; Bentley, David R.; Gormley, Niall; Humphray, Sean; Kingsbury, Zoya; Koko-Gonzales, Paula; Stone, Jennifer; McKernan, Kevin J.; Costa, Gina L.; Ichikawa, Jeffry K.; Lee, Clarence C.; Sudbrak, Ralf; Lehrach, Hans; Borodina, Tatiana A.; Dahl, Andreas; Davydov, Alexey N.; Marquardt, Peter; Mertes, Florian; Nietfeld, Wilfiried; Rosenstiel, Philip; Schreiber, Stefan; Soldatov, Aleksey V.; Timmermann, Bernd; Tolzmann, Marius; Egholm, Michael; Affourtit, Jason; Ashworth, Dana; Attiya, Said; Bachorski, Melissa; Buglione, Eli; Burke, Adam; Caprio, Amanda; Celone, Christopher; Clark, Shauna; Conners, David; Desany, Brian; Gu, Lisa; Guccione, Lorri; Kao, Kalvin; Kebbel, Andrew; Knowlton, Jennifer; Labrecque, Matthew; McDade, Louise; Mealmaker, Craig; Minderman, Melissa; Nawrocki, Anne; Niazi, Faheem; Pareja, Kristen; Ramenani, Ravi; Riches, David; Song, Wanmin; Turcotte, Cynthia; Wang, Shally; Mardis, Elaine R.; Wilson, Richard K.; Dooling, David; Fulton, Lucinda; Fulton, Robert; Weinstock, George; Durbin, Richard M.; Burton, John; Carter, David M.; Churcher, Carol; Coffey, Alison; Cox, Anthony; Palotie, Aarno; Quail, Michael; Skelly, Tom; Stalker, James; Swerdlow, Harold P.; Turner, Daniel; De Witte, Anniek; Giles, Shane; Gibbs, Richard A.; Wheeler, David; Bainbridge, Matthew; Challis, Danny; Sabo, Aniko; Yu, Fuli; Yu, Jin; Wang, Jun; Fang, Xiaodong; Guo, Xiaosen; Li, Ruiqiang; Li, Yingrui; Luo, Ruibang; Tai, Shuaishuai; Wu, Honglong; Zheng, Hancheng; Zheng, Xiaole; Zhou, Yan; Li, Guoqing; Wang, Jian; Yang, Huanming; Marth, Gabor T.; Garrison, Erik P.; Huang, Weichun; Indap, Amit; Kural, Deniz; Lee, Wan-Ping; Leong, Wen Fung; Quinlan, Aaron R.; Stewart, Chip; Stromberg, Michael P.; Ward, Alistair N.; Wu, Jiantao; Lee, Charles; Mills, Ryan E.; Shi, Xinghua; Daly, Mark J.; DePristo, Mark A.; Altshuler, David L.; Ball, Aaron D.; Banks, Eric; Bloom, Toby; Browning, Brian L.; Cibulskis, Kristian; Fennell, Tim J.; Garimella, Kiran V.; Grossman, Sharon R.; Handsaker, Robert E.; Hanna, Matt; Hartl, Chris; Jaffe, David B.; Kernytsky, Andrew M.; Korn, Joshua M.; Li, Heng; Maguire, Jared R.; McCarroll, Steven A.; McKenna, Aaron; Nemesh, James C.; Philippakis, Anthony A.; Poplin, Ryan E.; Price, Alkes; Rivas, Manuel A.; Sabeti, Pardis C.; Schaffner, Stephen F.; Shefler, Erica; Shlyakhter, Ilya A.; Cooper, David N.; Ball, Edward V.; Mort, Matthew; Phillips, Andrew D.; Stenson, Peter D.; Sebat, Jonathan; Makarov, Vladimir; Ye, Kenny; Yoon, Seungtai C.; Bustamante, Carlos D.; Clark, Andrew G.; Boyko, Adam; Degenhardt, Jeremiah; Gravel, Simon; Gutenkunst, Ryan N.; Kaganovich, Mark; Keinan, Alon; Lacroute, Phil; Ma, Xin; Reynolds, Andy; Clarke, Laura; Flicek, Paul; Cunningham, Fiona; Herrero, Javier; Keenen, Stephen; Kulesha, Eugene; Leinonen, Rasko; McLaren, William M.; Radhakrishnan, Rajesh; Smith, Richard E.; Zalunin, Vadim; Zheng-Bradley, Xiangqun; Korbel, Jan O.; Stutz, Adrian M.; Humphray, Sean; Bauer, Markus; Cheetham, R. Keira; Cox, Tony; Eberle, Michael; James, Terena; Kahn, Scott; Murray, Lisa; Chakravarti, Aravinda; Ye, Kai; De La Vega, Francisco M.; Fu, Yutao; Hyland, Fiona C. L.; Manning, Jonathan M.; McLaughlin, Stephen F.; Peckham, Heather E.; Sakarya, Onur; Sun, Yongming A.; Tsung, Eric F.; Batzer, Mark A.; Konkel, Miriam K.; Walker, Jerilyn A.; Sudbrak, Ralf; Albrecht, Marcus W.; Amstislavskiy, Vyacheslav S.; Herwig, Ralf; Parkhomchuk, Dimitri V.; Sherry, Stephen T.; Agarwala, Richa; Khouri, Hoda M.; Morgulis, Aleksandr O.; Paschall, Justin E.; Phan, Lon D.; Rotmistrovsky, Kirill E.; Sanders, Robert D.; Shumway, Martin F.

2011-01-01

150

Development of Pneumatic Aerodynamic Devices to Improve the Performance, Economics, and Safety of Heavy Vehicles  

SciTech Connect

Under contract to the DOE Office of Heavy Vehicle Technologies, the Georgia Tech Research Institute (GTRI) is developing and evaluating pneumatic (blown) aerodynamic devices to improve the performance, economics, stability and safety of operation of Heavy Vehicles. The objective of this program is to apply the pneumatic aerodynamic aircraft technology previously developed and flight-tested by GTRI personnel to the design of an efficient blown tractor-trailer configuration. Recent experimental results obtained by GTRI using blowing have shown drag reductions of 35% on a streamlined automobile wind-tunnel model. Also measured were lift or down-load increases of 100-150% and the ability to control aerodynamic moments about all 3 axes without any moving control surfaces. Similar drag reductions yielded by blowing on bluff afterbody trailers in current US trucking fleet operations are anticipated to reduce yearly fuel consumption by more than 1.2 billion gallons, while even further reduction is possible using pneumatic lift to reduce tire rolling resistance. Conversely, increased drag and down force generated instantaneously by blowing can greatly increase braking characteristics and control in wet/icy weather due to effective ''weight'' increases on the tires. Safety is also enhanced by controlling side loads and moments caused on these Heavy Vehicles by winds, gusts and other vehicles passing. This may also help to eliminate the jack-knifing problem if caused by extreme wind side loads on the trailer. Lastly, reduction of the turbulent wake behind the trailer can reduce splash and spray patterns and rough air being experienced by following vehicles. To be presented by GTRI in this paper will be results developed during the early portion of this effort, including a preliminary systems study, CFD prediction of the blown flowfields, and design of the baseline conventional tractor-trailer model and the pneumatic wind-tunnel model.

Robert J. Englar

2000-06-19

151

Validation and statistical power comparison of methods for analyzing free-response observer performance studies  

PubMed Central

Rationale and Objectives The aim of this work was to validate and compare the statistical powers of proposed methods for analyzing free-response data using a search-model based simulator. Materials and Methods A free-response data simulator is described that can model a single reader interpreting the same cases in two modalities, or two CAD algorithms, or two human observers, interpreting the same cases in one modality. A variance components model, analogous to the Roe and Metz receiver operating characteristic (ROC) data simulator, is described, that models intra-case and inter-modality correlations in free-response studies. Two generic observers were simulated: a quasi-human observer and a quasi-CAD algorithm. Null hypothesis (NH) validity and statistical powers of ROC, jackknife alternative free-response operating characteristic (JAFROC), a variant of JAFROC termed JAFROC-1, initial detection and candidate analysis (IDCA) and a non-parametric (NP) approach were investigated. Results All methods had valid NH behavior over a wide range of simulator parameters. For equal numbers of normal and abnormal cases, for the human observer, the statistical power ranking of the methods was JAFROC-1 > JAFROC > (IDCA ~ NP) > ROC. For the CAD algorithm the ranking was (NP ~ IDCA) > (JAFROC-1~JAFROC) > ROC. In either case the statistical power of the highest ranked method exceeded that of the lowest ranked method by about a factor of two. Dependence of statistical power on simulator parameters followed expected trends. For data sets with more abnormal cases than normal cases, JAFROC-1 power significantly exceeded JAFROC power. Conclusion Based on this work the recommendation is to use JAFROC-1 for human observers (including human-observers with CAD assist) and the NP method for evaluating CAD algorithms. PMID:19000872

Chakraborty, Dev P.

2009-01-01

152

Asymmetric Constriction of Dividing Escherichia coli Cells Induced by Expression of a Fusion between Two Min Proteins  

PubMed Central

The Min system, consisting of MinC, MinD, and MinE, plays an important role in localizing the Escherichia coli cell division machinery to midcell by preventing FtsZ ring (Z ring) formation at cell poles. MinC has two domains, MinCn and MinCc, which both bind to FtsZ and act synergistically to inhibit FtsZ polymerization. Binary fission of E. coli usually proceeds symmetrically, with daughter cells at roughly 180° to each other. In contrast, we discovered that overproduction of an artificial MinCc-MinD fusion protein in the absence of other Min proteins induced frequent and dramatic jackknife-like bending of cells at division septa, with cell constriction predominantly on the outside of the bend. Mutations in the fusion known to disrupt MinCc-FtsZ, MinCc-MinD, or MinD-membrane interactions largely suppressed bending division. Imaging of FtsZ-green fluorescent protein (GFP) showed no obvious asymmetric localization of FtsZ during MinCc-MinD overproduction, suggesting that a downstream activity of the Z ring was inhibited asymmetrically. Consistent with this, MinCc-MinD fusions localized predominantly to segments of the Z ring at the inside of developing cell bends, while FtsA (but not ZipA) tended to localize to the outside. As FtsA is required for ring constriction, we propose that this asymmetric localization pattern blocks constriction of the inside of the septal ring while permitting continued constriction of the outside portion. PMID:24682325

Rowlett, Veronica Wells

2014-01-01

153

Comparative assessment of GIS-based methods and metrics for estimating long-term exposures to air pollution  

NASA Astrophysics Data System (ADS)

The development of geographical information system techniques has opened up a wide array of methods for air pollution exposure assessment. The extent to which these provide reliable estimates of air pollution concentrations is nevertheless not clearly established. Nor is it clear which methods or metrics should be preferred in epidemiological studies. This paper compares the performance of ten different methods and metrics in terms of their ability to predict mean annual PM 10 concentrations across 52 monitoring sites in London, UK. Metrics analysed include indicators (distance to nearest road, traffic volume on nearest road, heavy duty vehicle (HDV) volume on nearest road, road density within 150 m, traffic volume within 150 m and HDV volume within 150 m) and four modelling approaches: based on the nearest monitoring site, kriging, dispersion modelling and land use regression (LUR). Measures were computed in a GIS, and resulting metrics calibrated and validated against monitoring data using a form of grouped jack-knife analysis. The results show that PM 10 concentrations across London show little spatial variation. As a consequence, most methods can predict the average without serious bias. Few of the approaches, however, show good correlations with monitored PM 10 concentrations, and most predict no better than a simple classification based on site type. Only land use regression reaches acceptable levels of correlation ( R2 = 0.47), though this can be improved by also including information on site type. This might therefore be taken as a recommended approach in many studies, though care is needed in developing meaningful land use regression models, and like any method they need to be validated against local data before their application as part of epidemiological studies.

Gulliver, John; de Hoogh, Kees; Fecht, Daniela; Vienneau, Danielle; Briggs, David

2011-12-01

154

Demographic history and rare allele sharing among human populations.  

PubMed

High-throughput sequencing technology enables population-level surveys of human genomic variation. Here, we examine the joint allele frequency distributions across continental human populations and present an approach for combining complementary aspects of whole-genome, low-coverage data and targeted high-coverage data. We apply this approach to data generated by the pilot phase of the Thousand Genomes Project, including whole-genome 2-4× coverage data for 179 samples from HapMap European, Asian, and African panels as well as high-coverage target sequencing of the exons of 800 genes from 697 individuals in seven populations. We use the site frequency spectra obtained from these data to infer demographic parameters for an Out-of-Africa model for populations of African, European, and Asian descent and to predict, by a jackknife-based approach, the amount of genetic diversity that will be discovered as sample sizes are increased. We predict that the number of discovered nonsynonymous coding variants will reach 100,000 in each population after ?1,000 sequenced chromosomes per population, whereas ?2,500 chromosomes will be needed for the same number of synonymous variants. Beyond this point, the number of segregating sites in the European and Asian panel populations is expected to overcome that of the African panel because of faster recent population growth. Overall, we find that the majority of human genomic variable sites are rare and exhibit little sharing among diverged populations. Our results emphasize that replication of disease association for specific rare genetic variants across diverged populations must overcome both reduced statistical power because of rarity and higher population divergence. PMID:21730125

Gravel, Simon; Henn, Brenna M; Gutenkunst, Ryan N; Indap, Amit R; Marth, Gabor T; Clark, Andrew G; Yu, Fuli; Gibbs, Richard A; Bustamante, Carlos D

2011-07-19

155

Modelling temperature, photoperiod and vernalization responses of Brunonia australis (Goodeniaceae) and Calandrinia sp. (Portulacaceae) to predict flowering time  

PubMed Central

Background and Aims Crop models for herbaceous ornamental species typically include functions for temperature and photoperiod responses, but very few incorporate vernalization, which is a requirement of many traditional crops. This study investigated the development of floriculture crop models, which describe temperature responses, plus photoperiod or vernalization requirements, using Australian native ephemerals Brunonia australis and Calandrinia sp. Methods A novel approach involved the use of a field crop modelling tool, DEVEL2. This optimization program estimates the parameters of selected functions within the development rate models using an iterative process that minimizes sum of squares residual between estimated and observed days for the phenological event. Parameter profiling and jack-knifing are included in DEVEL2 to remove bias from parameter estimates and introduce rigour into the parameter selection process. Key Results Development rate of B. australis from planting to first visible floral bud (VFB) was predicted using a multiplicative approach with a curvilinear function to describe temperature responses and a broken linear function to explain photoperiod responses. A similar model was used to describe the development rate of Calandrinia sp., except the photoperiod function was replaced with an exponential vernalization function, which explained a facultative cold requirement and included a coefficient for determining the vernalization ceiling temperature. Temperature was the main environmental factor influencing development rate for VFB to anthesis of both species and was predicted using a linear model. Conclusions The phenology models for B. australis and Calandrinia sp. described development rate from planting to VFB and from VFB to anthesis in response to temperature and photoperiod or vernalization and may assist modelling efforts of other herbaceous ornamental plants. In addition to crop management, the vernalization function could be used to identify plant communities most at risk from predicted increases in temperature due to global warming. PMID:23404991

Cave, Robyn L.; Hammer, Graeme L.; McLean, Greg; Birch, Colin J.; Erwin, John E.; Johnston, Margaret E.

2013-01-01

156

Morphometric analysis of pelvic sexual dimorphism in a contemporary Western Australian population.  

PubMed

Requisite to routine casework involving unidentified skeletal remains is the formulation of an accurate biological profile, including sex estimation. Choice of method(s) is invariably related to preservation and by association, available bones. It is vital that the method applied affords statistical quantification of accuracy rates and predictive confidence so that evidentiary requirements for legal submission are satisfied. Achieving the latter necessitates the application of contemporary population-specific standards. This study examines skeletal pelvic dimorphism in contemporary Western Australian individuals to quantify the accuracy of using pelvic measurements to estimate sex and to formulate a series of morphometric standards. The sample comprises pelvic multi-slice computer tomography (MSCT) scans from 200 male and 200 female adults. Following 3D rendering, the 3D coordinates of 24 landmarks are acquired using OsiriX® (v.4.1.1) with 12 inter-landmark linear measurements and two angles acquired using MorphDb. Measurements are analysed using basic descriptive statistics and discriminant functions analyses employing jackknife validation of classification results. All except two linear measurements are dimorphic with sex differences explaining up to 65 % of sample variance. Transverse pelvic outlet and subpubic angle contribute most significantly to sex discrimination with accuracy rates between 100 % (complete pelvis-10 variables) and 81.2 % (ischial length). This study represents the initial forensic research into pelvic sexual dimorphism in a Western Australian population. Given these methods, we conclude that this highly dimorphic bone can be used to classify sex with a high degree of expected accuracy. PMID:24789357

Franklin, Daniel; Cardini, Andrea; Flavel, Ambika; Marks, Murray K

2014-09-01

157

The Internal Pudendal Artery Perforator Thigh Flap: A New Freestyle Pedicle Flap for the Ischial Region  

PubMed Central

Background: Recurrence and complication rates of pressure sores are highest in the ischial region, and other donor sites are needed for recurrent pressure sores. The potential of a new freestyle pedicle flap for ischial lesions, an internal pudendal artery perforator (iPap) thigh flap, was examined through anatomical and theoretical analyses and a case series using computed tomography angiography. Methods: The skin flap was designed in the thigh region based on an iPap. The skin perforators were marked with a Doppler probe. One patient underwent computed tomography angiography with fistulography to identify the damage to or effects on the pedicle vessels of the flap. Debridement of ischial lesions and flap elevation were performed in the jackknife position. Results: The iPap thigh flaps were performed in 5 patients, 4 with ischial pressure sores and 1 with calcinosis cutis of the ischial region. The width and length of the flaps ranged from 5 to 8?cm (mean, 6.6?cm) and 10 to 17?cm (mean, 12.6?cm), respectively. Three patients underwent partial osteotomy of the ischial bone. No complications, including flap necrosis or wound dehiscence of the donor and reconstructed sites, were observed. Conclusions: The perforator vessels of the internal pudendal artery are very close to the ischial tuberosity. Blood flow to the flap is reliable when careful debridement of the pressure sore is performed. The iPap thigh flap is a new option for soft-tissue defects in the ischial region, including ischial pressure sores. PMID:25289335

Goishi, Keiichi; Abe, Yoshiro; Takaku, Mitsuru; Seike, Takuya; Harada, Hiroshi; Nakanishi, Hideki

2014-01-01

158

Early detection of production deficit hot spots in semi-arid environment using FAPAR time series and a probabilistic approach  

NASA Astrophysics Data System (ADS)

Timely information on vegetation development at regional scale is needed in arid and semiarid African regions where rainfall variability leads to high inter-annual fluctuations in crop and pasture productivity, as well as to high risk of food crisis in the presence of severe drought events. The present study aims at developing and testing an automatic procedure to estimate the probability of experiencing a seasonal biomass production deficit solely on the basis of historical and near real-time remote sensing observations. The method is based on the extraction of vegetation phenology from SPOT-VEGTATION time series of the Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) and the subsequent computation of seasonally cumulated FAPAR as a proxy for vegetation gross primary production. Within season forecasts of the overall seasonal performance, expressed in terms of probability of experiencing a critical deficit, are based on a statistical approach taking into account two factors: i) the similarity between the current FAPAR profile and past profiles observable in the 15 years FAPAR time series; ii) the uncertainty of past predictions of season outcome as derived using jack-knifing technique. The method is applicable at the regional to continental scale and can be updated regularly during the season (whenever a new satellite observation is made available) to provide a synoptic view of the hot spots of likely production deficit. The specific objective of the procedure described here is to deliver to the food security analyst, as early as possible within the season, only the relevant information (e.g., masking out areas without active vegetation at the time of analysis), expressed through a reliable and easily interpretable measure of impending risk. Evaluation of method performance and examples of application in the Sahel region are discussed.

Meroni, M.; Fasbender, D.; Kayitakire, F.; Pini, G.; Rembold, F.; Urbano, F.; Verstraete, M. M.

2013-12-01

159

A New Estimate of the Earth's Land Surface Temperature History  

NASA Astrophysics Data System (ADS)

The Berkeley Earth Surface Temperature team has re-evaluated the world's atmospheric land surface temperature record using a linear least-squares method that allow the use of all the digitized records back to 1800, including short records that had been excluded by prior groups. We use the Kriging method to estimate an optimal weighting of stations to give a world average based on uniform weighting of the land surface. We have assembled a record of the available data by merging 1.6 billion temperature reports from 16 pre-existing data archives; this data base will be made available for public use. The former Global Historic Climatology Network (GHCN) monthly data base shows a sudden drop in the number of stations reporting monthly records from 1980 to the present; we avoid this drop by calculating monthly averages from the daily records. By using all the data, we reduce the effects of potential data selection bias. We make an independent estimate of the urban heat island effect by calculating the world land temperature trends based on stations chosen to be far from urban sites. We calculate the effect of poor station quality, as documented in the US by the team led by Anthony Watts by estimating the temperature trends based solely on the stations ranked good (1,2 or 1,2,3 in the NOAA ranking scheme). We avoid issues of homogenization bias by using raw data; at times when the records are discontinuous (e.g. due to station moves) we break the record into smaller segments and analyze those, rather than attempt to correct the discontinuity. We estimate the uncertainties in the final results using the jackknife procedure developed by J. Tukey. We calculate spatial uncertainties by measuring the effects of geographical exclusion on recent data that have good world coverage. The results we obtain are compared to those published by the groups at NOAA, NASA-GISS, and Hadley-CRU in the UK.

Muller, R. A.; Curry, J. A.; Groom, D.; Jacobsen, B.; Perlmutter, S.; Rohde, R. A.; Rosenfeld, A.; Wickham, C.; Wurtele, J.

2011-12-01

160

Virus-PLoc: a fusion classifier for predicting the subcellular localization of viral proteins within host and virus-infected cells.  

PubMed

Viruses can reproduce their progenies only within a host cell, and their actions depend both on its destructive tendencies toward a specific host cell and on environmental conditions. Therefore, knowledge of the subcellular localization of viral proteins in a host cell or virus-infected cell is very useful for in-depth studying of their functions and mechanisms as well as designing antiviral drugs. An analysis on the Swiss-Prot database (version 50.0, released on May 30, 2006) indicates that only 23.5% of viral protein entries are annotated for their subcellular locations in this regard. As for the gene ontology database, the corresponding percentage is 23.8%. Such a gap calls for the development of high throughput tools for timely annotating the localization of viral proteins within host and virus-infected cells. In this article, a predictor called "Virus-PLoc" has been developed that is featured by fusing many basic classifiers with each engineered according to the K-nearest neighbor rule. The overall jackknife success rate obtained by Virus-PLoc in identifying the subcellular compartments of viral proteins was 80% for a benchmark dataset in which none of proteins has more than 25% sequence identity to any other in a same location site. Virus-PLoc will be freely available as a web-server at http://202.120.37.186/bioinf/virus for the public usage. Furthermore, Virus-PLoc has been used to provide large-scale predictions of all viral protein entries in Swiss-Prot database that do not have subcellular location annotations or are annotated as being uncertain. The results thus obtained have been deposited in a downloadable file prepared with Microsoft Excel and named "Tab_Virus-PLoc.xls." This file is available at the same website and will be updated twice a year to include the new entries of viral proteins and reflect the continuous development of Virus-PLoc. PMID:17120237

Shen, Hong-Bin; Chou, Kuo-Chen

2007-02-15

161

Benthic macrofauna habitat associations in Willapa Bay, Washington, USA  

NASA Astrophysics Data System (ADS)

Estuary-wide benthic macrofauna-habitat associations in Willapa Bay, Washington, United States, were determined for 4 habitats (eelgrass [ Zostera marina], Atlantic cordgrass [ Spartina alterniflora], mud shrimp [ Upogebia pugettensis], ghost shrimp [ Neotrypaea californiensis]) in 1996 and 7 habitats (eelgrass, Atlantic cordgrass, mud shrimp, ghost shrimp, oyster [ Crassostrea gigas], bare mud/sand, subtidal) in 1998. Most benthic macrofaunal species inhabited multiple habitats; however, 2 dominants, a fanworm, Manayunkia aestuarina, in Spartina, and a sand dollar, Dendraster excentricus, in subtidal, were rare or absent in all other habitats. Benthic macrofaunal Bray-Curtis similarity varied among all habitats except eelgrass and oyster. There were significant differences among habitats within- and between-years on several of the following ecological indicators: mean number of species ( S), abundance ( A), biomass ( B), abundance of deposit (AD), suspension (AS), and facultative (AF) feeders, Swartz's index (SI), Brillouin's index ( H), and jackknife estimates of habitat species richness (HSR). In the 4 habitats sampled in both years, A was about 2.5× greater in 1996 (a La Niña year) than 1998 (a strong El Niño year) yet relative values of S, A, B, AD, AS, SI, and H among the habitats were not significantly different, indicating strong benthic macrofauna-habitat associations despite considerable climatic and environmental variability. In general, the rank order of habitats on indicators associated with high diversity and productivity (high S, A, B, SI, H, HSR) was eelgrass = oyster ? Atlantic cordgrass ? mud shrimp ? bare mud/sand ? ghost shrimp = subtidal. Vegetation, burrowing shrimp, and oyster density and sediment %silt + clay and %total organic carbon were generally poor, temporally inconsistent predictors of ecological indicator variability within habitats. The benthic macrofauna-habitat associations in this study can be used to help identify critical habitats, prioritize habitats for environmental protection, index habitat suitability, assess habitat equivalency, and as habitat value criteria in ecological risk assessments in Willapa Bay.

Ferraro, Steven P.; Cole, Faith A.

2007-02-01

162

Ammonia- and methane-oxidizing microorganisms in high-altitude wetland sediments and adjacent agricultural soils.  

PubMed

Ammonia oxidation is known to be carried out by ammonia-oxidizing bacteria (AOB) and archaea (AOA), while methanotrophs (methane-oxidizing bacteria (MOB)) play an important role in mitigating methane emissions from the environment. However, the difference of AOA, AOB, and MOB distribution in wetland sediment and adjacent upland soil remains unclear. The present study investigated the abundances and community structures of AOA, AOB, and MOB in sediments of a high-altitude freshwater wetland in Yunnan Province (China) and adjacent agricultural soils. Variations of AOA, AOB, and MOB community sizes and structures were found in water lily-vegetated and Acorus calamus-vegetated sediments and agricultural soils (unflooded rice soil, cabbage soil, and garlic soil and flooded rice soil). AOB community size was higher than AOA in agricultural soils and lily-vegetated sediment, but lower in A. calamus-vegetated sediment. MOB showed a much higher abundance than AOA and AOB. Flooded rice soil had the largest AOA, AOB, and MOB community sizes. Principal coordinate analyses and Jackknife Environment Clusters analyses suggested that unflooded and flooded rice soils had relatively similar AOA, AOB, and MOB structures. Cabbage soil and A. calamus-vegetated sediment had relatively similar AOA and AOB structures, but their MOB structures showed a large difference. Nitrososphaera-like microorganisms were the predominant AOA species in garlic soil but were present with a low abundance in unflooded rice soil and cabbage soil. Nitrosospira-like AOB were dominant in wetland sediments and agricultural soils. Type I MOB Methylocaldum and type II MOB Methylocystis were dominant in wetland sediments and agricultural soils. Moreover, Pearson's correlation analysis indicated that AOA Shannon diversity was positively correlated with the ratio of organic carbon to nitrogen (p?

Yang, Yuyin; Shan, Jingwen; Zhang, Jingxu; Zhang, Xiaoling; Xie, Shuguang; Liu, Yong

2014-12-01

163

Technical efficiency of district hospitals: Evidence from Namibia using Data Envelopment Analysis  

PubMed Central

Background In most countries of the sub-Saharan Africa, health care needs have been increasing due to emerging and re-emerging health problems. However, the supply of health care resources to address the problems has been continuously declining, thus jeopardizing the progress towards achieving the health-related Millennium Development Goals. Namibia is no exception to this. It is therefore necessary to quantify the level of technical inefficiency in the countries so as to alert policy makers of the potential resource gains to the health system if the hospitals that absorb a lion's share of the available resources are technically efficient. Method All public sector hospitals (N = 30) were included in the study. Hospital capacity utilization ratios and the data envelopment analysis (DEA) technique were used to assess technical efficiency. The DEA model used three inputs and two outputs. Data for four financial years (1997/98 to 2000/2001) was used for the analysis. To test for the robustness of the DEA technical efficiency scores the Jackknife analysis was used. Results The findings suggest the presence of substantial degree of pure technical and scale inefficiency. The average technical efficiency level during the given period was less than 75%. Less than half of the hospitals included in the study were located on the technically efficient frontier. Increasing returns to scale is observed to be the predominant form of scale inefficiency. Conclusion It is concluded that the existing level of pure technical and scale inefficiency of the district hospitals is considerably high and may negatively affect the government's initiatives to improve access to quality health care and scaling up of interventions that are necessary to achieve the health-related Millennium Development Goals. It is recommended that the inefficient hospitals learn from their efficient peers identified by the DEA model so as to improve the overall performance of the health system. PMID:16566818

Zere, Eyob; Mbeeli, Thomas; Shangula, Kalumbi; Mandlhate, Custodia; Mutirua, Kautoo; Tjivambi, Ben; Kapenambili, William

2006-01-01

164

THREE-POINT CORRELATION FUNCTIONS OF SDSS GALAXIES: CONSTRAINING GALAXY-MASS BIAS  

SciTech Connect

We constrain the linear and quadratic bias parameters from the configuration dependence of the three-point correlation function (3PCF) in both redshift and projected space, utilizing measurements of spectroscopic galaxies in the Sloan Digital Sky Survey Main Galaxy Sample. We show that bright galaxies (M{sub r} < -21.5) are biased tracers of mass, measured at a significance of 4.5{sigma} in redshift space and 2.5{sigma} in projected space by using a thorough error analysis in the quasi-linear regime (9-27 h{sup -1} Mpc). Measurements on a fainter galaxy sample are consistent with an unbiased model. We demonstrate that a linear bias model appears sufficient to explain the galaxy-mass bias of our samples, although a model using both linear and quadratic terms results in a better fit. In contrast, the bias values obtained from the linear model appear in better agreement with the data by inspection of the relative bias and yield implied values of {sigma}{sub 8} that are more consistent with current constraints. We investigate the covariance of the 3PCF, which itself is a measurement of galaxy clustering. We assess the accuracy of our error estimates by comparing results from mock galaxy catalogs to jackknife re-sampling methods. We identify significant differences in the structure of the covariance. However, the impact of these discrepancies appears to be mitigated by an eigenmode analysis that can account for the noisy, unresolved modes. Our joint analysis of both redshift space and projected measurements allows us to identify systematic effects affecting constraints from the 3PCF.

McBride, Cameron K. [Department of Physics and Astronomy, University of Pittsburgh, Pittsburgh, PA 15260 (United States); Connolly, Andrew J. [Department of Astronomy, University of Washington, Seattle, WA 98195-1580 (United States); Gardner, Jeffrey P. [Department of Physics, University of Washington, Seattle, WA 98195-1560 (United States); Scranton, Ryan [Department of Physics, University of California, Davis, CA 95616 (United States); Scoccimarro, Roman [Center for Cosmology and Particle Physics, New York University, New York, NY 10003 (United States); Berlind, Andreas A. [Department of Physics and Astronomy, Vanderbilt University, Nashville, TN 37235 (United States); MarIn, Felipe [Department of Astronomy and Astrophysics, Kavli Institute for Cosmological Physics, University of Chicago, Chicago, IL 60637 (United States); Schneider, Donald P., E-mail: cameron.mcbride@vanderbilt.edu [Department of Astronomy and Astrophysics, Pennsylvania State University, University Park, PA 16802 (United States)

2011-10-01

165

A comparison of Australian and USA radiologists' performance in detection of breast cancer  

NASA Astrophysics Data System (ADS)

The aim of current work was to compare the performance of radiologists that read a higher number of cases to those that read a lower number, as well as examine the effect of number of years of experience on performance. This study compares Australian and USA radiologist with differing levels of experience when reading mammograms. Thirty mammographic cases were presented to 41 radiologists, 21 from Australia and 20 from the USA. Readers were asked to locate and visualize cancer and assign a mark-rating pair with confidence levels from 1 to 5. A jackknife free-response receiver operating characteristic (JAFROC), inferred receiver operating characteristic (ROC), sensitivity, specificity and location sensitivity were calculated. A Mann-Whitney test was used to compare the performance of Australian and USA radiologists using SPSS software. The results showed that the USA radiologists sampled had more years of experience (p?0.01) but read less mammograms per year (p?0.03). Significantly higher sensitivity and location sensitivity (p? 0.001) were found for the Australia radiologists when experience and the number of mammograms read per year were taken into account. There were no differences between the two countries in overall performance measured by JAFROC and inferred ROC. For the most experienced radiologists within the Australian sample experienced ROC and location sensitivity were higher when compared to the least experienced. The increased number of years experience of the USA radiologists did not result in an increase in any performance metrics. The number of cases per year is a better predictor of improved diagnostic performance.

Suleiman, Wasfi I.; Georgian-Smith, Dianne; Evanoff, Michael G.; Lewis, Sarah; McEntee, Mark F.

2014-03-01

166

Absolute and relative locations of earthquakes at Mount St. Helens, Washington, using continuous data: implications for magmatic processes: Chapter 4 in A volcano rekindled: the renewed eruption of Mount St. Helens, 2004-2006  

USGS Publications Warehouse

This study uses a combination of absolute and relative locations from earthquake multiplets to investigate the seismicity associated with the eruptive sequence at Mount St. Helens between September 23, 2004, and November 20, 2004. Multiplets, a prominent feature of seismicity during this time period, occurred as volcano-tectonic, hybrid, and low-frequency earthquakes spanning a large range of magnitudes and lifespans. Absolute locations were improved through the use of a new one-dimensional velocity model with excellent shallow constraints on P-wave velocities. We used jackknife tests to minimize possible biases in absolute and relative locations resulting from station outages and changing station configurations. In this paper, we show that earthquake hypocenters shallowed before the October 1 explosion along a north-dipping structure under the 1980-86 dome. Relative relocations of multiplets during the initial seismic unrest and ensuing eruption showed rather small source volumes before the October 1 explosion and larger tabular source volumes after October 5. All multiplets possess absolute locations very close to each other. However, the highly dissimilar waveforms displayed by each of the multiplets analyzed suggest that different sources and mechanisms were present within a very small source volume. We suggest that multiplets were related to pressurization of the conduit system that produced a stationary source that was highly stable over long time periods. On the basis of their response to explosions occurring in October 2004, earthquakes not associated with multiplets also appeared to be pressure dependent. The pressure source for these earthquakes appeared, however, to be different from the pressure source of the multiplets.

Thelen, Weston A.; Crosson, Robert S.; Creager, Kenneth C.

2008-01-01

167

Evaluating decadal predictions - some considerations on bias correction and cross-validation  

NASA Astrophysics Data System (ADS)

Currently substantial efforts are undertaken to improve the prediction capability on the time-scale of a few years to a decade. In seasonal forecasting validation techniques based on past model performance are well-established tools. So far there is no consensus on the degree to which these are also applicable to decadal predictions. We contribute to this discussion by assessing the effects of drift-correction and cross-validation on the skill estimates. The study employs decadal hindcasts of 2m temperature from the EU FP6 ENSEMBLES project and a synthetic toy model. Decadal predictions can be subject to substantial lead-time dependent model drifts. The conventional drift-correction method has a considerable sampling uncertainty taking up to 40% of the potentially predictable signal. Introducing a smooth drift curve allows to reduce this uncertainty by about 30% for annual values. The typical leave-one-out cross-validation, as recommended for seasonal forecasting, may lead to biased skill estimates for decadal prediction due to the small number of hindcasts available. We identify this effect and show that "Jackknifing" represents a suitable technique to estimate potential skill without bias and to estimate sampling uncertainty. Results indicate significant correlation skill on the order of 0.7-0.9 for predicting global annual mean temperature on all lead-times. The strong sampling uncertainty due to the lack of independent and representative decadal hindcasts remains the key problem for the evaluation. It prohibits drawing a final conclusion, by means of verification, on whether or not decadal predictions have skill in predicting climate variability beyond a simpler trend estimate.

Gangsto, R.; Weigel, A. P.; Appenzeller, C.; Fischer, A.; Liniger, M. A.

2012-04-01

168

Comparison between chest digital tomosynthesis and CT as a screening method to detect artificial pulmonary nodules: a phantom study  

PubMed Central

Objectives The objective of this study was to evaluate the imaging capabilities of chest digital tomosynthesis (DT) as a screening method for the detection of artificial pulmonary nodules, and to compare its efficiency with that of CT. Methods DT and CT were used to detect artificial pulmonary nodules (5 mm and 8 mm in diameter, ground-glass opacities) placed in a chest phantom. Using a three-dimensional filtered back-projection algorithm at acquisition angles of 8°, 20°, 30° and 40°, DT images of the desired layer thicknesses were reconstructed from the image data acquired during a single tomographic scan. Both standard and sharp CT reconstruction kernels were used, and the detectability index (DI) valves computed for both the DT scan acquisition angles and CT reconstruction kernel types were considered. For the observer study, we examined 50 samples of artificial pulmonary nodules using both DT and CT imaging. On the basis of evaluations made by five thoracic radiologists, a jackknife free-response receiver operating characteristic (JAFROC) study was performed to compare and assess the differences in detection accuracy between CT and DT imaging. Results For each increased acquisition angle, DI obtained by DT imaging was similar to that obtained by CT imaging. The difference in the observer-averaged JAFROC figure of merit for the five readings was 0.0363 (95% confidence interval: ?0.18, 0.26; F=0.101; p=0.75). Conclusion With the advantages of a decreased radiation dose and the practical accessibility of examination, DT may be a useful alternative to CT for the detection of artificial pulmonary nodules. PMID:22422390

Gomi, T; Nakajima, M; Fujiwara, H; Takeda, T; Saito, K; Umeda, T; Sakaguchi, K

2012-01-01

169

Polynesian ant (Hymenoptera: Formicidae) species richness and distribution: a regional survey  

NASA Astrophysics Data System (ADS)

Thirteen Polynesian islands, including five true atolls, an uplifted atoll, and seven high volcanic islands of varying ages, were surveyed for ants by hand collecting techniques. Ten of the thirteen islands had been surveyed previously, and more and species were found in the present survey than were known from all earlier surveys combined, with two exception (Ducie Atoll and Easter Island). This represents the first report of the Argentine ant, Linepithema humile Mayr, from Easter Island. L. humile is a very successful pest species which has only recently invaded Easter Island, and is now very abundant and widespread, occurring at 16 of the 17 sample sites scattered across the island. The introduction of this species is almost certainly responsible for the apparent decline in species richness on Easter Island. In general, more species were present on high islands than atolls of a similar size, and elevation was significant while log (area) and latitude were not in a multiple linear regression with ant species number as the dependent variable. Not enough time was spent on the islands to survey their ant faunas completely, and extrapolations from species effort curves and jackknife estimators of earlier, thorough surverys for ants in the society Islands suggest that only about 50% of the total species were collected in the present survey, at least on the high islands. My collections were probably more complete on the atolls. The increase in species numbers from the present survey relative to known species richnesses (particularly when a large fraction of the species actually present were probably not included in the present survey) supports the hypothesis that remote Polynesian islands are not as depauperate in terms of ant species numbers as previously thought.

Morrison, Lloyd W.

1997-11-01

170

An Ancient Origin for the Enigmatic Flat-Headed Frogs (Bombinatoridae: Barbourula) from the Islands of Southeast Asia  

PubMed Central

Background The complex history of Southeast Asian islands has long been of interest to biogeographers. Dispersal and vicariance events in the Pleistocene have received the most attention, though recent studies suggest a potentially more ancient history to components of the terrestrial fauna. Among this fauna is the enigmatic archaeobatrachian frog genus Barbourula, which only occurs on the islands of Borneo and Palawan. We utilize this lineage to gain unique insight into the temporal history of lineage diversification in Southeast Asian islands. Methodology/Principal Findings Using mitochondrial and nuclear genetic data, multiple fossil calibration points, and likelihood and Bayesian methods, we estimate phylogenetic relationships and divergence times for Barbourula. We determine the sensitivity of focal divergence times to specific calibration points by jackknife approach in which each calibration point is excluded from analysis. We find that relevant divergence time estimates are robust to the exclusion of specific calibration points. Barbourula is recovered as a monophyletic lineage nested within a monophyletic Costata. Barbourula diverged from its sister taxon Bombina in the Paleogene and the two species of Barbourula diverged in the Late Miocene. Conclusions/Significance The divergences within Barbourula and between it and Bombina are surprisingly old and represent the oldest estimates for a cladogenetic event resulting in living taxa endemic to Southeast Asian islands. Moreover, these divergence time estimates are consistent with a new biogeographic scenario: the Palawan Ark Hypothesis. We suggest that components of Palawan's terrestrial fauna might have “rafted” on emergent portions of the North Palawan Block during its migration from the Asian mainland to its present-day position near Borneo. Further, dispersal from Palawan to Borneo (rather than Borneo to Palawan) may explain the current day disjunct distribution of this ancient lineage. PMID:20711504

Blackburn, David C.; Bickford, David P.; Diesmos, Arvin C.; Iskandar, Djoko T.; Brown, Rafe M.

2010-01-01

171

The effect of image processing on the detection of cancers in digital mammography.  

PubMed

OBJECTIVE. The objective of our study was to investigate the effect of image processing on the detection of cancers in digital mammography images. MATERIALS AND METHODS. Two hundred seventy pairs of breast images (both breasts, one view) were collected from eight systems using Hologic amorphous selenium detectors: 80 image pairs showed breasts containing subtle malignant masses; 30 image pairs, biopsy-proven benign lesions; 80 image pairs, simulated calcification clusters; and 80 image pairs, no cancer (normal). The 270 image pairs were processed with three types of image processing: standard (full enhancement), low contrast (intermediate enhancement), and pseudo-film-screen (no enhancement). Seven experienced observers inspected the images, locating and rating regions they suspected to be cancer for likelihood of malignancy. The results were analyzed using a jackknife-alternative free-response receiver operating characteristic (JAFROC) analysis. RESULTS. The detection of calcification clusters was significantly affected by the type of image processing: The JAFROC figure of merit (FOM) decreased from 0.65 with standard image processing to 0.63 with low-contrast image processing (p = 0.04) and from 0.65 with standard image processing to 0.61 with film-screen image processing (p = 0.0005). The detection of noncalcification cancers was not significantly different among the image-processing types investigated (p > 0.40). CONCLUSION. These results suggest that image processing has a significant impact on the detection of calcification clusters in digital mammography. For the three image-processing versions and the system investigated, standard image processing was optimal for the detection of calcification clusters. The effect on cancer detection should be considered when selecting the type of image processing in the future. PMID:25055275

Warren, Lucy M; Given-Wilson, Rosalind M; Wallis, Matthew G; Cooke, Julie; Halling-Brown, Mark D; Mackenzie, Alistair; Chakraborty, Dev P; Bosmans, Hilde; Dance, David R; Young, Kenneth C

2014-08-01

172

Do adolescent Ecstasy users have different attitudes towards drugs when compared to Marijuana users?  

PubMed Central

Background Perceived risk and attitudes about the consequences of drug use, perceptions of others expectations and self-efficacy influence the intent to try drugs and continue drug use once use has started. We examine associations between adolescents’ attitudes and beliefs towards ecstasy use; because most ecstasy users have a history of marijuana use, we estimate the association for three groups of adolescents: non-marijuana/ecstasy users, marijuana users (used marijuana at least once but never used ecstasy) and ecstasy users (used ecstasy at least once). Methods Data from 5,049 adolescents aged 12–18 years old who had complete weighted data information in Round 2 of the Restricted Use Files (RUF) of the National Survey of Parents and Youth (NSPY). Data were analyzed using jackknife weighted multinomial logistic regression models. Results Adolescent marijuana and ecstasy users were more likely to approve of marijuana and ecstasy use as compared to non-drug using youth. Adolescent marijuana and ecstasy users were more likely to have close friends who approved of ecstasy as compared to non-drug using youth. The magnitudes of these two associations were stronger for ecstasy use than for marijuana use in the final adjusted model. Our final adjusted model shows that approval of marijuana and ecstasy use was more strongly associated with marijuana and ecstasy use in adolescence than perceived risk in using both drugs. Conclusion Information about the risks and consequences of ecstasy use need to be presented to adolescents in order to attempt to reduce adolescents’ approval of ecstasy use as well as ecstasy experimentation. PMID:18068314

Martins, Silvia S.; Storr, Carla L.; Alexandre, Pierre K.; Chilcoat, Howard D.

2008-01-01

173

Measuring modality ordering consistency of observer performance paradigms  

NASA Astrophysics Data System (ADS)

Two observer performance paradigms applied to the same modalities, readers and cases are said to order the modalities consistently if both confirm the same sign (positive or negative) of the figure of merit difference. The aim of this work was to develop a modality ordering consistency measure. The paradigms considered were receiver operating characteristic (ROC) and jackknife alternative free-response ROC (JAFROC). Clinical FROC data from a previous study was used. Using the highest rating method ROC ratings were inferred from FROC ratings. JAFROC analyses of the FROC data and Dorfman-Berbaum-Metz multiple-reader multiple-case (DBM-MRMC) analysis of the inferred ROC data showed significant and consistent differences in the two figures of merit. Additionally 2000 bootstrap data sets were sampled and analyzed by JAFROC and DBM-MRMC. It was found that a positive JAFROC figure of merit difference was 101 times more likely when the ROC difference was positive than when the ROC difference was negative (odds ratio = 101). Valid modality ordering consistency (or inconsistency) claims are possible only when both figures of merit differences are statistically significant. For those bootstraps where both JAFROC and ROC yielded significant differences there were no inconsistent orderings. The effect of artificially degrading JAFROC performance was investigated. It was found that the odds ratio was more sensitive to the degradation. The results in this work are likely to be optimistic. A more realistic test of modality ordering consistency would require two separate studies (FROC and ROC) using the same readers and cases.

Chakraborty, D. P.; Zanca, Federica

2010-02-01

174

Testate Amoebae as Paleohydrological Proxies in the Florida Everglades  

NASA Astrophysics Data System (ADS)

The largest wetland restoration effort ever attempted, the Comprehensive Everglades Restoration Plan (CERP), is currently underway in the Florida Everglades, and a critical goal of CERP is reestablishment of the pre-drainage (pre-AD 1880) hydrology. Paleoecological research in the greater Everglades ecosystem is underway to reconstruct past water levels and variability throughout the system, providing a basis for restoration targets. Testate amoebae, a group of unicellular organisms that form decay-resistant tests, have been successfully used in northern-latitude bogs to reconstruct past wetland hydrology; however, their application in other peatland types, particularly at lower latitudes, has not been well studied. We assessed the potential use of testate amoebae as tools to reconstruct the past hydrology of the Everglades. Modern surface samples were collected from the Everglades National Park and Water Conservation Areas, across a water table gradient that included four vegetation types (tree island interior, tree island edge, sawgrass transition, slough). Community composition was quantified and compared to environmental conditions (water table, pH, vegetation) using ordination and gradient-analysis approaches. Results of nonmetric multidimensional scaling revealed that the most important pattern of community change, representing about 30% of the variance in the dataset, was related to water-table depth (r2=0.32). Jackknifed cross-validation of a transfer function for water table depth, based on a simple weighted average model, indicated the potential for testate amoebae in studies of past Everglades hydrology (RMSEP = 9 cm, r2=0.47). Although the performance of the transfer function was not as good as those from northern-latitude bogs, our results suggest that testate amoebae could be could be a valuable tool in paleohydrological studies of the Everglades, particularly when used with other hydrological proxies (e.g., pollen, plant macrofossils, diatoms).

Andrews, T.; Booth, R.; Bernhardt, C. E.; Willard, D. A.

2011-12-01

175

Grid search modeling of receiver functions: Implications for crustal structure in the Middle East and North Africa  

SciTech Connect

A grid search is used to estimate average crustal thickness and shear wave velocity structure beneath 12 three-component broadband seismic stations in the Middle East, North Africa, and nearby regions. The crustal thickness in these regions is found to vary from a minimum of 8.0{plus_minus}1.5&hthinsp;km in East Africa (Afar) region to possibly a maximum of 64{plus_minus}4.8&hthinsp;km in the lesser Caucasus. Stations located within the stable African platform indicate a crustal thickness of about 40 km. Teleseismic three-component waveform data produced by 165 earthquakes are used to create receiver function stacks for each station. Using a grid search, we have solved for the optimal and most simple shear velocity models beneath all 12 stations. Unlike other techniques (linearized least squares or forward modeling), the grid search methodology guarantees that we solve for the global minimum within our defined model parameter space. Using the grid search, we also qualitatively estimate the least number of layers required to model the observed receiver functions{close_quote} major seismic phases (e.g., PS{sub Moho}). A jackknife error estimation method is used to test the stability of our receiver function inversions for all 12 stations in the region that had recorded a sufficient number of high-quality broadband teleseismic waveforms. Five of the 12 estimates of crustal thicknesses are consistent with what is known of crustal structure from prior geophysical work. Furthermore, the remaining seven estimates of crustal structure are in regions for which previously there were few or no data about crustal thickness. {copyright} 1998 American Geophysical Union

Sandvol, E.; Seber, D.; Calvert, A.; Barazangi, M. [Institute for the Study of the Continents, Cornell University, Ithaca, New York (United States)] [Institute for the Study of the Continents, Cornell University, Ithaca, New York (United States)

1998-11-01

176

EMPeror: a tool for visualizing high-throughput microbial community data  

PubMed Central

Background As microbial ecologists take advantage of high-throughput sequencing technologies to describe microbial communities across ever-increasing numbers of samples, new analysis tools are required to relate the distribution of microbes among larger numbers of communities, and to use increasingly rich and standards-compliant metadata to understand the biological factors driving these relationships. In particular, the Earth Microbiome Project drives these needs by profiling the genomic content of tens of thousands of samples across multiple environment types. Findings Features of EMPeror include: ability to visualize gradients and categorical data, visualize different principal coordinates axes, present the data in the form of parallel coordinates, show taxa as well as environmental samples, dynamically adjust the size and transparency of the spheres representing the communities on a per-category basis, dynamically scale the axes according to the fraction of variance each explains, show, hide or recolor points according to arbitrary metadata including that compliant with the MIxS family of standards developed by the Genomic Standards Consortium, display jackknifed-resampled data to assess statistical confidence in clustering, perform coordinate comparisons (useful for procrustes analysis plots), and greatly reduce loading times and overall memory footprint compared with existing approaches. Additionally, ease of sharing, given EMPeror’s small output file size, enables agile collaboration by allowing users to embed these visualizations via emails or web pages without the need for extra plugins. Conclusions Here we present EMPeror, an open source and web browser enabled tool with a versatile command line interface that allows researchers to perform rapid exploratory investigations of 3D visualizations of microbial community data, such as the widely used principal coordinates plots. EMPeror includes a rich set of controllers to modify features as a function of the metadata. By being specifically tailored to the requirements of microbial ecologists, EMPeror thus increases the speed with which insight can be gained from large microbiome datasets. PMID:24280061

2013-01-01

177

Efficacy of digital breast tomosynthesis for breast cancer diagnosis  

NASA Astrophysics Data System (ADS)

Purpose: To compare the diagnostic performance of digital breast tomosynthesis (DBT) in combination with digital mammography (DM) with that of digital mammography alone. Materials and Methods: Twenty six experienced radiologists who specialized in breast imaging read 50 cases (27 cancers and 23 non-cancer cases) of patients who underwent DM and DBT. Both exams included the craniocaudal (CC) and mediolateral oblique (MLO) views. Histopathologic examination established truth in all lesions. Each case was interpreted in two modes, once with DM alone followed by DM+DBT, and the observers were asked to mark the location of any lesions, if present, and give it a score based on a five-category assessment by the Royal Australian and New Zealand College of Radiologists (RANZCR). The diagnostic performance of DM compared with that of DM+DBT was evaluated in terms of the difference between areas under receiver-operating characteristic curves (AUCs), Jackknife free-response receiver operator characteristics (JAFROC) figure-of-merit, sensitivity, location sensitivity and specificity. Results: Average AUC and JAFROC for DM versus DM+DBT was significantly different (AUCs 0.690 vs 0.781, p=< 0.0001), (JAFROC 0.618 vs. 0.732, p=< 0.0001) respectively. In addition, the use of DM+DBT resulted in an improvement in sensitivity (0.629 vs. 0.701, p=0.0011), location sensitivity (0.548 vs. 0.690, p=< 0.0001) and specificity (0.656 vs. 0.758, p=0.0015) when compared to DM alone. Conclusion: Adding DBT to the standard DM significantly improved radiologists' performance in terms of AUCs, JAFROC figure of merit, sensitivity, location sensitivity and specificity values.

Alakhras, M.; Mello-Thoms, C.; Rickard, M.; Bourne, R.; Brennan, P. C.

2014-03-01

178

Phylogenetic studies favour the unification of Pennisetum, Cenchrus and Odontelytrum (Poaceae): a combined nuclear, plastid and morphological analysis, and nomenclatural combinations in Cenchrus  

PubMed Central

Backgrounds and Aims Twenty-five genera having sterile inflorescence branches were recognized as the bristle clade within the x = 9 Paniceae (Panicoideae). Within the bristle clade, taxonomic circumscription of Cenchrus (20–25 species), Pennisetum (80–140) and the monotypic Odontelytrum is still unclear. Several criteria have been applied to characterize Cenchrus and Pennisetum, but none of these has proved satisfactory as the diagnostic characters, such as fusion of bristles in the inflorescences, show continuous variation. Methods A phylogenetic analysis based on morphological, plastid (trnL-F, ndhF) and nuclear (knotted) data is presented for a representative species sampling of the genera. All analyses were conducted under parsimony, using heuristic searches with TBR branch swapping. Branch support was assessed with parsimony jackknifing. Key Results Based on plastid and morphological data, Pennisetum, Cenchrus and Odontelytrum were supported as a monophyletic group: the PCO clade. Only one section of Pennisetum (Brevivalvula) was supported as monophyletic. The position of P. lanatum differed among data partitions, although the combined plastid and morphology and nuclear analyses showed this species to be a member of the PCO clade. The basic chromosome number x = 9 was found to be plesiomorphic, and x = 5, 7, 8, 10 and 17 were derived states. The nuclear phylogenetic analysis revealed a reticulate pattern of relationships among Pennisetum and Cenchrus, suggesting that there are at least three different genomes. Because apomixis can be transferred among species through hybridization, its history most likely reflects crossing relationships, rather than multiple independent appearances. Conclusions Due to the consistency between the present results and different phylogenetic hypotheses (including morphological, developmental and multilocus approaches), and the high support found for the PCO clade, also including the type species of the three genera, we propose unification of Pennisetum, Cenchrus and Odontelytrum. Species of Pennisetum and Odontelytrum are here transferred into Cenchrus, which has priority. Sixty-six new combinations are made here. PMID:20570830

Chemisquy, M. Amelia; Giussani, Liliana M.; Scataglini, Maria A.; Kellogg, Elizabeth A.; Morrone, Osvaldo

2010-01-01

179

External Quality Assessment (EQA) program for the preanalytical and analytical immunohistochemical determination of HER2 in breast cancer: an experience on a regional scale  

PubMed Central

Background An External Quality Assessment (EQA) program was developed to investigate the state of the art of HER2 immunohistochemical determination in breast cancer (BC) in 16 Pathology Departments in the Lazio Region (Italy). This program was implemented through two specific steps to evaluate HER2 staining (step 1) and interpretation (step 2) reproducibility among participants. Methods The management activities of this EQA program were assigned to the Coordinating Center (CC), the Revising Centers (RCs) and the Participating Centers (PCs). In step 1, 4 BC sections, selected by RCs, were stained by each PC using their own procedures. In step 2, each PC interpreted HER2 score in 10 BC sections stained by the CC. The concordance pattern was evaluated by using the kappa category-specific statistic and/or the weighted kappa statistic with the corresponding 95% Jackknife confidence interval. Results In step 1, a substantial/almost perfect agreement was reached between the PCs for scores 0 and 3+ whereas a moderate and fair agreement was observed for scores 1+ and 2+, respectively. In step 2, a fully satisfactory agreement was observed for 6 out of the 16 PCs and a quite satisfactory agreement was obtained for the remaining 10 PCs. Conclusions Our findings highlight that in the whole HER2 evaluation process the two intermediate categories, scores 1+ and 2+, are less reproducible than scores 0 and 3+. These findings are relevant in clinical practice where the choice of treatment is based on HER2 positivity, suggesting the need to share evaluation procedures within laboratories and implement educational programs. PMID:23965490

2013-01-01

180

Differences in population size variability among populations and species of the family Salmonidae.  

PubMed

1. How population sizes vary with time is an important ecological question with both practical and theoretical implications. Because population size variability corresponds to the operation of density-dependent mechanisms and the presence of stable states, numerous researchers have attempted to conduct broad taxonomic comparisons of population size variability. 2. Most comparisons of population size variability suggest a general lack of taxonomic differences. However, these comparisons may conflate differences within taxonomic levels with differences among taxonomic levels. Further, the degree to which intraspecific differences may affect broader inferences has generally not been estimated and has largely been ignored. 3. To address this uncertainty, we examined intraspecific differences in population size variability for a total of 131 populations distributed among nine species of the Salmonidae. We extended this comparison to the interspecific level by developing species level estimates of population size variability. 4. We used a jackknife (re-sampling) approach to estimate intra- and interspecific variation in population size variability. We found significant intraspecific differences in how population sizes vary with time in all six species of salmonids where it could be tested as well as clear interspecific differences. Further, despite significant interspecific variation, the majority of variation present was at the intraspecific level. Finally, we found that classic and recently developed measures of population variability lead to concordant inferences. 5. The presence of significant intraspecific differences in all species examined suggests that the ability to detect broad taxonomic patterns in how population sizes change over time may be limited if variance is not properly partitioned among and within taxonomic levels. PMID:20412345

Dochtermann, Ned A; Peacock, Mary M

2010-07-01

181

Lower-extremity strength profiles and gender-based classification of basketball players ages 9-22 years.  

PubMed

Despite an increase in women sports participants and recognition of gender differences in injury patterns (e.g., knee), few normative strength data exist beyond hamstrings and quadriceps measures. This study had 2 purposes: to assess the lower-extremity strength of women (W) and men (M) basketball players who were 9-22 years old, and to determine which strength measures most correctly classify the gender of 12- to 22-year-old athletes. Fifty basketball players (26 W, 24 M) without ligamentous or meniscal injury performed concentric isokinetic testing of bilateral hip, knee, and ankle musculature. We identified maximal peak torques for the hip (flexors, extensors, abductors, adductors), knee (flexors and extensors), and ankle (plantar flexors and dorsiflexors), and we formed periarticular (hip, knee, and ankle), antigravity, and total leg strength composite measures. We calculated mean and 95% confidence intervals. With body mass-height normalization, most age and gender differences were small. Mean values were typically higher for older vs. younger players and for men vs. women players. Mean values were often lower for girls 12-13 years vs. those 9-10 years. In the age group of 16-22 years, men had stronger knee flexors, hip flexors, plantar flexors, and total leg strength than women. Men who were 16-22 years old had stronger knee flexors and hip flexors than did younger men and women players. Based on discriminant function, knee strength measures did not adequately classify gender. Instead, total leg strength measures had correct gender classifications of 74 and 69% (jackknifed) with significant multivariate tests (p = 0.025). For researchers and practitioners, these results support strength assessment and training of the whole lower extremity, not just knee musculature. Limited strength differences between girls 9-10 years old and those 12-13 years old suggest that the peripubertal period is an important time to target strength development. PMID:19209081

Buchanan, Patricia A; Vardaxis, Vassilios G

2009-03-01

182

iLoc-Euk: A Multi-Label Classifier for Predicting the Subcellular Localization of Singleplex and Multiplex Eukaryotic Proteins  

PubMed Central

Predicting protein subcellular localization is an important and difficult problem, particularly when query proteins may have the multiplex character, i.e., simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing protein subcellular location predictor can only be used to deal with the single-location or “singleplex” proteins. Actually, multiple-location or “multiplex” proteins should not be ignored because they usually posses some unique biological functions worthy of our special notice. By introducing the “multi-labeled learning” and “accumulation-layer scale”, a new predictor, called iLoc-Euk, has been developed that can be used to deal with the systems containing both singleplex and multiplex proteins. As a demonstration, the jackknife cross-validation was performed with iLoc-Euk on a benchmark dataset of eukaryotic proteins classified into the following 22 location sites: (1) acrosome, (2) cell membrane, (3) cell wall, (4) centriole, (5) chloroplast, (6) cyanelle, (7) cytoplasm, (8) cytoskeleton, (9) endoplasmic reticulum, (10) endosome, (11) extracellular, (12) Golgi apparatus, (13) hydrogenosome, (14) lysosome, (15) melanosome, (16) microsome (17) mitochondrion, (18) nucleus, (19) peroxisome, (20) spindle pole body, (21) synapse, and (22) vacuole, where none of proteins included has pairwise sequence identity to any other in a same subset. The overall success rate thus obtained by iLoc-Euk was 79%, which is significantly higher than that by any of the existing predictors that also have the capacity to deal with such a complicated and stringent system. As a user-friendly web-server, iLoc-Euk is freely accessible to the public at the web-site http://icpr.jci.edu.cn/bioinfo/iLoc-Euk. It is anticipated that iLoc-Euk may become a useful bioinformatics tool for Molecular Cell Biology, Proteomics, System Biology, and Drug Development Also, its novel approach will further stimulate the development of predicting other protein attributes. PMID:21483473

Chou, Kuo-Chen; Wu, Zhi-Cheng; Xiao, Xuan

2011-01-01

183

Classification and Analysis of Regulatory Pathways Using Graph Property, Biochemical and Physicochemical Property, and Functional Property  

PubMed Central

Given a regulatory pathway system consisting of a set of proteins, can we predict which pathway class it belongs to? Such a problem is closely related to the biological function of the pathway in cells and hence is quite fundamental and essential in systems biology and proteomics. This is also an extremely difficult and challenging problem due to its complexity. To address this problem, a novel approach was developed that can be used to predict query pathways among the following six functional categories: (i) “Metabolism”, (ii) “Genetic Information Processing”, (iii) “Environmental Information Processing”, (iv) “Cellular Processes”, (v) “Organismal Systems”, and (vi) “Human Diseases”. The prediction method was established trough the following procedures: (i) according to the general form of pseudo amino acid composition (PseAAC), each of the pathways concerned is formulated as a 5570-D (dimensional) vector; (ii) each of components in the 5570-D vector was derived by a series of feature extractions from the pathway system according to its graphic property, biochemical and physicochemical property, as well as functional property; (iii) the minimum redundancy maximum relevance (mRMR) method was adopted to operate the prediction. A cross-validation by the jackknife test on a benchmark dataset consisting of 146 regulatory pathways indicated that an overall success rate of 78.8% was achieved by our method in identifying query pathways among the above six classes, indicating the outcome is quite promising and encouraging. To the best of our knowledge, the current study represents the first effort in attempting to identity the type of a pathway system or its biological function. It is anticipated that our report may stimulate a series of follow-up investigations in this new and challenging area. PMID:21980418

Cai, Yu-Dong; Chou, Kuo-Chen

2011-01-01

184

iLoc-Gpos: a multi-layer classifier for predicting the subcellular localization of singleplex and multiplex Gram-positive bacterial proteins.  

PubMed

By introducing the "multi-layer scale", as well as hybridizing the information of gene ontology and the sequential evolution information, a novel predictor, called iLoc-Gpos, has been developed for predicting the subcellular localization of Gram positive bacterial proteins with both single-location and multiple-location sites. For facilitating comparison, the same stringent benchmark dataset used to estimate the accuracy of Gpos-mPLoc was adopted to demonstrate the power of iLoc-Gpos. The dataset contains 519 Gram-positive bacterial proteins classified into the following four subcellular locations: (1) cell membrane, (2) cell wall, (3) cytoplasm, and (4) extracell; none of proteins included has ?25% pairwise sequence identity to any other in a same subset (subcellular location). The overall success rate by jackknife test on such a stringent benchmark dataset by iLoc-Gpos was over 93%, which is about 11% higher than that by GposmPLoc. As a user-friendly web-server, iLoc-Gpos is freely accessible to the public at http://icpr.jci.edu.cn/bioinfo/iLoc- Gpos or http://www.jci-bioinfo.cn/iLoc-Gpos. Meanwhile, a step-by-step guide is provided on how to use the web-server to get the desired results. Furthermore, for the user ? s convenience, the iLoc-Gpos web-server also has the function to accept the batch job submission, which is not available in the existing version of Gpos-mPLoc web-server. PMID:21919865

Wu, Zhi-Cheng; Xiao, Xuan; Chou, Kuo-Chen

2012-01-01

185

iDNA-Prot: Identification of DNA Binding Proteins Using Random Forest with Grey Model  

PubMed Central

DNA-binding proteins play crucial roles in various cellular processes. Developing high throughput tools for rapidly and effectively identifying DNA-binding proteins is one of the major challenges in the field of genome annotation. Although many efforts have been made in this regard, further effort is needed to enhance the prediction power. By incorporating the features into the general form of pseudo amino acid composition that were extracted from protein sequences via the “grey model” and by adopting the random forest operation engine, we proposed a new predictor, called iDNA-Prot, for identifying uncharacterized proteins as DNA-binding proteins or non-DNA binding proteins based on their amino acid sequences information alone. The overall success rate by iDNA-Prot was 83.96% that was obtained via jackknife tests on a newly constructed stringent benchmark dataset in which none of the proteins included has pairwise sequence identity to any other in a same subset. In addition to achieving high success rate, the computational time for iDNA-Prot is remarkably shorter in comparison with the relevant existing predictors. Hence it is anticipated that iDNA-Prot may become a useful high throughput tool for large-scale analysis of DNA-binding proteins. As a user-friendly web-server, iDNA-Prot is freely accessible to the public at the web-site on http://icpr.jci.edu.cn/bioinfo/iDNA-Prot or http://www.jci-bioinfo.cn/iDNA-Prot. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results. PMID:21935457

Lin, Wei-Zhong; Fang, Jian-An; Xiao, Xuan; Chou, Kuo-Chen

2011-01-01

186

iLoc-Euk: a multi-label classifier for predicting the subcellular localization of singleplex and multiplex eukaryotic proteins.  

PubMed

Predicting protein subcellular localization is an important and difficult problem, particularly when query proteins may have the multiplex character, i.e., simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing protein subcellular location predictor can only be used to deal with the single-location or "singleplex" proteins. Actually, multiple-location or "multiplex" proteins should not be ignored because they usually posses some unique biological functions worthy of our special notice. By introducing the "multi-labeled learning" and "accumulation-layer scale", a new predictor, called iLoc-Euk, has been developed that can be used to deal with the systems containing both singleplex and multiplex proteins. As a demonstration, the jackknife cross-validation was performed with iLoc-Euk on a benchmark dataset of eukaryotic proteins classified into the following 22 location sites: (1) acrosome, (2) cell membrane, (3) cell wall, (4) centriole, (5) chloroplast, (6) cyanelle, (7) cytoplasm, (8) cytoskeleton, (9) endoplasmic reticulum, (10) endosome, (11) extracellular, (12) Golgi apparatus, (13) hydrogenosome, (14) lysosome, (15) melanosome, (16) microsome (17) mitochondrion, (18) nucleus, (19) peroxisome, (20) spindle pole body, (21) synapse, and (22) vacuole, where none of proteins included has ?25% pairwise sequence identity to any other in a same subset. The overall success rate thus obtained by iLoc-Euk was 79%, which is significantly higher than that by any of the existing predictors that also have the capacity to deal with such a complicated and stringent system. As a user-friendly web-server, iLoc-Euk is freely accessible to the public at the web-site http://icpr.jci.edu.cn/bioinfo/iLoc-Euk. It is anticipated that iLoc-Euk may become a useful bioinformatics tool for Molecular Cell Biology, Proteomics, System Biology, and Drug Development Also, its novel approach will further stimulate the development of predicting other protein attributes. PMID:21483473

Chou, Kuo-Chen; Wu, Zhi-Cheng; Xiao, Xuan

2011-01-01

187

Predicting Secretory Proteins of Malaria Parasite by Incorporating Sequence Evolution Information into Pseudo Amino Acid Composition via Grey System Model  

PubMed Central

The malaria disease has become a cause of poverty and a major hindrance to economic development. The culprit of the disease is the parasite, which secretes an array of proteins within the host erythrocyte to facilitate its own survival. Accordingly, the secretory proteins of malaria parasite have become a logical target for drug design against malaria. Unfortunately, with the increasing resistance to the drugs thus developed, the situation has become more complicated. To cope with the drug resistance problem, one strategy is to timely identify the secreted proteins by malaria parasite, which can serve as potential drug targets. However, it is both expensive and time-consuming to identify the secretory proteins of malaria parasite by experiments alone. To expedite the process for developing effective drugs against malaria, a computational predictor called “iSMP-Grey” was developed that can be used to identify the secretory proteins of malaria parasite based on the protein sequence information alone. During the prediction process a protein sample was formulated with a 60D (dimensional) feature vector formed by incorporating the sequence evolution information into the general form of PseAAC (pseudo amino acid composition) via a grey system model, which is particularly useful for solving complicated problems that are lack of sufficient information or need to process uncertain information. It was observed by the jackknife test that iSMP-Grey achieved an overall success rate of 94.8%, remarkably higher than those by the existing predictors in this area. As a user-friendly web-server, iSMP-Grey is freely accessible to the public at http://www.jci-bioinfo.cn/iSMP-Grey. Moreover, for the convenience of most experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results without the need to follow the complicated mathematical equations involved in this paper. PMID:23189138

Lin, Wei-Zhong; Fang, Jian-An; Xiao, Xuan; Chou, Kuo-Chen

2012-01-01

188

iLoc-Plant: a multi-label classifier for predicting the subcellular localization of plant proteins with both single and multiple sites.  

PubMed

Predicting protein subcellular localization is a challenging problem, particularly when query proteins may simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing methods can only be used to deal with the single-location proteins. Actually, multiple-location proteins should not be ignored because they usually bear some special functions worthy of our notice. By introducing the "multi-labeled learning" approach, a new predictor, called iLoc-Plant, has been developed that can be used to deal with the systems containing both single- and multiple-location plant proteins. As a demonstration, the jackknife cross-validation was performed with iLoc-Plant on a benchmark dataset of plant proteins classified into the following 12 location sites: (1) cell membrane, (2) cell wall, (3) chloroplast, (4) cytoplasm, (5) endoplasmic reticulum, (6) extracellular, (7) Golgi apparatus, (8) mitochondrion, (9) nucleus, (10) peroxisome, (11) plastid, and (12) vacuole, where some proteins belong to two or three locations but none has ? 25% pairwise sequence identity to any other in a same subset. The overall success rate thus obtained by iLoc-Plant was 71%, which is remarkably higher than those achieved by any existing predictors that also have the capacity to deal with such a stringent and complicated plant protein system. As a user-friendly web-server, iLoc-Plant is freely accessible to the public at the web-site or . Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results without the need to follow the complicated mathematic equations presented in this paper for its integrity. It is anticipated that iLoc-Plant may become a useful bioinformatics tool for Molecular Cell Biology, Proteomics, Systems Biology, and Drug Development. PMID:21984117

Wu, Zhi-Cheng; Xiao, Xuan; Chou, Kuo-Chen

2011-12-01

189

Predicting Transcriptional Activity of Multiple Site p53 Mutants Based on Hybrid Properties  

PubMed Central

As an important tumor suppressor protein, reactivate mutated p53 was found in many kinds of human cancers and that restoring active p53 would lead to tumor regression. In this work, we developed a new computational method to predict the transcriptional activity for one-, two-, three- and four-site p53 mutants, respectively. With the approach from the general form of pseudo amino acid composition, we used eight types of features to represent the mutation and then selected the optimal prediction features based on the maximum relevance, minimum redundancy, and incremental feature selection methods. The Mathew's correlation coefficients (MCC) obtained by using nearest neighbor algorithm and jackknife cross validation for one-, two-, three- and four-site p53 mutants were 0.678, 0.314, 0.705, and 0.907, respectively. It was revealed by the further optimal feature set analysis that the 2D (two-dimensional) structure features composed the largest part of the optimal feature set and maybe played the most important roles in all four types of p53 mutant active status prediction. It was also demonstrated by the optimal feature sets, especially those at the top level, that the 3D structure features, conservation, physicochemical and biochemical properties of amino acid near the mutation site, also played quite important roles for p53 mutant active status prediction. Our study has provided a new and promising approach for finding functionally important sites and the relevant features for in-depth study of p53 protein and its action mechanism. PMID:21857971

Xu, Zhongping; Huang, Yun; Kong, Xiangyin; Cai, Yu-Dong; Chou, Kuo-Chen

2011-01-01

190

Hepatitis C Virus Network Based Classification of Hepatocellular Cirrhosis and Carcinoma  

PubMed Central

Hepatitis C virus (HCV) is a main risk factor for liver cirrhosis and hepatocellular carcinoma, particularly to those patients with chronic liver disease or injury. The similar etiology leads to a high correlation of the patients suffering from the disease of liver cirrhosis with those suffering from the disease of hepatocellular carcinoma. However, the biological mechanism for the relationship between these two kinds of diseases is not clear. The present study was initiated in an attempt to investigate into the HCV infection protein network, in hopes to find good biomarkers for diagnosing the two diseases as well as gain insights into their progression mechanisms. To realize this, two potential biomarker pools were defined: (i) the target genes of HCV, and (ii) the between genes on the shortest paths among the target genes of HCV. Meanwhile, a predictor was developed for identifying the liver tissue samples among the following three categories: (i) normal, (ii) cirrhosis, and (iii) hepatocellular carcinoma. Interestingly, it was observed that the identification accuracy was higher with the tissue samples defined by extracting the features from the second biomarker pool than that with the samples defined based on the first biomarker pool. The identification accuracy by the jackknife validation for the between-genes approach was 0.960, indicating that the novel approach holds a quite promising potential in helping find effective biomarkers for diagnosing the liver cirrhosis disease and the hepatocellular carcinoma disease. It may also provide useful insights for in-depth study of the biological mechanisms of HCV-induced cirrhosis and hepatocellular carcinoma. PMID:22493692

Huang, Tao; Wang, Junjie; Cai, Yu-Dong; Yu, Hanry; Chou, Kuo-Chen

2012-01-01

191

Predicting Anatomical Therapeutic Chemical (ATC) Classification of Drugs by Integrating Chemical-Chemical Interactions and Similarities  

PubMed Central

The Anatomical Therapeutic Chemical (ATC) classification system, recommended by the World Health Organization, categories drugs into different classes according to their therapeutic and chemical characteristics. For a set of query compounds, how can we identify which ATC-class (or classes) they belong to? It is an important and challenging problem because the information thus obtained would be quite useful for drug development and utilization. By hybridizing the informations of chemical-chemical interactions and chemical-chemical similarities, a novel method was developed for such purpose. It was observed by the jackknife test on a benchmark dataset of 3,883 drug compounds that the overall success rate achieved by the prediction method was about 73% in identifying the drugs among the following 14 main ATC-classes: (1) alimentary tract and metabolism; (2) blood and blood forming organs; (3) cardiovascular system; (4) dermatologicals; (5) genitourinary system and sex hormones; (6) systemic hormonal preparations, excluding sex hormones and insulins; (7) anti-infectives for systemic use; (8) antineoplastic and immunomodulating agents; (9) musculoskeletal system; (10) nervous system; (11) antiparasitic products, insecticides and repellents; (12) respiratory system; (13) sensory organs; (14) various. Such a success rate is substantially higher than 7% by the random guess. It has not escaped our notice that the current method can be straightforwardly extended to identify the drugs for their 2nd-level, 3rd-level, 4th-level, and 5th-level ATC-classifications once the statistically significant benchmark data are available for these lower levels. PMID:22514724

Chen, Lei; Zeng, Wei-Ming; Cai, Yu-Dong; Feng, Kai-Yan; Chou, Kuo-Chen

2012-01-01

192

iLoc-Animal: a multi-label learning classifier for predicting subcellular localization of animal proteins.  

PubMed

Predicting protein subcellular localization is a challenging problem, particularly when query proteins have multi-label features meaning that they may simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing methods can only be used to deal with the single-label proteins. Actually, multi-label proteins should not be ignored because they usually bear some special function worthy of in-depth studies. By introducing the "multi-label learning" approach, a new predictor, called iLoc-Animal, has been developed that can be used to deal with the systems containing both single- and multi-label animal (metazoan except human) proteins. Meanwhile, to measure the prediction quality of a multi-label system in a rigorous way, five indices were introduced; they are "Absolute-True", "Absolute-False" (or Hamming-Loss"), "Accuracy", "Precision", and "Recall". As a demonstration, the jackknife cross-validation was performed with iLoc-Animal on a benchmark dataset of animal proteins classified into the following 20 location sites: (1) acrosome, (2) cell membrane, (3) centriole, (4) centrosome, (5) cell cortex, (6) cytoplasm, (7) cytoskeleton, (8) endoplasmic reticulum, (9) endosome, (10) extracellular, (11) Golgi apparatus, (12) lysosome, (13) mitochondrion, (14) melanosome, (15) microsome, (16) nucleus, (17) peroxisome, (18) plasma membrane, (19) spindle, and (20) synapse, where many proteins belong to two or more locations. For such a complicated system, the outcomes achieved by iLoc-Animal for all the aforementioned five indices were quite encouraging, indicating that the predictor may become a useful tool in this area. It has not escaped our notice that the multi-label approach and the rigorous measurement metrics can also be used to investigate many other multi-label problems in molecular biology. As a user-friendly web-server, iLoc-Animal is freely accessible to the public at the web-site . PMID:23370050

Lin, Wei-Zhong; Fang, Jian-An; Xiao, Xuan; Chou, Kuo-Chen

2013-04-01

193

Demographic analysis of the fitness of Problepsis superans (Lepidoptera: Geometridae) feeding on three ligustrum (Lamiales: Oleaceae) species.  

PubMed

Using the age-stage, two-sex life table, the effects of three ligustrum species, Ligustrum x vicaryi Hort., Ligustrum quihoui Carrière, and Ligustrum lucidum Aiton, on the fitness of Problepsis superans (Butler, 1885) (Lepidoptera: Geometridae) were assayed by considering life table parameters of P. superans at 27 +/- 1 degrees C, 70 +/- 5% relative humidity, and a photoperiod of 16:8 (L: D) h. The means and SEs of population parameters were calculated using the jackknife and bootstrap methods. The total developmental time of larval stage of P. superans on L. x vicaryi was significantly shorter than that on L. x vicaryi and L. quihoui, whereas higher fecundity was observed on L. x vicaryi. The highest value of the finite rate of increase was observed on L. x vicaryi. The intrinsic rate of increase of P. superans on L. x vicaryi, L. quihoui, and L. lucidum, was 0.147 +/- 0.004, 0.130 +/- 0.004, and 0.112 +/- 0.005, respectively, which differed significantly among the three ligustrum species. The net reproductive rate varied from 122.8 +/- 24.7 female offspring on L. lucidum to 242.2 +/- 36.2 female offspring on L. x vicaryi. The lowest mean generation time was observed on L. x vicaryi. The gross reproductive rate of P. superans on the three ligustrum species did not significantly differ. Based on growth and population parameters, the suitability of the three ligustrum species to P. superans is ranked from high to low in the order as L. x vicaryi, L. quihoui, and L. lucidum. PMID:25026663

Hu, Liang-Xiong; Chen, Ye; He, Zheng-Sheng; Zou, Zhi-Wen; Xia, Bin

2014-06-01

194

DNA hybridization evidence for the principal lineages of hummingbirds (Aves:Trochilidae).  

PubMed

The spectacular evolutionary radiation of hummingbirds (Trochilidae) has served as a model system for many biological studies. To begin to provide a historical context for these investigations, we generated a complete matrix of DNA hybridization distances among 26 hummingbirds and an outgroup swift (Chaetura pelagica) to determine the principal hummingbird lineages. FITCH topologies estimated from symmetrized delta TmH-C values and subjected to various validation methods (bootstrapping, weighted jackknifing, branch length significance) indicated a fundamental split between hermit (Eutoxeres aquila, Threnetes ruckeri; Phaethornithinae) and nonhermit (Trochilinae) hummingbirds, and provided strong support for six principal nonhermit clades with the following branching order: (1) a predominantly lowland group comprising caribs (Eulampis holosericeus) and relatives (Androdon aequatorialis and Heliothryx barroti) with violet-ears (Colibri coruscans) and relatives (Doryfera ludovicae); (2) an Andean-associated clade of highly polytypic taxa (Eriocnemis, Heliodoxa, and Coeligena); (3) a second endemic Andean clade (Oreotrochilus chimborazo, Aglaiocercus coelestis, and Lesbia victoriae) paired with thorntails (Popelairia conversii); (4) emeralds and relatives (Chlorostilbon mellisugus, Amazilia tzacatl, Thalurania colombica, Orthorhyncus cristatus and Campylopterus villaviscensio); (5) mountain-gems (Lampornis clemenciae and Eugenes fulgens); and (6) tiny bee-like forms (Archilochus colubris, Myrtis fanny, Acestrura mulsant, and Philodice mitchellii). Corresponding analyses on a matrix of unsymmetrized delta values gave similar support for these relationships except that the branching order of the two Andean clades (2, 3 above) was unresolved. In general, subsidiary relationships were consistent and well supported by both matrices, sometimes revealing surprising associations between forms that differ dramatically in plumage and bill morphology. Our results also reveal some basic aspects of hummingbird ecologic and morphologic evolution. For example, most of the diverse endemic Andean assemblage apparently comprises two genetically divergent clades, whereas the majority of North American hummingbirds belong a single third clade. Genetic distances separating some morphologically distinct genera (Oreotrochilus, Aglaiocercus, Lesbia; Myrtis, Acestrura, Philodice) were no greater than among congeneric (Coeligena) species, indicating that, in hummingbirds, morphological divergence does not necessarily reflect level of genetic divergence. PMID:9066799

Bleiweiss, R; Kirsch, J A; Matheus, J C

1997-03-01

195

Does more sequence data improve estimates of galliform phylogeny? Analyses of a rapid radiation using a complete data matrix  

PubMed Central

The resolution of rapid evolutionary radiations or “bushes” in the tree of life has been one of the most difficult and interesting problems in phylogenetics. The avian order Galliformes appears to have undergone several rapid radiations that have limited the resolution of prior studies and obscured the position of taxa important both agriculturally and as model systems (chicken, turkey, Japanese quail). Here we present analyses of a multi-locus data matrix comprising over 15,000 sites, primarily from nuclear introns but also including three mitochondrial regions, from 46 galliform taxa with all gene regions sampled for all taxa. The increased sampling of unlinked nuclear genes provided strong bootstrap support for all but a small number of relationships. Coalescent-based methods to combine individual gene trees and analyses of datasets that are independent of published data indicated that this well-supported topology is likely to reflect the galliform species tree. The inclusion or exclusion of mitochondrial data had a limited impact upon analyses upon analyses using either concatenated data or multispecies coalescent methods. Some of the key phylogenetic findings include support for a second major clade within the core phasianids that includes the chicken and Japanese quail and clarification of the phylogenetic relationships of turkey. Jackknifed datasets suggested that there is an advantage to sampling many independent regions across the genome rather than obtaining long sequences for a small number of loci, possibly reflecting the differences among gene trees that differ due to incomplete lineage sorting. Despite the novel insights we obtained using this increased sampling of gene regions, some nodes remain unresolved, likely due to periods of rapid diversification. Resolving these remaining groups will likely require sequencing a very large number of gene regions, but our analyses now appear to support a robust backbone for this order. PMID:24795852

Braun, Edward L.

2014-01-01

196

The fate of an immigrant: Ensis directus in the eastern German Bight  

NASA Astrophysics Data System (ADS)

We studied Ensis directus in the subtidal (7-16 m depth) of the eastern German Bight. The jack-knife clam that invaded in the German Bight in 1978 has all characteristics of a successful immigrant: Ensis directus has a high reproductive capacity (juveniles, July 2001: Amrumbank 1,914 m-2, Eiderstedt/Vogelsand: 11,638 m-2), short generation times and growths rapidly: maximum growth rates were higher than in former studies (mean: 3 mm month-1, 2nd year: up to 14 mm month-1). Ensis directus uses natural mechanisms for rapid dispersal, occurs gregariously and exhibits a wide environmental tolerance. However, optimal growth and population-structure annual gaps might be influenced by reduced salinity: at Vogelsand (transition area of Elbe river), maximum growth was lower (164 mm) than at the Eiderstedt site (outer range of Elbe river, L ? = 174 mm). Mass mortalities of the clams are probably caused by washout (video inspections), low winter temperature and strong storms. Ensis directus immigrated into the community finding its own habitat on mobile sands with strong tidal currents. Recent studies on E. directus found that the species neither suppresses native species nor takes over the position of an established one which backs up our study findings over rather short time scales. On the contrary, E. directus seems to favour the settlement of some deposit feeders. Dense clam mats might stabilise the sediment and function as a sediment-trap for organic matter. Ensis directus has neither become a nuisance to other species nor developed according to the `boom-and-bust' theory. The fate of the immigrant E. directus rather is a story of a successful trans-ocean invasion which still holds on 23 years after the first findings in the outer elbe estuary off Vogelsand.

Dannheim, Jennifer; Rumohr, Heye

2012-09-01

197

Solution structure of tRNAVal from refinement of homology model against residual dipolar coupling and SAXS data  

PubMed Central

A procedure is presented for refinement of a homology model of E.Coli tRNAVal, originally based on the X-ray structure of yeast tRNAPhe, using experimental residual dipolar coupling (RDC) and small angle X-ray scattering (SAXS) data. A spherical sampling algorithm is described for refinement against SAXS data that does not require a globbic approximation, which is particularly important for nucleic acids where such approximations are less appropriate. Substantially higher speed of the algorithm also makes its application favorable for proteins. In addition to the SAXS data, the structure refinement employed a sparse set of NMR data consisting of 24 imino N-HN RDCs measured with Pf1 phage alignment, and 20 imino N-HN RDCs obtained from magnetic field dependent alignment of tRNAVal. The refinement strategy aims to largely retain the local geometry of the 58% identical tRNAPhe by ensuring that the atomic coordinates for short, overlapping segments of the ribose-phosphate backbone and the conserved base pairs remain close to those of the starting model. Local coordinate restraints are enforced using the non-crystallographic symmetry (NCS) term in the XPLOR-NIH or CNS software package, while still permitting modest movements of adjacent segments. The RDCs mainly drive the relative orientation of the helical arms, whereas the SAXS restraints ensure an overall molecular shape compatible with experimental scattering data. The resulting structure exhibits good cross-validation statistics (jack-knifed Qfree = 14% for the Pf1 RDCs, compared to 25% for the starting model) and exhibits a larger angle between the two helical arms than observed in the X-ray structure of tRNAPhe, in agreement with previous NMR-based tRNAVal models. PMID:18787959

Grishaev, Alexander; Ying, Jinfa; Canny, Marella D.; Pardi, Arthur; Bax, Ad

2008-01-01

198

Food selection among Atlantic Coast seaducks in relation to historic food habits  

USGS Publications Warehouse

Food selection among Atlantic Coast seaducks during 1999-2005 was determined from hunter-killed ducks and compared to data from historic food habits file (1885-1985) for major migrational and wintering areas in the Atlantic Flyway. Food selection was determined by analyses of the gullet (esophagus and proventriculus) and gizzard of 860 ducks and summarized by aggregate percent for each species. When sample size was adequate comparisons were made among age and sex groupings and also among local sites in major habitat areas. Common eiders in Maine and the Canadian Maritimes fed predominantly (53%) on the blue mussel (Mytilus edulis). Scoters in Massachusetts, Maine, and the Canadian Maritimes fed predominantly on the blue mussel (46%), Atlantic jackknife clam (Ensis directus; 19%), and Atlantic surf clam (Spisula solidissima; 15%), whereas scoters in the Chesapeake Bay fed predominantly on hooked mussel (Ischadium recurvum; 42%), the stout razor clam (Tagelus plebeius; 22%), and dwarf surf clam (Mulinia lateralis; 15%). The amethyst gem clam (Gemma gemma) was the predominant food (45%) of long-tailed ducks in Chesapeake Bay. Buffleheads and common goldeneyes fed on a mixed diet of mollusks and soft bodied invertebrates (amphipods, isopods and polychaetes). No major differences were noticed between the sexes in regard to food selection in any of the wintering areas. Comparisons to historic food habits in all areas failed to detect major differences. However, several invertebrate species recorded in historic samples were not found in current samples and two invasive species (Atlantic Rangia, Rangia cuneata and green crab, Carcinas maenas) were recorded in modem samples, but not in historic samples. Benthic sampling in areas where seaducks were collected showed a close correlation between consumption and availability. Each seaduck species appears to fill a unique niche in regard to feeding ecology, although there is much overlap of prey species selected. Understanding the trophic relationships of seaducks in coastal wintering areas will give managers a better understanding of habitat changes in regard to future environmental perturbations.

Perry, M.C.; Osenton, P.C.; Wells-Berlin, A. M.; Kidwell, D.M.

2005-01-01

199

Estimation of Low Streamflow Statistics at Ungauged Sites Using Baseflow Correlation  

NASA Astrophysics Data System (ADS)

Low streamflow estimates are required for water quality and quantity management purposes. This study focuses on estimation of the 7-day 10-year low flow (Q7,10), an extensively employed low flow statistic in the United States. The baseflow correlation method is an information transfer technique that can be used to estimate low flow statistics at an ungauged site by correlating a nominal number of measured baseflows at the ungauged site with those at nearby gauged sites. A national assessment of baseflow correlation estimators is made via a jackknife simulation with daily streamflow values at more than 1300 USGS HCDN gauged river sites. It is shown that the chosen performance metric is important when evaluating the method across a large range of Q7,10 values. Results confirm that baseflow measurements should be obtained from different baseflow recessions. The method performance is sensitive to the correlation coefficient between baseflows at gauged and ungauged sites when the number of baseflow measurements is 5. When the number of baseflow measurements is 10 or more, the method performs adequately if the correlation coefficient is greater than 0.6. The performance of the baseflow correlation method improves as the number of baseflow measurements increases, but levels off dramatically when one has more than 10 measurements. This research also investigates a number of different baseflow correlation methods that employ information from multiple gauged sites to estimate the Q7,10 at a single ungauged site. Results show that the performance can be improved by using multiple site information, especially when less than 10 baseflow measurements are used.

Zhang, Z.; Kroll, C. N.

2005-05-01

200

PRIMUS: Galaxy Clustering as a Function of Luminosity and Color at 0.2 < z < 1  

NASA Astrophysics Data System (ADS)

We present measurements of the luminosity and color-dependence of galaxy clustering at 0.2 < z < 1.0 in the Prism Multi-object Survey. We quantify the clustering with the redshift-space and projected two-point correlation functions, ?(rp , ?) and wp (rp ), using volume-limited samples constructed from a parent sample of over ~130, 000 galaxies with robust redshifts in seven independent fields covering 9 deg2 of sky. We quantify how the scale-dependent clustering amplitude increases with increasing luminosity and redder color, with relatively small errors over large volumes. We find that red galaxies have stronger small-scale (0.1 Mpc h -1 < rp < 1 Mpc h -1) clustering and steeper correlation functions compared to blue galaxies, as well as a strong color dependent clustering within the red sequence alone. We interpret our measured clustering trends in terms of galaxy bias and obtain values of b gal ? 0.9-2.5, quantifying how galaxies are biased tracers of dark matter depending on their luminosity and color. We also interpret the color dependence with mock catalogs, and find that the clustering of blue galaxies is nearly constant with color, while redder galaxies have stronger clustering in the one-halo term due to a higher satellite galaxy fraction. In addition, we measure the evolution of the clustering strength and bias, and we do not detect statistically significant departures from passive evolution. We argue that the luminosity- and color-environment (or halo mass) relations of galaxies have not significantly evolved since z ~ 1. Finally, using jackknife subsampling methods, we find that sampling fluctuations are important and that the COSMOS field is generally an outlier, due to having more overdense structures than other fields; we find that "cosmic variance" can be a significant source of uncertainty for high-redshift clustering measurements.

Skibba, Ramin A.; Smith, M. Stephen M.; Coil, Alison L.; Moustakas, John; Aird, James; Blanton, Michael R.; Bray, Aaron D.; Cool, Richard J.; Eisenstein, Daniel J.; Mendez, Alexander J.; Wong, Kenneth C.; Zhu, Guangtun

2014-04-01

201

Aboveground biomass and leaf area index (LAI) mapping for Niassa Reserve, northern Mozambique  

NASA Astrophysics Data System (ADS)

Estimations of biomass are critical in miombo woodlands because they represent the primary source of goods and services for over 80% of the population in southern Africa. This study was carried out in Niassa Reserve, northern Mozambique. The main objectives were first to estimate woody biomass and Leaf Area Index (LAI) using remotely sensed data [RADARSAT (C-band, ? = 5.7-cm)] and Landsat ETM+ derived Normalized Difference Vegetation Index (NDVI) and Simple Ratio (SR) calibrated by field measurements and, second to determine, at both landscape and plot scales, the environmental controls (precipitation, woody cover density, fire and elephants) of biomass and LAI. A land-cover map (72% overall accuracy) was derived from the June 2004 ETM+ mosaic. Field biomass and LAI were correlated with RADARSAT backscatter (rbiomass = 0.65, rLAI = 0.57, p < 0.0001) from July 2004, NDVI (rbiomass = 0.30, rLAI = 0.35; p < 0.0001) and SR (rbiomass = 0.36, rLAI = 0.40, p < 0.0001). A jackknife stepwise regression technique was used to develop the best predictive models for biomass (biomass = -5.19 + 0.074 * radarsat + 1.56 * SR, r2 = 0.55) and LAI (LAI = -0.66 + 0.01 * radarsat + 0.22 * SR, r2 = 0.45). Biomass and LAI maps were produced with an estimated peak of 18 kg m-2 and 2.80 m2 m-2, respectively. On the landscape-scale, both biomass and LAI were strongly determined by mean annual precipitation (F = 13.91, p = 0.0002). On the plot spatial scale, woody biomass was significantly determined by fire frequency, and LAI by vegetation type.

Ribeiro, Natasha S.; Saatchi, Sassan S.; Shugart, Herman H.; Washington-Allen, Robert A.

2008-09-01

202

BICEP2 I: Detection Of B-mode Polarization at Degree Angular Scales  

E-print Network

(abridged for arXiv) We report results from the BICEP2 experiment, a cosmic microwave background (CMB) polarimeter specifically designed to search for the signal of inflationary gravitational waves in the B-mode power spectrum around $\\ell\\sim80$. The telescope comprised a 26 cm aperture all-cold refracting optical system equipped with a focal plane of 512 antenna coupled transition edge sensor 150 GHz bolometers each with temperature sensitivity of $\\approx300\\mu\\mathrm{K}_\\mathrm{CMB}\\sqrt{s}$. BICEP2 observed from the South Pole for three seasons from 2010 to 2012. A low-foreground region of sky with an effective area of 380 square deg was observed to a depth of 87 nK deg in Stokes $Q$ and $U$. We find an excess of $B$-mode power over the base lensed-LCDM expectation in the range $30 5\\sigma$. Through jackknife tests and simulations we show that systematic contamination is much smaller than the observed excess. We also examine a number of available models of polarized dust emission and find that at their default parameter values they predict power $\\sim(5-10)\\times$ smaller than the observed excess signal. However, these models are not sufficiently constrained to exclude the possibility of dust emission bright enough to explain the entire excess signal. Cross correlating BICEP2 against 100 GHz maps from the BICEP1 experiment, the excess signal is confirmed and its spectral index is found to be consistent with that of the CMB, disfavoring dust at $1.7\\sigma$. The observed $B$-mode power spectrum is well fit by a lensed-LCDM + tensor theoretical model with tensor-to-scalar ratio $r=0.20^{+0.07}_{-0.05}$, with $r=0$ disfavored at $7.0\\sigma$. Accounting for the contribution of foreground dust will shift this value downward by an amount which will be better constrained with upcoming data sets.

P. A. R Ade; R. W. Aikin; D. Barkats; S. J. Benton; C. A. Bischoff; J. J. Bock; J. A. Brevik; I. Buder; E. Bullock; C. D. Dowell; L. Duband; J. P. Filippini; S. Fliescher; S. R. Golwala; M. Halpern; M. Hasselfield; S. R. Hildebrandt; G. C. Hilton; V. V. Hristov; K. D. Irwin; K. S. Karkare; J. P. Kaufman; B. G. Keating; S. A. Kernasovskiy; J. M. Kovac; C. L. Kuo; E. M. Leitch; M. Lueker; P. Mason; C. B. Netterfield; H. T. Nguyen; R. O'Brient; R. W. Ogburn IV; A. Orlando; C. Pryke; C. D. Reintsema; S. Richter; R. Schwarz; C. D. Sheehy; Z. K. Staniszewski; R. V. Sudiwala; G. P. Teply; J. E. Tolan; A. D. Turner; A. G. Vieregg; C. L. Wong; K. W. Yoon

2014-03-17

203

MR Imaging in Patients with Suspected Liver Metastases: Value of Liver-Specific Contrast Agent Gadoxetic Acid  

PubMed Central

Objective To compare the diagnostic performance of gadoxetic acid-enhanced magnetic resonance (MR) imaging with that of triple-phase multidetector-row computed tomography (MDCT) in the detection of liver metastasis. Materials and Methods Our institutional review board approved this retrospective study and waived informed consent. The study population consisted of 51 patients with hepatic metastases and 62 patients with benign hepatic lesions, who underwent triple-phase MDCT and gadoxetic acid-enhanced MRI within one month. Two radiologists independently and randomly reviewed MDCT and MRI images regarding the presence and probability of liver metastasis. In order to determine additional value of hepatobiliary-phase (HBP), the dynamic-MRI set alone and combined dynamic-and-HBP set were evaluated, respectively. The standard of reference was a combination of pathology diagnosis and follow-up imaging. For each reader, diagnostic accuracy was compared using the jackknife alternative free-response receiver-operating-characteristic (JAFROC). Results For both readers, average JAFROC figure-of-merit (FOM) was significantly higher on the MR image sets than on the MDCT images: average FOM was 0.582 on the MDCT, 0.788 on the dynamic-MRI set and 0.847 on the combined HBP set, respectively (p < 0.0001). The differences were more prominent for small (? 1 cm) lesions: average FOM values were 0.433 on MDCT, 0.711 on the dynamic-MRI set and 0.828 on the combined HBP set, respectively (p < 0.0001). Sensitivity increased significantly with the addition of HBP in gadoxetic acid-enhanced MR imaging (p < 0.0001). Conclusion Gadoxetic acid-enhanced MRI shows a better performance than triple-phase MDCT for the detection of hepatic metastasis, especially for small (? 1 cm) lesions. PMID:24265564

Lee, Kyung Hee; Park, Ji Hoon; Kim, Jung Hoon; Park, Hee Sun; Yu, Mi Hye; Yoon, Jeong-Hee; Han, Joon Koo; Choi, Byung Ihn

2013-01-01

204

Test-Retest Intervisit Variability of Functional and Structural Parameters in X-Linked Retinoschisis  

PubMed Central

Purpose To examine the variability of four outcome measures that could be used to address safety and efficacy in therapeutic trials with X-linked juvenile retinoschisis. Methods Seven men with confirmed mutations in the RS1 gene were evaluated over four visits spanning 6 months. Assessments included visual acuity, full-field electroretinograms (ERG), microperimetric macular sensitivity, and retinal thickness measured by optical coherence tomography (OCT). Eyes were separated into Better or Worse Eye groups based on acuity at baseline. Repeatability coefficients were calculated for each parameter and jackknife resampling used to derive 95% confidence intervals (CIs). Results The threshold for statistically significant change in visual acuity ranged from three to eight letters. For ERG a-wave, an amplitude reduction greater than 56% would be considered significant. For other parameters, variabilities were lower in the Worse Eye group, likely a result of floor effects due to collapse of the schisis pockets and/or retinal atrophy. The criteria for significant change (Better/Worse Eye) for three important parameters were: ERG b/a-wave ratio (0.44/0.23), point wise sensitivity (10.4/7.0 dB), and central retinal thickness (31%/18%). Conclusions The 95% CI range for visual acuity, ERG, retinal sensitivity, and central retinal thickness relative to baseline are described for this cohort of participants with X-linked juvenile retinoschisis (XLRS). Translational Relevance A quantitative understanding of the variability of outcome measures is vital to establishing the safety and efficacy limits for therapeutic trials of XLRS patients.

Jeffrey, Brett G.; Cukras, Catherine A.; Vitale, Susan; Turriff, Amy; Bowles, Kristin; Sieving, Paul A.

2014-01-01

205

Population pharmacokinetics of clofarabine, a second-generation nucleoside analog, in pediatric patients with acute leukemia.  

PubMed

The population pharmacokinetics of plasma clofarabine and intracellular clofarabine triphosphate were characterized in pediatric patients with acute leukemias. Traditional model-building techniques with NONMEM were used. Covariates were entered into the base model using a forward selection significance level of .05 and a backwards deletion criterion of .005. Model performance, stability, and influence analysis were assessed using the nonparametric bootstrap and n-1 jackknife. Simulations were used to understand the relationship between important covariates and exposure. A 2-compartment model with weight (scaled to a 40-kg reference patient) modeled as a power function on all pharmacokinetic parameters (0.75 on clearance-related terms and 1.0 on volume-related terms) was fit to plasma clofarabine concentrations (n = 32). White blood cell (WBC) count, modeled as a power function (scaled to a WBC count of 10 x 10(3)/microL), was a significant predictor of central volume with power term 0.128 +/- 0.0314. A reference patient had a systemic clearance of 32.8 L/h (27% between-subject variability [BSV]), a central volume of 115 L (56% BSV), an intercompartmental clearance of 20.5 L/h (27% BSV), and a peripheral volume of 94.5 L (39% BSV). Intracellular clofarabine triphosphate concentrations were modeled using a random intercept model without any covariates. The average predicted concentration was 11.6 +/- 2.62 microM (80% BSV), and although clofarabine triphosphate half-life could not be definitively estimated, its value was taken to be longer than 24 hours. The results confirm that clofarabine should continue being dosed on a per-squaremeter or per-body-weight basis. PMID:15496649

Bonate, Peter L; Craig, Adam; Gaynon, Paul; Gandhi, Varsha; Jeha, Sima; Kadota, Richard; Lam, Gilbert N; Plunkett, William; Razzouk, Bassem; Rytting, Michael; Steinherz, Peter; Weitman, Steve

2004-11-01

206

A genetic programming approach for Burkholderia Pseudomallei diagnostic pattern discovery  

PubMed Central

Motivation: Finding diagnostic patterns for fighting diseases like Burkholderia pseudomallei using biomarkers involves two key issues. First, exhausting all subsets of testable biomarkers (antigens in this context) to find a best one is computationally infeasible. Therefore, a proper optimization approach like evolutionary computation should be investigated. Second, a properly selected function of the antigens as the diagnostic pattern which is commonly unknown is a key to the diagnostic accuracy and the diagnostic effectiveness in clinical use. Results: A conversion function is proposed to convert serum tests of antigens on patients to binary values based on which Boolean functions as the diagnostic patterns are developed. A genetic programming approach is designed for optimizing the diagnostic patterns in terms of their accuracy and effectiveness. During optimization, it is aimed to maximize the coverage (the rate of positive response to antigens) in the infected patients and minimize the coverage in the non-infected patients while maintaining the fewest number of testable antigens used in the Boolean functions as possible. The final coverage in the infected patients is 96.55% using 17 of 215 (7.4%) antigens with zero coverage in the non-infected patients. Among these 17 antigens, BPSL2697 is the most frequently selected one for the diagnosis of Burkholderia Pseudomallei. The approach has been evaluated using both the cross-validation and the Jack–knife simulation methods with the prediction accuracy as 93% and 92%, respectively. A novel approach is also proposed in this study to evaluate a model with binary data using ROC analysis. Contact: z.r.yang@ex.ac.uk PMID:19561021

Yang, Zheng Rong; Lertmemongkolchai, Ganjana; Tan, Gladys; Felgner, Philip L.; Titball, Richard

2009-01-01

207

Spring flood reconstruction from continuous and discrete tree ring series  

NASA Astrophysics Data System (ADS)

This study proposes a method to reconstruct past spring flood discharge from continuous and discrete tree ring chronologies, since both have their respective strengths and weaknesses in northern environments. Ring width or density series provide uninterrupted records that are indirectly linked to regional discharge through a concomitant effect of climate on tree growth and streamflow. Conversely, discrete event chronologies constitute conspicuous records of past high water levels since they are constructed from trees that are directly damaged by the flood. However, the uncertainty of discrete series increases toward the past, and their relationships with spring discharge are often nonlinear. To take advantage of these two sources of information, we introduce a new transfer model technique on the basis of generalized additive model (GAM) theory. The incorporation of discrete predictors and the evaluation of the robustness of the nonlinear relationships are assessed using a jackknife procedure. We exemplify our approach in a reconstruction of May water supplies to the Caniapiscau hydroelectric reservoir in northern Quebec, Canada. We used earlywood density measurements as continuous variables and ice-scar dates around Lake Montausier in the James Bay area as a discrete variable. Strong calibration (0.57 < 0.61 < 0.75) and validation (0.27 < 0.44 < 0.58) R2 statistics were obtained, thus highlighting the usefulness of the model. Our reconstruction suggests that, since ˜1965, spring floods have become more intense and variable in comparison with the last 150 years. We argue that a similar procedure can be used in each case where discrete and continuous tree ring proxies are used together to reconstruct past spring floods.

Boucher, ÉTienne; Ouarda, Taha B. M. J.; BéGin, Yves; Nicault, Antoine

2011-07-01

208

Historical extension of operational NDVI products for livestock insurance in Kenya  

NASA Astrophysics Data System (ADS)

Droughts induce livestock losses that severely affect Kenyan pastoralists. Recent index insurance schemes have the potential of being a viable tool for insuring pastoralists against drought-related risk. Such schemes require as input a forage scarcity (or drought) index that can be reliably updated in near real-time, and that strongly relates to livestock mortality. Generally, a long record (>25 years) of the index is needed to correctly estimate mortality risk and calculate the related insurance premium. Data from current operational satellites used for large-scale vegetation monitoring span over a maximum of 15 years, a time period that is considered insufficient for accurate premium computation. This study examines how operational NDVI datasets compare to, and could be combined with the non-operational recently constructed 30-year GIMMS AVHRR record (1981-2011) to provide a near-real time drought index with a long term archive for the arid lands of Kenya. We compared six freely available, near-real time NDVI products: five from MODIS and one from SPOT-VEGETATION. Prior to comparison, all datasets were averaged in time for the two vegetative seasons in Kenya, and aggregated spatially at the administrative division level at which the insurance is offered. The feasibility of extending the resulting aggregated drought indices back in time was assessed using jackknifed R2 statistics (leave-one-year-out) for the overlapping period 2002-2011. We found that division-specific models were more effective than a global model for linking the division-level temporal variability of the index between NDVI products. Based on our results, good scope exists for historically extending the aggregated drought index, thus providing a longer operational record for insurance purposes. We showed that this extension may have large effects on the calculated insurance premium. Finally, we discuss several possible improvements to the drought index.

Vrieling, Anton; Meroni, Michele; Shee, Apurba; Mude, Andrew G.; Woodard, Joshua; de Bie, C. A. J. M. (Kees); Rembold, Felix

2014-05-01

209

Rapid field identification of subjects involved in firearm-related crimes based on electroanalysis coupled with advanced chemometric data treatment.  

PubMed

We demonstrate a novel system for the detection and discrimination of varying levels of exposure to gunshot residue from subjects in various control scenarios. Our aim is to address the key challenge of minimizing the false positive identification of individuals suspected of discharging a firearm. The chemometric treatment of voltammetric data from different controls using Canonical Variate Analysis (CVA) provides several distinct clusters for each scenario examined. Multiple samples were taken from subjects in controlled tests such as secondary contact with gunshot residue (GSR), loading a firearm, and postdischarge of a firearm. These controls were examined at both bare carbon and gold-modified screen-printed electrodes using different sampling methods: the 'swipe' method with integrated sampling and electroanalysis and a more traditional acid-assisted q-tip swabbing method. The electroanalytical fingerprint of each sample was examined using square-wave voltammetry; the resulting data were preprocessed with Fast Fourier Transform (FFT), followed by CVA treatment. High levels of discrimination were thus achieved in each case over 3 classes of samples (reflecting different levels of involvement), achieving maximum accuracy, sensitivity, and specificity values of 100% employing the leave-one-out validation method. Further validation with the 'jack-knife' technique was performed, and the resulting values were in good agreement with the former method. Additionally, samples from subjects in daily contact with relevant metallic constituents were analyzed to assess possible false positives. This system may serve as a potential method for a portable, field-deployable system aimed at rapidly identifying a subject who has loaded or discharged a firearm to verify involvement in a crime, hence providing law enforcement personnel with an invaluable forensic tool in the field. PMID:23121395

Cetó, Xavier; O'Mahony, Aoife M; Samek, Izabela A; Windmiller, Joshua R; del Valle, Manel; Wang, Joseph

2012-12-01

210

Measuring agreement between ratings interpretations and binary clinical interpretations of images: a simulation study of methods for quantifying the clinical relevance of an observer performance paradigm  

PubMed Central

Laboratory receiver operating characteristic (ROC) studies, that are often used to evaluate medical imaging systems, differ from “live” clinical interpretations in several respects which could compromise their clinical relevance. The aim was to develop methodology for quantifying the clinical relevance of a laboratory ROC study. A simulator was developed to generate ROC ratings data and binary clinical interpretations classified as correct or incorrect for a common set of images interpreted under clinical and laboratory conditions. The area under the trapezoidal ROC curve was used as the laboratory figure-of-merit and the fraction of correct clinical decisions as the clinical figure-of-merit. Conventional agreement measures (Pearson, Spearman, Kendall and kappa) between the bootstrap-induced fluctuations of the two figures-of-merit were estimated. A jackknife pseudovalue transformation applied to the figures-of-merit was also investigated as a way to capture agreement existing at the individual image level that could be lost at the figure-of-merit level. It is shown that the pseudovalues define a relevance ROC curve the area under which (rAUC) measures the ability of the laboratory figure-of-merit based pseudovalues to correctly classify incorrect vs. correct clinical interpretations, and is a measure of the clinical relevance of an ROC study. The conventional measures and rAUC were compared under varying simulator conditions. It was found that design details of the ROC study, namely the number of bins, the difficulty level of the images, the ratio of disease-present to disease-absent images, and the unavoidable difference between laboratory and clinical performance levels, can seriously underestimate the agreement as indicated by conventional agreement measures, even for perfectly correlated data, while rAUC showed high agreement and was relatively immune to these details. At the same time rAUC was sensitive to factors such as intrinsic correlation between the laboratory and clinical decision variables and differences in reporting thresholds that are expected to influence agreement both at the individual image level and at the figure-of-merit level. Suggestions are made for how to conduct relevance ROC studies aimed at assessing agreement between laboratory and clinical interpretations. PMID:22516804

Chakraborty, Dev P.

2012-01-01

211

Computer-aided detection of breast masses: Four-view strategy for screening mammography  

SciTech Connect

Purpose: To improve the performance of a computer-aided detection (CAD) system for mass detection by using four-view information in screening mammography. Methods: The authors developed a four-view CAD system that emulates radiologists' reading by using the craniocaudal and mediolateral oblique views of the ipsilateral breast to reduce false positives (FPs) and the corresponding views of the contralateral breast to detect asymmetry. The CAD system consists of four major components: (1) Initial detection of breast masses on individual views, (2) information fusion of the ipsilateral views of the breast (referred to as two-view analysis), (3) information fusion of the corresponding views of the contralateral breast (referred to as bilateral analysis), and (4) fusion of the four-view information with a decision tree. The authors collected two data sets for training and testing of the CAD system: A mass set containing 389 patients with 389 biopsy-proven masses and a normal set containing 200 normal subjects. All cases had four-view mammograms. The true locations of the masses on the mammograms were identified by an experienced MQSA radiologist. The authors randomly divided the mass set into two independent sets for cross validation training and testing. The overall test performance was assessed by averaging the free response receiver operating characteristic (FROC) curves of the two test subsets. The FP rates during the FROC analysis were estimated by using the normal set only. The jackknife free-response ROC (JAFROC) method was used to estimate the statistical significance of the difference between the test FROC curves obtained with the single-view and the four-view CAD systems. Results: Using the single-view CAD system, the breast-based test sensitivities were 58% and 77% at the FP rates of 0.5 and 1.0 per image, respectively. With the four-view CAD system, the breast-based test sensitivities were improved to 76% and 87% at the corresponding FP rates, respectively. The improvement was found to be statistically significant (p<0.0001) by JAFROC analysis. Conclusions: The four-view information fusion approach that emulates radiologists' reading strategy significantly improves the performance of breast mass detection of the CAD system in comparison with the single-view approach.

Wei Jun; Chan Heangping; Zhou Chuan; Wu Yita; Sahiner, Berkman; Hadjiiski, Lubomir M.; Roubidoux, Marilyn A.; Helvie, Mark A. [Department of Radiology, University of Michigan, 1500 East Medical Center Drive, C478 Med-Inn Building, Ann Arbor, Michigan 48109-5842 (United States)

2011-04-15

212

A stable pattern of EEG spectral coherence distinguishes children with autism from neuro-typical controls - a large case control study  

PubMed Central

Background The autism rate has recently increased to 1 in 100 children. Genetic studies demonstrate poorly understood complexity. Environmental factors apparently also play a role. Magnetic resonance imaging (MRI) studies demonstrate increased brain sizes and altered connectivity. Electroencephalogram (EEG) coherence studies confirm connectivity changes. However, genetic-, MRI- and/or EEG-based diagnostic tests are not yet available. The varied study results likely reflect methodological and population differences, small samples and, for EEG, lack of attention to group-specific artifact. Methods Of the 1,304 subjects who participated in this study, with ages ranging from 1 to 18 years old and assessed with comparable EEG studies, 463 children were diagnosed with autism spectrum disorder (ASD); 571 children were neuro-typical controls (C). After artifact management, principal components analysis (PCA) identified EEG spectral coherence factors with corresponding loading patterns. The 2- to 12-year-old subsample consisted of 430 ASD- and 554 C-group subjects (n = 984). Discriminant function analysis (DFA) determined the spectral coherence factors' discrimination success for the two groups. Loading patterns on the DFA-selected coherence factors described ASD-specific coherence differences when compared to controls. Results Total sample PCA of coherence data identified 40 factors which explained 50.8% of the total population variance. For the 2- to 12-year-olds, the 40 factors showed highly significant group differences (P < 0.0001). Ten randomly generated split half replications demonstrated high-average classification success (C, 88.5%; ASD, 86.0%). Still higher success was obtained in the more restricted age sub-samples using the jackknifing technique: 2- to 4-year-olds (C, 90.6%; ASD, 98.1%); 4- to 6-year-olds (C, 90.9%; ASD 99.1%); and 6- to 12-year-olds (C, 98.7%; ASD, 93.9%). Coherence loadings demonstrated reduced short-distance and reduced, as well as increased, long-distance coherences for the ASD-groups, when compared to the controls. Average spectral loading per factor was wide (10.1 Hz). Conclusions Classification success suggests a stable coherence loading pattern that differentiates ASD- from C-group subjects. This might constitute an EEG coherence-based phenotype of childhood autism. The predominantly reduced short-distance coherences may indicate poor local network function. The increased long-distance coherences may represent compensatory processes or reduced neural pruning. The wide average spectral range of factor loadings may suggest over-damped neural networks. PMID:22730909

2012-01-01

213

Two-Phase Analysis in Consensus Genetic Mapping  

PubMed Central

Numerous mapping projects conducted on different species have generated an abundance of mapping data. Consequently, many multilocus maps have been constructed using diverse mapping populations and marker sets for the same organism. The quality of maps varies broadly among populations, marker sets, and software used, necessitating efforts to integrate the mapping information and generate consensus maps. The problem of consensus genetic mapping (MCGM) is by far more challenging compared with genetic mapping based on a single dataset, which by itself is also cumbersome. The additional complications introduced by consensus analysis include inter-population differences in recombination rate and exchange distribution along chromosomes; variations in dominance of the employed markers; and use of different subsets of markers in different labs. Hence, it is necessary to handle arbitrary patterns of shared sets of markers and different level of mapping data quality. In this article, we introduce a two-phase approach for solving MCGM. In phase 1, for each dataset, multilocus ordering is performed combined with iterative jackknife resampling to evaluate the stability of marker orders. In this phase, the ordering problem is reduced to the well-known traveling salesperson problem (TSP). Namely, for each dataset, we look for order that gives minimum sum of recombination distances between adjacent markers. In phase 2, the optimal consensus order of shared markers is selected from the set of allowed orders and gives the minimal sum of total lengths of nonconflicting maps of the chromosome. This criterion may be used in different modifications to take into account the variation in quality of the original data (population size, marker quality, etc.). In the foregoing formulation, consensus mapping is considered as a specific version of TSP that can be referred to as “synchronized TSP.” The conflicts detected after phase 1 are resolved using either a heuristic algorithm over the entire chromosome or an exact/heuristic algorithm applied subsequently to the revealed small non-overlapping regions with conflicts separated by non-conflicting regions. The proposed approach was tested on a wide range of simulated data and real datasets from maize. PMID:22670224

Ronin, Y.; Mester, D.; Minkov, D.; Belotserkovski, R.; Jackson, B. N.; Schnable, P. S.; Aluru, S.; Korol, A.

2012-01-01

214

Evaluation of optical remote sensing to estimate evapotranspiration and canopy conductance  

NASA Astrophysics Data System (ADS)

e compared evapotranspiration (ET) estimates produced with six different vegetation measures derived from the Moderate Resolution Imaging Spectroradiometer (MODIS) and three contrasting estimation approaches using measurements from eddy covariance flux towers at 16 FLUXNET sites located over six different land cover types. The aim was to assess optimal approaches in using optical remote sensing to estimate ET. The first two approaches directly regressed various MODIS vegetation indices (VIs) and leaf area index (LAI) and fraction of photosynthetically active radiation (fPAR) products with ET and evaporative fraction. In the third approach, the Penman-Monteith (PM) equation was inverted to obtain surface conductance (Gs), represented by days dominated by dry canopy conductance (Gc). The Gc values were then regressed against the MODIS vegetation products and used to parameterize the PM equation for retrievals of ET. Jack-Knifing cross validation was used to evaluate the various regression models and assess their performance across all land cover types and sites. Our analysis shows that the PM-Gc approach leads to the lowest root mean square errors and highest determination coefficients globally across all sites. The MODIS LAI and fPAR products produced the poorest estimates of ET; while the VIs each performed best for some of the land cover types. The enhanced vegetation index (EVI) produced considerably better ET estimates for evergreen needlefleaf forest, the normalized difference vegetation index (NDVI) best estimated ET in grassland, cropland and woody savannas and the VI-based crop coefficient (Kc) yielded the best estimates for evergreen and deciduous broadleaf forests. Using the mean of the Gc estimates derived from NDVI, EVI and Kc we computed global grids of Gc from which annual statistics were extracted to characterise different functional types. The resulting values can be used to parameterize land surface models.ean global Gc for 2001-2011 estimated as the average of values predicted based on NDVI, EVI and Kc calculated from MCD43C4 data (downloaded from ftp://e4ftl01.cr.usgs.gov/MOTA/MCD43C4.005/ in February 2012).

Yebra, M.; van Dijk, A. I.; Leuning, R.; Huete, A. R.; Guerschman, J. P.

2012-12-01

215

A multi-site and multi-variable calibration and validation approach for hydrological modeling of mountainous catchments  

NASA Astrophysics Data System (ADS)

Hydrological modeling in mountainous regions, where catchment hydrology is heavily influenced by snow (and possibly ice) processes, is a challenging task. The intrinsic complexity of local processes is added to the difficulty of estimating spatially-distributed inputs such as rainfall and temperature, which often exhibit a high spatial heterogeneity that cannot be fully captured by the measurement network. Hence, an interpolation step is often required prior to the hydrological modeling step. In most cases, the reconstruction of meteorological forcings and the calibration of the hydrological model are done sequentially. The outputs of the hydrological model (discharge estimates) may give some insight on the quality of the reconstructed forcings used to feed it, but in this two-step approach it is not possible to easily feed the interpolation scheme back with the discrepancies between observed and simulated discharges. Yet, despite having undergone the rainfall-runoff (or snow-runoff) transformation, discharge at the outlet of a (sub)catchment is still an interesting integrator (spatial low-pass filter) of the forcing fields and is an ancillary areal information complementing the direct, point-scale data collected at raingages. In this perspective, chosing the best interpolation scheme partly becomes an inverse hydrological problem. In this study, we present a one-step calibration strategy where the parameters of both the interpolation model (i.e., reconstruction procedure of meteorological forcings) and of the hydrological model (i.e., snow cover evolution, soil moisture accounting, and flow routing schemes) are jointly infered in a multi-site and multi-variable approach, using a multi-objective evolutionary algorithm. Interpolated fields are rainfall and temperature, whereas hydrological prognostic variables consist in discharge and snow water equivalent (SWE) time series at several locations in the 3,600-square kilometer Upper Durance River catchment (French Alps). Using cross-validation and jack-knife procedures, we show that the parameters infered in such a way are much more robust and mutually consistent. The identifiability of each block of functional parameters, i.e. controlling each individual prognostic variable (SWE, discharge, or estimed rainfall / temperature at an ungaged location), is shown to benefit from the one-step, joint inference making use of the other variables.

Le Moine, N.; Hendrickx, F.; Bourqui, M.

2012-12-01

216

Modelling and mapping the local distribution of representative species on the Le Danois Bank, El Cachucho Marine Protected Area (Cantabrian Sea)  

NASA Astrophysics Data System (ADS)

The management and protection of potentially vulnerable species and habitats require the availability of detailed spatial data. However, such data are often not readily available in particular areas that are challenging for sampling by traditional sampling techniques, for example seamounts. Within this study habitat modelling techniques were used to create predictive maps of six species of conservation concern for the Le Danois Bank (El Cachucho Marine Protected Area in the South of the Bay of Biscay). The study used data from ECOMARG multidisciplinary surveys that aimed to create a representative picture of the physical and biological composition of the area. Classical fishing gear (otter trawl and beam trawl) was used to sample benthic communities that inhabit sedimentary areas, and non-destructive visual sampling techniques (ROV and photogrammetric sled) were used to determine the presence of epibenthic macrofauna in complex and vulnerable habitats. Multibeam echosounder data, high-resolution seismic profiles (TOPAS system) and geological data from box-corer were used to characterize the benthic terrain. ArcGIS software was used to produce high-resolution maps (75×75 m2) of such variables in the entire area. The Maximum Entropy (MAXENT) technique was used to process these data and create Habitat Suitability maps for six species of special conservation interest. The model used seven environmental variables (depth, rugosity, aspect, slope, Bathymetric Position Index (BPI) in fine and broad scale and morphosedimentary characteristics) to identify the most suitable habitats for such species and indicates which environmental factors determine their distribution. The six species models performed highly significantly better than random (p<0.0001; Mann-Whitney test) when Area Under the Curve (AUC) values were tested. This indicates that the environmental variables chosen are relevant to distinguish the distribution of these species. The Jackknife test estimated depth to be the key factor structuring their distribution, followed by the seabed morpho-sedimentary characteristics and rugosity variables. Three of the species studied (Asconema setubalense, Callogorgia verticillata and Helicolenus dactylopterus) were found to have small suitable areas as a result of being restrictive species related to the environmental characteristics of the top of the bank. The other species (Pheronema carpenteri, Phycis blennoides and Trachyscorpia cristulata), which were species less restrictive to the environmental variables used, had highly suitable areas of distribution. The study provides high-resolution maps of species that characterize the habitat of two communities included in OSPAR and NATURA networks, whose distributions corroborate the adequate protection of this area by the management measures applied at present.

García-Alegre, Ana; Sánchez, Francisco; Gómez-Ballesteros, María; Hinz, Hilmar; Serrano, Alberto; Parra, Santiago

2014-08-01

217

Establishing macroecological trait datasets: digitalization, extrapolation, and validation of diet preferences in terrestrial mammals worldwide.  

PubMed

Ecological trait data are essential for understanding the broad-scale distribution of biodiversity and its response to global change. For animals, diet represents a fundamental aspect of species' evolutionary adaptations, ecological and functional roles, and trophic interactions. However, the importance of diet for macroevolutionary and macroecological dynamics remains little explored, partly because of the lack of comprehensive trait datasets. We compiled and evaluated a comprehensive global dataset of diet preferences of mammals ("MammalDIET"). Diet information was digitized from two global and cladewide data sources and errors of data entry by multiple data recorders were assessed. We then developed a hierarchical extrapolation procedure to fill-in diet information for species with missing information. Missing data were extrapolated with information from other taxonomic levels (genus, other species within the same genus, or family) and this extrapolation was subsequently validated both internally (with a jack-knife approach applied to the compiled species-level diet data) and externally (using independent species-level diet information from a comprehensive continentwide data source). Finally, we grouped mammal species into trophic levels and dietary guilds, and their species richness as well as their proportion of total richness were mapped at a global scale for those diet categories with good validation results. The success rate of correctly digitizing data was 94%, indicating that the consistency in data entry among multiple recorders was high. Data sources provided species-level diet information for a total of 2033 species (38% of all 5364 terrestrial mammal species, based on the IUCN taxonomy). For the remaining 3331 species, diet information was mostly extrapolated from genus-level diet information (48% of all terrestrial mammal species), and only rarely from other species within the same genus (6%) or from family level (8%). Internal and external validation showed that: (1) extrapolations were most reliable for primary food items; (2) several diet categories ("Animal", "Mammal", "Invertebrate", "Plant", "Seed", "Fruit", and "Leaf") had high proportions of correctly predicted diet ranks; and (3) the potential of correctly extrapolating specific diet categories varied both within and among clades. Global maps of species richness and proportion showed congruence among trophic levels, but also substantial discrepancies between dietary guilds. MammalDIET provides a comprehensive, unique and freely available dataset on diet preferences for all terrestrial mammals worldwide. It enables broad-scale analyses for specific trophic levels and dietary guilds, and a first assessment of trait conservatism in mammalian diet preferences at a global scale. The digitalization, extrapolation and validation procedures could be transferable to other trait data and taxa. PMID:25165528

Kissling, Wilm Daniel; Dalby, Lars; Fløjgaard, Camilla; Lenoir, Jonathan; Sandel, Brody; Sandom, Christopher; Trøjelsgaard, Kristian; Svenning, Jens-Christian

2014-07-01

218

European multicentre study to define disease activity criteria for systemic sclerosis.* II. Identification of disease activity variables and development of preliminary activity indexes  

PubMed Central

OBJECTIVE—To develop criteria for disease activity in systemic sclerosis (SSc) that are valid, reliable, and easy to use.?METHODS—Investigators from 19 European centres completed a standardised clinical chart for a consecutive number of patients with SSc. Three protocol management members blindly evaluated each chart and assigned a disease activity score on a semiquantitative scale of 0-10. Two of them, in addition, gave a blinded, qualitative evaluation of disease activity ("inactive to moderately active" or "active to very active" disease). Both these evaluations were found to be reliable. A final disease activity score and qualitative evaluation of disease activity were arrived at by consensus for each patient; the former represented the gold standard for subsequent analyses. The correlations between individual items in the chart and this gold standard were then analysed.?RESULTS—A total of 290 patients with SSc (117 with diffuse SSc (dSSc) and 173 with limited SSc (lSSc)) were enrolled in the study. The items (including ?-factors—that is, worsening according to the patient report) that were found to correlate with the gold standard on multiple regression were used to construct three separate 10-point indices of disease activity: (a) ?-cardiopulmonary (4.0), ?-skin (3.0), ?-vascular (2.0), and ?-articular/muscular (1.0) for patients with dSSc; (b) ?-skin (2.5), erythrocyte sedimentation rate (ESR) >30 mm/1st h (2.5), ?-cardiopulmonary (1.5), ?-vascular (1.0), arthritis (1.0), hypocomplementaemia (1.0), and scleredema (0.5) for lSSc; (c) ?-cardiopulmonary (2.0), ?-skin (2.0), ESR >30 mm/1st h (1.5), total skin score >20 (1.0), hypocomplementaemia (1.0), scleredema (0.5), digital necrosis (0.5), ?-vascular (0.5), arthritis (0.5), TLCO <80% (0.5) for all patients with SSc. The three indexes were validated by the jackknife technique. Finally, receiver operating characteristic curves were constructed in order to define the value of the index with the best discriminant capacity for "active to very active" patients.?CONCLUSIONS—Three feasible, reliable, and valid preliminary indices to define disease activity in SSc were constructed.?? PMID:11350848

Valentini, G; Della, R; Bombardieri, S; Bencivelli, W; Silman, A; D'Angelo, S; Cerinic, M; Belch, J; Black, C; Bruhlmann, P; Czirjak, L; De Luca, A; Drosos, A; Ferri, C; Gabrielli, A; Giacomelli, R; Hayem, G; Inanc, M; McHugh, N; Nielsen, H; Rosada, M; Scorza, R; Stork, J; Sysa, A; van den Hoogen, F H J; Vlachoyiannopoulo..., P

2001-01-01

219

A New Method for Predicting the Subcellular Localization of Eukaryotic Proteins with Both Single and Multiple Sites: Euk-mPLoc 2.0  

PubMed Central

Information of subcellular locations of proteins is important for in-depth studies of cell biology. It is very useful for proteomics, system biology and drug development as well. However, most existing methods for predicting protein subcellular location can only cover 5 to 12 location sites. Also, they are limited to deal with single-location proteins and hence failed to work for multiplex proteins, which can simultaneously exist at, or move between, two or more location sites. Actually, multiplex proteins of this kind usually posses some important biological functions worthy of our special notice. A new predictor called “Euk-mPLoc 2.0” is developed by hybridizing the gene ontology information, functional domain information, and sequential evolutionary information through three different modes of pseudo amino acid composition. It can be used to identify eukaryotic proteins among the following 22 locations: (1) acrosome, (2) cell wall, (3) centriole, (4) chloroplast, (5) cyanelle, (6) cytoplasm, (7) cytoskeleton, (8) endoplasmic reticulum, (9) endosome, (10) extracell, (11) Golgi apparatus, (12) hydrogenosome, (13) lysosome, (14) melanosome, (15) microsome (16) mitochondria, (17) nucleus, (18) peroxisome, (19) plasma membrane, (20) plastid, (21) spindle pole body, and (22) vacuole. Compared with the existing methods for predicting eukaryotic protein subcellular localization, the new predictor is much more powerful and flexible, particularly in dealing with proteins with multiple locations and proteins without available accession numbers. For a newly-constructed stringent benchmark dataset which contains both single- and multiple-location proteins and in which none of proteins has pairwise sequence identity to any other in a same location, the overall jackknife success rate achieved by Euk-mPLoc 2.0 is more than 24% higher than those by any of the existing predictors. As a user-friendly web-server, Euk-mPLoc 2.0 is freely accessible at http://www.csbio.sjtu.edu.cn/bioinf/euk-multi-2/. For a query protein sequence of 400 amino acids, it will take about 15 seconds for the web-server to yield the predicted result; the longer the sequence is, the more time it may usually need. It is anticipated that the novel approach and the powerful predictor as presented in this paper will have a significant impact to Molecular Cell Biology, System Biology, Proteomics, Bioinformatics, and Drug Development. PMID:20368981

Chou, Kuo-Chen; Shen, Hong-Bin

2010-01-01

220

A Multi-Label Classifier for Predicting the Subcellular Localization of Gram-Negative Bacterial Proteins with Both Single and Multiple Sites  

PubMed Central

Prediction of protein subcellular localization is a challenging problem, particularly when the system concerned contains both singleplex and multiplex proteins. In this paper, by introducing the “multi-label scale” and hybridizing the information of gene ontology with the sequential evolution information, a novel predictor called iLoc-Gneg is developed for predicting the subcellular localization of Gram-positive bacterial proteins with both single-location and multiple-location sites. For facilitating comparison, the same stringent benchmark dataset used to estimate the accuracy of Gneg-mPLoc was adopted to demonstrate the power of iLoc-Gneg. The dataset contains 1,392 Gram-negative bacterial proteins classified into the following eight locations: (1) cytoplasm, (2) extracellular, (3) fimbrium, (4) flagellum, (5) inner membrane, (6) nucleoid, (7) outer membrane, and (8) periplasm. Of the 1,392 proteins, 1,328 are each with only one subcellular location and the other 64 are each with two subcellular locations, but none of the proteins included has pairwise sequence identity to any other in a same subset (subcellular location). It was observed that the overall success rate by jackknife test on such a stringent benchmark dataset by iLoc-Gneg was over 91%, which is about 6% higher than that by Gneg-mPLoc. As a user-friendly web-server, iLoc-Gneg is freely accessible to the public at http://icpr.jci.edu.cn/bioinfo/iLoc-Gneg. Meanwhile, a step-by-step guide is provided on how to use the web-server to get the desired results. Furthermore, for the user's convenience, the iLoc-Gneg web-server also has the function to accept the batch job submission, which is not available in the existing version of Gneg-mPLoc web-server. It is anticipated that iLoc-Gneg may become a useful high throughput tool for Molecular Cell Biology, Proteomics, System Biology, and Drug Development. PMID:21698097

Xiao, Xuan; Wu, Zhi-Cheng; Chou, Kuo-Chen

2011-01-01

221

A DEEP SEARCH FOR EXTENDED RADIO CONTINUUM EMISSION FROM DWARF SPHEROIDAL GALAXIES: IMPLICATIONS FOR PARTICLE DARK MATTER  

SciTech Connect

We present deep radio observations of four nearby dwarf spheroidal (dSph) galaxies, designed to detect extended synchrotron emission resulting from weakly interacting massive particle (WIMP) dark matter annihilations in their halos. Models by Colafrancesco et al. (CPU07) predict the existence of angularly large, smoothly distributed radio halos in such systems, which stem from electron and positron annihilation products spiraling in a turbulent magnetic field. We map a total of 40.5 deg{sup 2} around the Draco, Ursa Major II, Coma Berenices, and Willman 1 dSphs with the Green Bank Telescope (GBT) at 1.4 GHz to detect this annihilation signature, greatly reducing discrete-source confusion using the NVSS catalog. We achieve a sensitivity of {sigma}{sub sub} {approx}< 7 mJy beam{sup -1} in our discrete source-subtracted maps, implying that the NVSS is highly effective at removing background sources from GBT maps. For Draco we obtained approximately concurrent Very Large Array observations to quantify the variability of the discrete source background, and find it to have a negligible effect on our results. We construct radial surface brightness profiles from each of the subtracted maps, and jackknife the data to quantify the significance of the features therein. At the {approx}10' resolution of our observations, foregrounds contribute a standard deviation of 1.8 mJy beam{sup -1} {<=} {sigma}{sub ast} {<=} 5.7 mJy beam{sup -1} to our high-latitude maps, with the emission in Draco and Coma dominated by foregrounds. On the other hand, we find no significant emission in the Ursa Major II and Willman 1 fields, and explore the implications of non-detections in these fields for particle dark matter using the fiducial models of CPU07. For a WIMP mass M{sub {chi}} = 100 GeV annihilating into b b-bar final states and B = 1 {mu}G, upper limits on the annihilation cross-section for Ursa Major II and Willman I are log (({sigma}v){sub {chi}}, cm{sup 3} s{sup -1}) {approx}< -25 for the preferred set of charged particle propagation parameters adopted by CPU07; this is comparable to that inferred at {gamma}-ray energies from the two-year Fermi Large Area Telescope data. We discuss three avenues for improving the constraints on ({sigma}v){sub {chi}} presented here, and conclude that deep radio observations of dSphs are highly complementary to indirect WIMP searches at higher energies.

Spekkens, Kristine [Department of Physics, Royal Military College of Canada, P.O. Box 17000, Station Forces, Kingston, Ontario K7K 7B4 (Canada); Mason, Brian S. [National Radio Astronomy Observatory, 520 Edgemont Road, Charlottesville, VA 22903-2475 (United States); Aguirre, James E. [Department of Physics and Astronomy, University of Pennsylvania, 209 South 33rd Street, Philadelphia, PA 19104 (United States); Nhan, Bang, E-mail: kristine.spekkens@rmc.ca [Department of Astrophysical and Planetary Sciences, University of Colorado, 391 UCB, Boulder, CO 80309 (United States)

2013-08-10

222

A Deep Search for Extended Radio Continuum Emission from Dwarf Spheroidal Galaxies: Implications for Particle Dark Matter  

NASA Astrophysics Data System (ADS)

We present deep radio observations of four nearby dwarf spheroidal (dSph) galaxies, designed to detect extended synchrotron emission resulting from weakly interacting massive particle (WIMP) dark matter annihilations in their halos. Models by Colafrancesco et al. (CPU07) predict the existence of angularly large, smoothly distributed radio halos in such systems, which stem from electron and positron annihilation products spiraling in a turbulent magnetic field. We map a total of 40.5 deg2 around the Draco, Ursa Major II, Coma Berenices, and Willman 1 dSphs with the Green Bank Telescope (GBT) at 1.4 GHz to detect this annihilation signature, greatly reducing discrete-source confusion using the NVSS catalog. We achieve a sensitivity of ?sub <~ 7 mJy beam-1 in our discrete source-subtracted maps, implying that the NVSS is highly effective at removing background sources from GBT maps. For Draco we obtained approximately concurrent Very Large Array observations to quantify the variability of the discrete source background, and find it to have a negligible effect on our results. We construct radial surface brightness profiles from each of the subtracted maps, and jackknife the data to quantify the significance of the features therein. At the ~10' resolution of our observations, foregrounds contribute a standard deviation of 1.8 mJy beam-1 <= ?ast <= 5.7 mJy beam-1 to our high-latitude maps, with the emission in Draco and Coma dominated by foregrounds. On the other hand, we find no significant emission in the Ursa Major II and Willman 1 fields, and explore the implications of non-detections in these fields for particle dark matter using the fiducial models of CPU07. For a WIMP mass M ? = 100 GeV annihilating into b\\bar{b} final states and B = 1 ?G, upper limits on the annihilation cross-section for Ursa Major II and Willman I are log (lang?vrang?, cm3 s-1) <~ -25 for the preferred set of charged particle propagation parameters adopted by CPU07; this is comparable to that inferred at ?-ray energies from the two-year Fermi Large Area Telescope data. We discuss three avenues for improving the constraints on lang?vrang? presented here, and conclude that deep radio observations of dSphs are highly complementary to indirect WIMP searches at higher energies.

Spekkens, Kristine; Mason, Brian S.; Aguirre, James E.; Nhan, Bang

2013-08-01

223

iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition.  

PubMed

Meiotic recombination is an important biological process. As a main driving force of evolution, recombination provides natural new combinations of genetic variations. Rather than randomly occurring across a genome, meiotic recombination takes place in some genomic regions (the so-called 'hotspots') with higher frequencies, and in the other regions (the so-called 'coldspots') with lower frequencies. Therefore, the information of the hotspots and coldspots would provide useful insights for in-depth studying of the mechanism of recombination and the genome evolution process as well. So far, the recombination regions have been mainly determined by experiments, which are both expensive and time-consuming. With the avalanche of genome sequences generated in the postgenomic age, it is highly desired to develop automated methods for rapidly and effectively identifying the recombination regions. In this study, a predictor, called 'iRSpot-PseDNC', was developed for identifying the recombination hotspots and coldspots. In the new predictor, the samples of DNA sequences are formulated by a novel feature vector, the so-called 'pseudo dinucleotide composition' (PseDNC), into which six local DNA structural properties, i.e. three angular parameters (twist, tilt and roll) and three translational parameters (shift, slide and rise), are incorporated. It was observed by the rigorous jackknife test that the overall success rate achieved by iRSpot-PseDNC was >82% in identifying recombination spots in Saccharomyces cerevisiae, indicating the new predictor is promising or at least may become a complementary tool to the existing methods in this area. Although the benchmark data set used to train and test the current method was from S. cerevisiae, the basic approaches can also be extended to deal with all the other genomes. Particularly, it has not escaped our notice that the PseDNC approach can be also used to study many other DNA-related problems. As a user-friendly web-server, iRSpot-PseDNC is freely accessible at http://lin.uestc.edu.cn/server/iRSpot-PseDNC. PMID:23303794

Chen, Wei; Feng, Peng-Mian; Lin, Hao; Chou, Kuo-Chen

2013-04-01

224

Automated determination of P-phase arrival times at regional and local distances using higher order statistics  

NASA Astrophysics Data System (ADS)

We present an algorithm for automatic P-phase arrival time determination for local and regional seismic events based on higher order statistics (HOS). Using skewness or kurtosis a characteristic function is determined to which a new iterative picking algorithm is applied. For P-phase identification we apply the Akaike Information Criterion to the characteristic function, while for a precise determination of the P-phase arrival time a pragmatic picking algorithm is applied to a recalculated characteristic function. In addition, an automatic quality estimate is obtained, based on the slope and the signal-to-noise ratio, both calculated from the characteristic function. To get rid of erroneous picks, a Jackknife procedure and an envelope function analysis is used. The algorithm is applied to a large data set with very heterogeneous qualities of P-onsets acquired by a temporary, regional seismic network of the EGELADOS-project in the southern Aegean. The reliability and robustness of the proposed algorithm is tested by comparing more than 3000 manually derived P readings, serving as reference picks, with the corresponding automatically estimated P-wave arrival times. We find an average deviation from the reference picks of 0.26 +/- 0.64s when using kurtosis and 0.38 +/- 0.75s when using skewness. If automatically as excellent classified picks are considered only, the average difference from the reference picks is 0.07 +/- 0.31s and 0.07 +/- 0.41s, respectively. However, substantially more P-arrival times are determined when using kurtosis, indicating that the characteristic function derived from kurtosis estimation is to be preferred. Since the characteristic function is calculated recursively, the algorithm is very fast and hence suited for earthquake early warning purposes. Furthermore, a comparative study with automatically derived P-readings using Allen's and Baer & Kradolfer's picking algorithms applied to the same data set demonstrates better quantitative and qualitative performance of the HOS approach. This study shows, that precise automatic P-onset determination is feasible, even when using data sets with very heterogeneous signal-to-noise ratio.

Küperkoch, L.; Meier, T.; Lee, J.; Friederich, W.; Working Group, EGELADOS

2010-05-01

225

Effective transcription factor binding site prediction using a combination of optimization, a genetic algorithm and discriminant analysis to capture distant interactions  

PubMed Central

Background Reliable transcription factor binding site (TFBS) prediction methods are essential for computer annotation of large amount of genome sequence data. However, current methods to predict TFBSs are hampered by the high false-positive rates that occur when only sequence conservation at the core binding-sites is considered. Results To improve this situation, we have quantified the performance of several Position Weight Matrix (PWM) algorithms, using exhaustive approaches to find their optimal length and position. We applied these approaches to bio-medically important TFBSs involved in the regulation of cell growth and proliferation as well as in inflammatory, immune, and antiviral responses (NF-?B, ISGF3, IRF1, STAT1), obesity and lipid metabolism (PPAR, SREBP, HNF4), regulation of the steroidogenic (SF-1) and cell cycle (E2F) genes expression. We have also gained extra specificity using a method, entitled SiteGA, which takes into account structural interactions within TFBS core and flanking regions, using a genetic algorithm (GA) with a discriminant function of locally positioned dinucleotide (LPD) frequencies. To ensure a higher confidence in our approach, we applied resampling-jackknife and bootstrap tests for the comparison, it appears that, optimized PWM and SiteGA have shown similar recognition performances. Then we applied SiteGA and optimized PWMs (both separately and together) to sequences in the Eukaryotic Promoter Database (EPD). The resulting SiteGA recognition models can now be used to search sequences for BSs using the web tool, SiteGA. Analysis of dependencies between close and distant LPDs revealed by SiteGA models has shown that the most significant correlations are between close LPDs, and are generally located in the core (footprint) region. A greater number of less significant correlations are mainly between distant LPDs, which spanned both core and flanking regions. When SiteGA and optimized PWM models were applied together, this substantially reduced false positives at least at higher stringencies. Conclusion Based on this analysis, SiteGA adds substantial specificity even to optimized PWMs and may be considered for large-scale genome analysis. It adds to the range of techniques available for TFBS prediction, and EPD analysis has led to a list of genes which appear to be regulated by the above TFs. PMID:18093302

Levitsky, Victor G; Ignatieva, Elena V; Ananko, Elena A; Turnaev, Igor I; Merkulova, Tatyana I; Kolchanov, Nikolay A; Hodgman, TC

2007-01-01

226

NR-2L: A Two-Level Predictor for Identifying Nuclear Receptor Subfamilies Based on Sequence-Derived Features  

PubMed Central

Nuclear receptors (NRs) are one of the most abundant classes of transcriptional regulators in animals. They regulate diverse functions, such as homeostasis, reproduction, development and metabolism. Therefore, NRs are a very important target for drug development. Nuclear receptors form a superfamily of phylogenetically related proteins and have been subdivided into different subfamilies due to their domain diversity. In this study, a two-level predictor, called NR-2L, was developed that can be used to identify a query protein as a nuclear receptor or not based on its sequence information alone; if it is, the prediction will be automatically continued to further identify it among the following seven subfamilies: (1) thyroid hormone like (NR1), (2) HNF4-like (NR2), (3) estrogen like, (4) nerve growth factor IB-like (NR4), (5) fushi tarazu-F1 like (NR5), (6) germ cell nuclear factor like (NR6), and (7) knirps like (NR0). The identification was made by the Fuzzy K nearest neighbor (FK-NN) classifier based on the pseudo amino acid composition formed by incorporating various physicochemical and statistical features derived from the protein sequences, such as amino acid composition, dipeptide composition, complexity factor, and low-frequency Fourier spectrum components. As a demonstration, it was shown through some benchmark datasets derived from the NucleaRDB and UniProt with low redundancy that the overall success rates achieved by the jackknife test were about 93% and 89% in the first and second level, respectively. The high success rates indicate that the novel two-level predictor can be a useful vehicle for identifying NRs and their subfamilies. As a user-friendly web server, NR-2L is freely accessible at either http://icpr.jci.edu.cn/bioinfo/NR2L or http://www.jci-bioinfo.cn/NR2L. Each job submitted to NR-2L can contain up to 500 query protein sequences and be finished in less than 2 minutes. The less the number of query proteins is, the shorter the time will usually be. All the program codes for NR-2L are available for non-commercial purpose upon request. PMID:21858146

Wang, Pu; Xiao, Xuan; Chou, Kuo-Chen

2011-01-01

227

A multi-label classifier for predicting the subcellular localization of gram-negative bacterial proteins with both single and multiple sites.  

PubMed

Prediction of protein subcellular localization is a challenging problem, particularly when the system concerned contains both singleplex and multiplex proteins. In this paper, by introducing the "multi-label scale" and hybridizing the information of gene ontology with the sequential evolution information, a novel predictor called iLoc-Gneg is developed for predicting the subcellular localization of gram-positive bacterial proteins with both single-location and multiple-location sites. For facilitating comparison, the same stringent benchmark dataset used to estimate the accuracy of Gneg-mPLoc was adopted to demonstrate the power of iLoc-Gneg. The dataset contains 1,392 gram-negative bacterial proteins classified into the following eight locations: (1) cytoplasm, (2) extracellular, (3) fimbrium, (4) flagellum, (5) inner membrane, (6) nucleoid, (7) outer membrane, and (8) periplasm. Of the 1,392 proteins, 1,328 are each with only one subcellular location and the other 64 are each with two subcellular locations, but none of the proteins included has pairwise sequence identity to any other in a same subset (subcellular location). It was observed that the overall success rate by jackknife test on such a stringent benchmark dataset by iLoc-Gneg was over 91%, which is about 6% higher than that by Gneg-mPLoc. As a user-friendly web-server, iLoc-Gneg is freely accessible to the public at http://icpr.jci.edu.cn/bioinfo/iLoc-Gneg. Meanwhile, a step-by-step guide is provided on how to use the web-server to get the desired results. Furthermore, for the user's convenience, the iLoc-Gneg web-server also has the function to accept the batch job submission, which is not available in the existing version of Gneg-mPLoc web-server. It is anticipated that iLoc-Gneg may become a useful high throughput tool for Molecular Cell Biology, Proteomics, System Biology, and Drug Development. PMID:21698097

Xiao, Xuan; Wu, Zhi-Cheng; Chou, Kuo-Chen

2011-01-01

228

iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition  

PubMed Central

Meiotic recombination is an important biological process. As a main driving force of evolution, recombination provides natural new combinations of genetic variations. Rather than randomly occurring across a genome, meiotic recombination takes place in some genomic regions (the so-called ‘hotspots’) with higher frequencies, and in the other regions (the so-called ‘coldspots’) with lower frequencies. Therefore, the information of the hotspots and coldspots would provide useful insights for in-depth studying of the mechanism of recombination and the genome evolution process as well. So far, the recombination regions have been mainly determined by experiments, which are both expensive and time-consuming. With the avalanche of genome sequences generated in the postgenomic age, it is highly desired to develop automated methods for rapidly and effectively identifying the recombination regions. In this study, a predictor, called ‘iRSpot-PseDNC’, was developed for identifying the recombination hotspots and coldspots. In the new predictor, the samples of DNA sequences are formulated by a novel feature vector, the so-called ‘pseudo dinucleotide composition’ (PseDNC), into which six local DNA structural properties, i.e. three angular parameters (twist, tilt and roll) and three translational parameters (shift, slide and rise), are incorporated. It was observed by the rigorous jackknife test that the overall success rate achieved by iRSpot-PseDNC was >82% in identifying recombination spots in Saccharomyces cerevisiae, indicating the new predictor is promising or at least may become a complementary tool to the existing methods in this area. Although the benchmark data set used to train and test the current method was from S. cerevisiae, the basic approaches can also be extended to deal with all the other genomes. Particularly, it has not escaped our notice that the PseDNC approach can be also used to study many other DNA-related problems. As a user-friendly web-server, iRSpot-PseDNC is freely accessible at http://lin.uestc.edu.cn/server/iRSpot-PseDNC. PMID:23303794

Chen, Wei; Feng, Peng-Mian; Lin, Hao; Chou, Kuo-Chen

2013-01-01

229

iNR-PhysChem: A Sequence-Based Predictor for Identifying Nuclear Receptors and Their Subfamilies via Physical-Chemical Property Matrix  

PubMed Central

Nuclear receptors (NRs) form a family of ligand-activated transcription factors that regulate a wide variety of biological processes, such as homeostasis, reproduction, development, and metabolism. Human genome contains 48 genes encoding NRs. These receptors have become one of the most important targets for therapeutic drug development. According to their different action mechanisms or functions, NRs have been classified into seven subfamilies. With the avalanche of protein sequences generated in the postgenomic age, we are facing the following challenging problems. Given an uncharacterized protein sequence, how can we identify whether it is a nuclear receptor? If it is, what subfamily it belongs to? To address these problems, we developed a predictor called iNR-PhysChem in which the protein samples were expressed by a novel mode of pseudo amino acid composition (PseAAC) whose components were derived from a physical-chemical matrix via a series of auto-covariance and cross-covariance transformations. It was observed that the overall success rate achieved by iNR-PhysChem was over 98% in identifying NRs or non-NRs, and over 92% in identifying NRs among the following seven subfamilies: NR1thyroid hormone like, NR2HNF4-like, NR3estrogen like, NR4nerve growth factor IB-like, NR5fushi tarazu-F1 like, NR6germ cell nuclear factor like, and NR0knirps like. These rates were derived by the jackknife tests on a stringent benchmark dataset in which none of protein sequences included has pairwise sequence identity to any other in a same subset. As a user-friendly web-server, iNR-PhysChem is freely accessible to the public at either http://www.jci-bioinfo.cn/iNR-PhysChem or http://icpr.jci.edu.cn/bioinfo/iNR-PhysChem. Also a step-by-step guide is provided on how to use the web-server to get the desired results without the need to follow the complicated mathematics involved in developing the predictor. It is anticipated that iNR-PhysChem may become a useful high throughput tool for both basic research and drug design. PMID:22363503

Xiao, Xuan; Wang, Pu; Chou, Kuo-Chen

2012-01-01

230

iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites.  

PubMed

Although numerous efforts have been made for predicting the subcellular locations of proteins based on their sequence information, it still remains as a challenging problem, particularly when query proteins may have the multiplex character, i.e., they simultaneously exist, or move between, two or more different subcellular location sites. Most of the existing methods were established on the assumption: a protein has one, and only one, subcellular location. Actually, recent evidence has indicated an increasing number of human proteins having multiple subcellular locations. This kind of multiplex proteins should not be ignored because they may bear some special biological functions worthy of our attention. Based on the accumulation-label scale, a new predictor, called iLoc-Hum, was developed for identifying the subcellular localization of human proteins with both single and multiple location sites. As a demonstration, the jackknife cross-validation was performed with iLoc-Hum on a benchmark dataset of human proteins that covers the following 14 location sites: centrosome, cytoplasm, cytoskeleton, endoplasmic reticulum, endosome, extracellular, Golgi apparatus, lysosome, microsome, mitochondrion, nucleus, peroxisome, plasma membrane, and synapse, where some proteins belong to two, three or four locations but none has 25% or higher pairwise sequence identity to any other in the same subset. For such a complicated and stringent system, the overall success rate achieved by iLoc-Hum was 76%, which is remarkably higher than that by any of the existing predictors that also have the capacity to deal with this kind of system. Further comparisons were also made via two independent datasets; all indicated that the success rates by iLoc-Hum were even more significantly higher than its counterparts. As a user-friendly web-server, iLoc-Hum is freely accessible to the public at or . For the convenience of most experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results by choosing either a straightforward submission or a batch submission, without the need to follow the complicated mathematical equations involved. PMID:22134333

Chou, Kuo-Chen; Wu, Zhi-Cheng; Xiao, Xuan

2012-02-01

231

A Multi-Label Predictor for Identifying the Subcellular Locations of Singleplex and Multiplex Eukaryotic Proteins  

PubMed Central

Subcellular locations of proteins are important functional attributes. An effective and efficient subcellular localization predictor is necessary for rapidly and reliably annotating subcellular locations of proteins. Most of existing subcellular localization methods are only used to deal with single-location proteins. Actually, proteins may simultaneously exist at, or move between, two or more different subcellular locations. To better reflect characteristics of multiplex proteins, it is highly desired to develop new methods for dealing with them. In this paper, a new predictor, called Euk-ECC-mPLoc, by introducing a powerful multi-label learning approach which exploits correlations between subcellular locations and hybridizing gene ontology with dipeptide composition information, has been developed that can be used to deal with systems containing both singleplex and multiplex eukaryotic proteins. It can be utilized to identify eukaryotic proteins among the following 22 locations: (1) acrosome, (2) cell membrane, (3) cell wall, (4) centrosome, (5) chloroplast, (6) cyanelle, (7) cytoplasm, (8) cytoskeleton, (9) endoplasmic reticulum, (10) endosome, (11) extracellular, (12) Golgi apparatus, (13) hydrogenosome, (14) lysosome, (15) melanosome, (16) microsome, (17) mitochondrion, (18) nucleus, (19) peroxisome, (20) spindle pole body, (21) synapse, and (22) vacuole. Experimental results on a stringent benchmark dataset of eukaryotic proteins by jackknife cross validation test show that the average success rate and overall success rate obtained by Euk-ECC-mPLoc were 69.70% and 81.54%, respectively, indicating that our approach is quite promising. Particularly, the success rates achieved by Euk-ECC-mPLoc for small subsets were remarkably improved, indicating that it holds a high potential for simulating the development of the area. As a user-friendly web-server, Euk-ECC-mPLoc is freely accessible to the public at the website http://levis.tongji.edu.cn:8080/bioinfo/Euk-ECC-mPLoc/. We believe that Euk-ECC-mPLoc may become a useful high-throughput tool, or at least play a complementary role to the existing predictors in identifying subcellular locations of eukaryotic proteins. PMID:22629314

Wang, Xiao; Li, Guo-Zheng

2012-01-01

232

EEG spectral coherence data distinguish chronic fatigue syndrome patients from healthy controls and depressed patients-A case control study  

PubMed Central

Background Previous studies suggest central nervous system involvement in chronic fatigue syndrome (CFS), yet there are no established diagnostic criteria. CFS may be difficult to differentiate from clinical depression. The study's objective was to determine if spectral coherence, a computational derivative of spectral analysis of the electroencephalogram (EEG), could distinguish patients with CFS from healthy control subjects and not erroneously classify depressed patients as having CFS. Methods This is a study, conducted in an academic medical center electroencephalography laboratory, of 632 subjects: 390 healthy normal controls, 70 patients with carefully defined CFS, 24 with major depression, and 148 with general fatigue. Aside from fatigue, all patients were medically healthy by history and examination. EEGs were obtained and spectral coherences calculated after extensive artifact removal. Principal Components Analysis identified coherence factors and corresponding factor loading patterns. Discriminant analysis determined whether spectral coherence factors could reliably discriminate CFS patients from healthy control subjects without misclassifying depression as CFS. Results Analysis of EEG coherence data from a large sample (n = 632) of patients and healthy controls identified 40 factors explaining 55.6% total variance. Factors showed highly significant group differentiation (p < .0004) identifying 89.5% of unmedicated female CFS patients and 92.4% of healthy female controls. Recursive jackknifing showed predictions were stable. A conservative 10-factor discriminant function model was subsequently applied, and also showed highly significant group discrimination (p < .001), accurately classifying 88.9% unmedicated males with CFS, and 82.4% unmedicated male healthy controls. No patient with depression was classified as having CFS. The model was less accurate (73.9%) in identifying CFS patients taking psychoactive medications. Factors involving the temporal lobes were of primary importance. Conclusions EEG spectral coherence analysis identified unmedicated patients with CFS and healthy control subjects without misclassifying depressed patients as CFS, providing evidence that CFS patients demonstrate brain physiology that is not observed in healthy normals or patients with major depression. Studies of new CFS patients and comparison groups are required to determine the possible clinical utility of this test. The results concur with other studies finding neurological abnormalities in CFS, and implicate temporal lobe involvement in CFS pathophysiology. PMID:21722376

2011-01-01

233

Application of threshold-bias independent analysis to eye-tracking and FROC data  

PubMed Central

Rationale and Objectives Studies of medical image interpretation have focused on either assessing radiologists’ performance using, for example, the receiver operating characteristic (ROC) paradigm, or assessing the interpretive process by analyzing eye-tracking (ET) data. Analysis of ET data has not benefited from threshold-bias independent figures-of-merit (FOMs) analogous to the area under the ROC curve. The aim was to demonstrate the feasibility of such FOMs and to measure the agreement between figures-of-merit derived from free-response ROC (FROC) and ET data. Methods Eight expert breast radiologists interpreted a case set of 120 two-view mammograms while eye-position data and FROC data were continuously collected during the interpretation interval. Regions that attract prolonged (>800ms) visual attention were considered to be virtual marks, and ratings based on the dwell and approach-rate (inverse of time-to-hit) were assigned to them. The virtual ratings were used to define threshold-bias independent FOMs in a manner analogous to the area under the trapezoidal alternative FROC (AFROC) curve (0 = worst, 1 = best). Agreement at the case level (0.5 = chance, 1 = perfect) was measured using the jackknife and 95% confidence intervals (CI) for the FOMs and agreement were estimated using the bootstrap. Results The AFROC mark-ratings FOM was largest 0.734, CI = (0.65, 0.81) followed by the dwell 0.460 (0.34, 0.59) and then by the approach-rate FOM 0.336 (0.25, 0.46). The differences between the FROC mark-ratings FOM and the perceptual FOMs were significant (p < 0.05). All pairwise agreements were significantly better then chance: ratings vs. dwell 0.707 (0.63, 0.88), dwell vs. approach-rate 0.703 (0.60, 0.79) and rating vs. approach-rate 0.606 (0.53, 0.68). The ratings vs. approach-rate agreement was significantly smaller than the dwell vs. approach-rate agreement (p = 0.008). Conclusions Leveraging current methods developed for analyzing observer performance data could complement current ways of analyzing ET data and lead to new insights. PMID:23040503

Chakraborty, Dev P.; Yoon, Hong-Jun; Mello-Thoms, Claudia

2012-01-01

234

Measuring agreement between rating interpretations and binary clinical interpretations of images: a simulation study of methods for quantifying the clinical relevance of an observer performance paradigm  

NASA Astrophysics Data System (ADS)

Laboratory receiver operating characteristic (ROC) studies, that are often used to evaluate medical imaging systems, differ from ‘live’ clinical interpretations in several respects which could compromise their clinical relevance. The aim was to develop methodology for quantifying the clinical relevance of a laboratory ROC study. A simulator was developed to generate ROC ratings data and binary clinical interpretations classified as correct or incorrect for a common set of images interpreted under clinical and laboratory conditions. The area under the trapezoidal ROC curve (AUC) was used as the laboratory figure-of-merit and the fraction of correct clinical decisions as the clinical figure-of-merit. Conventional agreement measures (Pearson, Spearman, Kendall and kappa) between the bootstrap-induced fluctuations of the two figures of merit were estimated. A jackknife pseudovalue transformation applied to the figures of merit was also investigated as a way to capture agreement existing at the individual image level that could be lost at the figure-of-merit level. It is shown that the pseudovalues define a relevance-ROC curve. The area under this curve (rAUC) measures the ability of the laboratory figure-of-merit-based pseudovalues to correctly classify incorrect versus correct clinical interpretations. Therefore, rAUC is a measure of the clinical relevance of an ROC study. The conventional measures and rAUC were compared under varying simulator conditions. It was found that design details of the ROC study, namely the number of bins, the difficulty level of the images, the ratio of disease-present to disease-absent images and the unavoidable difference between laboratory and clinical performance levels, can lead to serious underestimation of the agreement as indicated by conventional agreement measures, even for perfectly correlated data, while rAUC showed high agreement and was relatively immune to these details. At the same time rAUC was sensitive to factors such as intrinsic correlation between the laboratory and clinical decision variables and differences in reporting thresholds that are expected to influence agreement both at the individual image level and at the figure-of-merit level. Suggestions are made for how to conduct relevance-ROC studies aimed at assessing agreement between laboratory and clinical interpretations. The method could be used to evaluate the clinical relevance of alternative scalar figures of merit, such as the sensitivity at a predifined specificity.

Chakraborty, Dev P.

2012-05-01

235

Computer-aided mass detection in mammography: false positive reduction via gray-scale invariant ranklet texture features.  

PubMed

In this work, gray-scale invariant ranklet texture features are proposed for false positive reduction (FPR) in computer-aided detection (CAD) of breast masses. Two main considerations are at the basis of this proposal. First, false positive (FP) marks surviving our previous CAD system seem to be characterized by specific texture properties that can be used to discriminate them from masses. Second, our previous CAD system achieves invariance to linear/nonlinear monotonic gray-scale transformations by encoding regions of interest into ranklet images through the ranklet transform, an image transformation similar to the wavelet transform, yet dealing with pixels' ranks rather than with their gray-scale values. Therefore, the new FPR approach proposed herein defines a set of texture features which are calculated directly from the ranklet images corresponding to the regions of interest surviving our previous CAD system, hence, ranklet texture features; then, a support vector machine (SVM) classifier is used for discrimination. As a result of this approach, texture-based information is used to discriminate FP marks surviving our previous CAD system; at the same time, invariance to linear/nonlinear monotonic gray-scale transformations of the new CAD system is guaranteed, as ranklet texture features are calculated from ranklet images that have this property themselves by construction. To emphasize the gray-scale invariance of both the previous and new CAD systems, training and testing are carried out without any in-between parameters' adjustment on mammograms having different gray-scale dynamics; in particular, training is carried out on analog digitized mammograms taken from a publicly available digital database, whereas testing is performed on full-field digital mammograms taken from an in-house database. Free-response receiver operating characteristic (FROC) curve analysis of the two CAD systems demonstrates that the new approach achieves a higher reduction of FP marks when compared to the previous one. Specifically, at 60%, 65%, and 70% per-mammogram sensitivity, the new CAD system achieves 0.50, 0.68, and 0.92 FP marks per mammogram, whereas at 70%, 75%, and 80% per-case sensitivity it achieves 0.37, 0.48, and 0.71 FP marks per mammogram, respectively. Conversely, at the same sensitivities, the previous CAD system reached 0.71, 0.87, and 1.15 FP marks per mammogram, and 0.57, 0.73, and 0.92 FPs per mammogram. Also, statistical significance of the difference between the two per-mammogram and per-case FROC curves is demonstrated by the p-value < 0.001 returned by jackknife FROC analysis performed on the two CAD systems. PMID:19291970

Masotti, Matteo; Lanconelli, Nico; Campanini, Renato

2009-02-01

236

Growing Season Temperatures in Europe and Climate Forcings Over the Past 1400 Years  

PubMed Central

Background The lack of instrumental data before the mid-19th-century limits our understanding of present warming trends. In the absence of direct measurements, we used proxies that are natural or historical archives recording past climatic changes. A gridded reconstruction of spring-summer temperature was produced for Europe based on tree-rings, documentaries, pollen assemblages and ice cores. The majority of proxy series have an annual resolution. For a better inference of long-term climate variation, they were completed by low-resolution data (decadal or more), mostly on pollen and ice-core data. Methodology/Principal Findings An original spectral analog method was devised to deal with this heterogeneous dataset, and to preserve long-term variations and the variability of temperature series. So we can replace the recent climate changes in a broader context of the past 1400 years. This preservation is possible because the method is not based on a calibration (regression) but on similarities between assemblages of proxies. The reconstruction of the April-September temperatures was validated with a Jack-knife technique. It was also compared to other spatially gridded temperature reconstructions, literature data, and glacier advance and retreat curves. We also attempted to relate the spatial distribution of European temperature anomalies to known solar and volcanic forcings. Conclusions We found that our results were accurate back to 750. Cold periods prior to the 20th century can be explained partly by low solar activity and/or high volcanic activity. The Medieval Warm Period (MWP) could be correlated to higher solar activity. During the 20th century, however only anthropogenic forcing can explain the exceptionally high temperature rise. Warm periods of the Middle Age were spatially more heterogeneous than last decades, and then locally it could have been warmer. However, at the continental scale, the last decades were clearly warmer than any period of the last 1400 years. The heterogeneity of MWP versus the homogeneity of the last decades is likely an argument that different forcings could have operated. These results support the fact that we are living a climate change in Europe never seen in the past 1400 years. PMID:20376366

Guiot, Joel; Corona, Christophe

2010-01-01

237

MSLoc-DT: a new method for predicting the protein subcellular location of multispecies based on decision templates.  

PubMed

Revealing the subcellular location of newly discovered protein sequences can bring insight to their function and guide research at the cellular level. The rapidly increasing number of sequences entering the genome databanks has called for the development of automated analysis methods. Currently, most existing methods used to predict protein subcellular locations cover only one, or a very limited number of species. Therefore, it is necessary to develop reliable and effective computational approaches to further improve the performance of protein subcellular prediction and, at the same time, cover more species. The current study reports the development of a novel predictor called MSLoc-DT to predict the protein subcellular locations of human, animal, plant, bacteria, virus, fungi, and archaea by introducing a novel feature extraction approach termed Amino Acid Index Distribution (AAID) and then fusing gene ontology information, sequential evolutionary information, and sequence statistical information through four different modes of pseudo amino acid composition (PseAAC) with a decision template rule. Using the jackknife test, MSLoc-DT can achieve 86.5, 98.3, 90.3, 98.5, 95.9, 98.1, and 99.3% overall accuracy for human, animal, plant, bacteria, virus, fungi, and archaea, respectively, on seven stringent benchmark datasets. Compared with other predictors (e.g., Gpos-PLoc, Gneg-PLoc, Virus-PLoc, Plant-PLoc, Plant-mPLoc, ProLoc-Go, Hum-PLoc, GOASVM) on the gram-positive, gram-negative, virus, plant, eukaryotic, and human datasets, the new MSLoc-DT predictor is much more effective and robust. Although the MSLoc-DT predictor is designed to predict the single location of proteins, our method can be extended to multiple locations of proteins by introducing multilabel machine learning approaches, such as the support vector machine and deep learning, as substitutes for the K-nearest neighbor (KNN) method. As a user-friendly web server, MSLoc-DT is freely accessible at http://bioinfo.ibp.ac.cn/MSLOC_DT/index.html. PMID:24361712

Zhang, Shao-Wu; Liu, Yan-Fang; Yu, Yong; Zhang, Ting-He; Fan, Xiao-Nan

2014-03-15

238

One year survival of ART and conventional restorations in patients with disability  

PubMed Central

Background Providing restorative treatment for persons with disability may be challenging and has been related to the patient’s ability to cope with the anxiety engendered by treatment and to cooperate fully with the demands of the clinical situation. The aim of the present study was to assess the survival rate of ART restorations compared to conventional restorations in people with disability referred for special care dentistry. Methods Three treatment protocols were distinguished: ART (hand instruments/high-viscosity glass-ionomer); conventional restorative treatment (rotary instrumentation/resin composite) in the clinic (CRT/clinic) and under general anaesthesia (CRT/GA). Patients were referred for restorative care to a special care centre and treated by one of two specialists. Patients and/or their caregivers were provided with written and verbal information regarding the proposed techniques, and selected the type of treatment they were to receive. Treatment was provided as selected but if this option proved clinically unfeasible one of the alternative techniques was subsequently proposed. Evaluation of restoration survival was performed by two independent trained and calibrated examiners using established ART restoration assessment codes at 6 months and 12 months. The Proportional Hazard model with frailty corrections was applied to calculate survival estimates over a one year period. Results 66 patients (13.6?±?7.8 years) with 16 different medical disorders participated. CRT/clinic proved feasible for 5 patients (7.5%), the ART approach for 47 patients (71.2%), and 14 patients received CRT/GA (21.2%). In all, 298 dentine carious lesions were restored in primary and permanent teeth, 182 (ART), 21 (CRT/clinic) and 95 (CRT/GA). The 1-year survival rates and jackknife standard error of ART and CRT restorations were 97.8?±?1.0% and 90.5?±?3.2%, respectively (p?=?0.01). Conclusions These short-term results indicate that ART appears to be an effective treatment protocol for treating patients with disability restoratively, many of whom have difficulty coping with the conventional restorative treatment. Trial registration number Netherlands Trial Registration: NTR 4400 PMID:24885938

2014-01-01

239

Fossil Chironomidae (Insecta: Diptera) as quantitative indicators of past salinity in African lakes  

NASA Astrophysics Data System (ADS)

We surveyed sub-fossil chironomid assemblages in surface sediments of 73 low- to mid-elevation lakes in tropical East Africa (Uganda, Kenya, Tanzania, Ethiopia) to develop inference models for quantitative paleosalinity reconstruction. Using a calibration data set of 67 lakes with surface-water conductivity between 34 and 68,800 ?S/cm, trial models based on partial least squares (PLS), weighted-averaging (WA), weighted-averaging partial least squares (WA-PLS), maximum likelihood (ML), and the weighted modern analogue technique (WMAT) produced jack-knifed coefficients of determination ( r2) between 0.83 and 0.87, and root-mean-squared errors of prediction (RMSEP) between 0.27 and 0.31 log 10 conductivity units, values indicating that fossil assemblages of African Chironomidae can be valuable indicators of past salinity change. The new inference models improve on previous models, which were calibrated with presence-absence data from live collections, by the much greater information content of the calibration data set, and greater probability of finding good modern analogues for fossil assemblages. However, inferences still suffered to a greater (WA, WMAT) or lesser (WA-PLS, PLS and ML) extent from weak correlation between chironomid species distribution and salinity in a broad range of fresh waters, and apparent threshold response of African chironomid communities to salinity change near 3000 ?S/cm. To improve model sensitivity in freshwater lakes we expanded the calibration data set with 11 dilute (6-61 ?S/cm) high-elevation lakes on Mt. Kenya (Kenya) and the Ruwenzori Mts. (Uganda). This did not appreciably improve models' error statistics, in part because it introduced a secondary environmental gradient to the faunal data, probably temperature. To evaluate whether a chironomid-based salinity inference model calibrated in East African lakes could be meaningfully used for environmental reconstruction elsewhere on the continent, we expanded the calibration data set with 8 fresh (15-168 ?S/cm) lakes in Cameroon, West Africa, and one hypersaline desert lake in Chad. This experiment yielded poorer error statistics, primarily because the need to amalgamate East and West African sister taxa reduced overall taxonomic resolution and increased the mean tolerance range of retained taxa. However, the merged data set constrained better the salinity optimum of several freshwater taxa, and further increased the probability of finding good modern analogues. We then used chironomid stratigraphic data and independent proxy reconstructions from two fluctuating lakes in Kenya to compare the performance of new and previous African salinity-inference models. This analysis revealed significant differences between the various numerical techniques in reconstructed salinity trends through time, due to their different sensitivity to the presence or relative abundance of certain key taxa, combined with the above-mentioned threshold faunal response to salinity change. Simple WA and WMAT produced ecologically sensible reconstructions because their step-like change in inferred conductivity near 3000 ?S/cm mirrors the relatively rapid transitions between fresh and saline lake phases associated with climate-driven lake-level change in shallow tropical closed-basin lakes. Statistical camouflaging of this threshold faunal response in WA-PLS and ML models resulted in less trustworthy reconstructions of past salinity in lakes crossing the freshwater-saline boundary. We conclude that selection of a particular inference model should not only be based on statistical performance measures, but consider chironomid community ecology in the study region, and the amplitude of reconstructed environmental change relative to the modern environmental gradient represented in the calibration data set.

Eggermont, Hilde; Heiri, Oliver; Verschuren, Dirk

2006-08-01

240

Program package for multicanonical simulations of U(1) lattice gauge theory-Second version  

NASA Astrophysics Data System (ADS)

A new version STMCMUCA_V1_1 of our program package is available. It eliminates compatibility problems of our Fortran 77 code, originally developed for the g77 compiler, with Fortran 90 and 95 compilers. New version program summaryProgram title: STMC_U1MUCA_v1_1 Catalogue identifier: AEET_v1_1 Licensing provisions: Standard CPC license, http://cpc.cs.qub.ac.uk/licence/licence.html Programming language: Fortran 77 compatible with Fortran 90 and 95 Computers: Any capable of compiling and executing Fortran code Operating systems: Any capable of compiling and executing Fortran code RAM: 10 MB and up depending on lattice size used No. of lines in distributed program, including test data, etc.: 15059 No. of bytes in distributed program, including test data, etc.: 215733 Keywords: Markov chain Monte Carlo, multicanonical, Wang-Landau recursion, Fortran, lattice gauge theory, U(1) gauge group, phase transitions of continuous systems Classification: 11.5 Catalogue identifier of previous version: AEET_v1_0 Journal Reference of previous version: Computer Physics Communications 180 (2009) 2339-2347 Does the new version supersede the previous version?: Yes Nature of problem: Efficient Markov chain Monte Carlo simulation of U(1) lattice gauge theory (or other continuous systems) close to its phase transition. Measurements and analysis of the action per plaquette, the specific heat, Polyakov loops and their structure factors. Solution method: Multicanonical simulations with an initial Wang-Landau recursion to determine suitable weight factors. Reweighting to physical values using logarithmic coding and calculating jackknife error bars. Reasons for the new version: The previous version was developed for the g77 compiler Fortran 77 version. Compiler errors were encountered with Fortran 90 and Fortran 95 compilers (specified below). Summary of revisions: epsilon=one/10**10 is replaced by epsilon/10.0D10 in the parameter statements of the subroutines u1_bmha.f, u1_mucabmha.f, u1wl_backup.f, u1wlread_backup.f of the folder Libs/U1_par. For the tested compilers script files are added in the folder ExampleRuns and readme.txt files are now provided in all subfolders of ExampleRuns. The gnuplot driver files produced by the routine hist_gnu.f of Libs/Fortran are adapted to syntax required by gnuplot version 4.0 and higher. Restrictions: Due to the use of explicit real*8 initialization the conversion into real*4 will require extra changes besides replacing the implicit.sta file by its real*4 version. Unusual features: The programs have to be compiled the script files like those contained in the folder ExampleRuns as explained in the original paper. Running time: The prepared test runs took up to 74 minutes to execute on a 2 GHz PC.

Bazavov, Alexei; Berg, Bernd A.

2013-03-01

241

Dynamics of interfaces and detergency  

NASA Astrophysics Data System (ADS)

Laser light scattering (LLS) methods have been used to study the surface wave dispersion and surface viscoelasticity of aqueous solutions of the zwitterionic surfactant n-hexadecyl-n,n-dimethyl-3-ammonio-1- propanesulphonate (HDPS), which has CMC 0.027mM. These studies have probed a surface wavenumber range of 400jackknife approach. It is demonstrated that the contribution of the empirical term to the total error associated with the drop volume method is typically smaller than the systematic percentage error.

Johnson, Edward George

242

Modeling deformation associated with the 2004-2008 dome-building eruption of Mount St. Helens  

NASA Astrophysics Data System (ADS)

We estimate deformation sources active during and after the 2004-2008 dome-building eruption of Mount St. Helens (MSH) by inverting campaign and continuous GPS (CGPS) measured deformation between 2000 and 2011. All data are corrected for background deformation using a tectonic model that includes block rotation and uniform strain accumulation. The campaign GPS surveys characterize the deformation over a large area, and the CGPS data allow estimates of time-dependent changes in the rate of deformation. Only one CGPS station, JRO1, was operating near MSH prior to the start of unrest on September 23, 2004. Most other CGPS stations, installed by the Plate Boundary Observatory and Cascade Volcano Observatory, were operating by mid-October, 2004. The inward displacement of JRO1 started with the seismic unrest on September 23, 2004, and continued at a rate of 0.5 mm/day until the last phreatic explosion on October 5, 2004 (note there was another explosion in March 2005). The deformation then decayed exponentially until activity ceased in January, 2008. The rate of decay was estimated using a number of clean CGPS time series, and then it was fixed to estimate amplitudes for all CGPS station displacements. The inward and downward movements (deflation) observed at all stations during the eruption (2004-2008) were best-fit by a prolate spheroid with geometric aspect ratio 0.19 ± 0.6, a depth of 7.4 ± 1.7 km, and a cavity volume decrease of 0.028 ± 0.005 cubic km. This source is practically vertical (dip angle: 84 ± 5; strike angle 298 ± 84) and is located beneath the dome. All errors are 95% bounds and have been estimated using jackknife. The post-eruption deformation (2008 - present) is characterized by deflation in the near field (within 2 km from the dome) and inflation in the far field. The near-field deflation signal is best fit by a very shallow sill-like source (~0.18 ± 0.05 km below the crater floor) with a radius of 0.5 ± 0.3 km and a cavity volume decrease of 0.010 ± 0.001 cubic km. The best-fitting source for the far-field inflation is a prolate spheroid of geometric aspect ratio 0.12 ± 0.2, a depth of 7.3 ± 0.6 km, and a cavity volume increase of 0.006 ± 0.001 cubic km. The source dips slightly to the north (dip angle: 75 ± 4; strike angle 357 ± 8). Both sources are located beneath the dome. These results suggest that the same deep magma source has been active beneath the volcano for the past 7 years. This source fed the dome eruption and is now slowly being filled. The shallow source controlling the near-field, post-eruption deformation is probably due to the cooling and contraction of the lava dome within the crater.

Lisowski, M.; Battaglia, M.

2011-12-01