Science.gov

Sample records for jackknifing

  1. The Infinitesimal Jackknife with Exploratory Factor Analysis

    ERIC Educational Resources Information Center

    Zhang, Guangjian; Preacher, Kristopher J.; Jennrich, Robert I.

    2012-01-01

    The infinitesimal jackknife, a nonparametric method for estimating standard errors, has been used to obtain standard error estimates in covariance structure analysis. In this article, we adapt it for obtaining standard errors for rotated factor loadings and factor correlations in exploratory factor analysis with sample correlation matrices. Both…

  2. Jackknife and bootstrap inferential procedures for censored survival data

    NASA Astrophysics Data System (ADS)

    Fang, Loh Yue; Arasan, Jayanthi; Midi, Habshah; Bakar, Mohd Rizam Abu

    2015-10-01

    Confidence interval is an estimate of a certain parameter. Classical construction of confidence interval based on asymptotic normality (Wald) often produces misleading inferences when dealing with censored data especially in small samples. Alternative techniques allow us to construct the confidence interval estimation without relying on this assumption. In this paper, we compare the performances of the jackknife and several bootstraps confidence interval estimates for the parameters of a log logistic model with censored data and covariate. We investigate their performances at two nominal error probability levels and several levels of censoring proportion. Conclusions were then drawn based on the results of the coverage probability study.

  3. Nonparametric Estimation of Standard Errors in Covariance Analysis Using the Infinitesimal Jackknife

    ERIC Educational Resources Information Center

    Jennrich, Robert I.

    2008-01-01

    The infinitesimal jackknife provides a simple general method for estimating standard errors in covariance structure analysis. Beyond its simplicity and generality what makes the infinitesimal jackknife method attractive is that essentially no assumptions are required to produce consistent standard error estimates, not even the requirement that the…

  4. Fatal accidental asphyxia in a jack-knife position.

    PubMed

    Benomran, F A

    2010-10-01

    Accidental death from postural or positional asphyxia takes place when the abnormal position of the victim's body compromises the process of respiration. Diagnosis is largely made by circumstantial evidence supported by absence of any other significant pathology or trauma explaining death. This case report is about a 50-year-old male who had been drinking the previous night and was found dead in the morning inside a tire repair shop. His jack-knifed body had been encompassed, buttocks-down, within the hollow core made by 3 big tires stacked on top of each other. The author was called to the scene of death and had hands-on encounter with the body in-situ where scene photographs were taken. Apart from a blood alcohol of 290 mg/100 ml, marked congestion of the face, petechial hemorrhages on the conjunctivae and lung edema and congestion, autopsy findings were unremarkable. Abrasions on shoulders, lateral aspects of arms and posterior aspects of lower legs indicated friction with internal rims of tires while slipping down. There were no other injuries or pathology to account for his death. Death was determined to be due to accidental postural asphyxia secondary to intoxication by alcohol. PMID:20851361

  5. The Infinitesimal Jackknife and Moment Structure Analysis Using Higher Order Moments.

    PubMed

    Jennrich, Robert; Satorra, Albert

    2016-03-01

    Mean corrected higher order sample moments are asymptotically normally distributed. It is shown that both in the literature and popular software the estimates of their asymptotic covariance matrices are incorrect. An introduction to the infinitesimal jackknife is given and it is shown how to use it to correctly estimate the asymptotic covariance matrices of higher order sample moments. Another advantage in using the infinitesimal jackknife is the ease with which it may be used when stacking or sub-setting estimators. The estimates given are used to test the goodness of fit of a non-linear factor analysis model. A computationally accelerated form for infinitesimal jackknife estimates is given. PMID:25361618

  6. Morphological characterization via light and electron microscopy of Atlantic jackknife clam (Ensis directus) hemocytes.

    PubMed

    Preziosi, Brian M; Bowden, Timothy J

    2016-05-01

    The Atlantic jackknife clam, Ensis directus, is currently being researched as a potential species for aquaculture operations in Maine. The goal of this study was to describe the hemocytes of this species for the first time and provide a morphological classification scheme. We viewed hemocytes under light microscopy (using Hemacolor, neutral red, and Pappenheim's stains) as well as transmission electron microscopy (TEM). The 2 main types of hemocytes found were granulocytes and hyalinocytes (agranular cells). The granulocytes were subdivided into large and small granulocytes while the hyalinocytes were subdivided into large and small hyalinocytes. The large hemocytes had both a larger diameter and smaller nucleus to cell diameter ratio than their smaller counterparts. A rare cell type, the vesicular cell, was also observed and it possessed many vesicles but few or no granules. Using TEM, granulocytes were found to contain both electron-lucent and electron-dense granules of various sizes. These numerous granules were the only structures that took up the neutral red stain. Hyalinocytes had few of these granules relative to granulocytes. Large hyalinocytes had both various organelles and large vesicles in their abundant cytoplasm while small hyalinocytes had little room for organelles in their scant cytoplasm. Total hemocyte counts averaged 1.96×10(6) cells mL(-1) while differential hemocyte counts averaged 11% for small hyalinocytes, 12% for large hyalinocytes, 59% for small granulocytes, and 18% for large granulocytes. The results of this study provide a starting point for future studies on E. directus immune function. PMID:27015289

  7. Jackknife-corrected parametric bootstrap estimates of growth rates in bivalve mollusks using nearest living relatives.

    PubMed

    Dexter, Troy A; Kowalewski, Micha?

    2013-12-01

    Quantitative estimates of growth rates can augment ecological and paleontological applications of body-size data. However, in contrast to body-size estimates, assessing growth rates is often time-consuming, expensive, or unattainable. Here we use an indirect approach, a jackknife-corrected parametric bootstrap, for efficient approximation of growth rates using nearest living relatives with known age-size relationships. The estimate is developed by (1) collecting a sample of published growth rates of closely related species, (2) calculating the average growth curve using those published age-size relationships, (3) resampling iteratively these empirically known growth curves to estimate the standard errors and confidence bands around the average growth curve, and (4) applying the resulting estimate of uncertainty to bracket age-size relationships of the species of interest. This approach was applied to three monophyletic families (Donacidae, Mactridae, and Semelidae) of mollusk bivalves, a group characterized by indeterministic shell growth, but widely used in ecological, paleontological, and geochemical research. The resulting indirect estimates were tested against two previously published geochemical studies and, in both cases, yielded highly congruent age estimates. In addition, a case study in applied fisheries was used to illustrate the potential of the proposed approach for augmenting aquaculture management practices. The resulting estimates of growth rates place body size data in a constrained temporal context and confidence intervals associated with resampling estimates allow for assessing the statistical uncertainty around derived temporal ranges. The indirect approach should allow for improved evaluation of diverse research questions, from sustainability of industrial shellfish harvesting to climatic interpretations of stable isotope proxies extracted from fossil skeletons. PMID:24071629

  8. Is It Useful and Safe to Maintain the Sitting Position During Only One Minute before Position Change to the Jack-knife Position?

    PubMed Central

    Park, Soo Young; Park, Jong Cook

    2010-01-01

    Background Conventional spinal saddle block is performed with the patient in a sitting position, keeping the patient sitting for between 3 to 10 min after injection of a drug. This amount of time, however, is long enough to cause prolonged postoperative urinary retention. The trend in this block is to lower the dose of local anesthetics, providing a selective segmental block; however, an optimal dose and method are needed for adequate anesthesia in variable situations. Therefore, in this study, we evaluated the question of whether only 1 min of sitting after drug injection would be sufficient and safe for minor anorectal surgery. Methods Two hundred and sixteen patients undergoing minor anorectal surgery under spinal anesthesia remained sitting for 1 min after completion of subarachnoid administration of 1 ml of a 0.5% hyperbaric bupivacaine solution (5 mg). They were then placed in the jack-knife position. After surgery, analgesia levels were assessed using loss of cold sensation in the supine position. The next day, urination and 11-point numeric rating scale (NRS) for postoperative pain were assessed. Results None of the patients required additional analgesics during surgical manipulation. Postoperative sensory levels were T10 [T8-T12] in patients, and no significant differences were observed between sex (P = 0.857), height (P = 0.065), obesity (P = 0.873), or age (P = 0.138). Urinary retention developed in only 7 patients (3.2%). In this group, NRS was 5.0 2.4 (P = 0.014). Conclusions The one-minute sitting position for spinal saddle block before the jack-knife position is a safe method for use with minor anorectal surgery and can reduce development of postoperative urinary retention. PMID:20830265

  9. The effect of temperature and wing morphology on quantitative genetic variation in the cricket Gryllus firmus, with an appendix examining the statistical properties of the Jackknife-MANOVA method of matrix comparison.

    PubMed

    Bégin, M; Roff, D A; Debat, V

    2004-11-01

    We investigated the effect of temperature and wing morphology on the quantitative genetic variances and covariances of five size-related traits in the sand cricket, Gryllus firmus. Micropterous and macropterous crickets were reared in the laboratory at 24, 28 and 32 degrees C. Quantitative genetic parameters were estimated using a nested full-sib family design, and (co)variance matrices were compared using the T method, Flury hierarchy and Jackknife-manova method. The results revealed that the mean phenotypic value of each trait varied significantly among temperatures and wing morphs, but temperature reaction norms were not similar across all traits. Micropterous individuals were always smaller than macropterous individuals while expressing more phenotypic variation, a finding discussed in terms of canalization and life-history trade-offs. We observed little variation between the matrices of among-family (co)variation corresponding to each combination of temperature and wing morphology, with only one matrix of six differing in structure from the others. The implications of this result are discussed with respect to the prediction of evolutionary trajectories. PMID:15525410

  10. Prone jackknife position is not necessary to achieve a cylindrical abdominoperineal resection: demonstration of the lithotomy position.

    PubMed

    Keller, Deborah S; Lawrence, Justin K; Delaney, Conor P

    2014-02-01

    This video demonstrates a laparoscopic abdominal perineal resection for a fixed 4.8-cm mass involving the posterior and left rectal walls and left puborectalis, 2 cm from the anal verge (see Video, Supplemental Digital Content 1, http://links.lww.com/DCR/A127). We detail the steps of the procedure, all completed in lithotomy, including lateral-to-medial dissection; identification and protection of the left ureter and presacral nerves; division of the inferior mesenteric artery; medial-to-lateral dissection, with meeting the previous dissection plane; total mesorectal excision and pelvic dissection; perineal dissection and layered closure; and abdominal inspection and colostomy creation. Total operative time was 181 minutes. The specimen total mesorectal excision was complete with a negative circumferential radial margin (greater than 1 cm). Final pathology was T3N2M0. PMID:24401888

  11. A Fortran IV Program for Estimating Parameters through Multiple Matrix Sampling with Standard Errors of Estimate Approximated by the Jackknife.

    ERIC Educational Resources Information Center

    Shoemaker, David M.

    Described and listed herein with concomitant sample input and output is the Fortran IV program which estimates parameters and standard errors of estimate per parameters for parameters estimated through multiple matrix sampling. The specific program is an improved and expanded version of an earlier version. (Author/BJG)

  12. The complete mitochondrial genome of the grand jackknife clam, Solen grandis (Bivalvia: Solenidae): a novel gene order and unusual non-coding region.

    PubMed

    Yuan, Yang; Li, Qi; Kong, Lingfeng; Yu, Hong

    2012-02-01

    Molluscs in general, and bivalves in particular, exhibit an extraordinary degree of mitochondrial gene order variation when compared with other metazoans. The complete mitochondrial genome of Solen grandis (Bivalvia: Solenidae) was determined using long-PCR and genome walking techniques. The entire mitochondrial genome sequence of S. grandis is 16,784 bp in length, and contains 36 genes including 12 protein-coding genes (atp8 is absent), 2 ribosomal RNAs, and 22 tRNAs. All genes are encoded on the same strand. Compared with other species, it bears a novel gene order. Besides these, we find a peculiar non-coding region of 435 bp with a microsatellite-like (TA)(12) element, poly-structures and many hairpin structures. In contrast to the available heterodont mitochondrial genomes from GenBank, the complete mtDNA of S. grandis has the shortest cox3 gene, and the longest atp6, nad4, nad5 genes. PMID:21598108

  13. Variance Estimation Using Replication Methods in Structural Equation Modeling with Complex Sample Data

    ERIC Educational Resources Information Center

    Stapleton, Laura M.

    2008-01-01

    This article discusses replication sampling variance estimation techniques that are often applied in analyses using data from complex sampling designs: jackknife repeated replication, balanced repeated replication, and bootstrapping. These techniques are used with traditional analyses such as regression, but are currently not used with structural…

  14. Possession, Transportation, and Use of Firearms by Older Youth in 4-H Shooting Sports Programs

    ERIC Educational Resources Information Center

    White, David J.; Williver, S. Todd

    2014-01-01

    Thirty years ago we would think nothing of driving to school with a jackknife in our pocket or rifle in the gun rack. Since then, the practices of possessing, transporting, and using firearms have been limited by laws, rules, and public perception. Despite restrictions on youth, the Youth Handgun Safety Act does afford 4-H shooting sports members

  15. 46 CFR 160.043-4 - Construction and workmanship.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 46 Shipping 6 2010-10-01 2010-10-01 false Construction and workmanship. 160.043-4 Section 160.043-4 Shipping COAST GUARD, DEPARTMENT OF HOMELAND SECURITY (CONTINUED) EQUIPMENT, CONSTRUCTION, AND MATERIALS: SPECIFICATIONS AND APPROVAL LIFESAVING EQUIPMENT Jackknife (With Can Opener) for Merchant Vessels § 160.043-4 Construction and workmanship....

  16. A Comparison of Rasch Person Analysis and Robust Estimators.

    ERIC Educational Resources Information Center

    Smith, Richard M.

    1985-01-01

    Standard maximum likeliheed estimation was compared using two forms of robust estimation, BIWEIGHT (based on Tukey's Biweight) and AMTJACK (AMT-Robustified Jackknife), and Rasch model person analysis. The two procedures recovered the generating parameters, but Rasch person analysis also helped to identify the nature of a response disturbance. (GDC)

  17. Captain M. A. Ainslie (1869-1951): his observations and telescopes

    NASA Astrophysics Data System (ADS)

    Mobberley, M. P.

    2010-02-01

    The astronomical career of one of the BAA's most enthusiastic planetary observers, who contributed observations in the first five decades of the twentieth century, is described. In addition, his pioneering observation of the occultation of a star by Saturn's rings in 1917 is examined and the full story of his unique 'Jack-Knife telescope', designed by Horace Dall, is given.

  18. Resampling Methods Revisited: Advancing the Understanding and Applications in Educational Research

    ERIC Educational Resources Information Center

    Bai, Haiyan; Pan, Wei

    2008-01-01

    Resampling methods including randomization test, cross-validation, the jackknife and the bootstrap are widely employed in the research areas of natural science, engineering and medicine, but they lack appreciation in educational research. The purpose of the present review is to revisit and highlight the key principles and developments of

  19. 46 CFR 160.043-4 - Construction and workmanship.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 46 Shipping 6 2013-10-01 2013-10-01 false Construction and workmanship. 160.043-4 Section 160.043-4 Shipping COAST GUARD, DEPARTMENT OF HOMELAND SECURITY (CONTINUED) EQUIPMENT, CONSTRUCTION, AND MATERIALS: SPECIFICATIONS AND APPROVAL LIFESAVING EQUIPMENT Jackknife (With Can Opener) for Merchant Vessels § 160.043-4 Construction and workmanship....

  20. Sample Design for Educational Survey Research.

    ERIC Educational Resources Information Center

    Ross, Kenneth N.

    1978-01-01

    Student's empirical sampling approach is used to assess the magnitude of the sampling errors of statistics describing a recursive causal model. The data were gathered with four complex sample designs commonly used in educational surveys. Jackknife and half-sample error estimates are applied to the data. (Author/CTM)

  1. Possession, Transportation, and Use of Firearms by Older Youth in 4-H Shooting Sports Programs

    ERIC Educational Resources Information Center

    White, David J.; Williver, S. Todd

    2014-01-01

    Thirty years ago we would think nothing of driving to school with a jackknife in our pocket or rifle in the gun rack. Since then, the practices of possessing, transporting, and using firearms have been limited by laws, rules, and public perception. Despite restrictions on youth, the Youth Handgun Safety Act does afford 4-H shooting sports members…

  2. The use and misuse of statistics in space physics

    NASA Technical Reports Server (NTRS)

    Reiff, Patricia H.

    1990-01-01

    This paper presents several statistical techniques most commonly used in space physics, including Fourier analysis, linear correlation, auto- and cross-correlation, power spectral density and superimposed epoch analysis, and presents tests to assess the significance of the results. New techniques such as bootstrapping and jackknifing are presented. When no test of significance is in common usage, a plausible test is suggested.

  3. Life table and consumption capacity of corn earworm, Helicoverpa armigera, fed asparagus, Asparagus officinalis.

    PubMed

    Jha, Ratna Kumar; Tuan, Shu-Jen; Chi, Hsin; Tang, Li-Cheng

    2014-01-01

    The life table and consumption rate of Helicoverpa armigera (Hübner) (Lepidoptera: Noctuidae) reared on asparagus, Asparagus officinalis L. (Asparagales: Asparagaceae) were studied under laboratory conditions to assess their interaction. Development, survival, fecundity, and consumption data were analyzed by the age-stage, twosex life table. This study indicated that asparagus is a natural host of H. armigera. However, the poor nutritional content in asparagus foliage and the poor fitness of H. armigera that fed on asparagus indicated that asparagus is a suboptimal host in comparison to hybrid sweet corn. The uncertainty associated with life table parameters was estimated by using jackknife and bootstrap techniques, and the results were compared for statistical inference. The intrinsic rate of increase (r), finite rate of increase (λ), net reproductive rate (R0), and mean generation time (T) were estimated by the jackknife technique to be 0.0780 day(-1), 1.0811 day(-1), 67.4 offspring, and 54.8 days, respectively, while those estimated by the bootstrap technique were 0.0752 day(-1), 1.0781 day(-1), 68.0 offspring, and 55.3 days, respectively. The net consumption rate of H. armigera, as estimated by the jackknife and bootstrap technique, was 1183.02 and 1132.9 mg per individual, respectively. The frequency distribution of sample means obtained by the jackknife technique failed the normality test, while the bootstrap results fit the normal distribution well. By contrast, the relationship between the mean fecundity and the net reproductive rate, as estimated by the bootstrap technique, was slightly inconsistent with the relationship found by mathematical proof. The application of the jackknife and bootstrap techniques in estimating population parameters requires further examination. PMID:25373181

  4. Sampling effort and estimates of species richness based on prepositioned area electrofisher samples

    USGS Publications Warehouse

    Bowen, Z.H.; Freeman, Mary C.

    1998-01-01

    Estimates of species richness based on electrofishing data are commonly used to describe the structure of fish communities. One electrofishing method for sampling riverine fishes that has become popular in the last decade is the prepositioned area electrofisher (PAE). We investigated the relationship between sampling effort and fish species richness at seven sites in the Tallapoosa River system, USA based on 1,400 PAE samples collected during 1994 and 1995. First, we estimated species richness at each site using the first-order jackknife and compared observed values for species richness and jackknife estimates of species richness to estimates based on historical collection data. Second, we used a permutation procedure and nonlinear regression to examine rates of species accumulation. Third, we used regression to predict the number of PAE samples required to collect the jackknife estimate of species richness at each site during 1994 and 1995. We found that jackknife estimates of species richness generally were less than or equal to estimates based on historical collection data. The relationship between PAE electrofishing effort and species richness in the Tallapoosa River was described by a positive asymptotic curve as found in other studies using different electrofishing gears in wadable streams. Results from nonlinear regression analyses indicted that rates of species accumulation were variable among sites and between years. Across sites and years, predictions of sampling effort required to collect jackknife estimates of species richness suggested that doubling sampling effort (to 200 PAEs) would typically increase observed species richness by not more than six species. However, sampling effort beyond about 60 PAE samples typically increased observed species richness by < 10%. We recommend using historical collection data in conjunction with a preliminary sample size of at least 70 PAE samples to evaluate estimates of species richness in medium-sized rivers. Seventy PAE samples should provide enough information to describe the relationship between sampling effort and species richness and thus facilitate evaluation of a sampling effort.

  5. Estimation of the size of a closed population when capture probabilities vary among animals

    USGS Publications Warehouse

    Burnham, K.P.; Overton, W.S.

    1978-01-01

    A model which allows capture probabilities to vary by individuals is introduced for multiple recapture studies n closed populations. The set of individual capture probabilities is modelled as a random sample from an arbitrary probability distribution over the unit interval. We show that the capture frequencies are a sufficient statistic. A nonparametric estimator of population size is developed based on the generalized jackknife; this estimator is found to be a linear combination of the capture frequencies. Finally, tests of underlying assumptions are presented.

  6. Evaluating species richness: biased ecological inference results from spatial heterogeneity in species detection probabilities

    USGS Publications Warehouse

    McNew, Lance B.; Handel, Colleen M.

    2015-01-01

    Accurate estimates of species richness are necessary to test predictions of ecological theory and evaluate biodiversity for conservation purposes. However, species richness is difficult to measure in the field because some species will almost always be overlooked due to their cryptic nature or the observer's failure to perceive their cues. Common measures of species richness that assume consistent observability across species are inviting because they may require only single counts of species at survey sites. Single-visit estimation methods ignore spatial and temporal variation in species detection probabilities related to survey or site conditions that may confound estimates of species richness. We used simulated and empirical data to evaluate the bias and precision of raw species counts, the limiting forms of jackknife and Chao estimators, and multi-species occupancy models when estimating species richness to evaluate whether the choice of estimator can affect inferences about the relationships between environmental conditions and community size under variable detection processes. Four simulated scenarios with realistic and variable detection processes were considered. Results of simulations indicated that (1) raw species counts were always biased low, (2) single-visit jackknife and Chao estimators were significantly biased regardless of detection process, (3) multispecies occupancy models were more precise and generally less biased than the jackknife and Chao estimators, and (4) spatial heterogeneity resulting from the effects of a site covariate on species detection probabilities had significant impacts on the inferred relationships between species richness and a spatially explicit environmental condition. For a real dataset of bird observations in northwestern Alaska, the four estimation methods produced different estimates of local species richness, which severely affected inferences about the effects of shrubs on local avian richness. Overall, our results indicate that neglecting the effects of site covariates on species detection probabilities may lead to significant bias in estimation of species richness, as well as the inferred relationships between community size and environmental covariates.

  7. Evaluating species richness: Biased ecological inference results from spatial heterogeneity in detection probabilities.

    PubMed

    McNew, Lance B; Handel, Colleen M

    2015-09-01

    Accurate estimates of species richness are necessary to test predictions of ecological theory and evaluate biodiversity for conservation purposes. However, species richness is difficult to measure in the field because some species will almost always be overlooked due to their cryptic nature or the observer's failure to perceive their cues. Common measures of species richness that assume consistent observability across species are inviting because they may require only single counts of species at survey sites. Single-visit estimation methods ignore spatial and temporal variation in species detection probabilities related to survey or site conditions that may confound estimates of species richness. We used simulated and empirical data to evaluate the bias and precision of raw species counts, the limiting forms of jackknife and Chao estimators, and multispecies occupancy models when estimating species richness to evaluate whether the choice of estimator can affect inferences about the relationships between environmental conditions and community size under variable detection processes. Four simulated scenarios with realistic and variable detection processes were considered. Results of simulations indicated that (1) raw species counts were always biased low, (2) single-visit jackknife and Chao estimators were significantly biased regardless of detection process, (3) multispecies occupancy models were more precise and generally less biased than the jackknife and Chao estimators, and (4) spatial heterogeneity resulting from the effects of a site covariate on species detection probabilities had significant impacts on the inferred relationships between species richness and a spatially explicit environmental condition. For a real data set of bird observations in northwestern Alaska, USA, the four estimation methods produced different estimates of local species richness, which severely affected inferences about the effects of shrubs on local avian richness. Overall, our results indicate that neglecting the effects of site covariates on species detection probabilities may lead to significant bias in estimation of species richness, as well as the inferred relationships between community size and environmental covariates. PMID:26552273

  8. On population size estimators in the Poisson mixture model.

    PubMed

    Mao, Chang Xuan; Yang, Nan; Zhong, Jinhua

    2013-09-01

    Estimating population sizes via capture-recapture experiments has enormous applications. The Poisson mixture model can be adopted for those applications with a single list in which individuals appear one or more times. We compare several nonparametric estimators, including the Chao estimator, the Zelterman estimator, two jackknife estimators and the bootstrap estimator. The target parameter of the Chao estimator is a lower bound of the population size. Those of the other four estimators are not lower bounds, and they may produce lower confidence limits for the population size with poor coverage probabilities. A simulation study is reported and two examples are investigated. PMID:23865502

  9. Repeated-measures bioassay with correlated errors and heterogeneous variances: a Monte Carlo study.

    PubMed

    Elashoff, J D

    1981-09-01

    The relative potency of two drugs may be estimated from a small experiment in which all K doses of each drug are given in order of increasing dose level to each of n aminals at one testing session. Fiducial limits are often estimated from Fieller's theorem, with the (2K - 1)(n - 1)-degree-of-freedom error term obtained from a two-way analysis of variance. Monte Carlo results indicate that this method can yield limits which are far too narrow when there is serial correlation between successive doses. The jackknife confidence limits are found to behave well for the models investigated. PMID:7317557

  10. The Purley train crash mechanism: injuries and prevention.

    PubMed Central

    Fothergill, N J; Ebbs, S R; Reese, A; Partridge, R J; Mowbray, M; Southcott, R D; Hashemi, K

    1992-01-01

    On the afternoon of Saturday 4th March 1989 two trains, both bound for London Victoria Station, collided. Part of the rear train rolled down a steep railway embankment and jack-knifed against a tree. The mechanism of the crash and the injuries sustained by the 55 victims who were seen in the A&E Department of the Mayday University Hospital are described. Improvements in signalling technology and design of rolling stock which may reduce both the risk of collision and severity of injury in future accidents are discussed. Images Fig. 1 PMID:1388485

  11. CLUSFAVOR 5.0: hierarchical cluster and principal-component analysis of microarray-based transcriptional profiles

    PubMed Central

    Peterson, Leif E

    2002-01-01

    CLUSFAVOR (CLUSter and Factor Analysis with Varimax Orthogonal Rotation) 5.0 is a Windows-based computer program for hierarchical cluster and principal-component analysis of microarray-based transcriptional profiles. CLUSFAVOR 5.0 standardizes input data; sorts data according to gene-specific coefficient of variation, standard deviation, average and total expression, and Shannon entropy; performs hierarchical cluster analysis using nearest-neighbor, unweighted pair-group method using arithmetic averages (UPGMA), or furthest-neighbor joining methods, and Euclidean, correlation, or jack-knife distances; and performs principal-component analysis. PMID:12184816

  12. HEXT, a software supporting tree-based screens for hybrid taxa in multilocus data sets, and an evaluation of the homoplasy excess test

    PubMed Central

    Schneider, Kevin; Koblmüller, Stephan; Sefc, Kristina M.

    2016-01-01

    Summary The homoplasy excess test (HET) is a tree-based screen for hybrid taxa in multilocus nuclear phylogenies. Homoplasy between a hybrid taxon and the clades containing the parental taxa reduces bootstrap support in the tree. The HET is based on the expectation that excluding the hybrid taxon from the data set increases the bootstrap support for the parental clades, whereas excluding non-hybrid taxa has little effect on statistical node support. To carry out a HET, bootstrap trees are calculated with taxon-jackknife data sets, that is excluding one taxon (species, population) at a time. Excess increase in bootstrap support for certain nodes upon exclusion of a particular taxon indicates the hybrid (the excluded taxon) and its parents (the clades with increased support). We introduce a new software program, hext, which generates the taxon-jackknife data sets, runs the bootstrap tree calculations, and identifies excess bootstrap increases as outlier values in boxplot graphs. hext is written in r language and accepts binary data (0/1; e.g. AFLP) as well as co-dominant SNP and genotype data. We demonstrate the usefulness of hext in large SNP data sets containing putative hybrids and their parents. For instance, using published data of the genus Vitis (~6,000 SNP loci), hext output supports V. × champinii as a hybrid between V. rupestris and V. mustangensis. With simulated SNP and AFLP data sets, excess increases in bootstrap support were not always connected with the hybrid taxon (false positives), whereas the expected bootstrap signal failed to appear on several occasions (false negatives). Potential causes for both types of spurious results are discussed. With both empirical and simulated data sets, the taxon-jackknife output generated by hext provided additional signatures of hybrid taxa, including changes in tree topology across trees, consistent effects of exclusions of the hybrid and the parent taxa, and moderate (rather than excessive) increases in bootstrap support. hext significantly facilitates the taxon-jackknife approach to hybrid taxon detection, even though the simple test for excess bootstrap increase may not reliably identify hybrid taxa in all applications. PMID:27066216

  13. The full-length phylogenetic tree from 1551 ribosomal sequences of chitinous fungi, Fungi.

    PubMed

    Tehler, Anders; Little, Damon P; Farris, James S

    2003-08-01

    A data set with 1551 fungal sequences of the small subunit ribosomal RNA has been analysed phylogenetically. Four animal sequences were used to root the tree. The parsimony ratchet algorithm in combination with tree fusion was used to find most parsimonious trees and the parsimony jackknifing method was used to establish support frequencies. The full-length consensus tree, of the most parsimonious trees, is published and jackknife frequencies above 50% are plotted on the consensus tree at supported nodes. Until recently attempts to find the most parsimonious trees for large data sets were impractical, given current computational limitations. The parsimony ratchet in combination with tree fusion was found to be a very efficient method of rapid parsimony analysis of this large data set. Parsimony jackknifing is a very fast and efficient method for establishing group support. The results show that the Glomeromycota are the sister group to a monophyletic Dikaryomycota. The majority of the species in the Glomeromycota/Dikaryomycota group have a symbiotic lifestyle--a possible synapomorphy for a group 'Symbiomycota'. This would suggest that symbiosis between fungi and green plants evolved prior to the colonization of land by plants and not as a result of the colonization process. The Basidiomycotina and the Ascomycotina are both supported as monophyletic. The Urediniomycetes is the sister group to the rest of the Basidiomycotina successively followed in a grade by Ustilaginomycetes, Tremellomycetes, Dacrymycetales, Ceratobasidiales and Homobasidiomycetes each supported as monophyletic except the Homobasidiomycetes which are left unsupported. The ascomycete node begins with a polytomy consisting of the Pneumocystidomycetes, Schizosaccharomycetes, unsupported group with the Taphrinomycetes and Neolectales, and finally an unnamed, monophyletic and supported group including the Saccharomycetes and Euascomycetes. Within the Euascomycetes the inoperculate euascomycetes (Inoperculata) are supported as monophyletic excluding the Orbiliomycetes which are included in an unsupported operculate, pezizalean sister group together with Helvellaceae, Morchellaceae, Tuberaceae and others. Geoglossum is the sister group to the rest of the inoperculate euascomycetes. The Sordariomycetes, Dothideomycetes, Chaetothyriomycetes and Eurotiomycetes are each highly supported as monophyletic. The Leotiomycetes and the Lecanoromycetes both appear in the consensus of the most parsimonious trees but neither taxon receives any jackknife support. PMID:14531615

  14. Small-sample estimation of species richness applied to forest communities.

    PubMed

    Hwang, Wen-Han; Shen, Tsung-Jen

    2010-12-01

    Many well-known methods are available for estimating the number of species in a forest community. However, most existing methods result in considerable negative bias in applications, where field surveys typically represent only a small fraction of sampled communities. This article develops a new method based on sampling with replacement to estimate species richness via the generalized jackknife procedure. The proposed estimator yields small bias and reasonably accurate interval estimation even with small samples. The performance of the proposed estimator is compared with several typical estimators via simulation study using two complete census datasets from Panama and Malaysia. PMID:20002401

  15. Ongoing Estimation of the Epidemic Parameters of a Stochastic, Spatial, Discrete-Time Model for a 1983–84 Avian Influenza Epidemic

    PubMed Central

    Rorres, C.; Pelletier, S. T. K.; Bruhn, M. C.; Smith, G.

    2013-01-01

    SUMMARY We formulate a stochastic, spatial, discrete-time model of viral “Susceptible, Exposed, Infectious, Recovered” animal epidemics and apply it to an avian influenza epidemic in Pennsylvania in 1983–84. Using weekly data for the number of newly infectious cases collected during the epidemic, we find estimates for the latent period of the virus and the values of two parameters within the transmission kernel of the model. These data are then jackknifed on a progressive weekly basis to show how our estimates can be applied to an ongoing epidemic to generate continually improving values of certain epidemic parameters. PMID:21500633

  16. Ongoing estimation of the epidemic parameters of a stochastic, spatial, discrete-time model for a 1983-84 avian influenza epidemic.

    PubMed

    Rorres, C; Pelletier, S T K; Bruhn, M C; Smith, G

    2011-03-01

    We formulate a stochastic, spatial, discrete-time model of viral "Susceptible, Exposed, Infectious, Recovered" animal epidemics and apply it to an avian influenza epidemic in Pennsylvania in 1983-84. Using weekly data for the number of newly infectious cases collected during the epidemic, we find estimates for the latent period of the virus and the values of two parameters within the transmission kernel of the model. These data are then jackknifed on a progressive weekly basis to show how our estimates can be applied to an ongoing epidemic to generate continually improving values of certain epidemic parameters. PMID:21500633

  17. [Usefulness of residuals in clinical research].

    PubMed

    Cuevas-Urióstegui, M L; Garduño-Espinosa, J; Fajardo-Gutiérrez, A; Hernández-Hernández, D M; Martínez-García, M C

    1993-05-01

    The simple linear regression analysis, multiple linear regression and logistic regression constitute powerful statistical analysis tools widely used in clinical research. These kinds of analyses are based upon mathematical models which at the same time are established on certain basic assumptions. The regression analysis assumptions are basically: a) that the model is really linear, b) that the distribution of data is normal (from a statistical point of view), c) that the variances of the employed data are homogeneous (homocedastics) and that the included data are independent. The regression diagnostic has become popular as a form to evaluate if the assumptions have been accomplished, one of its most important techniques is the residual analysis. A residual can be defined as the value which measures the distance between the regression line and the corresponding value of the variable "y". Among these kinds of residuals used to evaluate the assumptions of regression are: the crude residual, the standardized, of student and the jackknife. The most useful among them is the jackknife residual. The usefulness and limitations of the residuals in the evaluation of the regression analysis assumptions are described, basically referring to the identification and handling of extreme values (outliers). PMID:8504006

  18. PsN-Toolkit--a collection of computer intensive statistical methods for non-linear mixed effect modeling using NONMEM.

    PubMed

    Lindbom, Lars; Pihlgren, Pontus; Jonsson, E Niclas; Jonsson, Niclas

    2005-09-01

    PsN-Toolkit is a collection of statistical tools for pharmacometric data analysis using the non-linear mixed effect modeling software NONMEM. The toolkit is object oriented and written in the programming language Perl using the programming library Perl-speaks-NONMEM (PsN). Five methods: the Bootstrap, the Jackknife, Log-likelihood Profiling, Case-deletion Diagnostics and Stepwise Covariate Model building are included as separate classes and may be used in user-written Perl scripts or through stand-alone command line applications. The tools are designed with the ability to cooperate and with an emphasis on common structures for workflow and result handling. Parallel execution of independent tool sections is supported on shared memory multiprocessor (SMP) computers, Mosix/openMosix clusters and distributed computing environments following the NorduGrid standard. In conclusion, PsN-Toolkit makes it easier to use the Bootstrap, the Jackknife, Log-likelihood Profiling, Case-deletion Diagnostics and Stepwise Covariate Model building in pharmacometric data analysis. PMID:16023764

  19. Performance of internal covariance estimators for cosmic shear correlation functions

    DOE PAGESBeta

    Friedrich, O.; Seitz, S.; Eifler, T. F.; Gruen, D.

    2015-12-31

    Data re-sampling methods such as the delete-one jackknife are a common tool for estimating the covariance of large scale structure probes. In this paper we investigate the concepts of internal covariance estimation in the context of cosmic shear two-point statistics. We demonstrate how to use log-normal simulations of the convergence field and the corresponding shear field to carry out realistic tests of internal covariance estimators and find that most estimators such as jackknife or sub-sample covariance can reach a satisfactory compromise between bias and variance of the estimated covariance. In a forecast for the complete, 5-year DES survey we show that internally estimated covariance matrices can provide a large fraction of the true uncertainties on cosmological parameters in a 2D cosmic shear analysis. The volume inside contours of constant likelihood in themore » $$\\Omega_m$$-$$\\sigma_8$$ plane as measured with internally estimated covariance matrices is on average $$\\gtrsim 85\\%$$ of the volume derived from the true covariance matrix. The uncertainty on the parameter combination $$\\Sigma_8 \\sim \\sigma_8 \\Omega_m^{0.5}$$ derived from internally estimated covariances is $$\\sim 90\\%$$ of the true uncertainty.« less

  20. Prediction of Antimicrobial Peptides Based on Sequence Alignment and Support Vector Machine-Pairwise Algorithm Utilizing LZ-Complexity

    PubMed Central

    Shahrudin, Shahriza

    2015-01-01

    This study concerns an attempt to establish a new method for predicting antimicrobial peptides (AMPs) which are important to the immune system. Recently, researchers are interested in designing alternative drugs based on AMPs because they have found that a large number of bacterial strains have become resistant to available antibiotics. However, researchers have encountered obstacles in the AMPs designing process as experiments to extract AMPs from protein sequences are costly and require a long set-up time. Therefore, a computational tool for AMPs prediction is needed to resolve this problem. In this study, an integrated algorithm is newly introduced to predict AMPs by integrating sequence alignment and support vector machine- (SVM-) LZ complexity pairwise algorithm. It was observed that, when all sequences in the training set are used, the sensitivity of the proposed algorithm is 95.28% in jackknife test and 87.59% in independent test, while the sensitivity obtained for jackknife test and independent test is 88.74% and 78.70%, respectively, when only the sequences that has less than 70% similarity are used. Applying the proposed algorithm may allow researchers to effectively predict AMPs from unknown protein peptide sequences with higher sensitivity. PMID:25802839

  1. Estimation of the Time Interval between the Administration of Heroin and the Sampling of Blood in Chronic Inhalers.

    PubMed

    Dubois, Nathalie; Hallet, Claude; Seidel, Laurence; Demaret, Isabelle; Luppens, David; Ansseau, Marc; Rozet, Eric; Albert, Adelin; Hubert, Philippe; Charlier, Corinne

    2015-05-01

    To develop a model for estimating the time delay between last heroin consumption and blood sampling in chronic drug users. Eleven patients, all heroin inhalers undergoing detoxification, were included in the study. Several plasma samples were collected during the detoxification procedure and analyzed for the heroin metabolites 6-acetylmorphine (6AM), morphine (MOR), morphine-6-glucuronide (M6G) and morphine-3-glucuronide (M3G), according to a UHPLC/MSMS method. The general linear mixed model was applied to time-related concentrations and a pragmatic four-step delay estimation approach was proposed based on the simultaneous presence of metabolites in plasma. Validation of the model was carried out using the jackknife technique on the 11 patients, and on a group of 7 test patients. Quadratic equations were derived for all metabolites except 6AM. The interval delay estimation was 2-4 days when only M3G present in plasma, 1-2 days when M6G and M3G were both present, 0-1 day when MOR, M6G and M3G were present and <2 h for all metabolites present. The 'jackknife' correlation between declared and actual estimated delays was 0.90. The overall precision of the delay estimates was 8-9 h. The delay between last heroin consumption and blood sampling in chronic drug users can be satisfactorily predicted from plasma heroin metabolites. PMID:25648554

  2. Performance of internal covariance estimators for cosmic shear correlation functions

    NASA Astrophysics Data System (ADS)

    Friedrich, O.; Seitz, S.; Eifler, T. F.; Gruen, D.

    2016-03-01

    Data re-sampling methods such as delete-one jackknife, bootstrap or the sub-sample covariance are common tools for estimating the covariance of large-scale structure probes. We investigate different implementations of these methods in the context of cosmic shear two-point statistics. Using lognormal simulations of the convergence field and the corresponding shear field we generate mock catalogues of a known and realistic covariance. For a survey of {˜ } 5000 ° ^2 we find that jackknife, if implemented by deleting sub-volumes of galaxies, provides the most reliable covariance estimates. Bootstrap, in the common implementation of drawing sub-volumes of galaxies, strongly overestimates the statistical uncertainties. In a forecast for the complete 5-yr Dark Energy Survey, we show that internally estimated covariance matrices can provide a large fraction of the true uncertainties on cosmological parameters in a 2D cosmic shear analysis. The volume inside contours of constant likelihood in the Ωm-σ8 plane as measured with internally estimated covariance matrices is on average ≳85 per cent of the volume derived from the true covariance matrix. The uncertainty on the parameter combination Σ _8 ˜ σ _8 Ω _m^{0.5} derived from internally estimated covariances is ˜90 per cent of the true uncertainty.

  3. Reweighting estimators for Cox regression with missing covariate data: Analysis of insulin resistance and risk of stroke in the Northern Manhattan Study

    PubMed Central

    Xu, Qiang; Paik, Myunghee Cho; Rundek, Tatjana; Elkind, Mitchell S. V.; Sacco, Ralph L.

    2015-01-01

    Incomplete covariates often obscure analysis results from a Cox regression. In an analysis of the Northern Manhattan Study (NOMAS) to determine the influence of insulin resistance on the incidence of stroke in non-diabetic individuals, insulin level is unknown for 34.1% of the subjects. The available data suggest that the missingness mechanism depends on outcome variables, which may generate biases in estimating the parameters of interest if only using the complete observations. This article aimed to introduce practical strategies to analyze the NOMAS data and present sensitivity analyses by using the reweighting method in standard statistical packages. When the data set structure is in counting process style, the reweighting estimates can be obtained by built-in procedures with variance estimated by the jackknife method. Simulation results indicate that the jackknife variance estimate provides reasonable coverage probability in moderate sample sizes. We subsequently conducted sensitivity analyses for the NOMAS data, showing that the risk estimates are robust to a variety of missingness mechanisms. At the end of this article, we present the core SAS and R programs used in the analysis. PMID:21965165

  4. Using increment of diversity to predict mitochondrial proteins of malaria parasite: integrating pseudo-amino acid composition and structural alphabet.

    PubMed

    Chen, Ying-Li; Li, Qian-Zhong; Zhang, Li-Qing

    2012-04-01

    Due to the complexity of Plasmodium falciparum (PF) genome, predicting mitochondrial proteins of PF is more difficult than other species. In this study, using the n-peptide composition of reduced amino acid alphabet (RAAA) obtained from structural alphabet named Protein Blocks as feature parameter, the increment of diversity (ID) is firstly developed to predict mitochondrial proteins. By choosing the 1-peptide compositions on the N-terminal regions with 20 residues as the only input vector, the prediction performance achieves 86.86% accuracy with 0.69 Mathew's correlation coefficient (MCC) by the jackknife test. Moreover, by combining with the hydropathy distribution along protein sequence and several reduced amino acid alphabets, we achieved maximum MCC 0.82 with accuracy 92% in the jackknife test by using the developed ID model. When evaluating on an independent dataset our method performs better than existing methods. The results indicate that the ID is a simple and efficient prediction method for mitochondrial proteins of malaria parasite. PMID:21191803

  5. Performance of internal covariance estimators for cosmic shear correlation functions

    SciTech Connect

    Friedrich, O.; Seitz, S.; Eifler, T. F.; Gruen, D.

    2015-12-31

    Data re-sampling methods such as the delete-one jackknife are a common tool for estimating the covariance of large scale structure probes. In this paper we investigate the concepts of internal covariance estimation in the context of cosmic shear two-point statistics. We demonstrate how to use log-normal simulations of the convergence field and the corresponding shear field to carry out realistic tests of internal covariance estimators and find that most estimators such as jackknife or sub-sample covariance can reach a satisfactory compromise between bias and variance of the estimated covariance. In a forecast for the complete, 5-year DES survey we show that internally estimated covariance matrices can provide a large fraction of the true uncertainties on cosmological parameters in a 2D cosmic shear analysis. The volume inside contours of constant likelihood in the $\\Omega_m$-$\\sigma_8$ plane as measured with internally estimated covariance matrices is on average $\\gtrsim 85\\%$ of the volume derived from the true covariance matrix. The uncertainty on the parameter combination $\\Sigma_8 \\sim \\sigma_8 \\Omega_m^{0.5}$ derived from internally estimated covariances is $\\sim 90\\%$ of the true uncertainty.

  6. Ontogeny of the barley plant as related to mutation expression and detection of pollen mutations

    SciTech Connect

    Hodgdon, A.L.; Marcus, A.H.; Arenaz, P.; Rosichan, J.L.; Bogyo, T.P.; Nilan, R.A.

    1980-05-29

    Clustering of mutant pollen grains in a population of normal pollen due to premeiotic mutational events complicates translating mutation frequencies into rates. Embryo ontogeny in barley will be described and used to illustrate the formation of such mutant clusters. The nature of the statistics for mutation frequency will be described from a study of the reversion frequencies of various waxy mutants in barley. Computer analysis by a jackknife method of the reversion frequencies of a waxy mutant treated with the mutagen sodium azide showed a significantly higher reversion frequency than untreated material. Problems of the computer analysis suggest a better experimental design for pollen mutation experiments. Preliminary work on computer modeling for pollen development and mutation will be described.

  7. Estimating contaminant loads in rivers: An application of adjusted maximum likelihood to type 1 censored data

    USGS Publications Warehouse

    Cohn, T.A.

    2005-01-01

    This paper presents an adjusted maximum likelihood estimator (AMLE) that can be used to estimate fluvial transport of contaminants, like phosphorus, that are subject to censoring because of analytical detection limits. The AMLE is a generalization of the widely accepted minimum variance unbiased estimator (MVUE), and Monte Carlo experiments confirm that it shares essentially all of the MVUE's desirable properties, including high efficiency and negligible bias. In particular, the AMLE exhibits substantially less bias than alternative censored-data estimators such as the MLE (Tobit) or the MLE followed by a jackknife. As with the MLE and the MVUE the AMLE comes close to achieving the theoretical Frechet-Crame??r-Rao bounds on its variance. This paper also presents a statistical framework, applicable to both censored and complete data, for understanding and estimating the components of uncertainty associated with load estimates. This can serve to lower the cost and improve the efficiency of both traditional and real-time water quality monitoring.

  8. Prediction of Pork Quality by Fuzzy Support Vector Machine Classifier

    NASA Astrophysics Data System (ADS)

    Zhang, Jianxi; Yu, Huaizhi; Wang, Jiamin

    Existing objective methods to evaluate pork quality in general do not yield satisfactory results and their applications in meat industry are limited. In this study, fuzzy support vector machine (FSVM) method was developed to evaluate and predict pork quality rapidly and nondestructively. Firstly, the discrete wavelet transform (DWT) was used to eliminate the noise component in original spectrum and the new spectrum was reconstructed. Then, considering the characteristic variables still exist correlation and contain some redundant information, principal component analysis (PCA) was carried out. Lastly, FSVM was developed to differentiate and classify pork samples into different quality grades using the features from PCA. Jackknife tests on the working datasets indicated that the prediction accuracies were higher than other methods.

  9. Robust estimation of population size when capture probabilities vary among animals

    USGS Publications Warehouse

    Burnham, K.P.; Overton, W.S.

    1979-01-01

    A model is given for multiple recapture studies on closed populations which allows capture probabilities to vary among individuals. The capture probability of each individual is assumed to be constant over time. Based on this model we give a nonparametric estimation procedure for population size. The estimator involves selecting one of a sequence of estimator which are each linear combinations of the capture frequencies. The individual estimators are derived from the generalized jackknife method. We also give a goodness of fit test for the model's assumption that individual capture probabilities do not change during the study. The robustness of the estimation procedure is investigated with a simulation study. By virtue of this study, and the theoretical nature of the estimator, it is judged to be robust to moderate variations in individual capture probabilities which may occur in commonly used short-term livetrapping studies.

  10. Linear regression in astronomy. II

    NASA Astrophysics Data System (ADS)

    Feigelson, Eric D.; Babu, Gutti J.

    1992-09-01

    A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.

  11. Predicting Subcellular Localization of Apoptosis Proteins Combining GO Features of Homologous Proteins and Distance Weighted KNN Classifier

    PubMed Central

    Wang, Xiao; Li, Hui; Zhang, Qiuwen; Wang, Rong

    2016-01-01

    Apoptosis proteins play a key role in maintaining the stability of organism; the functions of apoptosis proteins are related to their subcellular locations which are used to understand the mechanism of programmed cell death. In this paper, we utilize GO annotation information of apoptosis proteins and their homologous proteins retrieved from GOA database to formulate feature vectors and then combine the distance weighted KNN classification algorithm with them to solve the data imbalance problem existing in CL317 data set to predict subcellular locations of apoptosis proteins. It is found that the number of homologous proteins can affect the overall prediction accuracy. Under the optimal number of homologous proteins, the overall prediction accuracy of our method on CL317 data set reaches 96.8% by Jackknife test. Compared with other existing methods, it shows that our proposed method is very effective and better than others for predicting subcellular localization of apoptosis proteins. PMID:27213149

  12. Avian community response to small-scale habitat disturbance in Maine

    USGS Publications Warehouse

    Derleth, E.L.; McAuley, D.G.; Dwyer, T.J.

    1989-01-01

    The effects of small clearcuts (1 - 8 ha) on avian communities in the forest of eastern Maine were studied using point counts during spring 1978 - 1981. Surveys were conducted in uncut (control) and clear-cut (treatment) plots in three stand types: conifer, hardwood, and mixed growth. We used a mark-recapture model and its associated jackknife species richness estimator (N), as an indicator of avian community structure. Increases in estimated richness (N) and Shannon - Weaver diversity (H') were noted in the treated hardwood and mixed growth, but not in the conifer stands. Seventeen avian species increased in relative abundance, whereas two species declined. Stand treatment was associated with important changes in bird species composition. Increased habitat patchiness and the creation of forest edge are hypothesized as causes for the greater estimates of richness and diversity.

  13. ETAP: An Exoplanetary Transit Analyzer Program

    NASA Astrophysics Data System (ADS)

    Demircan, O.; Bakış, V.

    2015-07-01

    We introduce new software for the modeling of exoplanet transit light curves. The technique is based on the aperture cross correlation between the light changes and the diffraction pattern of the apertures representing the eclipsing binary components (Kopal 1977; Kopal & Demircan 1978), where in the present case the eclipsing component is considered to be a transiting exoplanet. The software finds the best parameters including the fractional radii of the exoplanet and the host star, the inclination and eccentricity of the orbit, the longitude of periastron, limb-darkening coefficients of any degree, and the transit time, yielding the lowest χ2. Parameter uncertainties are calculated with the jackknife method. We compare results for selected exoplanet transit light curves against those available in the literature.

  14. A method for WD40 repeat detection and secondary structure prediction.

    PubMed

    Wang, Yang; Jiang, Fan; Zhuo, Zhu; Wu, Xian-Hui; Wu, Yun-Dong

    2013-01-01

    WD40-repeat proteins (WD40s), as one of the largest protein families in eukaryotes, play vital roles in assembling protein-protein/DNA/RNA complexes. WD40s fold into similar β-propeller structures despite diversified sequences. A program WDSP (WD40 repeat protein Structure Predictor) has been developed to accurately identify WD40 repeats and predict their secondary structures. The method is designed specifically for WD40 proteins by incorporating both local residue information and non-local family-specific structural features. It overcomes the problem of highly diversified protein sequences and variable loops. In addition, WDSP achieves a better prediction in identifying multiple WD40-domain proteins by taking the global combination of repeats into consideration. In secondary structure prediction, the average Q3 accuracy of WDSP in jack-knife test reaches 93.7%. A disease related protein LRRK2 was used as a representive example to demonstrate the structure prediction. PMID:23776530

  15. Bootstrapped MRMC confidence intervals

    NASA Astrophysics Data System (ADS)

    Samuelson, Frank W.; Wagner, Robert F.

    2005-04-01

    The multiple-reader, multiple-case (MRMC) paradigm of Swets and Pickett (1982) for ROC analysis was expressed as a components of variance model by Dorfman, Berbaum, and Metz (1992) and validated by Roe and Metz (1997) for Type I error rates. Our group proposed an analysis of the MRMC components of variance model using bootstrap (Beiden, Wagner, and Campbell, 2000) experiments instead of jackknife pseudo-values. These approaches have been challenged by some contemporary authors (e.g. Zhou, Obuchowski, and McClish, 2002). The purpose of the present paper is to formally compare the models and to carry out validation tests of their performance. We investigate different approaches to statistical inference, including several types of nonparametric bootstrap confidence intervals and report on validation and simulation experiments of Type I errors.

  16. A phylogenetic perspective on larval spine morphology in Leucorrhinia (Odonata: Libellulidae) based on ITS1, 5.8S, and ITS2 rDNA sequences.

    PubMed

    Hovmöller, Rasmus; Johansson, Frank

    2004-03-01

    Leucorrhinia (Odonata, Anisoptera, Libellulidae) consists of 14-15 species with a holarctic distribution. We have combined the morphological characters of a previous study with sequence data from the ITS1, 5.8S rDNA, and ITS2 regions of the nuclear ribosomal repeat. Cloning was used to investigate the intra-individual variation and such variation was found in all investigated species. Parsimony jackknifing was used to identify supported groups. The effect of sequence alignment and gap coding was explored by a modified sensitivity analysis. Loss of spines in Leucorrhinia larvae has occurred twice: once in Europe and once in North America. The role of spines as a defence against predation is discussed in a phylogenetic context. PMID:15012945

  17. Gulls identified as major source of fecal pollution in coastal waters: a microbial source tracking study.

    PubMed

    Araújo, Susana; Henriques, Isabel S; Leandro, Sérgio Miguel; Alves, Artur; Pereira, Anabela; Correia, António

    2014-02-01

    Gulls were reported as sources of fecal pollution in coastal environments and potential vectors of human infections. Microbial source tracking (MST) methods were rarely tested to identify this pollution origin. This study was conducted to ascertain the source of water fecal contamination in the Berlenga Island, Portugal. A total of 169 Escherichia coli isolates from human sewage, 423 isolates from gull feces and 334 water isolates were analyzed by BOX-PCR. An average correct classification of 79.3% was achieved. When an 85% similarity cutoff was applied 24% of water isolates were present in gull feces against 2.7% detected in sewage. Jackknifing resulted in 29.3% of water isolates classified as gull, and 10.8% classified as human. Results indicate that gulls constitute a major source of water contamination in the Berlenga Island. This study validated a methodology to differentiate human and gull fecal pollution sources in a real case of a contaminated beach. PMID:24140684

  18. Small-angle X-ray scattering- and nuclear magnetic resonance-derived conformational ensemble of the highly flexible antitoxin PaaA2.

    PubMed

    Sterckx, Yann G J; Volkov, Alexander N; Vranken, Wim F; Kragelj, Jaka; Jensen, Malene Ringkjøbing; Buts, Lieven; Garcia-Pino, Abel; Jové, Thomas; Van Melderen, Laurence; Blackledge, Martin; van Nuland, Nico A J; Loris, Remy

    2014-06-10

    Antitoxins from prokaryotic type II toxin-antitoxin modules are characterized by a high degree of intrinsic disorder. The description of such highly flexible proteins is challenging because they cannot be represented by a single structure. Here, we present a combination of SAXS and NMR data to describe the conformational ensemble of the PaaA2 antitoxin from the human pathogen E. coli O157. The method encompasses the use of SAXS data to filter ensembles out of a pool of conformers generated by a custom NMR structure calculation protocol and the subsequent refinement by a block jackknife procedure. The final ensemble obtained through the method is validated by an established residual dipolar coupling analysis. We show that the conformational ensemble of PaaA2 is highly compact and that the protein exists in solution as two preformed helices, connected by a flexible linker, that probably act as molecular recognition elements for toxin inhibition. PMID:24768114

  19. Nuclear DNA analysis of oral hyperplasia and dysplasia using image cytometry.

    PubMed

    Abdel-Salam, M; Mayall, B H; Hansen, L S; Chew, K L; Greenspan, J S

    1987-10-01

    We investigated the value of image analysis in discriminating among oral white lesions with hyperplasia without dysplasia and oral white or white-and-red lesions with moderate or severe dysplasia. Normal oral epithelial tissue was used as a control. Image analysis was applied to 5-micron formalin-fixed sections stained with the azure A-Feulgen reaction for nuclear DNA. For 150-200 cells from each section, 5 nuclear variables were assessed: area, form factor, total stain, average stain and ellipticity. For each variable, 2 measurements were obtained, the mean and the interquartile range, and were used for stepwise discriminant analysis. Using this test, a model of 3 measurements with the most discriminating power was developed. When the jackknife classification test was applied to this model, we could discriminate with 81% accuracy between the 4 groups of tissue studied. PMID:3123622

  20. Ontogeny of the barley plant as related to mutation expression and detection of pollen mutations

    SciTech Connect

    Hodgdon, A.L.; Marcus, A.H.; Arenaz, P.; Rosichan, J.L.; Bogyo, T.P.; Nilan, R.A.

    1981-01-01

    Clustering of mutant pollen grains in a population of normal pollen due to premeiotic mutational events complicates translating mutation frequencies into rates. Embryo ontogeny in barley will be described and used to illustrate the formation of such mutant clusters. The nature of the statistics for mutation frequency will be described from a study of the reversion frequencies of various waxy mutants in barley. Computer analysis by a ''jackknife'' method of the reversion of a waxy mutant treated with the mutagen sodium azide showed a significantly higher reversion frequency than untreated material. Problems of the computer analysis suggest a better experimental design for pollen mutation experiments. Preliminary work on computer modeling for pollen development and mutation will be described.

  1. The Development of a Statistical Decision Aid for Psychiatric Case Identification of Alcohol Use Disorder in the Eastern Baltimore Community

    PubMed Central

    Rizik, Peter D.

    1984-01-01

    The development of a case identification decision support tool in a Public Health setting is presented in this research. The analytic problem is described as the classification problem (Duda and Hart, 1974). Using logistic regression with a variable selection option enables iterative improvements in classification criteria through dimensionality reduction and subspace modification. The performance of the classifier on new samples is simulated by a subset jackknife technique which trains the model on one portion of the sample and tests on the other portion. Repeating this several times gives estimates of case detection and suggests underlying patterns of predictive variable sets through key variable selections and parameter estimate stability. From these estimates, positive and negative predictive values and referral rates can be estimated for proper evaluation of the tool.

  2. Otolith elemental signatures indicate population separation in deep-sea rockfish, Helicolenus dactylopterus and Pontinus kuhlii, from the Azores

    NASA Astrophysics Data System (ADS)

    Higgins, Ruth; Isidro, Eduardo; Menezes, Gui; Correia, Alberto

    2013-10-01

    Deep sea rockfish, Helicolenus dactylopterus and Pontinus kuhlii from the Azores archipelago were used to study population structuring using the trace element composition of the otolith. Through solution-based inductively coupled plasma mass spectrometry we identified elemental profiles that adequately identified fish from different island groups in the region (East, West and Central). Mg:Ca, Pb:Ca and Li:Ca ratios combined to distinguish H. dactylopterus with 67% overall success. Sr:Ca, Ba:Ca, Li:Ca and Cu:Ca provided adequate distinction in P. kuhlii with a mean jack-knifed classification success of 75%. This was a first attempt at determining the distinguishability of fish aggregations from this oceanic island setting, where suitable habitat for these species is limited and fragmented. Results of our study corroborate with previous research pointing to constrained home ranges for these species. Implications for fisheries management are important since these commercial resources should be managed locally rather than regionally.

  3. Discrepancies in the Estimation of Gene Flow in Partula

    PubMed Central

    Johnson, M. S.; Clarke, B.; Murray, J.

    1988-01-01

    Methods for estimating gene flow (Nm) from genetic data should provide important insights into the dynamics of natural populations. If they are to be used with confidence, however, the methods must be shown to produce valid results. Estimates of Nm have been obtained for the snails Partula taeniata and Partula suturalis, based on F(ST) and on the frequencies of private alleles, p(1). Jackknifing was used to reduce the bias of estimates and to obtain confidence limits. The estimates derived from F(ST) are consistent with the low vagility of snails, and with direct field studies of gene flow in P. taeniata. In contrast, the estimates derived from p(1) were up to seven times as large, less precise and less consistent. Although the underlying causes of these discrepancies are not clear, the results suggest that F(ST) is the more reliable indirect estimator of gene flow, at least for Partula. PMID:17246477

  4. Prediction of protein structural classes using hybrid properties.

    PubMed

    Li, Wenjin; Lin, Kao; Feng, Kaiyan; Cai, Yudong

    2008-01-01

    In this paper, amino acid compositions are combined with some protein sequence properties (physiochemical properties) to predict protein structural classes. We are able to predict protein structural classes using a mathematical model that combines the nearest neighbor algorithm (NNA), mRMR (minimum redundancy, maximum relevance), and feature forward searching strategy. Jackknife cross-validation is used to evaluate the prediction accuracy. As a result, the prediction success rate improves to 68.8%, which is better than the 62.2% obtained when using only amino acid compositions. Therefore, we conclude that the physiochemical properties are factors that contribute to the protein folding phenomena and the most contributing features are found to be the amino acid composition. We expect that prediction accuracy will improve further as more sequence information comes to light. A web server for predicting the protein structural classes is available at http://app3.biosino.org:8080/liwenjin/index.jsp. PMID:18953662

  5. Prediction of Golgi-resident protein types using general form of Chou's pseudo-amino acid compositions: Approaches with minimal redundancy maximal relevance feature selection.

    PubMed

    Jiao, Ya-Sen; Du, Pu-Feng

    2016-08-01

    Recently, several efforts have been made in predicting Golgi-resident proteins. However, it is still a challenging task to identify the type of a Golgi-resident protein. Precise prediction of the type of a Golgi-resident protein plays a key role in understanding its molecular functions in various biological processes. In this paper, we proposed to use a mutual information based feature selection scheme with the general form Chou's pseudo-amino acid compositions to predict the Golgi-resident protein types. The positional specific physicochemical properties were applied in the Chou's pseudo-amino acid compositions. We achieved 91.24% prediction accuracy in a jackknife test with 49 selected features. It has the best performance among all the present predictors. This result indicates that our computational model can be useful in identifying Golgi-resident protein types. PMID:27155042

  6. Measures of clinical agreement for nominal and categorical data: the kappa coefficient.

    PubMed

    Cyr, L; Francis, K

    1992-07-01

    The desire to determine the extent inter-rater measurements obtained in a clinical setting are free from measurement error and reflect true scores has spurned a renewed interest in assessment of reliability. The kappa coefficient is considered the statistic of choice to analyze the reliability of nominal and categorical types of data recorded on the same patient by more than one clinician. This paper presents a simple computer program written in PASCAL that can be used in a clinical environment to quickly determine the reliability of nominal or categorical data. This computer program calculates both weighted and non-weighted kappa coefficients with their corresponding standard errors as well as bias-correcting jackknife estimates of kappa for use with small sample sizes. PMID:1643847

  7. Identification and analysis of the N(6)-methyladenosine in the Saccharomyces cerevisiae transcriptome.

    PubMed

    Chen, Wei; Tran, Hong; Liang, Zhiyong; Lin, Hao; Zhang, Liqing

    2015-01-01

    Knowledge of the distribution of N(6)-methyladenosine (m(6)A) is invaluable for understanding RNA biological functions. However, limitation in experimental methods impedes the progress towards the identification of m(6)A site. As a complement of experimental methods, a support vector machine based-method is proposed to identify m(6)A sites in Saccharomyces cerevisiae genome. In this model, RNA sequences are encoded by their nucleotide chemical property and accumulated nucleotide frequency information. It is observed in the jackknife test that the accuracy achieved by the proposed model in identifying the m(6)A site was 78.15%. For the convenience of experimental scientists, a web-server for the proposed model is provided at http://lin.uestc.edu.cn/server/m6Apred.php. PMID:26343792

  8. Identification and analysis of the N6-methyladenosine in the Saccharomyces cerevisiae transcriptome

    PubMed Central

    Chen, Wei; Tran, Hong; Liang, Zhiyong; Lin, Hao; Zhang, Liqing

    2015-01-01

    Knowledge of the distribution of N6-methyladenosine (m6A) is invaluable for understanding RNA biological functions. However, limitation in experimental methods impedes the progress towards the identification of m6A site. As a complement of experimental methods, a support vector machine based-method is proposed to identify m6A sites in Saccharomyces cerevisiae genome. In this model, RNA sequences are encoded by their nucleotide chemical property and accumulated nucleotide frequency information. It is observed in the jackknife test that the accuracy achieved by the proposed model in identifying the m6A site was 78.15%. For the convenience of experimental scientists, a web-server for the proposed model is provided at http://lin.uestc.edu.cn/server/m6Apred.php. PMID:26343792

  9. Quantitative analysis of mebendazole polymorphs in pharmaceutical raw materials using near-infrared spectroscopy.

    PubMed

    da Silva, Vitor H; Gonçalves, Jacqueline L; Vasconcelos, Fernanda V C; Pimentel, M Fernanda; Pereira, Claudete F

    2015-11-10

    This work evaluates the feasibility of using NIR spectroscopy for quantification of three polymorphs of mebendazole (MBZ) in pharmaceutical raw materials. Thirty ternary mixtures of polymorphic forms of MBZ were prepared, varying the content of forms A and C from 0 to 100% (w/w), and for form B from 0 to 30% (w/w). Reflectance NIR spectra were used to develop partial least square (PLS) regression models using all spectral variables and the variables with significant regression coefficients selected by the Jack-Knife algorithm (PLS/JK). MBZ polymorphs were quantified with RMSEP values of 2.37% w/w, 1.23% w/w and 1.48% w/w for polymorphs A, B and C, respectively. This is an easy, fast and feasible method for monitoring the quality of raw pharmaceutical materials of MBZ according to polymorph purity. PMID:26320077

  10. Using reduced amino acid composition to predict defensin family and subfamily: Integrating similarity measure and structural alphabet.

    PubMed

    Zuo, Yong-Chun; Li, Qian-Zhong

    2009-10-01

    Defensins are essentially ancient natural antibiotics with potent activity extending from lower organisms to humans. They can inhibit the growth or virulence of micro-organisms directly or indirectly enhance the host's immune system. The successful prediction of defensin peptides will provide very useful information and insights for the basic research of defensins. In this study, by selecting the N-peptide composition of reduced amino acid alphabet (RAAA) obtained from structural alphabet named Protein Blocks as the feature parameters, the increment of diversity (ID) is firstly developed to predict defensins family and subfamily. The jackknife test based on 2-peptide composition of reduced amino acid alphabet (RAAA) with 13 reduced amino acids shows that the overall accuracy of prediction are 91.36% for defensin family, and 94.21% for defensin subfamily. The results indicate that ID_RAAA is a simple and efficient prediction method for defensin peptides. PMID:19591890

  11. Latency as a region contrast: Measuring ERP latency differences with Dynamic Time Warping.

    PubMed

    Zoumpoulaki, A; Alsufyani, A; Filetti, M; Brammer, M; Bowman, H

    2015-12-01

    Methods for measuring onset latency contrasts are evaluated against a new method utilizing the dynamic time warping (DTW) algorithm. This new method allows latency to be measured across a region instead of single point. We use computer simulations to compare the methods' power and Type I error rates under different scenarios. We perform per-participant analysis for different signal-to-noise ratios and two sizes of window (broad vs. narrow). In addition, the methods are tested in combination with single-participant and jackknife average waveforms for different effect sizes, at the group level. DTW performs better than the other methods, being less sensitive to noise as well as to placement and width of the window selected. PMID:26372033

  12. Structural class tendency of polypeptide: A new conception in predicting protein structural class

    NASA Astrophysics Data System (ADS)

    Yu, Tao; Sun, Zhi-Bo; Sang, Jian-Ping; Huang, Sheng-You; Zou, Xian-Wu

    2007-12-01

    Prediction of protein domain structural classes is an important topic in protein science. In this paper, we proposed a new conception: structural class tendency of polypeptides (SCTP), which is based on the fact that a given amino acid fragment tends to be presented in certain type of proteins. The SCTP is obtained from an available training data set PDB40-B. When using the SCTP to predict protein structural classes by Intimate Sorting predictive method, we got the predictive accuracy (jackknife test) with 93.7%, 96.5%, and 78.6% for the testing data set PDB40-j, Chou&Maggiora and CHOU. These results indicate that the SCTP approach is quite encouraging and promising. This new conception provides an effective tool to extract valuable information from protein sequences.

  13. Application of Spatial and Closed Capture-Recapture Models on Known Population of the Western Derby Eland (Taurotragus derbianus derbianus) in Senegal

    PubMed Central

    Jůnek, Tomáš; Jůnková Vymyslická, Pavla; Hozdecká, Kateřina; Hejcmanová, Pavla

    2015-01-01

    Camera trapping with capture-recapture analyses has provided estimates of the abundances of elusive species over the last two decades. Closed capture-recapture models (CR) based on the recognition of individuals and incorporating natural heterogeneity in capture probabilities are considered robust tools; however, closure assumption is often questionable and the use of an Mh jackknife estimator may fail in estimations of real abundance when the heterogeneity is high and data is sparse. A novel, spatially explicit capture-recapture (SECR) approach based on the location-specific capture histories of individuals overcomes the limitations of closed models. We applied both methods on a closed population of 16 critically endangered Western Derby elands in the fenced 1,060-ha Fathala reserve, Senegal. We analyzed the data from 30 cameras operating during a 66-day sampling period deployed in two densities in grid and line arrays. We captured and identified all 16 individuals in 962 trap-days. Abundances were estimated in the programs CAPTURE (models M0, Mh and Mh Chao) and R, package secr (basic Null and Finite mixture models), and compared with the true population size. We specified 66 days as a threshold in which SECR provides an accurate estimate in all trapping designs within the 7-times divergent density from 0.004 to 0.028 camera trap/ha. Both SECR models showed uniform tendency to overestimate abundance when sampling lasted shorter with no major differences between their outputs. Unlike the closed models, SECR performed well in the line patterns, which indicates promising potential for linear sampling of properly defined habitats of non-territorial and identifiable herbivores in dense wooded savanna conditions. The CR models provided reliable estimates in the grid and we confirmed the advantage of Mh Chao estimator over Mh jackknife when data appeared sparse. We also demonstrated the pooling of trapping occasions with an increase in the capture probabilities, avoiding violation of results. PMID:26334997

  14. Single-pass versus two-pass boat electrofishing for characterizing river fish assemblages: Species richness estimates and sampling distance

    USGS Publications Warehouse

    Meador, M.R.

    2005-01-01

    Determining adequate sampling effort for characterizing fish assemblage structure in nonwadeable rivers remains a critical issue in river biomonitoring. Two-pass boat electrofishing data collected from 500-1,000-m-long river reaches as part of the U.S. Geological Survey's National Water-Quality Assessment (NAWQA) Program were analyzed to assess the efficacy of single-pass boat electrofishing. True fish species richness was estimated by use of a two-pass removal model and nonparametric jackknife estimation for 157 sampled reaches across the United States. Compared with estimates made with a relatively unbiased nonparametric estimator, estimates of true species richness based on the removal model may be biased, particularly when true species richness is greater than 10. Based on jackknife estimation, the mean percent of estimated true species richness collected in the first electrofishing pass (p??j,s1) for all 157 reaches was 65.5%. The effectiveness of single-pass boat electrofishing may be greatest when the expected species richness is relatively low (>10 species). The second pass produced additional species (1-13) in 89.2% of sampled reaches. Of these additional species, centrarchids were collected in 50.3% of reaches and cyprinids were collected in 45.9% of reaches. Examination of relations between channel width ratio (reach length divided by wetted channel width) and p??j,s1 values provided no clear recommendation for sampling distances based on channel width ratios. Increasing sampling effort through an extension of the sampled reach distance can increase the percent species richness obtained from single-pass boat electrofishing. When single-pass boat electrofishing is used to characterize fish assemblage structure, determination of the sampling distance should take into account such factors as species richness and patchiness, the presence of species with relatively low probabilities of detection, and human alterations to the channel.

  15. Pine Hollow Watershed Project : FY 2000 Projects.

    SciTech Connect

    Sherman County Soil and Water Conservation District

    2001-06-01

    The Pine Hollow Project (1999-010-00) is an on-going watershed restoration effort administered by Sherman County Soil and Water Conservation District and spearheaded by Pine Hollow/Jackknife Watershed Council. The headwaters are located near Shaniko in Wasco County, and the mouth is in Sherman County on the John Day River. Pine Hollow provides more than 20 miles of potential summer steelhead spawning and rearing habitat. The watershed is 92,000 acres. Land use is mostly range, with some dryland grain. There are no water rights on Pine Hollow. Due to shallow soils, the watershed is prone to rapid runoff events which scour out the streambed and the riparian vegetation. This project seeks to improve the quality of upland, riparian and in-stream habitat by restoring the natural hydrologic function of the entire watershed. Project implementation to date has consisted of construction of water/sediment control basins, gradient terraces on croplands, pasture cross-fences, upland water sources, and grass seeding on degraded sites, many of which were crop fields in the early part of the century. The project is expected to continue through about 2007. From March 2000 to June 2001, the Pine Hollow Project built 6 sediment basins, 1 cross-fence, 2 spring developments, 1 well development, 1 solar pump, 50 acres of native range seeding and 1 livestock waterline. FY2000 projects were funded by BPA, Oregon Watershed Enhancement Board, US Fish and Wildlife Service and landowners. In-kind services were provided by Sherman County Soil and Water Conservation District, USDA Natural Resources Conservation Service, USDI Bureau of Land Management, Oregon Department of Fish and Wildlife, Pine Hollow/Jackknife Watershed Council, landowners and Wasco County Soil and Water Conservation District.

  16. [Comparative analysis of three length based methods for estimating growth of the tilapia Oreochromis aureus (Perciformes: Cichlidae) in a tropical lake of Mexico].

    PubMed

    Arellano-Torres, Andrés; Hernández Montaño, Daniel; Meléndez Galicia, Carlos

    2013-09-01

    A comparative analysis of three length based methods for estimating growth of the tilapia Oreochromis aureus (Perciformes: Cichlidae) in a tropical lake of Mexico. Several methods are now available to estimate fish individual growth based upon the distribution of body lengths in a population. Comparative analyses of length-based methods have been undertaken mainly for marine species; nevertheless, limited information is available for inland species. Tilapia is one of the most important freshwater fisheries and its growth parameters have been estimated by several authors, usually using one length-based method. Thus, the main objectives of this study were: a) to estimate growth parameters of O. aureus from Chapala lake, Mexico, using three length-based methods ELEFAN, PROJMAT and SLCA; b) to quantify the effect of input data variations in growth parameters estimates by the jackknife technique; and c) to compare the new estimates with those previously reported, through the standard growth index phi. We collected and analyzed a total of 1,973 specimens from commercial landings from January to December 2010. The three length-base methods used in the present study resulted in parameter estimates within the range of those reported in other studies. Results derived from jackknife analysis revealed lowest values in the error percentage and coefficient of variation for L infinity when applying ELEFAN, while PROJMAT showed lowest values in the precision estimators for K, which was very similar to ELEFAN. Estimates of the comparative growth index phi were also very similar to those reported for the same species when studied in different reservoirs. Considering our results, we suggest the use of ELEFAN rather than SLCA due to its accuracy to estimate growth parameters for O. aureus. PMID:24044136

  17. Would species richness estimators change the observed species area relationship?

    NASA Astrophysics Data System (ADS)

    Borges, Paulo A. V.; Hortal, Joaquín; Gabriel, Rosalina; Homem, Nídia

    2009-01-01

    We evaluate whether the description of the species area relationship (SAR) can be improved by using richness estimates instead of observed richness values. To do this, we use three independent datasets gathered with standardized survey methods from the native laurisilva forest of the Azorean archipelago, encompassing different distributional extent and biological groups: soil epigean arthropods at eight forest fragments in Terceira Island, canopy arthropods inhabiting Juniperus brevifolia at 16 forest fragments of six different islands, and bryophytes of seven forest fragments from Terceira and Pico islands. Species richness values were estimated for each forest fragment using seven non-parametric estimators (ACE, ICE, Chao1, Chao2, Jackknife1, Jackknife2 and Bootstrap; five in the case of bryophytes). These estimates were fitted to classical log-log species-area curves and the intercept, slope and goodness of fit of these curves were compared with those obtained from the observed species richness values to determine if significant differences appear in these parameters. We hypothesized that the intercepts would be higher in the estimated data sets compared with the observed data, as estimated richness values are typically higher than observed values. We found partial support for the hypothesis - intercepts of the SAR obtained from estimated richness values were significantly higher in the case of epigean arthropods and bryophyte datasets. In contrast, the slope and goodness of fit obtained with estimated values were not significantly different from those obtained from observed species richness in all groups, although a few small differences appeared. We conclude that, although little is gained using these estimators if data come from standardized surveys, their estimations could be used to analyze macroecological relationships with non-standardized observed data, provided that survey incompleteness and/or unevenness are also taken into account.

  18. Absence of significant cross-correlation between WMAP and SDSS

    NASA Astrophysics Data System (ADS)

    López-Corredoira, M.; Sylos Labini, F.; Betancort-Rijo, J.

    2010-04-01

    Aims: Several authors have claimed to detect a significant cross-correlation between microwave WMAP anisotropies and the SDSS galaxy distribution. We repeat these analyses to determine the different cross-correlation uncertainties caused by re-sampling errors and field-to-field fluctuations. The first type of error concerns overlapping sky regions, while the second type concerns non-overlapping sky regions. Methods: To measure the re-sampling errors, we use bootstrap and jack-knife techniques. For the field-to-field fluctuations, we use three methods: 1) evaluation of the dispersion in the cross-correlation when correlating separated regions of WMAP with the original region of SDSS; 2) use of mock Monte Carlo WMAP maps; 3) a new method (developed in this article), which measures the error as a function of the integral of the product of the self-correlations for each map. Results: The average cross-correlation for b > 30 deg is significantly stronger than the re-sampling errors - both the jack-knife and bootstrap techniques provide similar results - but it is of the order of the field-to-field fluctuations. This is confirmed by the cross-correlation between anisotropies and galaxies in more than the half of the sample being null within re-sampling errors. Conclusions: Re-sampling methods underestimate the errors. Field-to-field fluctuations dominate the detected signals. The ratio of signal to re-sampling errors is larger than unity in a way that strongly depends on the selected sky region. We therefore conclude that there is no evidence yet of a significant detection of the integrated Sachs-Wolfe (ISW) effect. Hence, the value of Ω_Λ ≈ 0.8 obtained by the authors who assumed they were observing the ISW effect would appear to have originated from noise analysis.

  19. Analysis of correlated ROC areas in diagnostic testing.

    PubMed

    Song, H H

    1997-03-01

    This paper focuses on methods of analysis of areas under receiver operating characteristic (ROC) curves. Analysis of ROC areas should incorporate the correlation structure of repeated measurements taken on the same set of cases and the paucity of measurements per treatment resulting from an effective summarization of cases into a few area measures of diagnostic accuracy. The repeated nature of ROC data has been taken into consideration in the analysis methods previously suggested by Swets and Pickett (1982, Evaluation of Diagnostic Systems: Methods from Signal Detection Theory), Hanley and McNeil (1983, Radiology 148, 839-843), and DeLong, DeLong, and Clarke-Pearson (1988, Biometrics 44, 837-845). DeLong et al.'s procedure is extended to a Wald test for general situations of diagnostic testing. The method of analyzing jackknife pseudovalues by treating them as data is extremely useful when the number of area measures to be tested is quite small. The Wald test based on covariances of multivariate multisample U-statistics is compared with two approaches of analyzing pseudovalues, the univariate mixed-model analysis of variance (ANOVA) for repeated measurements and the three-way factorial ANOVA. Monte Carlo simulations demonstrate that the three tests give good approximation to the nominal size at the 5% levels for large sample sizes, but the paired t-test using ROC areas as data lacks the power of the other three tests and Hanley and McNeil's method is inappropriate for testing diagnostic accuracies. The Wald statistic performs better than the ANOVAs of pseudovalues. Jackknifing schemes of multiple deletion where different structures of normal and diseased distributions are accounted for appear to perform slightly better than simple multiple-deletion schemes but no appreciable power difference is apparent, and deletion of too many cases at a time may sacrifice power. These methods have important applications in diagnostic testing in ROC studies of radiology and of medicine in general. PMID:9147602

  20. Tracing the pathways of neotropical migratory shorebirds using stable isotopes: a pilot study.

    PubMed

    Farmer, A; Rye, R; Landis, G; Bern, C; Kester, C; Ridley, I

    2003-09-01

    We evaluated the potential use of stable isotopes to establish linkages between the wintering grounds and the breeding grounds of the Pectoral Sandpiper (Calidris melanotos), the White-rumped Sandpiper (Calidris fuscicollis), the Baird's Sandpiper (Calidris bairdii), and other Neotropical migratory shorebird species (e.g., Tringa spp.). These species molt their flight feathers on the wintering grounds and hence their flight feathers carry chemical signatures that are characteristic of their winter habitat. The objective of our pilot study was to assess the feasibility of identifying the winter origin of individual birds by: (1) collecting shorebird flight feathers from several widely separated Argentine sites and analyzing these for a suite of stable isotopes; and 2) analyzing the deuterium and 18O isotope data that were available from precipitation measurement stations in Argentina. Isotopic ratios (delta13C, delta15N and delta34S) of flight feathers were significantly different among three widely separated sites in Argentina during January 2001. In terms of relative importance in separating the sites, delta34S was most important, followed by delta15N, and then delta13C. In the complete discriminant analysis, the classification function correctly predicted group membership in 85% of the cases (jackknifed classification matrix). In a stepwise analysis delta13C was dropped from the solution, and site membership was correctly predicted in 92% of cases (jackknifed matrix). Analysis of precipitation data showed that both deltaD and delta18O were significantly related to both latitude and longitude on a countrywide scale (p < 0.001). Other variables, month, altitude, explained little additional variation in these isotope ratios. Several issues were identified that will likely constrain the degree of accuracy one can expect in predicting the geographic origin of birds from Argentina. There was unexplained variation in isotope ratios within and among the different wing feathers from individual birds. Such variation may indicate that birds are not faithful to a local site during their winter stay in Argentina. There was significant interannual variation in the deltaD and delta18O of precipitation. Hence, specific locations may not have a constant signature for some isotopes. Moreover, the fractionation that occurs in wetlands due to evaporation significantly skews local deltaD and delta18O values, which may undermine the strong large-scale gradients seen in the precipitation data. We are continuing the research with universities in Argentina with a focus on expanding the breadth of feather collection and attempting to resolve the identified issues. PMID:14521278

  1. Hierarchical Bayes estimation of species richness and occupancy in spatially replicated surveys

    USGS Publications Warehouse

    Kery, M.; Royle, J. Andrew

    2008-01-01

    1. Species richness is the most widely used biodiversity metric, but cannot be observed directly as, typically, some species are overlooked. Imperfect detectability must therefore be accounted for to obtain unbiased species-richness estimates. When richness is assessed at multiple sites, two approaches can be used to estimate species richness: either estimating for each site separately, or pooling all samples. The first approach produces imprecise estimates, while the second loses site-specific information. 2. In contrast, a hierarchical Bayes (HB) multispecies site-occupancy model benefits from the combination of information across sites without losing site-specific information and also yields occupancy estimates for each species. The heart of the model is an estimate of the incompletely observed presence-absence matrix, a centrepiece of biogeography and monitoring studies. We illustrate the model using Swiss breeding bird survey data, and compare its estimates with the widely used jackknife species-richness estimator and raw species counts. 3. Two independent observers each conducted three surveys in 26 1-km(2) quadrats, and detected 27-56 (total 103) species. The average estimated proportion of species detected after three surveys was 0.87 under the HB model. Jackknife estimates were less precise (less repeatable between observers) than raw counts, but HB estimates were as repeatable as raw counts. The combination of information in the HB model thus resulted in species-richness estimates presumably at least as unbiased as previous approaches that correct for detectability, but without costs in precision relative to uncorrected, biased species counts. 4. Total species richness in the entire region sampled was estimated at 113.1 (CI 106-123); species detectability ranged from 0.08 to 0.99, illustrating very heterogeneous species detectability; and species occupancy was 0.06-0.96. Even after six surveys, absolute bias in observed occupancy was estimated at up to 0.40. 5. Synthesis and applications. The HB model for species-richness estimation combines information across sites and enjoys more precise, and presumably less biased, estimates than previous approaches. It also yields estimates of several measures of community size and composition. Covariates for occupancy and detectability can be included. We believe it has considerable potential for monitoring programmes as well as in biogeography and community ecology.

  2. Differentiating prenatal exposure to methamphetamine and alcohol versus alcohol and not methamphetamine using tensor based brain morphometry and discriminant analysis

    PubMed Central

    Sowell, Elizabeth R.; Leow, Alex D.; Bookheimer, Susan Y.; Smith, Lynne M.; OConnor, Mary J.; Kan, Eric; Rosso, Carly; Houston, Suzanne; Dinov, Ivo D.; Thompson, Paul M.

    2010-01-01

    Here we investigate the effects of prenatal exposure to methamphetamine (MA) on local brain volume using magnetic resonance imaging. Because many who use MA during pregnancy also use alcohol, a known teratogen, we examined whether local brain volumes differed among 61 children (ages 5 to 15), 21 with prenatal MA exposure, 18 with concomitant prenatal alcohol exposure (the MAA group), 13 with heavy prenatal alcohol but not MA exposure (ALC group), and 27 unexposed controls (CON group). Volume reductions were observed in both exposure groups relative to controls in striatal and thalamic regions bilaterally, and right prefrontal and left occipitoparietal cortices. Striatal volume reductions were more severe in the MAA group than in the ALC group, and within the MAA group, a negative correlation between full-scale IQ (FSIQ) scores and caudate volume was observed. Limbic structures including the anterior and posterior cingulate, the inferior frontal gyrus (IFG) and ventral and lateral temporal lobes bilaterally were increased in volume in both exposure groups. Further, cingulate and right IFG volume increases were more pronounced in the MAA than ALC group. Discriminant function analyses using local volume measurements and FSIQ were used to predict group membership, yielding factor scores that correctly classified 72% of participants in jackknife analyses. These findings suggest that striatal and limbic structures, known to be sites of neurotoxicity in adult MA abusers, may be more vulnerable to prenatal MA exposure than alcohol exposure, and that more severe striatal damage is associated with more severe cognitive deficit. PMID:20237258

  3. Comparison of mapping approaches of design annual maximum daily precipitation

    NASA Astrophysics Data System (ADS)

    Szolgay, J.; Parajka, J.; Kohnová, S.; Hlavčová, K.

    2009-05-01

    In this study 2-year and 100-year annual maximum daily precipitation for rainfall-runoff studies and estimating flood hazard were mapped. The daily precipitation measurements at 23 climate stations from 1961-2000 were used in the upper Hron basin in central Slovakia. The choice of data preprocessing and interpolation methods was guided by their practical applicability and acceptance in the engineering hydrologic community. The main objective was to discuss the quality and properties of maps of design precipitation with a given return period with respect to the expectations of the end user. Four approaches to the preprocessing of annual maximum 24-hour precipitation data were used, and three interpolation methods employed. The first approach is the direct mapping of at-site estimates of distribution function quantiles; the second is the direct mapping of local estimates of the three parameters of the GEV distribution. In the third, the daily precipitation totals were interpolated into a regular grid network, and then the time series of the maximum daily precipitation totals in each grid point of the selected region were statistically analysed. In the fourth, the spatial distribution of the design precipitation was modeled by quantiles predicted by regional precipitation frequency analysis using the Hosking and Wallis procedure. The three interpolation methods used were the inverse distance weighting, nearest neighbor and the kriging method. Visual inspection and jackknife cross-validation were used to compare the combination of approaches.

  4. Predicting DNA binding proteins using support vector machine with hybrid fractal features.

    PubMed

    Niu, Xiao-Hui; Hu, Xue-Hai; Shi, Feng; Xia, Jing-Bo

    2014-02-21

    DNA-binding proteins play a vitally important role in many biological processes. Prediction of DNA-binding proteins from amino acid sequence is a significant but not fairly resolved scientific problem. Chaos game representation (CGR) investigates the patterns hidden in protein sequences, and visually reveals previously unknown structure. Fractal dimensions (FD) are good tools to measure sizes of complex, highly irregular geometric objects. In order to extract the intrinsic correlation with DNA-binding property from protein sequences, CGR algorithm, fractal dimension and amino acid composition are applied to formulate the numerical features of protein samples in this paper. Seven groups of features are extracted, which can be computed directly from the primary sequence, and each group is evaluated by the 10-fold cross-validation test and Jackknife test. Comparing the results of numerical experiments, the group of amino acid composition and fractal dimension (21-dimension vector) gets the best result, the average accuracy is 81.82% and average Matthew's correlation coefficient (MCC) is 0.6017. This resulting predictor is also compared with existing method DNA-Prot and shows better performances. PMID:24189096

  5. Prediction of Protein Structural Class Based on Gapped-Dipeptides and a Recursive Feature Selection Approach

    PubMed Central

    Liu, Taigang; Qin, Yufang; Wang, Yongjie; Wang, Chunhua

    2015-01-01

    The prior knowledge of protein structural class may offer useful clues on understanding its functionality as well as its tertiary structure. Though various significant efforts have been made to find a fast and effective computational approach to address this problem, it is still a challenging topic in the field of bioinformatics. The position-specific score matrix (PSSM) profile has been shown to provide a useful source of information for improving the prediction performance of protein structural class. However, this information has not been adequately explored. To this end, in this study, we present a feature extraction technique which is based on gapped-dipeptides composition computed directly from PSSM. Then, a careful feature selection technique is performed based on support vector machine-recursive feature elimination (SVM-RFE). These optimal features are selected to construct a final predictor. The results of jackknife tests on four working datasets show that our method obtains satisfactory prediction accuracies by extracting features solely based on PSSM and could serve as a very promising tool to predict protein structural class. PMID:26712737

  6. North American Tropical Cyclone Landfall and SST: A Statistical Model Study

    NASA Technical Reports Server (NTRS)

    Hall, Timothy; Yonekura, Emmi

    2013-01-01

    A statistical-stochastic model of the complete life cycle of North Atlantic (NA) tropical cyclones (TCs) is used to examine the relationship between climate and landfall rates along the North American Atlantic and Gulf Coasts. The model draws on archived data of TCs throughout the North Atlantic to estimate landfall rates at high geographic resolution as a function of the ENSO state and one of two different measures of sea surface temperature (SST): 1) SST averaged over the NA subtropics and the hurricane season and 2) this SST relative to the seasonal global subtropical mean SST (termed relSST). Here, the authors focus on SST by holding ENSO to a neutral state. Jackknife uncertainty tests are employed to test the significance of SST and relSST landfall relationships. There are more TC and major hurricane landfalls overall in warm years than cold, using either SST or relSST, primarily due to a basinwide increase in the number of storms. The signal along the coast, however, is complex. Some regions have large and significant sensitivity (e.g., an approximate doubling of annual major hurricane landfall probability on Texas from -2 to +2 standard deviations in relSST), while other regions have no significant sensitivity (e.g., the U.S. mid-Atlantic and Northeast coasts). This geographic structure is due to both shifts in the regions of primary TC genesis and shifts in TC propagation.

  7. BICEP2/Keck Array V: Measurements of B-mode Polarization at Degree Angular Scales and 150 GHz by the Keck Array

    NASA Astrophysics Data System (ADS)

    BICEP2 and Keck Array Collaborations; Ade, P. A. R.; Ahmed, Z.; Aikin, R. W.; Alexander, K. D.; Barkats, D.; Benton, S. J.; Bischoff, C. A.; Bock, J. J.; Brevik, J. A.; Buder, I.; Bullock, E.; Buza, V.; Connors, J.; Crill, B. P.; Dowell, C. D.; Dvorkin, C.; Duband, L.; Filippini, J. P.; Fliescher, S.; Golwala, S. R.; Halpern, M.; Harrison, S.; Hasselfield, M.; Hildebrandt, S. R.; Hilton, G. C.; Hristov, V. V.; Hui, H.; Irwin, K. D.; Karkare, K. S.; Kaufman, J. P.; Keating, B. G.; Kefeli, S.; Kernasovskiy, S. A.; Kovac, J. M.; Kuo, C. L.; Leitch, E. M.; Lueker, M.; Mason, P.; Megerian, K. G.; Netterfield, C. B.; Nguyen, H. T.; O'Brient, R.; Ogburn, R. W., IV; Orlando, A.; Pryke, C.; Reintsema, C. D.; Richter, S.; Schwarz, R.; Sheehy, C. D.; Staniszewski, Z. K.; Sudiwala, R. V.; Teply, G. P.; Thompson, K. L.; Tolan, J. E.; Turner, A. D.; Vieregg, A. G.; Weber, A. C.; Willmert, J.; Wong, C. L.; Yoon, K. W.

    2015-10-01

    The Keck Array is a system of cosmic microwave background polarimeters, each similar to the Bicep2 experiment. In this paper we report results from the 2012 to 2013 observing seasons, during which the Keck Array consisted of five receivers all operating in the same (150 GHz) frequency band and observing field as Bicep2. We again find an excess of B-mode power over the lensed-ΛCDM expectation of >5σ in the range 30 < ℓ < 150 and confirm that this is not due to systematics using jackknife tests and simulations based on detailed calibration measurements. In map difference and spectral difference tests these new data are shown to be consistent with Bicep2. Finally, we combine the maps from the two experiments to produce final Q and U maps which have a depth of 57 nK deg (3.4 μK arcmin) over an effective area of 400 deg2 for an equivalent survey weight of 250,000 μK-2. The final BB band powers have noise uncertainty a factor of 2.3 times better than the previous results, and a significance of detection of excess power of >6σ.

  8. Driver assistance system for passive multi-trailer vehicles with haptic steering limitations on the leading unit.

    PubMed

    Morales, Jesús; Mandow, Anthony; Martínez, Jorge L; Reina, Antonio J; García-Cerezo, Alfonso

    2013-01-01

    Driving vehicles with one or more passive trailers has difficulties in both forward and backward motion due to inter-unit collisions, jackknife, and lack of visibility. Consequently, advanced driver assistance systems (ADAS) for multi-trailer combinations can be beneficial to accident avoidance as well as to driver comfort. The ADAS proposed in this paper aims to prevent unsafe steering commands by means of a haptic handwheel. Furthermore, when driving in reverse, the steering-wheel and pedals can be used as if the vehicle was driven from the back of the last trailer with visual aid from a rear-view camera. This solution, which can be implemented in drive-by-wire vehicles with hitch angle sensors, profits from two methods previously developed by the authors: safe steering by applying a curvature limitation to the leading unit, and a virtual tractor concept for backward motion that includes the complex case of set-point propagation through on-axle hitches. The paper addresses system requirements and provides implementation details to tele-operate two different off- and on-axle combinations of a tracked mobile robot pulling and pushing two dissimilar trailers. PMID:23552102

  9. MEASUREMENT OF COSMIC MICROWAVE BACKGROUND POLARIZATION POWER SPECTRA FROM TWO YEARS OF BICEP DATA

    SciTech Connect

    Chiang, H. C.; Barkats, D.; Bock, J. J.; Hristov, V. V.; Jones, W. C.; Kovac, J. M.; Lange, A. E.; Mason, P. V.; Matsumura, T.; Ade, P. A. R.; Battle, J. O.; Dowell, C. D.; Nguyen, H. T.; Bierman, E. M.; Keating, B. G.; Duband, L.; Hivon, E. F.; Holzapfel, W. L.; Kuo, C. L.; Leitch, E. M.

    2010-03-10

    Background Imaging of Cosmic Extragalactic Polarization (BICEP) is a bolometric polarimeter designed to measure the inflationary B-mode polarization of the cosmic microwave background (CMB) at degree angular scales. During three seasons of observing at the South Pole (2006 through 2008), BICEP mapped {approx}2% of the sky chosen to be uniquely clean of polarized foreground emission. Here, we present initial results derived from a subset of the data acquired during the first two years. We present maps of temperature, Stokes Q and U, E and B modes, and associated angular power spectra. We demonstrate that the polarization data are self-consistent by performing a series of jackknife tests. We study potential systematic errors in detail and show that they are sub-dominant to the statistical errors. We measure the E-mode angular power spectrum with high precision at 21 <= l <= 335, detecting for the first time the peak expected at l {approx} 140. The measured E-mode spectrum is consistent with expectations from a LAMBDACDM model, and the B-mode spectrum is consistent with zero. The tensor-to-scalar ratio derived from the B-mode spectrum is r = 0.02{sup +0.31}{sub -0.26}, or r < 0.72 at 95% confidence, the first meaningful constraint on the inflationary gravitational wave background to come directly from CMB B-mode polarization.

  10. Two multi-classification strategies used on SVM to predict protein structural classes by using auto covariance.

    PubMed

    Wu, Jiang; Li, Yi-Zhou; Li, Meng-Long; Yu, Le-Zheng

    2009-12-01

    Machine learning methods play the very important role in protein secondary structure prediction and other related works. On condition of a certain approach, the prediction qualities mostly depend on the ways of representing protein sequences into numeric features. In this paper, two Support Vector Machine (SVM) multi-classification strategies, "one-against-one" (1-a-1) and "one-against-all" (1-a-a), were used in protein structural classes identification. Auto covariance (AC), which transforms the physicochemical properties of the amino acids of the proteins into a data matrix, focuses on the neighboring effects and the interactions between residues in protein sequences. "1-a-1" approach was used on SVM to predict protein structural classes and obtained very promising overall accuracy 90.69% by Jackknife test. It was more than 10% higher than the accuracy obtained by using "1-a-a". Experimental results led to the finding that the SVM predictor constructed by "1-a-1" can avoid the appearance of biased prediction accuracy. This current method, using the protein primary sequence information described by auto covariance (AC) and "1-a-1" approach on SVM, should play an important complementary role in other related applications. PMID:20640811

  11. Shrinkage estimation of the power spectrum covariance matrix

    NASA Astrophysics Data System (ADS)

    Pope, Adrian C.; Szapudi, István

    2008-09-01

    We seek to improve estimates of the power spectrum covariance matrix from a limited number of simulations by employing a novel statistical technique known as shrinkage estimation. The shrinkage technique optimally combines an empirical estimate of the covariance with a model (the target) to minimize the total mean squared error compared to the true underlying covariance. We test this technique on N-body simulations and evaluate its performance by estimating cosmological parameters. Using a simple diagonal target, we show that the shrinkage estimator significantly outperforms both the empirical covariance and the target individually when using a small number of simulations. We find that reducing noise in the covariance estimate is essential for properly estimating the values of cosmological parameters as well as their confidence intervals. We extend our method to the jackknife covariance estimator and again find significant improvement, though simulations give better results. Even for thousands of simulations we still find evidence that our method improves estimation of the covariance matrix. Because our method is simple, requires negligible additional numerical effort, and produces superior results, we always advocate shrinkage estimation for the covariance of the power spectrum and other large-scale structure measurements when purely theoretical modelling of the covariance is insufficient.

  12. Transanal Minimally Invasive Surgery (TAMIS) to Treat Vesicorectal Fistula: A New Approach

    PubMed Central

    Tobias-Machado, Marcos; Mattos, Pablo Aloisio Lima; Reis, Leonardo Oliveira; Juliano, César Augusto Braz; Pompeo, Antonio Carlos Lima

    2015-01-01

    ABSTRACT Purpose: Vesicorectal fistula is one of the most devastating postoperative complications after radical prostatectomy. Definitive treatment is difficult due to morbidity and recurrence. Despite many options, there is not an unanimous accepted approach. This article aimed to report a new minimally invasive approach as an option to reconstructive surgery. Materials and Methods: We report on Transanal Minimally Invasive Surgery (TAMIS) with miniLap devices for instrumentation in a 65 year old patient presenting with vesicorectal fistula after radical prostatectomy. We used Alexis® device for transanal access and 3, 5 and 11 mm triangulated ports for the procedure. The surgical steps were as follows: cystoscopy and implant of guide wire through fistula; patient at jack-knife position; transanal access; Identification of the fistula; dissection; vesical wall closure; injection of fibrin glue in defect; rectal wall closure. Results: The operative time was 240 minutes, with 120 minutes for reconstruction. No perioperative complications or conversion were observed. Hospital stay was two days and catheters were removed at four weeks. No recurrence was observed. Conclusions: This approach has low morbidity and is feasible. The main difficulties consisted in maintaining luminal dilation, instrumental manipulation and suturing. PMID:26689530

  13. DephosSite: a machine learning approach for discovering phosphotase-specific dephosphorylation sites

    PubMed Central

    Wang, Xiaofeng; Yan, Renxiang; Song, Jiangning

    2016-01-01

    Protein dephosphorylation, which is an inverse process of phosphorylation, plays a crucial role in a myriad of cellular processes, including mitotic cycle, proliferation, differentiation, and cell growth. Compared with tyrosine kinase substrate and phosphorylation site prediction, there is a paucity of studies focusing on computational methods of predicting protein tyrosine phosphatase substrates and dephosphorylation sites. In this work, we developed two elegant models for predicting the substrate dephosphorylation sites of three specific phosphatases, namely, PTP1B, SHP-1, and SHP-2. The first predictor is called MGPS-DEPHOS, which is modified from the GPS (Group-based Prediction System) algorithm with an interpretable capability. The second predictor is called CKSAAP-DEPHOS, which is built through the combination of support vector machine (SVM) and the composition of k-spaced amino acid pairs (CKSAAP) encoding scheme. Benchmarking experiments using jackknife cross validation and 30 repeats of 5-fold cross validation tests show that MGPS-DEPHOS and CKSAAP-DEPHOS achieved AUC values of 0.921, 0.914 and 0.912, for predicting dephosphorylation sites of the three phosphatases PTP1B, SHP-1, and SHP-2, respectively. Both methods outperformed the previously developed kNN-DEPHOS algorithm. In addition, a web server implementing our algorithms is publicly available at http://genomics.fzu.edu.cn/dephossite/ for the research community. PMID:27002216

  14. Human DNA Ligase III Recognizes DNA Ends by Dynamic Switching between Two DNA-Bound States

    SciTech Connect

    Cotner-Gohara, Elizabeth; Kim, In-Kwon; Hammel, Michal; Tainer, John A.; Tomkinson, Alan E.; Ellenberger, Tom

    2010-09-13

    Human DNA ligase III has essential functions in nuclear and mitochondrial DNA replication and repair and contains a PARP-like zinc finger (ZnF) that increases the extent of DNA nick joining and intermolecular DNA ligation, yet the bases for ligase III specificity and structural variation among human ligases are not understood. Here combined crystal structure and small-angle X-ray scattering results reveal dynamic switching between two nick-binding components of ligase III: the ZnF-DNA binding domain (DBD) forms a crescent-shaped surface used for DNA end recognition which switches to a ring formed by the nucleotidyl transferase (NTase) and OB-fold (OBD) domains for catalysis. Structural and mutational analyses indicate that high flexibility and distinct DNA binding domain features in ligase III assist both nick sensing and the transition from nick sensing by the ZnF to nick joining by the catalytic core. The collective results support a 'jackknife model' in which the ZnF loads ligase III onto nicked DNA and conformational changes deliver DNA into the active site. This work has implications for the biological specificity of DNA ligases and functions of PARP-like zinc fingers.

  15. Non-coding RNA identification based on topology secondary structure and reading frame in organelle genome level.

    PubMed

    Wu, Cheng-Yan; Li, Qian-Zhong; Feng, Zhen-Xing

    2016-01-01

    Non-coding RNA (ncRNA) genes make transcripts as same as the encoding genes, and ncRNAs directly function as RNAs rather than serve as blueprints for proteins. As the function of ncRNA is closely related to organelle genomes, it is desirable to explore ncRNA function by confirming its provenance. In this paper, the topology secondary structure, motif and the triplets under three reading frames are considered as parameters of ncRNAs. A method of SVM combining the increment of diversity (ID) algorithm is applied to construct the classifier. When the method is applied to the ncRNA dataset less than 80% sequence identity, the overall accuracies reach 95.57%, 96.40% in the five-fold cross-validation and the jackknife test, respectively. Further, for the independent testing dataset, the average prediction success rate of our method achieved 93.24%. The higher predictive success rates indicate that our method is very helpful for distinguishing ncRNAs from various organelle genomes. PMID:26697761

  16. DephosSite: a machine learning approach for discovering phosphotase-specific dephosphorylation sites.

    PubMed

    Wang, Xiaofeng; Yan, Renxiang; Song, Jiangning

    2016-01-01

    Protein dephosphorylation, which is an inverse process of phosphorylation, plays a crucial role in a myriad of cellular processes, including mitotic cycle, proliferation, differentiation, and cell growth. Compared with tyrosine kinase substrate and phosphorylation site prediction, there is a paucity of studies focusing on computational methods of predicting protein tyrosine phosphatase substrates and dephosphorylation sites. In this work, we developed two elegant models for predicting the substrate dephosphorylation sites of three specific phosphatases, namely, PTP1B, SHP-1, and SHP-2. The first predictor is called MGPS-DEPHOS, which is modified from the GPS (Group-based Prediction System) algorithm with an interpretable capability. The second predictor is called CKSAAP-DEPHOS, which is built through the combination of support vector machine (SVM) and the composition of k-spaced amino acid pairs (CKSAAP) encoding scheme. Benchmarking experiments using jackknife cross validation and 30 repeats of 5-fold cross validation tests show that MGPS-DEPHOS and CKSAAP-DEPHOS achieved AUC values of 0.921, 0.914 and 0.912, for predicting dephosphorylation sites of the three phosphatases PTP1B, SHP-1, and SHP-2, respectively. Both methods outperformed the previously developed kNN-DEPHOS algorithm. In addition, a web server implementing our algorithms is publicly available at http://genomics.fzu.edu.cn/dephossite/ for the research community. PMID:27002216

  17. Orthology inference in nonmodel organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics.

    PubMed

    Yang, Ya; Smith, Stephen A

    2014-11-01

    Orthology inference is central to phylogenomic analyses. Phylogenomic data sets commonly include transcriptomes and low-coverage genomes that are incomplete and contain errors and isoforms. These properties can severely violate the underlying assumptions of orthology inference with existing heuristics. We present a procedure that uses phylogenies for both homology and orthology assignment. The procedure first uses similarity scores to infer putative homologs that are then aligned, constructed into phylogenies, and pruned of spurious branches caused by deep paralogs, misassembly, frameshifts, or recombination. These final homologs are then used to identify orthologs. We explore four alternative tree-based orthology inference approaches, of which two are new. These accommodate gene and genome duplications as well as gene tree discordance. We demonstrate these methods in three published data sets including the grape family, Hymenoptera, and millipedes with divergence times ranging from approximately 100 to over 400 Ma. The procedure significantly increased the completeness and accuracy of the inferred homologs and orthologs. We also found that data sets that are more recently diverged and/or include more high-coverage genomes had more complete sets of orthologs. To explicitly evaluate sources of conflicting phylogenetic signals, we applied serial jackknife analyses of gene regions keeping each locus intact. The methods described here can scale to over 100 taxa. They have been implemented in python with independent scripts for each step, making it easy to modify or incorporate them into existing pipelines. All scripts are available from https://bitbucket.org/yangya/phylogenomic_dataset_construction. PMID:25158799

  18. Isotopic and Elemental Composition of Roasted Coffee as a Guide to Authenticity and Origin.

    PubMed

    Carter, James F; Yates, Hans S A; Tinggi, Ujang

    2015-06-24

    This study presents the stable isotopic and elemental compositions of single-origin, roasted coffees available to retail consumers. The δ(13)C, δ(15)N, and δ(18)O compositions were in agreement with those previously reported for green coffee beans. The δ(15)N composition was seen to be related to organic cultivation, reflected in both δ(2)H and δ(18)O compositions. The δ(13)C composition of extracted caffeine differed little from that of the bulk coffee. Stepwise discriminant analysis with jackknife tests, using isotopic and elemental data, provided up to 77% correct classification of regions of production. Samples from Africa and India were readily classified. The wide range in both isotopic and elemental compositions of samples from other regions, specifically Central/South America, resulted in poor discrimination between or within these regions. Simpler X-Y and geo-spatial plots of the isotopic data provided effective visual means to distinguish between coffees from different regions. PMID:26001050

  19. COMDYN: Software to study the dynamics of animal communities using a capture-recapture approach

    USGS Publications Warehouse

    Hines, J.E.; Boulinier, T.; Nichols, J.D.; Sauer, J.R.; Pollock, K.H.

    1999-01-01

    COMDYN is a set of programs developed for estimation of parameters associated with community dynamics using count data from two locations or time periods. It is Internet-based, allowing remote users either to input their own data, or to use data from the North American Breeding Bird Survey for analysis. COMDYN allows probability of detection to vary among species and among locations and time periods. The basic estimator for species richness underlying all estimators is the jackknife estimator proposed by Burnham and Overton. Estimators are presented for quantities associated with temporal change in species richness, including rate of change in species richness over time, local extinction probability, local species turnover and number of local colonizing species. Estimators are also presented for quantities associated with spatial variation in species richness, including relative richness at two locations and proportion of species present in one location that are also present at a second location. Application of the estimators to species richness estimation has been previously described and justified. The potential applications of these programs are discussed.

  20. Investigating different similarity measures for a case-based reasoning classifier to predict breast cancer

    NASA Astrophysics Data System (ADS)

    Bilska-Wolak, Anna O.; Floyd, Carey E., Jr.

    2001-07-01

    This paper investigates the effects of using different similarity measures for a case-based reasoning (CBR) classifier to predict breast cancer. The CBR classifier used a mammographer's BI-RADSTM description of a lesion to predict breast biopsy outcome. The classifier compared the case to be examined to a reference collection of cases and identified those that were similar. The decision variable was formed as the ratio of similar cases that were malignant to all similar cases. A reference collection of 1027 biopsy-proven cases from Duke University Medical Center was used as input. Both Euclidean and Hamming distance measures were compared using all possible combinations of nine BI-RADSTM features and age. Performance was evaluated using jackknife sampling and ROC analysis. For all combinations of features, it was found that Euclidean distance measure produced greater ROC areas and partial ROC areas than Hamming. The differences were significant at an alpha level of 0.05. The greatest ROC area of 0.82 +/- 0.01 was generated using six of the features and Euclidean distance measure. The results of both distance measures yielded greater ROC areas than previously reported values and were similar to results generated with an Artificial Neural Network using 10 features.

  1. Driver Assistance System for Passive Multi-Trailer Vehicles with Haptic Steering Limitations on the Leading Unit

    PubMed Central

    Morales, Jesús; Mandow, Anthony; Martínez, Jorge L.; Reina, Antonio J.; García-Cerezo, Alfonso

    2013-01-01

    Driving vehicles with one or more passive trailers has difficulties in both forward and backward motion due to inter-unit collisions, jackknife, and lack of visibility. Consequently, advanced driver assistance systems (ADAS) for multi-trailer combinations can be beneficial to accident avoidance as well as to driver comfort. The ADAS proposed in this paper aims to prevent unsafe steering commands by means of a haptic handwheel. Furthermore, when driving in reverse, the steering-wheel and pedals can be used as if the vehicle was driven from the back of the last trailer with visual aid from a rear-view camera. This solution, which can be implemented in drive-by-wire vehicles with hitch angle sensors, profits from two methods previously developed by the authors: safe steering by applying a curvature limitation to the leading unit, and a virtual tractor concept for backward motion that includes the complex case of set-point propagation through on-axle hitches. The paper addresses system requirements and provides implementation details to tele-operate two different off- and on-axle combinations of a tracked mobile robot pulling and pushing two dissimilar trailers. PMID:23552102

  2. iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition

    PubMed Central

    Lin, Hao; Deng, En-Ze; Ding, Hui; Chen, Wei; Chou, Kuo-Chen

    2014-01-01

    The σ54 promoters are unique in prokaryotic genome and responsible for transcripting carbon and nitrogen-related genes. With the avalanche of genome sequences generated in the postgenomic age, it is highly desired to develop automated methods for rapidly and effectively identifying the σ54 promoters. Here, a predictor called ‘iPro54-PseKNC’ was developed. In the predictor, the samples of DNA sequences were formulated by a novel feature vector called ‘pseudo k-tuple nucleotide composition’, which was further optimized by the incremental feature selection procedure. The performance of iPro54-PseKNC was examined by the rigorous jackknife cross-validation tests on a stringent benchmark data set. As a user-friendly web-server, iPro54-PseKNC is freely accessible at http://lin.uestc.edu.cn/server/iPro54-PseKNC. For the convenience of the vast majority of experimental scientists, a step-by-step protocol guide was provided on how to use the web-server to get the desired results without the need to follow the complicated mathematics that were presented in this paper just for its integrity. Meanwhile, we also discovered through an in-depth statistical analysis that the distribution of distances between the transcription start sites and the translation initiation sites were governed by the gamma distribution, which may provide a fundamental physical principle for studying the σ54 promoters. PMID:25361964

  3. iDPF-PseRAAAC: A Web-Server for Identifying the Defensin Peptide Family and Subfamily Using Pseudo Reduced Amino Acid Alphabet Composition.

    PubMed

    Zuo, Yongchun; Lv, Yang; Wei, Zhuying; Yang, Lei; Li, Guangpeng; Fan, Guoliang

    2015-01-01

    Defensins as one of the most abundant classes of antimicrobial peptides are an essential part of the innate immunity that has evolved in most living organisms from lower organisms to humans. To identify specific defensins as interesting antifungal leads, in this study, we constructed a more rigorous benchmark dataset and the iDPF-PseRAAAC server was developed to predict the defensin family and subfamily. Using reduced dipeptide compositions were used, the overall accuracy of proposed method increased to 95.10% for the defensin family, and 98.39% for the vertebrate subfamily, which is higher than the accuracy from other methods. The jackknife test shows that more than 4% improvement was obtained comparing with the previous method. A free online server was further established for the convenience of most experimental scientists at http://wlxy.imu.edu.cn/college/biostation/fuwu/iDPF-PseRAAAC/index.asp. A friendly guide is provided to describe how to use the web server. We anticipate that iDPF-PseRAAAC may become a useful high-throughput tool for both basic research and drug design. PMID:26713618

  4. Bioequivalence evaluation of two formulations of pidotimod using a limited sampling strategy.

    PubMed

    Huang, Ji-Han; Huang, Xiao-Hui; Wang, Kun; Li, Jian-Chun; Xie, Xue-Feng; Shen, Chen-Lin; Li, Lu-Jin; Zheng, Qing-Shan

    2013-07-01

    The aim of this study was to develop a limited sampling strategy (LSS) to assess the bioequivalence of two formulations of pidotimod. A randomized, two-way, cross-over study was conducted in healthy Chinese volunteers to compare two formulations of pidotimod. A limited sampling model was established using regression models to estimate the pharmacokinetic parameters and assess the bioequivalence of pidotimod. The model was internally validated by the Jack-knife method and graphical methods. The traditional non-compartmental method was also used to analyze the data and compared with LSS method. The results indicate that following oral administration of a single 800 mg dose, the plasma AUC(0-12 h) and C(max) of pidotimod can be predicted accurately using only two to four plasma samples. The bioequivalence assessment based on the LSS models provided results very similar to that obtained using all the observed concentration-time data points and indicate that the two pidotimod formulations were bioequivalent. A LSS method for assessing the bioequivalence of pidotimod formulations was established and proved to be applicable and accurate. This LSS method could be considered appropriate for a pidotimod bioequivalence study, providing an inexpensive cost of sampling acquisition and analysis. And the methodology presented here may also be applicable to bioequivalence evaluation of other medications. PMID:23639228

  5. HydPred: a novel method for the identification of protein hydroxylation sites that reveals new insights into human inherited disease.

    PubMed

    Li, Shuyan; Lu, Jun; Li, Jiazhong; Chen, Ximing; Yao, Xiaojun; Xi, Lili

    2016-01-26

    The disruption of protein hydroxylation is highly associated with several serious diseases and consequently the identification of protein hydroxylation sites has attracted significant attention recently. Here, we report the development of an improved method, called HydPred, to identify protein hydroxylation sites (hydroxyproline and hydroxylysine) based on the synthetic minority over-sampling technique (SMOTE), the random forest (RF) algorithm and four blocks of newly composed features that are derived from the protein primary sequence. The HydPred method achieved the best prediction performance reported until now with Matthew's correlation coefficient values of 0.770 and 0.857 for hydroxyproline and hydroxylysine, respectively, according to jack-knife cross-validation. This represents an improvement of 8% for hydroxyproline and 19% for hydroxylysine compared to the best results of available predictors. The prediction performance of HydPred for the external validation of hydroxyproline and hydroxylysine was also improved compared with other published methods. We subsequently applied HydPred to study the association of disruption of hydroxylation sites with human inherited disease. The analyses suggested that the loss of hydroxylation sites is more likely to cause disease instead of the gain of hydroxylation sites and 52 different human inherited diseases were found to be highly associated with the loss of hydroxylation sites. Therefore, HydPred represents a new strategy to discover the molecular basis of pathogenesis associated with abnormal hydroxylation. HydPred is now available online as a user-friendly web server at . PMID:26661679

  6. Stock structure of Lake Baikal omul as determined by whole-body morphology

    USGS Publications Warehouse

    Bronte, Charles R.; Fleischer, G.W.; Maistrenko, S.G.; Pronin, N.M.

    1999-01-01

    In Lake Baikal, three morphotypes of omul Coregonus autumnalis migratorius are recognized; the littoral, pelagic, and deep-water forms. Morphotype assignment is difficult, and similar to that encountered in pelagic and deep-water coregonines in the Laurentian Great Lakes. Principal component analysis revealed separation of all three morphotypes based on caudal peduncle length and depth, length and depth of the body between the dorsal and anal fin, and distance between the pectoral and the pelvic fins. Strong negative loadings were associated with head measurements. Omul of the same morphotype captured at different locations were classified to location of capture using step-wise discriminant function analysis. Jackknife correct classifications ranged from 43 to 78% for littoral omul from five locations, and 45–86% for pelagic omul from four locations. Patterns of location misclassification of littoral omul suggested that the sub-population structure, hence stock affinity, may be influenced by movements and intermixing of individuals among areas that are joined bathymetrically. Pelagic omul were more distinguishable by site and may support a previous hypothesis of a spawning-based rather than a foraging-based sub-population structure. Omul morphotypes may reflect adaptations to both ecological and local environmental conditions, and may have a genetic basis.

  7. Assessing long-term pH change in an Australian river catchment using monitoring and palaeolimnological data.

    PubMed

    Tibby, John; Reid, Michael A; Fluin, Jennie; Hart, Barry T; Kershaw, A Peter

    2003-08-01

    Reviews of stream monitoring data suggest that there has been significant acidification (>1.0 pH unit at some sites) of Victorian streamwaters over the past 3 decades. To assess whether these declines are within the range of natural variability, we developed a diatom model for inferring past pH and applied it to a ca. 3500-yr diatom record from a flood plain lake, Callemondah 1 Billabong, on the Goulburn River, which has among the most substantial observed pH declines. The model has a jackkniffed r2 between diatom inferred and measured pH of 0.77 and a root mean square error of prediction of 0.35 pH units. In the pre-European period, pH was stable (range 6.5-6.7) for approximately 3000 yr. Since European settlement around 160 yr ago, diatom-inferred billabong pH has increased significantly by >0.5 units. We hypothesize that this increase in pH is related to processes associated with land clearance (e.g., increased base cation load and decreased organic acid load). There is no evidence of the recent monitored declines in the Callemondah record, which may indicate that that flood plain lakes and the main stream are experiencing divergent pH trends or that the temporal resolution in the billabong sediment record is insufficient to register recent declines. PMID:12966966

  8. Instability-based mechanism for body undulations in centipede locomotion.

    PubMed

    Aoi, Shinya; Egi, Yoshimasa; Tsuchiya, Kazuo

    2013-01-01

    Centipedes have many body segments and legs and they generate body undulations during terrestrial locomotion. Centipede locomotion has the characteristic that body undulations are absent at low speeds but appear at faster speeds; furthermore, their amplitude and wavelength increase with increasing speed. There are conflicting reports regarding whether the muscles along the body axis resist or support these body undulations and the underlying mechanisms responsible for the body undulations remain largely unclear. In the present study, we investigated centipede locomotion dynamics using computer simulation with a body-mechanical model and experiment with a centipede-like robot and then conducted dynamic analysis with a simple model to clarify the mechanism. The results reveal that body undulations in these models occur due to an instability caused by a supercritical Hopf bifurcation. We subsequently compared these results with data obtained using actual centipedes. The model and actual centipedes exhibit similar dynamic properties, despite centipedes being complex, nonlinear dynamic systems. Based on our findings, we propose a possible passive mechanism for body undulations in centipedes, similar to a follower force or jackknife instability. We also discuss the roles of the muscles along the body axis in generating body undulations in terms of our physical model. PMID:23410369

  9. iCataly-PseAAC: Identification of Enzymes Catalytic Sites Using Sequence Evolution Information with Grey Model GM (2,1).

    PubMed

    Xiao, Xuan; Hui, Meng-Juan; Liu, Zi; Qiu, Wang-Ren

    2015-12-01

    Enzymes play pivotal roles in most of the biological reaction. The catalytic residues of an enzyme are defined as the amino acids which are directly involved in chemical catalysis; the knowledge of these residues is important for understanding enzyme function. Given an enzyme, which residues are the catalytic sites, and which residues are not? This is the first important problem for in-depth understanding the catalytic mechanism and drug development. With the explosive of protein sequences generated during the post-genomic era, it is highly desirable for both basic research and drug design to develop fast and reliable method for identifying the catalytic sites of enzymes according to their sequences. To address this problem, we proposed a new predictor, called iCataly-PseAAC. In the prediction system, the peptide sample was formulated with sequence evolution information via grey system model GM(2,1). It was observed by the rigorous jackknife test and independent dataset test that iCataly-PseAAC was superior to exist predictions though its only use sequence information. As a user-friendly web server, iCataly-PseAAC is freely accessible at http://www.jci-bioinfo.cn/iCataly-PseAAC. A step-by-step guide has been provided on how to use the web server to get the desired results for the convenience of most experimental scientists. PMID:26077845

  10. A multiple information fusion method for predicting subcellular locations of two different types of bacterial protein simultaneously.

    PubMed

    Chen, Jing; Xu, Huimin; He, Ping-An; Dai, Qi; Yao, Yuhua

    2016-01-01

    Subcellular localization prediction of bacterial protein is an important component of bioinformatics, which has great importance for drug design and other applications. For the prediction of protein subcellular localization, as we all know, lots of computational tools have been developed in the recent decades. In this study, we firstly introduce three kinds of protein sequences encoding schemes: physicochemical-based, evolutionary-based, and GO-based. The original and consensus sequences were combined with physicochemical properties. And elements information of different rows and columns in position-specific scoring matrix were taken into consideration simultaneously for more core and essence information. Computational methods based on gene ontology (GO) have been demonstrated to be superior to methods based on other features. Then principal component analysis (PCA) is applied for feature selection and reduced vectors are input to a support vector machine (SVM) to predict protein subcellular localization. The proposed method can achieve a prediction accuracy of 98.28% and 97.87% on a stringent Gram-positive (Gpos) and Gram-negative (Gneg) dataset with Jackknife test, respectively. At last, we calculate "absolute true overall accuracy (ATOA)", which is stricter than overall accuracy. The ATOA obtained from the proposed method is also up to 97.32% and 93.06% for Gpos and Gneg. From both the rationality of testing procedure and the success rates of test results, the current method can improve the prediction quality of protein subcellular localization. PMID:26724384

  11. Resampling methods for model fitting and model selection.

    PubMed

    Babu, G Jogesh

    2011-11-01

    Resampling procedures for fitting models and model selection are considered in this article. Nonparametric goodness-of-fit statistics are generally based on the empirical distribution function. The distribution-free property of these statistics does not hold in the multivariate case or when some of the parameters are estimated. Bootstrap methods to estimate the underlying distributions are discussed in such cases. The results hold not only in the case of one-dimensional parameter space, but also for the vector parameters. Bootstrap methods for inference, when the data is from an unknown distribution that may or may not belong to a specified family of distributions, are also considered. Most of the information criteria-based model selection procedures such as the Akaike information criterion, Bayesian information criterion, and minimum description length use estimation of bias. The bias, which is inevitable in model selection problems, arises mainly from estimating the distance between the "true" model and an estimated model. A jackknife type procedure for model selection is discussed, which instead of bias estimation is based on bias reduction. PMID:22023685

  12. Prediction of Protein Submitochondrial Locations by Incorporating Dipeptide Composition into Chou's General Pseudo Amino Acid Composition.

    PubMed

    Ahmad, Khurshid; Waris, Muhammad; Hayat, Maqsood

    2016-06-01

    Mitochondrion is the key organelle of eukaryotic cell, which provides energy for cellular activities. Submitochondrial locations of proteins play crucial role in understanding different biological processes such as energy metabolism, program cell death, and ionic homeostasis. Prediction of submitochondrial locations through conventional methods are expensive and time consuming because of the large number of protein sequences generated in the last few decades. Therefore, it is intensively desired to establish an automated model for identification of submitochondrial locations of proteins. In this regard, the current study is initiated to develop a fast, reliable, and accurate computational model. Various feature extraction methods such as dipeptide composition (DPC), Split Amino Acid Composition, and Composition and Translation were utilized. In order to overcome the issue of biasness, oversampling technique SMOTE was applied to balance the datasets. Several classification learners including K-Nearest Neighbor, Probabilistic Neural Network, and support vector machine (SVM) are used. Jackknife test is applied to assess the performance of classification algorithms using two benchmark datasets. Among various classification algorithms, SVM achieved the highest success rates in conjunction with the condensed feature space of DPC, which are 95.20 % accuracy on dataset SML3-317 and 95.11 % on dataset SML3-983. The empirical results revealed that our proposed model obtained the highest results so far in the literatures. It is anticipated that our proposed model might be useful for future studies. PMID:26746980

  13. Fusing face-verification algorithms and humans.

    PubMed

    O'Toole, Alice J; Abdi, Herv; Jiang, Fang; Phillips, P Jonathon

    2007-10-01

    It has been demonstrated recently that state-of-the-art face-recognition algorithms can surpass human accuracy at matching faces over changes in illumination. The ranking of algorithms and humans by accuracy, however, does not provide information about whether algorithms and humans perform the task comparably or whether algorithms and humans can be fused to improve performance. In this paper, we fused humans and algorithms using partial least square regression (PLSR). In the first experiment, we applied PLSR to face-pair similarity scores generated by seven algorithms participating in the Face Recognition Grand Challenge. The PLSR produced an optimal weighting of the similarity scores, which we tested for generality with a jackknife procedure. Fusing the algorithms' similarity scores using the optimal weights produced a twofold reduction of error rate over the most accurate algorithm. Next, human-subject-generated similarity scores were added to the PLSR analysis. Fusing humans and algorithms increased the performance to near-perfect classification accuracy. These results are discussed in terms of maximizing face-verification accuracy with hybrid systems consisting of multiple algorithms and humans. PMID:17926698

  14. A causal proportional hazards estimator for the effect of treatment actually received in a randomized trial with all-or-nothing compliance.

    PubMed

    Loeys, T; Goetghebeur, E

    2003-03-01

    Survival data from randomized trials are most often analyzed in a proportional hazards (PH) framework that follows the intention-to-treat (ITT) principle. When not all the patients on the experimental arm actually receive the assigned treatment, the ITT-estimator mixes its effect on treatment compliers with its absence of effect on noncompliers. The structural accelerated failure time (SAFT) models of Robins and Tsiatis are designed to consistently estimate causal effects on the treated, without direct assumptions about the compliance selection mechanism. The traditional PH-model, however, has not yet led to such causal interpretation. In this article, we examine a PH-model of treatment effect on the treated subgroup. While potential treatment compliance is unobserved in the control arm, we derive an estimating equation for the Compliers PROPortional Hazards Effect of Treatment (C-PROPHET). The jackknife is used for bias correction and variance estimation. The method is applied to data from a recently finished clinical trial in cancer patients with liver metastases. PMID:12762446

  15. Predicting bacteriophage proteins located in host cell with feature selection technique.

    PubMed

    Ding, Hui; Liang, Zhi-Yong; Guo, Feng-Biao; Huang, Jian; Chen, Wei; Lin, Hao

    2016-04-01

    A bacteriophage is a virus that can infect a bacterium. The fate of an infected bacterium is determined by the bacteriophage proteins located in the host cell. Thus, reliably identifying bacteriophage proteins located in the host cell is extremely important to understand their functions and discover potential anti-bacterial drugs. Thus, in this paper, a computational method was developed to recognize bacteriophage proteins located in host cells based only on their amino acid sequences. The analysis of variance (ANOVA) combined with incremental feature selection (IFS) was proposed to optimize the feature set. Using a jackknife cross-validation, our method can discriminate between bacteriophage proteins located in a host cell and the bacteriophage proteins not located in a host cell with a maximum overall accuracy of 84.2%, and can further classify bacteriophage proteins located in host cell cytoplasm and in host cell membranes with a maximum overall accuracy of 92.4%. To enhance the value of the practical applications of the method, we built a web server called PHPred (〈http://lin.uestc.edu.cn/server/PHPred〉). We believe that the PHPred will become a powerful tool to study bacteriophage proteins located in host cells and to guide related drug discovery. PMID:26945463

  16. Crash involvement of large trucks by configuration: a case-control study.

    PubMed

    Stein, H S; Jones, I S

    1988-05-01

    For a two-year period, large truck crashes on the interstate system in Washington State were investigated using a case-control method. For each large truck involved in a crash, three trucks were randomly selected for inspection from the traffic stream at the same time and place as the crash but one week later. The effects of truck and driver characteristics on crashes were assessed by comparing their relative frequency among the crash-involved and comparison sample trucks. Double trailer trucks were consistently overinvolved in crashes by a factor of two to three in both single and multiple vehicle crashes. Single unit trucks pulling trailers also were overinvolved. Doubles also had a higher frequency of jackknifing compared to tractor-trailers. The substantial overinvolvement of doubles in crashes was found regardless of driver age, hours of driving, cargo weight, or type of fleet. Younger drivers, long hours of driving, and operating empty trucks were also associated with higher crash involvement. PMID:3354729

  17. Integrating Map Algebra and Statistical Modeling for Spatio-Temporal Analysis of Monthly Mean Daily Incident Photosynthetically Active Radiation (PAR) over a Complex Terrain

    PubMed Central

    Evrendilek, Fatih

    2007-01-01

    This study aims at quantifying spatio-temporal dynamics of monthly mean daily incident photosynthetically active radiation (PAR) over a vast and complex terrain such as Turkey. The spatial interpolation method of universal kriging, and the combination of multiple linear regression (MLR) models and map algebra techniques were implemented to generate surface maps of PAR with a grid resolution of 500 × 500 m as a function of five geographical and 14 climatic variables. Performance of the geostatistical and MLR models was compared using mean prediction error (MPE), root-mean-square prediction error (RMSPE), average standard prediction error (ASE), mean standardized prediction error (MSPE), root-mean-square standardized prediction error (RMSSPE), and adjusted coefficient of determination (R2adj.). The best-fit MLR- and universal kriging-generated models of monthly mean daily PAR were validated against an independent 37-year observed dataset of 35 climate stations derived from 160 stations across Turkey by the Jackknifing method. The spatial variability patterns of monthly mean daily incident PAR were more accurately reflected in the surface maps created by the MLR-based models than in those created by the universal kriging method, in particular, for spring (May) and autumn (November). The MLR-based spatial interpolation algorithms of PAR described in this study indicated the significance of the multifactor approach to understanding and mapping spatio-temporal dynamics of PAR for a complex terrain over meso-scales.

  18. Distinguishing centrarchid genera by use of lateral line scales

    USGS Publications Warehouse

    Roberts, N.M.; Rabeni, C.F.; Stanovick, J.S.

    2007-01-01

    Predator-prey relations involving fishes are often evaluated using scales remaining in gut contents or feces. While several reliable keys help identify North American freshwater fish scales to the family level, none attempt to separate the family Centrarchidae to the genus level. Centrarchidae is of particular concern in the midwestern United States because it contains several popular sport fishes, such as smallmouth bass Micropterus dolomieu, largemouth bass M. salmoides, and rock bass Ambloplites rupestris, as well as less-sought-after species of sunfishes Lepomis spp. and crappies Pomoxis spp. Differentiating sport fish from non-sport fish has important management implications. Morphological characteristics of lateral line scales (n = 1,581) from known centrarchid fishes were analyzed. The variability of measurements within and between genera was examined to select variables that were the most useful in further classifying unknown centrarchid scales. A linear discriminant analysis model was developed using 10 variables. Based on this model, 84.4% of Ambloplites scales, 81.2% of Lepomis scales, and 86.6% of Micropterus scales were classified correctly using a jackknife procedure. ?? Copyright by the American Fisheries Society 2007.

  19. Using optimized evidence-theoretic K-nearest neighbor classifier and pseudo-amino acid composition to predict membrane protein types.

    PubMed

    Shen, Hongbin; Chou, Kuo-Chen

    2005-08-19

    Knowledge of membrane protein type often provides crucial hints toward determining the function of an uncharacterized membrane protein. With the avalanche of new protein sequences emerging during the post-genomic era, it is highly desirable to develop an automated method that can serve as a high throughput tool in identifying the types of newly found membrane proteins according to their primary sequences, so as to timely make the relevant annotations on them for the reference usage in both basic research and drug discovery. Based on the concept of pseudo-amino acid composition [K.C. Chou, Proteins: Struct. Funct. Genet. 43 (2001) 246-255; Erratum: Proteins: Struct. Funct. Genet. 44 (2001) 60] that has made it possible to incorporate a considerable amount of sequence-order effects by representing a protein sample in terms of a set of discrete numbers, a novel predictor, the so-called "optimized evidence-theoretic K-nearest neighbor" or "OET-KNN" classifier, was proposed. It was demonstrated via the self-consistency test, jackknife test, and independent dataset test that the new predictor, compared with many previous ones, yielded higher success rates in most cases. The new predictor can also be used to improve the prediction quality for, among many other protein attributes, structural class, subcellular localization, enzyme family class, and G-protein coupled receptor type. The OET-KNN classifier will be available as a web-server at http://www.pami.sjtu.edu.cn/kcchou. PMID:16002049

  20. Limited sampling hampers "big data" estimation of species richness in a tropical biodiversity hotspot.

    PubMed

    Engemann, Kristine; Enquist, Brian J; Sandel, Brody; Boyle, Brad; Jørgensen, Peter M; Morueta-Holme, Naia; Peet, Robert K; Violle, Cyrille; Svenning, Jens-Christian

    2015-02-01

    Macro-scale species richness studies often use museum specimens as their main source of information. However, such datasets are often strongly biased due to variation in sampling effort in space and time. These biases may strongly affect diversity estimates and may, thereby, obstruct solid inference on the underlying diversity drivers, as well as mislead conservation prioritization. In recent years, this has resulted in an increased focus on developing methods to correct for sampling bias. In this study, we use sample-size-correcting methods to examine patterns of tropical plant diversity in Ecuador, one of the most species-rich and climatically heterogeneous biodiversity hotspots. Species richness estimates were calculated based on 205,735 georeferenced specimens of 15,788 species using the Margalef diversity index, the Chao estimator, the second-order Jackknife and Bootstrapping resampling methods, and Hill numbers and rarefaction. Species richness was heavily correlated with sampling effort, and only rarefaction was able to remove this effect, and we recommend this method for estimation of species richness with "big data" collections. PMID:25692000

  1. Predicting cancerlectins by the optimal g-gap dipeptides

    PubMed Central

    Lin, Hao; Liu, Wei-Xin; He, Jiao; Liu, Xin-Hui; Ding, Hui; Chen, Wei

    2015-01-01

    The cancerlectin plays a key role in the process of tumor cell differentiation. Thus, to fully understand the function of cancerlectin is significant because it sheds light on the future direction for the cancer therapy. However, the traditional wet-experimental methods were money- and time-consuming. It is highly desirable to develop an effective and efficient computational tool to identify cancerlectins. In this study, we developed a sequence-based method to discriminate between cancerlectins and non-cancerlectins. The analysis of variance (ANOVA) was used to choose the optimal feature set derived from the g-gap dipeptide composition. The jackknife cross-validated results showed that the proposed method achieved the accuracy of 75.19%, which is superior to other published methods. For the convenience of other researchers, an online web-server CaLecPred was established and can be freely accessed from the website http://lin.uestc.edu.cn/server/CalecPred. We believe that the CaLecPred is a powerful tool to study cancerlectins and to guide the related experimental validations. PMID:26648527

  2. AcalPred: a sequence-based tool for discriminating between acidic and alkaline enzymes.

    PubMed

    Lin, Hao; Chen, Wei; Ding, Hui

    2013-01-01

    The structure and activity of enzymes are influenced by pH value of their surroundings. Although many enzymes work well in the pH range from 6 to 8, some specific enzymes have good efficiencies only in acidic (pH<5) or alkaline (pH>9) solution. Studies have demonstrated that the activities of enzymes correlate with their primary sequences. It is crucial to judge enzyme adaptation to acidic or alkaline environment from its amino acid sequence in molecular mechanism clarification and the design of high efficient enzymes. In this study, we developed a sequence-based method to discriminate acidic enzymes from alkaline enzymes. The analysis of variance was used to choose the optimized discriminating features derived from g-gap dipeptide compositions. And support vector machine was utilized to establish the prediction model. In the rigorous jackknife cross-validation, the overall accuracy of 96.7% was achieved. The method can correctly predict 96.3% acidic and 97.1% alkaline enzymes. Through the comparison between the proposed method and previous methods, it is demonstrated that the proposed method is more accurate. On the basis of this proposed method, we have built an online web-server called AcalPred which can be freely accessed from the website (http://lin.uestc.edu.cn/server/AcalPred). We believe that the AcalPred will become a powerful tool to study enzyme adaptation to acidic or alkaline environment. PMID:24130738

  3. iMethyl-PseAAC: identification of protein methylation sites via a pseudo amino acid composition approach.

    PubMed

    Qiu, Wang-Ren; Xiao, Xuan; Lin, Wei-Zhong; Chou, Kuo-Chen

    2014-01-01

    Before becoming the native proteins during the biosynthesis, their polypeptide chains created by ribosome's translating mRNA will undergo a series of "product-forming" steps, such as cutting, folding, and posttranslational modification (PTM). Knowledge of PTMs in proteins is crucial for dynamic proteome analysis of various human diseases and epigenetic inheritance. One of the most important PTMs is the Arg- or Lys-methylation that occurs on arginine or lysine, respectively. Given a protein, which site of its Arg (or Lys) can be methylated, and which site cannot? This is the first important problem for understanding the methylation mechanism and drug development in depth. With the avalanche of protein sequences generated in the postgenomic age, its urgency has become self-evident. To address this problem, we proposed a new predictor, called iMethyl-PseAAC. In the prediction system, a peptide sample was formulated by a 346-dimensional vector, formed by incorporating its physicochemical, sequence evolution, biochemical, and structural disorder information into the general form of pseudo amino acid composition. It was observed by the rigorous jackknife test and independent dataset test that iMethyl-PseAAC was superior to any of the existing predictors in this area. PMID:24977164

  4. Tributaries under Mediterranean climate: their role in macrobenthos diversity maintenance.

    PubMed

    Maasri, Alain; Dumont, Bernard; Claret, Cécile; Archambaud-Suard, Gaït; Gandouin, Emmanuel; Franquet, Evelyne

    2008-07-01

    The taxonomic richness erosion and the role of tributaries in the maintenance of the taxonomic richness were considered in a Mediterranean catchment in southeastern France. Nine stations were chosen along the Arc stream (three stations downstream from an organic effluent and one station upstream from the pollution source) and on two groups of tributaries (three intermittent and two perennial). High biodiversity erosion was noticed in the main stem, revealing diffuse sources of pollution added to the expected effect of the localized organic pollution. Jackknife richness estimator and beta diversity indicated that the intermittent tributaries had the highest richness values and harboured 70% of the taxa recorded at the catchment scale. The intermittent flow tributaries seem to play a major role in maintaining the taxonomic richness in such catchments, highly impacted by anthropogenic activities. The detailed examination and the preservation of these ecosystems should be an important step in catchment management, and support the need for catchment-scale conservation of freshwater invertebrates. PMID:18558378

  5. Analysis of metabolic pathway using hybrid properties.

    PubMed

    Chen, Lei; Cai, Yu-Dong; Shi, Xiao-He; Huang, Tao

    2012-01-01

    Given a compounds-forming system, i.e., a system consisting of some compounds and their relationship, can it form a biologically meaningful pathway? It is a fundamental problem in systems biology. Nowadays, a lot of information on different organisms, at both genetic and metabolic levels, has been collected and stored in some specific databases. Based on these data, it is feasible to address such an essential problem. Metabolic pathway is one kind of compounds-forming systems and we analyzed them in yeast by extracting different (biological and graphic) features from each of the 13,736 compounds-forming systems, of which 136 are positive pathways, i.e., known metabolic pathway from KEGG; while 13,600 were negative. Each of these compounds-forming systems was represented by 144 features, of which 88 are graph features and 56 biological features. "Minimum Redundancy Maximum Relevance" and "Incremental Feature Selection" were utilized to analyze these features and 16 optimal features were selected as being able to predict a query compounds- forming system most successfully. It was found through Jackknife cross-validation that the overall success rate of identifying the positive pathways was 74.26%. It is anticipated that this novel approach and encouraging result may give meaningful illumination to investigate this important topic. PMID:21919854

  6. Nucleosome positioning based on the sequence word composition.

    PubMed

    Yi, Xian-Fu; He, Zhi-Song; Chou, Kuo-Chen; Kong, Xiang-Yin

    2012-01-01

    The DNA of all eukaryotic organisms is packaged into nucleosomes (a basic repeating unit of chromatin). A nucleosome consists of histone octamer wrapped by core DNA and linker histone H1 associated with linker DNA. It has profound effects on all DNA-dependent processes by affecting sequence accessibility. Understanding the factors that influence nucleosome positioning has great help to the study of genomic control mechanism. Among many determinants, the inherent DNA sequence has been suggested to have a dominant role in nucleosome positioning in vivo. Here, we used the method of minimum redundancy maximum relevance (mRMR) feature selection and the nearest neighbor algorithm (NNA) combined with the incremental feature selection (IFS) method to identify the most important sequence features that either favor or inhibit nucleosome positioning. We analyzed the words of 53,021 nucleosome DNA sequences and 50,299 linker DNA sequences of Saccharomyces cerevisiae. 32 important features were abstracted from 5,460 features, and the overall prediction accuracy through jackknife cross-validation test was 76.5%. Our results support that sequence-dependent DNA flexibility plays an important role in positioning nucleosome core particles and that genome sequence facilitates the rapid nucleosome reassembly instead of nucleosome depletion. Besides, our results suggest that there exist some additional features playing a considerable role in discriminating nucleosome forming and inhibiting sequences. These results confirmed that the underlying DNA sequence plays a major role in nucleosome positioning. PMID:21919856

  7. Computer aided morphometry of the neonatal fetal alcohol syndrome face

    NASA Astrophysics Data System (ADS)

    Chik, Lawrence; Sokol, Robert J.; Martier, Susan S.

    1993-09-01

    Facial dysmorphology related to Fetal Alcohol Syndrome (FAS) has been studied from neonatal snapshots with computer-aided imaging tools by looking at facial landmarks and silhouettes. Statistical methods were used to characterize FAS-related midfacial hypoplasia by using standardized landmark coordinates of frontal and profile snapshots. Additional analyses were performed by tracing a segment of the facial silhouettes from the profile snapshots. In spite of inherent distortions due to the coordinate standardization procedure, controlled for race, three significant facial landmark coordinates accounted for 30.6% of the explained variance of FAS. Residualized for race, eight points along the silhouettes were shown to be significant in explaining 45.8% of the outcome variance. Combining the landmark coordinates and silhouettes points, 57% of the outcome variance was explained. Finally, including birthweight with landmark coordinates and silhouettes, 63% of the outcome variance was explained, with a jackknifed sensitivity of 95% (19/20) and a specificity of 92.9% (52/56).

  8. Prediction of Cancer Drugs by Chemical-Chemical Interactions

    PubMed Central

    Li, Hai-Peng; Feng, Kai-Yan; Chen, Lei; Zheng, Ming-Yue; Cai, Yu-Dong

    2014-01-01

    Cancer, which is a leading cause of death worldwide, places a big burden on health-care system. In this study, an order-prediction model was built to predict a series of cancer drug indications based on chemical-chemical interactions. According to the confidence scores of their interactions, the order from the most likely cancer to the least one was obtained for each query drug. The 1st order prediction accuracy of the training dataset was 55.93%, evaluated by Jackknife test, while it was 55.56% and 59.09% on a validation test dataset and an independent test dataset, respectively. The proposed method outperformed a popular method based on molecular descriptors. Moreover, it was verified that some drugs were effective to the ‘wrong’ predicted indications, indicating that some ‘wrong’ drug indications were actually correct indications. Encouraged by the promising results, the method may become a useful tool to the prediction of drugs indications. PMID:24498372

  9. Protein location prediction using atomic composition and global features of the amino acid sequence

    SciTech Connect

    Cherian, Betsy Sheena; Nair, Achuthsankar S.

    2010-01-22

    Subcellular location of protein is constructive information in determining its function, screening for drug candidates, vaccine design, annotation of gene products and in selecting relevant proteins for further studies. Computational prediction of subcellular localization deals with predicting the location of a protein from its amino acid sequence. For a computational localization prediction method to be more accurate, it should exploit all possible relevant biological features that contribute to the subcellular localization. In this work, we extracted the biological features from the full length protein sequence to incorporate more biological information. A new biological feature, distribution of atomic composition is effectively used with, multiple physiochemical properties, amino acid composition, three part amino acid composition, and sequence similarity for predicting the subcellular location of the protein. Support Vector Machines are designed for four modules and prediction is made by a weighted voting system. Our system makes prediction with an accuracy of 100, 82.47, 88.81 for self-consistency test, jackknife test and independent data test respectively. Our results provide evidence that the prediction based on the biological features derived from the full length amino acid sequence gives better accuracy than those derived from N-terminal alone. Considering the features as a distribution within the entire sequence will bring out underlying property distribution to a greater detail to enhance the prediction accuracy.

  10. iMethyl-PseAAC: Identification of Protein Methylation Sites via a Pseudo Amino Acid Composition Approach

    PubMed Central

    Qiu, Wang-Ren; Lin, Wei-Zhong; Chou, Kuo-Chen

    2014-01-01

    Before becoming the native proteins during the biosynthesis, their polypeptide chains created by ribosome's translating mRNA will undergo a series of “product-forming” steps, such as cutting, folding, and posttranslational modification (PTM). Knowledge of PTMs in proteins is crucial for dynamic proteome analysis of various human diseases and epigenetic inheritance. One of the most important PTMs is the Arg- or Lys-methylation that occurs on arginine or lysine, respectively. Given a protein, which site of its Arg (or Lys) can be methylated, and which site cannot? This is the first important problem for understanding the methylation mechanism and drug development in depth. With the avalanche of protein sequences generated in the postgenomic age, its urgency has become self-evident. To address this problem, we proposed a new predictor, called iMethyl-PseAAC. In the prediction system, a peptide sample was formulated by a 346-dimensional vector, formed by incorporating its physicochemical, sequence evolution, biochemical, and structural disorder information into the general form of pseudo amino acid composition. It was observed by the rigorous jackknife test and independent dataset test that iMethyl-PseAAC was superior to any of the existing predictors in this area. PMID:24977164

  11. Automated Prediction of CMEs Using Machine Learning of CME - Flare Associations

    NASA Astrophysics Data System (ADS)

    Qahwaji, R.; Colak, T.; Al-Omari, M.; Ipson, S.

    2008-04-01

    Machine-learning algorithms are applied to explore the relation between significant flares and their associated CMEs. The NGDC flares catalogue and the SOHO/LASCO CME catalogue are processed to associate X and M-class flares with CMEs based on timing information. Automated systems are created to process and associate years of flare and CME data, which are later arranged in numerical-training vectors and fed to machine-learning algorithms to extract the embedded knowledge and provide learning rules that can be used for the automated prediction of CMEs. Properties representing the intensity, flare duration, and duration of decline and duration of growth are extracted from all the associated (A) and not-associated (NA) flares and converted to a numerical format that is suitable for machine-learning use. The machine-learning algorithms Cascade Correlation Neural Networks (CCNN) and Support Vector Machines (SVM) are used and compared in our work. The machine-learning systems predict, from the input of a flare’s properties, if the flare is likely to initiate a CME. Intensive experiments using Jack-knife techniques are carried out and the relationships between flare properties and CMEs are investigated using the results. The predictive performance of SVM and CCNN is analysed and recommendations for enhancing the performance are provided.

  12. EGS hydraulic stimulation monitoring by surface arrays - location accuracy and completeness magnitude: the Basel Deep Heat Mining Project case study

    NASA Astrophysics Data System (ADS)

    Häge, Martin; Blascheck, Patrick; Joswig, Manfred

    2013-01-01

    The potential and limits of monitoring induced seismicity by surface-based mini arrays was evaluated for the hydraulic stimulation of the Basel Deep Heat Mining Project. This project aimed at the exploitation of geothermal heat from a depth of about 4,630 m. As reference for our results, a network of borehole stations by Geothermal Explorers Ltd. provided ground truth information. We utilized array processing, sonogram event detection and outlier-resistant, graphical jackknife location procedures to compensate for the decrease in signal-to-noise ratio at the surface. We could correctly resolve the NNW-SSE striking fault plane by relative master event locations. Statistical analysis of our catalog data resulted in M L 0.36 as completeness magnitude, but with significant day-to-night dependency. To compare to the performance of borehole data with M W 0.9 as completeness magnitude, we applied two methods for converting M L to M W which raised our M C to M W in the range of 0.99-1.13. Further, the b value for the duration of our measurement was calculated to 1.14 (related to M L), respectively 1.66 (related to M W), but changes over time could not be resolved from the error bars.

  13. Estimation of Purkait's triangle method and alternative models for sex assessment from the proximal femur in the Spanish population.

    PubMed

    Djorojevic, Mirjana; Roldán, Concepción; Botella, Miguel; Alemán, Inmaculada

    2016-01-01

    The current study was undertaken to test the validity and reproducibility of the Purkait triangle method and some alternative proposals for sex prediction from the proximal femur in the adult population of Spain. To that end, sexual dimorphism of the maximum femoral head diameter and the minimum femoral neck diameter were also evaluated. The study was conducted on 186 femora (109 males and 77 females) taken from the San José collection of identified individuals (Southern Spain). Discriminant function analyses (DFA) employing the jackknife procedure for cross-validations were considered. Overall, more than 94 % of individuals of both sexes were correctly classified. The most dimorphic single variable from the triangle method was the intertrochanteric apex distance (BC) that reached 85.5 % accuracy, falling below those obtained for the femoral head and femoral neck diameter, respectively, (89.8 and 91.9 %). Combining BC with the neck diameter, the predictive ability increased to 92.5 %; when femoral head diameter was added to the latter two, the classification success rate improved further up to 94.6 % (94.1 % after cross-validation). We conclude that the classification success rates of the Purkait's method remained considerably below any of those obtained with the models proposed in the present study which proved to be a much better and more reliable choice both as single predictors and in combination with other variables. PMID:25951948

  14. Retrorectal dermoid cyst manifested as an extrasphincteric perianal fistula - case report.

    PubMed

    Karagjozov, A; Milev, I; Antovic, S; Kadri, E

    2014-01-01

    Retrorectal tumors are very rare but well defined pathological entities in the literature. Also, an extrasphincteric fistula is a very rare form of perianal fistula which makes our case a very unusual and rare one, especially by the fact that it was successfully treated with the first operation and without protective stoma formation. The patient was first treated in hospital for a retrorectal abscess that had spontaneously ruptured in the postanal space. Because of the constant drainage of the suppurative content from the postanal opening in the following months, MRI and fistulography were performed, registering cystic formation in the retrorectal space with fistulous communication with the rectum above and completely separate from the sphincter mechanism. After that the patient was admitted for definitive treatment. The operation was performed with the patient in a prone jack-knife position. Complete excision of the cyst with the fistulous communication was performed and the rectum was sutured in two layers with separate slowly absorbable sutures. The wound was laid open and the patient was discharged on the 5th post operative day. After about ten months the defecation is normal, the wound is sealed and there are no signs of inflammation and secretion locally. PMID:25560513

  15. Dissociable executive functions in behavioral variant frontotemporal and Alzheimer dementias

    PubMed Central

    Feigenbaum, Dana; Rankin, Katherine P.; Smith, Glenn E.; Boxer, Adam L.; Wood, Kristie; Hanna, Sherrie M.; Miller, Bruce L.; Kramer, Joel H.

    2013-01-01

    Objective: The objective of this study was to determine which aspects of executive functions are most affected in behavioral variant frontotemporal dementia (bvFTD) and best differentiate this syndrome from Alzheimer disease (AD). Methods: We compared executive functions in 22 patients diagnosed with bvFTD, 26 with AD, and 31 neurologically healthy controls using a conceptually driven and comprehensive battery of executive function tests, the NIH EXAMINER battery (http://examiner.ucsf.edu). Results: The bvFTD and the AD patients were similarly impaired compared with controls on tests of working memory, category fluency, and attention, but the patients with bvFTD showed significantly more severe impairments than the patients with AD on tests of letter fluency, antisaccade accuracy, social decision-making, and social behavior. Discriminant function analysis with jackknifed cross-validation classified the bvFTD and AD patient groups with 73% accuracy. Conclusions: Executive function assessment can support bvFTD diagnosis when measures are carefully selected to emphasize frontally specific functions. PMID:23658382

  16. Prediction of malignant transformation in oral epithelial lesions by image cytometry.

    PubMed

    Abdel-Salam, M; Mayall, B H; Chew, K; Silverman, S; Greenspan, J S

    1988-11-01

    The value of image analysis in predicting the malignant potential of oral epithelial lesions showing either hyperplasia or dysplasia was investigated; 5-micron formalin-fixed sections of 16 oral epithelial lesions, of which eight had later transformed to carcinoma and eight had not transformed during a follow-up of 10-15 years were studied. The sections were stained with the azure A-Feulgen reaction for nuclear DNA. In each section 200 nuclei of epithelial cells and 20 nuclei of lymphocytes were assessed; all measurements were made blindly. For each nucleus six features related to shape and amount of stain and six features related to chromatin pattern were assessed. For each feature the mean, SD, and interquartile range were determined and used for linear stepwise discriminant analysis. A model of three variables with the most discriminating power was developed. When the jackknifed classification test was applied using this model, the malignant potential of the lesions that later transformed could be predicted with 87.5% accuracy. PMID:3167810

  17. Fast and Accurate Construction of Ultra-Dense Consensus Genetic Maps Using Evolution Strategy Optimization

    PubMed Central

    Mester, David; Ronin, Yefim; Schnable, Patrick; Aluru, Srinivas; Korol, Abraham

    2015-01-01

    Our aim was to develop a fast and accurate algorithm for constructing consensus genetic maps for chip-based SNP genotyping data with a high proportion of shared markers between mapping populations. Chip-based genotyping of SNP markers allows producing high-density genetic maps with a relatively standardized set of marker loci for different mapping populations. The availability of a standard high-throughput mapping platform simplifies consensus analysis by ignoring unique markers at the stage of consensus mapping thereby reducing mathematical complicity of the problem and in turn analyzing bigger size mapping data using global optimization criteria instead of local ones. Our three-phase analytical scheme includes automatic selection of ~100-300 of the most informative (resolvable by recombination) markers per linkage group, building a stable skeletal marker order for each data set and its verification using jackknife re-sampling, and consensus mapping analysis based on global optimization criterion. A novel Evolution Strategy optimization algorithm with a global optimization criterion presented in this paper is able to generate high quality, ultra-dense consensus maps, with many thousands of markers per genome. This algorithm utilizes "potentially good orders" in the initial solution and in the new mutation procedures that generate trial solutions, enabling to obtain a consensus order in reasonable time. The developed algorithm, tested on a wide range of simulated data and real world data (Arabidopsis), outperformed two tested state-of-the-art algorithms by mapping accuracy and computation time. PMID:25867943

  18. Use of remote sensing for analysis and estimation of vector-borne disease

    NASA Astrophysics Data System (ADS)

    Rahman, Atiqur

    An epidemiological data of malaria cases were correlated with satellite-based vegetation health (VH) indices to investigate if they can be used as a proxy for monitoring the number of malaria cases. Mosquitoes, which spread malaria in Bangladesh, are very sensitive to environmental conditions, especially to changes in weather. Therefore, VH indices, which characterize weather conditions, were tested as indicators of mosquitoes' activities in the spread of malaria. Satellite data were presented by the following VH indices: Vegetation Condition Index (VCI), Temperature Condition Index (TCI), and Vegetation Health Index (VHI). They were derived from radiances and measured by the Advanced Very High Resolution Radiometer (AVHRR) flown on NOAA afternoon polar orbiting satellites. Assessment of sensitivity of the VH was performed using correlation and regression analysis. Estimation models were validated using of Jackknife Cross-Validation procedure. Results show that the VH indices can be used for detection, and numerical estimate of the number of malaria cases. During the cooler months (January--April) when mosquitoes are less active, the correlation is low and increases considerably during the warm and wet season (April--November), for TCI in early October and for VCI in mid September. All analysis and estimation model developed here are based on data obtained for Bangladesh.

  19. Evaluation of marked-recapture for estimating striped skunk abundance

    USGS Publications Warehouse

    Greenwood, R.J.; Sargeant, A.B.; Johnson, D.H.

    1985-01-01

    The mark-recapture method for estimating striped skunk (Mephitis mephitis) abundance was evaluated by systematically livetrapping a radio-equipped population on a 31.4-km2 study area in North Dakota during late April of 1977 and 1978. The study population was 10 females and 13 males in 1977 and 20 females and 8 males in 1978. Skunks were almost exclusively nocturnal. Males traveled greater nightly distances than females (3.3 vs. 2.6 km, P < 0.05) and had larger home ranges (308 vs. 242 ha) although not significantly so. Increased windchill reduced night-time activity. The population was demographically but not geographically closed. Frequency of capture was positively correlated with time skunks spent on the study area. Little variation in capture probabilities was found among trap-nights. Skunks exhibited neither trap-proneness nor shyness. Capture rates in 1977 were higher for males than for females; the reverse occurred in 1978. Variation in individual capture rates was indicated among males in 1977 and among females in 1978. Ten estimators produced generally similar results, but all underestimated true population size. Underestimation was a function of the number of untrapped skunks, primarily those that spent limited time on the study area. The jackknife method produced the best estimates of skunk abundance.

  20. Phylogeny of sipunculan worms: A combined analysis of four gene regions and morphology.

    PubMed

    Schulze, Anja; Cutler, Edward B; Giribet, Gonzalo

    2007-01-01

    The intra-phyletic relationships of sipunculan worms were analyzed based on DNA sequence data from four gene regions and 58 morphological characters. Initially we analyzed the data under direct optimization using parsimony as optimality criterion. An implied alignment resulting from the direct optimization analysis was subsequently utilized to perform a Bayesian analysis with mixed models for the different data partitions. For this we applied a doublet model for the stem regions of the 18S rRNA. Both analyses support monophyly of Sipuncula and most of the same clades within the phylum. The analyses differ with respect to the relationships among the major groups but whereas the deep nodes in the direct optimization analysis generally show low jackknife support, they are supported by 100% posterior probability in the Bayesian analysis. Direct optimization has been useful for handling sequences of unequal length and generating conservative phylogenetic hypotheses whereas the Bayesian analysis under mixed models provided high resolution in the basal nodes of the tree. PMID:16919974

  1. Instability-based mechanism for body undulations in centipede locomotion

    NASA Astrophysics Data System (ADS)

    Aoi, Shinya; Egi, Yoshimasa; Tsuchiya, Kazuo

    2013-01-01

    Centipedes have many body segments and legs and they generate body undulations during terrestrial locomotion. Centipede locomotion has the characteristic that body undulations are absent at low speeds but appear at faster speeds; furthermore, their amplitude and wavelength increase with increasing speed. There are conflicting reports regarding whether the muscles along the body axis resist or support these body undulations and the underlying mechanisms responsible for the body undulations remain largely unclear. In the present study, we investigated centipede locomotion dynamics using computer simulation with a body-mechanical model and experiment with a centipede-like robot and then conducted dynamic analysis with a simple model to clarify the mechanism. The results reveal that body undulations in these models occur due to an instability caused by a supercritical Hopf bifurcation. We subsequently compared these results with data obtained using actual centipedes. The model and actual centipedes exhibit similar dynamic properties, despite centipedes being complex, nonlinear dynamic systems. Based on our findings, we propose a possible passive mechanism for body undulations in centipedes, similar to a follower force or jackknife instability. We also discuss the roles of the muscles along the body axis in generating body undulations in terms of our physical model.

  2. Identification of pathogenic fungi with an optoelectronic nose

    PubMed Central

    Zhang, Yinan; Askim, Jon R.; Zhong, Wenxuan; Orlean, Peter; Suslick, Kenneth S.

    2014-01-01

    Human fungal infections have gained recent notoriety following contamination of pharmaceuticals in the compounding process. Such invasive infections are a more serious global problem, especially for immunocompromised patients. While superficial fungal infections are common and generally curable, invasive fungal infections are often life-threatening and much harder to diagnose and treat. Despite the increasing awareness of the situation’s severity, currently available fungal diagnostic methods cannot always meet diagnostic needs, especially for invasive fungal infections. Volatile organic compounds produced by fungi provide an alternative diagnostic approach for identification of fungal strains. We report here an optoelectronic nose based on a disposable colorimetric sensor array capable of rapid differentiation and identification of pathogenic fungi based on their metabolic profiles of emitted volatiles. The sensor arrays were tested with 12 human pathogenic fungal strains grown on standard agar medium. Array responses were monitored with an ordinary flatbed scanner. All fungal strains gave unique composite responses within 3 hours and were correctly clustered using hierarchical cluster analysis. A standard jackknifed linear discriminant analysis gave a classification accuracy of 94% for 155 trials. Tensor discriminant analysis, which takes better advantage of the high dimensionality of the sensor array data, gave a classification accuracy of 98.1%. The sensor array is also able to observe metabolic changes in growth patterns upon the addition of fungicides, and this provides a facile screening tool for determining fungicide efficacy for various fungal strains in real time. PMID:24570999

  3. Cosmic Shear Measurements with DES Science Verification Data

    SciTech Connect

    Becker, M. R.

    2015-07-20

    We present measurements of weak gravitational lensing cosmic shear two-point statistics using Dark Energy Survey Science Verification data. We demonstrate that our results are robust to the choice of shear measurement pipeline, either ngmix or im3shape, and robust to the choice of two-point statistic, including both real and Fourier-space statistics. Our results pass a suite of null tests including tests for B-mode contamination and direct tests for any dependence of the two-point functions on a set of 16 observing conditions and galaxy properties, such as seeing, airmass, galaxy color, galaxy magnitude, etc. We use a large suite of simulations to compute the covariance matrix of the cosmic shear measurements and assign statistical significance to our null tests. We find that our covariance matrix is consistent with the halo model prediction, indicating that it has the appropriate level of halo sample variance. We also compare the same jackknife procedure applied to the data and the simulations in order to search for additional sources of noise not captured by the simulations. We find no statistically significant extra sources of noise in the data. The overall detection significance with tomography for our highest source density catalog is 9.7σ. Cosmological constraints from the measurements in this work are presented in a companion paper (DES et al. 2015).

  4. Hypothesis tests for hydrologic alteration

    NASA Astrophysics Data System (ADS)

    Kroll, Charles N.; Croteau, Kelly E.; Vogel, Richard M.

    2015-11-01

    Hydrologic systems can be altered by anthropogenic and climatic influences. While there are a number of statistical frameworks for describing and evaluating the extent of hydrologic alteration, here we present a new framework for assessing whether statistically significant hydrologic alteration has occurred, or whether the shift in the hydrologic regime is consistent with the natural variability of the system. Four hypothesis tests based on shifts of flow duration curves (FDCs) are developed and tested using three different experimental designs based on different strategies for resampling of annual FDCs. The four hypothesis tests examined are the Kolmogorov-Smirnov (KS), Kuiper (K), confidence interval (CI), and ecosurplus and ecodeficit (Eco). Here 117 streamflow sites that have potentially undergone hydrologic alteration due to reservoir construction are examined. 20 years of pre-reservoir record is used to develop the critical value of the test statistic for type I errors of 5% and 10%, while 10 years of post-alteration record is used to examine the power of each test. The best experimental design, based on calculating the mean annual FDC from an exhaustive jackknife resampling regime, provided a larger number of unique values of each test statistic and properly reproduced type I errors. Of the four tests, the CI test consistently had the highest power, while the K test had the second highest power; KS and Eco always had the lowest power. The power of the CI test appeared related to the storage ratio of the reservoir, a rough measure of the hydrologic alteration of the system.

  5. Structural class prediction of protein using novel feature extraction method from chaos game representation of predicted secondary structure.

    PubMed

    Zhang, Lichao; Kong, Liang; Han, Xiaodong; Lv, Jinfeng

    2016-07-01

    Protein structural class prediction plays an important role in protein structure and function analysis, drug design and many other biological applications. Extracting good representation from protein sequence is fundamental for this prediction task. In recent years, although several secondary structure based feature extraction strategies have been specially proposed for low-similarity protein sequences, the prediction accuracy still remains limited. To explore the potential of secondary structure information, this study proposed a novel feature extraction method from the chaos game representation of predicted secondary structure to mainly capture sequence order information and secondary structure segments distribution information in a given protein sequence. Several kinds of prediction accuracies obtained by the jackknife test are reported on three widely used low-similarity benchmark datasets (25PDB, 1189 and 640). Compared with the state-of-the-art prediction methods, the proposed method achieves the highest overall accuracies on all the three datasets. The experimental results confirm that the proposed feature extraction method is effective for accurate prediction of protein structural class. Moreover, it is anticipated that the proposed method could be extended to other graphical representations of protein sequence and be helpful in future research. PMID:27084358

  6. Full Moment Tensor Analysis Using First Motion Data at The Geysers Geothermal Field

    NASA Astrophysics Data System (ADS)

    Boyd, O.; Dreger, D. S.; Lai, V. H.; Gritto, R.

    2012-12-01

    Seismicity associated with geothermal energy production at The Geysers Geothermal Field in northern California has been increasing during the last forty years. We investigate source models of over fifty earthquakes with magnitudes ranging from Mw 3.5 up to Mw 4.5. We invert three-component, complete waveform data from broadband stations of the Berkeley Digital Seismic Network, the Northern California Seismic Network and the USA Array deployment (2005-2007) for the complete, six-element moment tensor. Some solutions are double-couple while others have substantial non-double-couple components. To assess the stability and significance of non-double-couple components, we use a suite of diagnostic tools including the F-test, Jackknife test, bootstrap and network sensitivity solution (NSS). The full moment tensor solutions of the studied events tend to plot in the upper half of the Hudson source type diagram where the fundamental source types include +CLVD, +LVD, tensile-crack, DC and explosion. Using the F-test to compare the goodness-of-fit values between the full and deviatoric moment tensor solutions, most of the full moment tensor solutions do not show a statistically significant improvement in fit over the deviatoric solutions. Because a small isotropic component may not significantly improve the fit, we include first motion polarity data to better constrain the full moment tensor solutions.

  7. Stability and Uncertainty of Full Moment Tensor Solutions for M < 3.5 Induced Earthquakes

    NASA Astrophysics Data System (ADS)

    Boyd, O. S.; Dreger, D. S.

    2014-12-01

    The increase in earthquakes associated with industrial activities has created a need to investigate and characterize the source physics of induced seismicity. Many techniques and approaches are available to determine representative source parameters of these events. For M > 3.5 events, high quality seismic data from regional networks can be used to provide reasonable estimates of moment tensor solutions. In this investigation we explore various techniques and datasets to constrain full moment tensor solutions of M < 3.5 induced events, expanding upon the approach developed by Guilhem et al., 2014. Small magnitude events recorded by local seismic networks can yield good quality data with distinct body wave and converted phases depending upon the velocity structure and frequency range. Generating synthetic seismograms or Green's functions to accurately model these high frequency phases can be challenging. To investigate the variability associated with the choice of Green's functions, we test available codes to see how well they capture body wave phases. Other stability and uncertainty measures include the F-test, Jackknife test, residual bootstrap, and Network Sensitivity Solution, (Ford et al., 2009; Ford et al., 2010). Additional datasets to constrain the full moment tensor solution include P-wave first motions and amplitude ratios.

  8. Bacterial community structure and soil properties of a subarctic tundra soil in Council, Alaska

    PubMed Central

    Kim, Hye Min; Jung, Ji Young; Yergeau, Etienne; Hwang, Chung Yeon; Hinzman, Larry; Nam, Sungjin; Hong, Soon Gyu; Kim, Ok-Sun; Chun, Jongsik; Lee, Yoo Kyung

    2014-01-01

    The subarctic region is highly responsive and vulnerable to climate change. Understanding the structure of subarctic soil microbial communities is essential for predicting the response of the subarctic soil environment to climate change. To determine the composition of the bacterial community and its relationship with soil properties, we investigated the bacterial community structure and properties of surface soil from the moist acidic tussock tundra in Council, Alaska. We collected 70 soil samples with 25-m intervals between sampling points from 0–10 cm to 10–20 cm depths. The bacterial community was analyzed by pyrosequencing of 16S rRNA genes, and the following soil properties were analyzed: soil moisture content (MC), pH, total carbon (TC), total nitrogen (TN), and inorganic nitrogen ( and ). The community compositions of the two different depths showed that Alphaproteobacteria decreased with soil depth. Among the soil properties measured, soil pH was the most significant factor correlating with bacterial community in both upper and lower-layer soils. Bacterial community similarity based on jackknifed unweighted unifrac distance showed greater similarity across horizontal layers than through the vertical depth. This study showed that soil depth and pH were the most important soil properties determining bacterial community structure of the subarctic tundra soil in Council, Alaska. PMID:24893754

  9. Identification of immunoglobulins using Chou's pseudo amino acid composition with feature selection technique.

    PubMed

    Tang, Hua; Chen, Wei; Lin, Hao

    2016-04-22

    Immunoglobulins, also called antibodies, are a group of cell surface proteins which are produced by the immune system in response to the presence of a foreign substance (called antigen). They play key roles in many medical, diagnostic and biotechnological applications. Correct identification of immunoglobulins is crucial to the comprehension of humoral immune function. With the avalanche of protein sequences identified in postgenomic age, it is highly desirable to develop computational methods to timely identify immunoglobulins. In view of this, we designed a predictor called "IGPred" by formulating protein sequences with the pseudo amino acid composition into which nine physiochemical properties of amino acids were incorporated. Jackknife cross-validated results showed that 96.3% of immunoglobulins and 97.5% of non-immunoglobulins can be correctly predicted, indicating that IGPred holds very high potential to become a useful tool for antibody analysis. For the convenience of most experimental scientists, a web-server for IGPred was established at http://lin.uestc.edu.cn/server/IGPred. We believe that the web-server will become a powerful tool to study immunoglobulins and to guide related experimental validations. PMID:26883492

  10. Additional Value of Diffusion-weighted MRI to Gd-EOB-DTPA-enhanced Hepatic MRI for the Detection of Liver Metastasis: the Difference Depending on the Experience of the Radiologists.

    PubMed

    Fukumoto, Wataru; Nakamura, Yuko; Higaki, Toru; Tatsugami, Fuminari; Iida, Makoto; Awai, Kazuo

    2015-06-01

    This retrospective study was to investigate whether adding diffusion-weighted imaging (DWI) to Gd-EOB-DTPA-enhanced MRI (EOB-MRI) improved the detection of liver metastasis in radiology resident and board-certified radiologist groups. It was approved by our institutional review board. We selected 18 patients with 35 liver metastases and 12 patients without liver tumors. Five board-certified radiologists and 5 radiology residents participated in the observer performance study. Each observer first interpreted T1- and T2-weighted-, plain-, arterial phase-, and hepatobiliary phase images and specified the location of the liver metastases. The software subsequently displayed the DWI images simultaneously and all participants repeated the reading. We used Jackknife alternative free-response receiver operating characteristic (JAFROC) analysis to compare the observer performance in detecting liver metastases. The mean values for the area under the curve (AUC) for EOB-MRI without and with DWI were 0.78 ± 0.13 [standard deviation: SD] and 0.87 ± 0.09, respectively, for the radiology residents, and the difference was statistically significant (p = 0.045). For the board- certified radiologists these values were 0.92 ± 0.02 and 0.96 ± 0.01, respectively, and the difference was not statistically significant (p = 0.092). EOB-MRI with DWI significantly improved the performance of radiology residents in the identification of liver metastases. PMID:26211220

  11. Predicting miRNA's target from primary structure by the nearest neighbor algorithm.

    PubMed

    Lin, Kao; Qian, Ziliang; Lu, Lin; Lu, Lingyi; Lai, Lihui; Gu, Jieyi; Zeng, Zhenbing; Li, Haipeng; Cai, Yudong

    2010-11-01

    We used a machine learning method, the nearest neighbor algorithm (NNA), to learn the relationship between miRNAs and their target proteins, generating a predictor which can then judge whether a new miRNA-target pair is true or not. We acquired 198 positive (true) miRNA-target pairs from Tarbase and the literature, and generated 4,888 negative (false) pairs through random combination. A 0/1 system and the frequencies of single nucleotides and di-nucleotides were used to encode miRNAs into vectors while various physicochemical parameters were used to encode the targets. The NNA was then applied, learning from these data to produce a predictor. We implemented minimum redundancy maximum relevance (mRMR) and properties forward selection (PFS) to reduce the redundancy of our encoding system, obtaining 91 most efficient properties. Finally, via the Jackknife cross-validation test, we got a positive accuracy of 69.2% and an overall accuracy of 96.0% with all the 253 properties. Besides, we got a positive accuracy of 83.8% and an overall accuracy of 97.2% with the 91 most efficient properties. A web-server for predictions is also made available at http://app3.biosino.org:8080/miRTP/index.jsp. PMID:20041294

  12. Flowering timing prediction in Australian native understorey species ( Acrotriche R.Br Ericaceae) using meteorological data

    NASA Astrophysics Data System (ADS)

    Schneemilch, Melanie; Kokkinn, Michael; Williams, Craig R.

    2012-01-01

    The aim of this study was to determine the climatic influences on floral development for five members of the Australian native plant genus Acrotriche R. Br (Ericaceae). An observed period of summer floral dormancy suggests temperature is involved in flowering regulation in these species. Models were developed to determine temperature requirements associated with the likelihood of flowering occurring on any one day. To this end, the timing of flowering and meteorological data were collated for several sites, and multivariate logistic regressions performed to identify variables with a significant influence on flowering timing. The resultant models described a large amount of variation in flowering presence/absence, with r 2 values ranging from 0.72 to 0.79. Temperature was identified as influential on both floral development and flowering timing in each of the study species. The positive influence of short photoperiods on flowering in three of the winter flowering species was not surprising. However, the reporting here of a significant association between interdiurnal temperature and flowering in one species is novel. The predictive power of the models was validated through a jackknife sequential recalculation approach, revealing strong positive and negative predictive ability for flowering for four of the five species. Applications of the models include assisting in determination of the suitability of areas for vegetation restoration and identifying the possible effects of climate change on flowering in the study species.

  13. Body height estimation based on dimensions of sacral and coccygeal vertebrae.

    PubMed

    Pelin, Can; Duyar, Izzet; Kayahan, Esra M; Zağyapan, Ragiba; Ağildere, A Muhteşem; Erar, Aydin

    2005-03-01

    This study is to evaluate whether it is possible to predict living stature from sacral and coccygeal vertebral dimensions. Individual vertebral body heights, sacral height (SH), and sacrococcygeal height (SCH) were recorded from the magnetic resonance images of 42 adult males. Sum of the heights of five sacral vertebrae (sigmaS), the first four coccygeal vertebrae (sigmaC), and the total height of the sacral and the first four coccygeal vertebrae together (sigmaSC) were also recorded. Linear regression equations for stature estimation were produced using the above mentioned variables. The regression equations were constructed and tested by using jack-knife procedure. Statistical analyses indicated that the combined variables (SH, SCH, sigmaS, sigmaC, sigmaSC) were more accurate predictors of stature than the heights of individual vertebrae. The results of the study pointed out that the equations derived from sacrococcygeal dimensions perform somewhat better than ones based on foot and head variables, but worse than those based on long-bone length. As a conclusion, the dimensions of sacral and coccygeal vertebrae could be used for stature estimation when long bones are not available. PMID:15813539

  14. Multiple Subject Barycentric Discriminant Analysis (MUSUBADA): How to Assign Scans to Categories without Using Spatial Normalization

    PubMed Central

    Abdi, Hervé; Williams, Lynne J.; Connolly, Andrew C.; Gobbini, M. Ida; Dunlop, Joseph P.; Haxby, James V.

    2012-01-01

    We present a new discriminant analysis (DA) method called Multiple Subject Barycentric Discriminant Analysis (MUSUBADA) suited for analyzing fMRI data because it handles datasets with multiple participants that each provides different number of variables (i.e., voxels) that are themselves grouped into regions of interest (ROIs). Like DA, MUSUBADA (1) assigns observations to predefined categories, (2) gives factorial maps displaying observations and categories, and (3) optimally assigns observations to categories. MUSUBADA handles cases with more variables than observations and can project portions of the data table (e.g., subtables, which can represent participants or ROIs) on the factorial maps. Therefore MUSUBADA can analyze datasets with different voxel numbers per participant and, so does not require spatial normalization. MUSUBADA statistical inferences are implemented with cross-validation techniques (e.g., jackknife and bootstrap), its performance is evaluated with confusion matrices (for fixed and random models) and represented with prediction, tolerance, and confidence intervals. We present an example where we predict the image categories (houses, shoes, chairs, and human, monkey, dog, faces,) of images watched by participants whose brains were scanned. This example corresponds to a DA question in which the data table is made of subtables (one per subject) and with more variables than observations. PMID:22548125

  15. Optimizing the feature set for a Bayesian network for breast cancer diagnosis using genetic algorithm techniques

    NASA Astrophysics Data System (ADS)

    Wang, Xiao Hui; Zheng, Bin; Chang, Yuan-Hsiang; Good, Walter F.

    1999-05-01

    This study investigates the degree to which the performance of Bayesian belief networks (BBNs), for computer-assisted diagnosis of breast cancer, can be improved by optimizing their input feature sets using a genetic algorithm (GA). 421 cases (all women) were used in this study, of which 92 were positive for breast cancer. Each case contained both non-image information and image information derived from mammograms by radiologists. A GA was used to select an optimal subset of features, from a total of 21, to use as the basis for a BBN classifier. The figure-of-merit used in the GA's evaluation of feature subsets was Az, the area under the ROC curve produced by the corresponding BBN classifier. For each feature subset evaluated by the GA, a BBN was developed to classify positive and negative cases. Overall performance of the BBNs was evaluated using a jackknife testing method to calculate Az, for their respective ROC curves. The Az value of the BBN incorporating all 21 features was 0.851 plus or minus 0.012. After a 93 generation search, the GA found an optimal feature set with four non-image and four mammographic features, which achieved an Az value of 0.927 plus or minus 0.009. This study suggests that GAs are a viable means to optimize feature sets, and optimizing feature sets can result in significant performance improvements.

  16. Predicting subcellular location of proteins using integrated-algorithm method.

    PubMed

    Cai, Yu-Dong; Lu, Lin; Chen, Lei; He, Jian-Feng

    2010-08-01

    Protein's subcellular location, which indicates where a protein resides in a cell, is an important characteristic of protein. Correctly assigning proteins to their subcellular locations would be of great help to the prediction of proteins' function, genome annotation, and drug design. Yet, in spite of great technical advance in the past decades, it is still time-consuming and laborious to experimentally determine protein subcellular locations on a high throughput scale. Hence, four integrated-algorithm methods were developed to fulfill such high throughput prediction in this article. Two data sets taken from the literature (Chou and Elrod, Protein Eng 12:107-118, 1999) were used as training set and test set, which consisted of 2,391 and 2,598 proteins, respectively. Amino acid composition was applied to represent the protein sequences. The jackknife cross-validation was used to test the training set. The final best integrated-algorithm predictor was constructed by integrating 10 algorithms in Weka (a software tool for tackling data mining tasks, http://www.cs.waikato.ac.nz/ml/weka/ ) based on an mRMR (Minimum Redundancy Maximum Relevance, http://research.janelia.org/peng/proj/mRMR/ ) method. It can achieve correct rate of 77.83 and 80.56% for the training set and test set, respectively, which is better than all of the 60 algorithms collected in Weka. This predicting software is available upon request. PMID:19662505

  17. Gastrointestinal Parasites of Ecuadorian Mantled Howler Monkeys (Alouatta palliata aequatorialis) Based on Fecal Analysis.

    PubMed

    Helenbrook, William D; Wade, Susan E; Shields, William M; Stehman, Stephen V; Whipps, Christopher M

    2015-06-01

    An analysis of gastrointestinal parasites of Ecuadorian mantled howler monkeys, Alouatta palliata aequatorialis, was conducted based on examination of fecal smears, flotations, and sedimentations. At least 1 type of parasite was detected in 97% of the 96 fecal samples screened across 19 howler monkey groups using these techniques. Samples averaged 3.6 parasite species per individual (±1.4 SD). Parasites included species representing genera of 2 apicomplexans: Cyclospora sp. (18% of individual samples) and Isospora sp. (3%); 6 other protozoa: Balantidium sp. (9%), Blastocystis sp. (60%), Chilomastix sp. (4%), Dientamoeba sp. (3%), Entamoeba species (56%), Iodamoeba sp. (5%); 4 nematodes: Enterobius sp. (3%), Capillaria sp. (78%), Strongyloides spp. (88%) which included 2 morphotypes, Trypanoxyuris sp. (12%); and the platyhelminth Controrchis sp. (15%). A statistically significant positive correlation was found between group size and each of 3 different estimators of parasite species richness adjusted for sampling effort (ICE: r(2) = 0.24, P = 0.05; Chao2: r(2) = 0.25, P = 0.05, and Jackknife: r(2) = 0.31, P = 0.03). Two significant associations between co-infecting parasites were identified. Based on the prevalence data, individuals infected with Balantidium sp. were more likely to also be infected with Isospora sp. (χ(2) = 6.02, P = 0.01), while individuals harboring Chilomastix sp. were less likely to have Capillaria sp. present (χ(2) = 4.03, P = 0.04). PMID:25686475

  18. Using logistic regression to estimate delay-discounting functions.

    PubMed

    Wileyto, E Paul; Audrain-McGovern, Janet; Epstein, Leonard H; Lerman, Caryn

    2004-02-01

    The monetary choice questionnaire (MCQ) and similar computer tasks ask preference questions in order to ascertain indifference, the perceived equivalence of immediate versus larger delayed rewards. Indifference data are then fitted with a hyperbolic function, summarizing the decline in perceived value with delay time. We present a fitting method that estimates the hyperbolic parameter k directly from survey responses. Binary preferences are modeled as a function of time (X2) and a transformed reward ratio (X1), yielding logistic regression coefficients beta 2 and beta 1. The hyperbolic parameter emerges as k = beta 2/beta 1, where the logistic predicted p = .5 (the definition of indifference). The MCQ was administered to 1,073 adolescents and was scored using both standard and logistic methods. The means for In(k) were similar (standard, -4.53; logistic, -4.51), and the results were highly correlated (rho = .973). Simulated MCQ data showed that k was unbiased, except where beta 1 > or = -1, indicating a vague survey response. Jackknife standard errors provided excellent coverage. PMID:15190698

  19. A systematic assessment of automated ribosomal intergenic spacer analysis (ARISA) as a tool for estimating bacterial richness.

    PubMed

    Kovacs, Amir; Yacoby, Keren; Gophna, Uri

    2010-04-01

    ARISA (automated ribosomal intergenic spacer analysis) is a commonly used method for microbial community analysis that provides estimates of microbial richness and diversity. Here we investigated the potential biases of ARISA in richness estimation by performing computer simulations using 722 complete genomes. Our simulations based on in silico PCR demonstrated that over 8% of bacterial strains represented by complete genomes will never yield a PCR fragment using ARISA primers, usually because their ribosomal RNA genes are not organized in an operon. Despite the tendency of ARISA to overestimate species richness, a strong linear correlation exists between the observed number of fragments, even after binning, and the actual number of species in the sample. This linearity is fairly robust to the taxon sampling in the database as it is also observed on subsets of the 722 genome database using a jackknife approach. However, this linearity disappears when the species richness is high and binned fragment lengths gradually become saturated. We suggest that for ARISA-based richness estimates, where the number of binned lengths observed ranges between 10 and 116, a correction should be used in order to obtain more accurate "species richness" results comparable to 16S rRNA clone-library data. PMID:20138144

  20. Visual search for tropical web spiders: the influence of plot length, sampling effort, and phase of the day on species richness.

    PubMed

    Pinto-Leite, C M; Rocha, P L B

    2012-12-01

    Empirical studies using visual search methods to investigate spider communities were conducted with different sampling protocols, including a variety of plot sizes, sampling efforts, and diurnal periods for sampling. We sampled 11 plots ranging in size from 5 by 10 m to 5 by 60 m. In each plot, we computed the total number of species detected every 10 min during 1 hr during the daytime and during the nighttime (0630 hours to 1100 hours, both a.m. and p.m.). We measured the influence of time effort on the measurement of species richness by comparing the curves produced by sample-based rarefaction and species richness estimation (first-order jackknife). We used a general linear model with repeated measures to assess whether the phase of the day during which sampling occurred and the differences in the plot lengths influenced the number of species observed and the number of species estimated. To measure the differences in species composition between the phases of the day, we used a multiresponse permutation procedure and a graphical representation based on nonmetric multidimensional scaling. After 50 min of sampling, we noted a decreased rate of species accumulation and a tendency of the estimated richness curves to reach an asymptote. We did not detect an effect of plot size on the number of species sampled. However, differences in observed species richness and species composition were found between phases of the day. Based on these results, we propose guidelines for visual search for tropical web spiders. PMID:23321102

  1. Ectomycorrhizal communities in a productive Tuber aestivum Vittad. orchard: composition, host influence and species replacement.

    PubMed

    Benucci, Gian Maria Niccolò; Raggi, Lorenzo; Albertini, Emidio; Grebenc, Tine; Bencivenga, Mattia; Falcinelli, Mario; Di Massimo, Gabriella

    2011-04-01

    Truffles (Tuber spp.) and other ectomycorrhizal species form species-rich assemblages in the wild as well as in cultivated ecosystems. We aimed to investigate the ectomycorrhizal communities of hazels and hornbeams that are growing in a 24-year-old Tuber aestivum orchard. We demonstrated that the ectomycorrhizal communities included numerous species and were phylogenetically diverse. Twenty-nine ectomycorrhizal taxa were identified. Tuber aestivum ectomycorrhizae were abundant (9.3%), only those of Tricholoma scalpturatum were more so (21.4%), and were detected in both plant symbionts with a variation in distribution and abundance between the two different hosts. The Thelephoraceae family was the most diverse, being represented by 12 taxa. The overall observed diversity represented 85% of the potential one as determined by a jackknife estimation of richness and was significantly higher in hazel than in hornbeam. The ectomycorrhizal communities of hornbeam trees were closely related phylogenetically, whereas no clear distribution pattern was observed for the communities in hazel. Uniform site characteristics indicated that ectomycorrhizal relationships were host mediated, but not host specific. Despite the fact that different plant species hosted diverse ectomycorrhizal communities and that the abundance of T. aestivum differed among sites, no difference was detected in the production of fruiting bodies. PMID:21223332

  2. Predicting cancerlectins by the optimal g-gap dipeptides.

    PubMed

    Lin, Hao; Liu, Wei-Xin; He, Jiao; Liu, Xin-Hui; Ding, Hui; Chen, Wei

    2015-01-01

    The cancerlectin plays a key role in the process of tumor cell differentiation. Thus, to fully understand the function of cancerlectin is significant because it sheds light on the future direction for the cancer therapy. However, the traditional wet-experimental methods were money- and time-consuming. It is highly desirable to develop an effective and efficient computational tool to identify cancerlectins. In this study, we developed a sequence-based method to discriminate between cancerlectins and non-cancerlectins. The analysis of variance (ANOVA) was used to choose the optimal feature set derived from the g-gap dipeptide composition. The jackknife cross-validated results showed that the proposed method achieved the accuracy of 75.19%, which is superior to other published methods. For the convenience of other researchers, an online web-server CaLecPred was established and can be freely accessed from the website http://lin.uestc.edu.cn/server/CalecPred. We believe that the CaLecPred is a powerful tool to study cancerlectins and to guide the related experimental validations. PMID:26648527

  3. Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM

    PubMed Central

    Liang, Yunyun; Liu, Sanyang; Zhang, Shengli

    2015-01-01

    Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM). Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS), segmented PsePSSM, and segmented autocovariance transformation (ACT) based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640) are adopted in this paper. Then a 700-dimensional (700D) feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA). To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences. PMID:26788119

  4. Orthology Inference in Nonmodel Organisms Using Transcriptomes and Low-Coverage Genomes: Improving Accuracy and Matrix Occupancy for Phylogenomics

    PubMed Central

    Yang, Ya; Smith, Stephen A.

    2014-01-01

    Orthology inference is central to phylogenomic analyses. Phylogenomic data sets commonly include transcriptomes and low-coverage genomes that are incomplete and contain errors and isoforms. These properties can severely violate the underlying assumptions of orthology inference with existing heuristics. We present a procedure that uses phylogenies for both homology and orthology assignment. The procedure first uses similarity scores to infer putative homologs that are then aligned, constructed into phylogenies, and pruned of spurious branches caused by deep paralogs, misassembly, frameshifts, or recombination. These final homologs are then used to identify orthologs. We explore four alternative tree-based orthology inference approaches, of which two are new. These accommodate gene and genome duplications as well as gene tree discordance. We demonstrate these methods in three published data sets including the grape family, Hymenoptera, and millipedes with divergence times ranging from approximately 100 to over 400 Ma. The procedure significantly increased the completeness and accuracy of the inferred homologs and orthologs. We also found that data sets that are more recently diverged and/or include more high-coverage genomes had more complete sets of orthologs. To explicitly evaluate sources of conflicting phylogenetic signals, we applied serial jackknife analyses of gene regions keeping each locus intact. The methods described here can scale to over 100 taxa. They have been implemented in python with independent scripts for each step, making it easy to modify or incorporate them into existing pipelines. All scripts are available from https://bitbucket.org/yangya/phylogenomic_dataset_construction. PMID:25158799

  5. Prediction of Lysine Ubiquitylation with Ensemble Classifier and Feature Selection

    PubMed Central

    Zhao, Xiaowei; Li, Xiangtao; Ma, Zhiqiang; Yin, Minghao

    2011-01-01

    Ubiquitylation is an important process of post-translational modification. Correct identification of protein lysine ubiquitylation sites is of fundamental importance to understand the molecular mechanism of lysine ubiquitylation in biological systems. This paper develops a novel computational method to effectively identify the lysine ubiquitylation sites based on the ensemble approach. In the proposed method, 468 ubiquitylation sites from 323 proteins retrieved from the Swiss-Prot database were encoded into feature vectors by using four kinds of protein sequences information. An effective feature selection method was then applied to extract informative feature subsets. After different feature subsets were obtained by setting different starting points in the search procedure, they were used to train multiple random forests classifiers and then aggregated into a consensus classifier by majority voting. Evaluated by jackknife tests and independent tests respectively, the accuracy of the proposed predictor reached 76.82% for the training dataset and 79.16% for the test dataset, indicating that this predictor is a useful tool to predict lysine ubiquitylation sites. Furthermore, site-specific feature analysis was performed and it was shown that ubiquitylation is intimately correlated with the features of its surrounding sites in addition to features derived from the lysine site itself. The feature selection method is available upon request. PMID:22272076

  6. iMem-Seq: A Multi-label Learning Classifier for Predicting Membrane Proteins Types.

    PubMed

    Xiao, Xuan; Zou, Hong-Liang; Lin, Wei-Zhong

    2015-08-01

    Predicting membrane protein type is a challenging problem, particularly when the query proteins may simultaneously have two or more different types. Most of the existing methods can only be used to deal with the single-label proteins. Actually, multiple-label proteins should not be ignored because they usually bear some special functions worthy of in-depth studies. By introducing the "multi-labeled learning" and hybridizing evolution information through Grey-PSSM, a novel predictor called iMem-Seq is developed that can be used to deal with the systems containing both single and multiple types of membrane proteins. As a demonstration, the jackknife cross-validation was performed with iMem-Seq on a benchmark dataset of membrane proteins classified into the eight types, where some proteins belong to two or there types, but none has ≥25% pairwise sequence identity to any other in a same subset. It was demonstrated via the rigorous cross-validations that the new predictor remarkably outperformed all its counterparts. As a user-friendly web-server, iMem-Seq is freely accessible to the public at the website http://www.jci-bioinfo.cn/iMem-Seq . PMID:25796484

  7. Nonparametric methods for drought severity estimation at ungauged sites

    NASA Astrophysics Data System (ADS)

    Sadri, S.; Burn, D. H.

    2012-12-01

    The objective in frequency analysis is, given extreme events such as drought severity or duration, to estimate the relationship between that event and the associated return periods at a catchment. Neural networks and other artificial intelligence approaches in function estimation and regression analysis are relatively new techniques in engineering, providing an attractive alternative to traditional statistical models. There are, however, few applications of neural networks and support vector machines in the area of severity quantile estimation for drought frequency analysis. In this paper, we compare three methods for this task: multiple linear regression, radial basis function neural networks, and least squares support vector regression (LS-SVR). The area selected for this study includes 32 catchments in the Canadian Prairies. From each catchment drought severities are extracted and fitted to a Pearson type III distribution, which act as observed values. For each method-duration pair, we use a jackknife algorithm to produce estimated values at each site. The results from these three approaches are compared and analyzed, and it is found that LS-SVR provides the best quantile estimates and extrapolating capacity.

  8. Stability of gene contributions and identification of outliers in multivariate analysis of microarray data

    PubMed Central

    Baty, Florent; Jaeger, Daniel; Preiswerk, Frank; Schumacher, Martin M; Brutsche, Martin H

    2008-01-01

    Background Multivariate ordination methods are powerful tools for the exploration of complex data structures present in microarray data. These methods have several advantages compared to common gene-by-gene approaches. However, due to their exploratory nature, multivariate ordination methods do not allow direct statistical testing of the stability of genes. Results In this study, we developed a computationally efficient algorithm for: i) the assessment of the significance of gene contributions and ii) the identification of sample outliers in multivariate analysis of microarray data. The approach is based on the use of resampling methods including bootstrapping and jackknifing. A statistical package of R functions was developed. This package includes tools for both inferring the statistical significance of gene contributions and identifying outliers among samples. Conclusion The methodology was successfully applied to three published data sets with varying levels of signal intensities. Its relevance was compared with alternative methods. Overall, it proved to be particularly effective for the evaluation of the stability of microarray data. PMID:18570644

  9. Predicting cancerlectins by the optimal g-gap dipeptides

    NASA Astrophysics Data System (ADS)

    Lin, Hao; Liu, Wei-Xin; He, Jiao; Liu, Xin-Hui; Ding, Hui; Chen, Wei

    2015-12-01

    The cancerlectin plays a key role in the process of tumor cell differentiation. Thus, to fully understand the function of cancerlectin is significant because it sheds light on the future direction for the cancer therapy. However, the traditional wet-experimental methods were money- and time-consuming. It is highly desirable to develop an effective and efficient computational tool to identify cancerlectins. In this study, we developed a sequence-based method to discriminate between cancerlectins and non-cancerlectins. The analysis of variance (ANOVA) was used to choose the optimal feature set derived from the g-gap dipeptide composition. The jackknife cross-validated results showed that the proposed method achieved the accuracy of 75.19%, which is superior to other published methods. For the convenience of other researchers, an online web-server CaLecPred was established and can be freely accessed from the website http://lin.uestc.edu.cn/server/CalecPred. We believe that the CaLecPred is a powerful tool to study cancerlectins and to guide the related experimental validations.

  10. Identification of New Candidate Genes and Chemicals Related to Esophageal Cancer Using a Hybrid Interaction Network of Chemicals and Proteins

    PubMed Central

    Liu, Junbao; Li, Li-Peng; He, Yi-Chun; Gao, Ru-Jian; Cai, Yu-Dong; Jiang, Yang

    2015-01-01

    Cancer is a serious disease responsible for many deaths every year in both developed and developing countries. One reason is that the mechanisms underlying most types of cancer are still mysterious, creating a great block for the design of effective treatments. In this study, we attempted to clarify the mechanism underlying esophageal cancer by searching for novel genes and chemicals. To this end, we constructed a hybrid network containing both proteins and chemicals, and generalized an existing computational method previously used to identify disease genes to identify new candidate genes and chemicals simultaneously. Based on jackknife test, our generalized method outperforms or at least performs at the same level as those obtained by a widely used method - the Random Walk with Restart (RWR). The analysis results of the final obtained genes and chemicals demonstrated that they highly shared gene ontology (GO) terms and KEGG pathways with direct and indirect associations with esophageal cancer. In addition, we also discussed the likelihood of selected candidate genes and chemicals being novel genes and chemicals related to esophageal cancer. PMID:26058041

  11. Interpolation of Global Monthly Rain-Gauge Observations for Climate Change Analysis

    NASA Astrophysics Data System (ADS)

    Grieser, Jürgen

    2014-05-01

    Monthly precipitation sums are observed at thousands of meteorological stations worldwide. Different institutes (e.g. the Global Precipitation Climatology Centre, GPCC, and the Climatic Research Unit, CRU, of the University of East Anglia) interpolate these observations to regular grids. These data are used widely in climate research, e.g. for the investigation of the hydrological cycle and climate change. Results of the interpolation depend on the station density, which varies considerably around the globe. It also depends on the interpolation method used (e.g. Ordinary Kriging and Shepard's Method). These methods are general interpolation methods that do not take into account the specifics of precipitation. The question discussed in this presentation is whether we can do better by using an interpolation strategy especially designed for monthly precipitation observations. Based on a dense local dataset (one station per 109 km2) and a less dense global dataset (one station per 27,000 km2) of 50 years of monthly precipitation observations, various interpolation strategies are compared. This includes the interpolation of transformed variables, the consideration of local spatial correlation of precipitation as well as data quality. The Jack-knife error is used to compare the different strategies. The major result is that some strategies used so far are far from optimal.

  12. AAFreqCoil: a new classifier to distinguish parallel dimeric and trimeric coiled coils.

    PubMed

    Wang, Xiaofeng; Zhou, Yuan; Yan, Renxiang

    2015-07-01

    Coiled coils are characteristic rope-like protein structures, constituted by one or more heptad repeats. Native coiled-coil structures play important roles in various biological processes, while the designed ones are widely employed in medicine and industry. To date, two major oligomeric states (i.e. dimeric and trimeric states) of a coiled-coil structure have been observed, plausibly exerting different biological functions. Therefore, exploration of the relationship between heptad repeat sequences and coiled coil structures is highly important. In this paper, we develop a new method named AAFreqCoil to classify parallel dimeric and trimeric coiled coils. Our method demonstrated its competitive performance when benchmarked based on 10-fold cross validation and jackknife cross validation. Meanwhile, the rules that can explicitly explain the prediction results of the test coiled coil can be extracted from the AAFreqCoil model for a better explanation of user predictions. A web server and stand-alone program implementing the AAFreqCoil algorithm are freely available at . PMID:25918905

  13. The resampling cross-validation technique in exercise science: modelling rowing power.

    PubMed

    Jensen, R L; Kline, G M

    1994-07-01

    The past 10-15 yr have witnessed a rapid increase in the development of new (and not so new) statistical methods that capitalize on recent advances in high-speed computing. These computer-intensive methods are often broadly referred to as resampling techniques and take several forms depending on the specific details of the procedure and the information of interest. Resampling techniques can be used both for inferential hypothesis testing as well as exploratory data description. Regardless of which method is employed, the central unifying theme is based upon the computer's power to rapidly resample many pseudosamples from a known (in-hand) data set (e.g., randomization tests, jackknife, boot-strap, cross-validation) or to randomly generate many pseudosamples from a theoretical probability distribution (e.g., normal, binomial, Poisson) with some known parameters (Monte Carlo method). This paper is not intended as a detailed description of computer-intensive methods, but only as an introduction to the resampling approach in cross-validation. A brief discussion of the motivation and an example in an exercise science context will be presented. PMID:7934770

  14. Spider diversity (Arachnida: Araneae) in Atlantic Forest areas at Pedra Branca State Park, Rio de Janeiro, Brazil

    PubMed Central

    Pérez-González, Abel; Baptista, Renner L. C.

    2016-01-01

    Abstract Background There has never been any published work about the diversity of spiders in the city of Rio de Janeiro using analytical tools to measure diversity. The only available records for spider communities in nearby areas indicate 308 species in the National Park of Tijuca and 159 species in Marapendi Municipal Park. These numbers are based on a rapid survey and on an one-year survey respectively. New information This study provides a more thorough understanding of how the spider species are distributed at Pedra Branca State Park. We report a total of 14,626 spider specimens recorded from this park, representing 49 families and 373 species or morphospecies, including at least 73 undescribed species. Also, the distribution range of 45 species was expanded, and species accumulation curves estimate that there is a minimum of 388 (Bootstrap) and a maximum of 468 species (Jackknife2) for the sampled areas. These estimates indicates that the spider diversity may be higher than observed. PMID:26929710

  15. Limited sampling hampers “big data” estimation of species richness in a tropical biodiversity hotspot

    PubMed Central

    Engemann, Kristine; Enquist, Brian J; Sandel, Brody; Boyle, Brad; Jørgensen, Peter M; Morueta-Holme, Naia; Peet, Robert K; Violle, Cyrille; Svenning, Jens-Christian

    2015-01-01

    Macro-scale species richness studies often use museum specimens as their main source of information. However, such datasets are often strongly biased due to variation in sampling effort in space and time. These biases may strongly affect diversity estimates and may, thereby, obstruct solid inference on the underlying diversity drivers, as well as mislead conservation prioritization. In recent years, this has resulted in an increased focus on developing methods to correct for sampling bias. In this study, we use sample-size-correcting methods to examine patterns of tropical plant diversity in Ecuador, one of the most species-rich and climatically heterogeneous biodiversity hotspots. Species richness estimates were calculated based on 205,735 georeferenced specimens of 15,788 species using the Margalef diversity index, the Chao estimator, the second-order Jackknife and Bootstrapping resampling methods, and Hill numbers and rarefaction. Species richness was heavily correlated with sampling effort, and only rarefaction was able to remove this effect, and we recommend this method for estimation of species richness with “big data” collections. PMID:25692000

  16. Prediction of space sickness in astronauts from preflight fluid, electrolyte, and cardiovascular variables and Weightless Environmental Training Facility (WETF) training

    NASA Technical Reports Server (NTRS)

    Simanonok, K.; Mosely, E.; Charles, J.

    1992-01-01

    Nine preflight variables related to fluid, electrolyte, and cardiovascular status from 64 first-time Shuttle crewmembers were differentially weighted by discrimination analysis to predict the incidence and severity of each crewmember's space sickness as rated by NASA flight surgeons. The nine variables are serum uric acid, red cell count, environmental temperature at the launch site, serum phosphate, urine osmolality, serum thyroxine, sitting systolic blood pressure, calculated blood volume, and serum chloride. Using two methods of cross-validation on the original samples (jackknife and a stratefied random subsample), these variables enable the prediction of space sickness incidence (NONE or SICK) with 80 percent sickness and space severity (NONE, MILD, MODERATE, of SEVERE) with 59 percent success by one method of cross-validation and 67 percent by another method. Addition of a tenth variable, hours spent in the Weightlessness Environment Training Facility (WETF) did not improve the prediction of space sickness incidences but did improve the prediction of space sickness severity to 66 percent success by the first method of cross-validation of original samples and to 71 percent by the second method. Results to date suggest the presence of predisposing physiologic factors to space sickness that implicate fluid shift etiology. The data also suggest that prior exposure to fluid shift during WETF training may produce some circulatory pre-adaption to fluid shifts in weightlessness that results in a reduction of space sickness severity.

  17. Control System Design of Multitrailer Using Neurocontrollers with Recessive Gene Structure by Step-up GA Training

    NASA Astrophysics Data System (ADS)

    Kiyuna, Ayaki; Kinjo, Hiroshi; Kurata, Koji; Yamamoto, Tetsuhiko

    In a previous study, we proposed a step-up training method for the multitrailer truck control system using neurocontrollers (NCs) evolved by a genetic algorithm (GA) and showed its efficiency. However, the method does not enable the training of NCs for a five-trailer connected truck system. In this paper, we present a new version of the step-up training method that enables the training of NCs for a five-trailer connected truck system. The proposed method is as follows: First, NCs are trained only to avoid the “jackknife phenomenon". Second, NCs are trained for minimizing squared errors starting from easy initial configurations. Finally, NCs are trained for minimizing the squared errors starting from more difficult initial configurations. The difficulty of training steps increases gradually. To improve training performance, we applied a recessive gene model to network weight coding and genetic operations. In this study, we applied the recessive gene model to the classic exclusive-or (XOR) training problem and showed its convergence performance. The GA training of NCs with the recessive gene model maintains diversity in the population and avoids evolutionary stagnation. Simulation shows that NCs with the recessive gene model and the proposed step-up training method are useful in the controller design of the multitrailer system.

  18. Effect of temperature on the life-history traits of Neoseiulus californicus (Acari: Phytoseiidae) fed on Panonychus ulmi.

    PubMed

    El Taj, H F; Jung, Chuleui

    2012-03-01

    The developmental rate and reproductive biology of Neoseiulus californicus, a generalist predator on spider mites and small insects, was investigated in the laboratory at five constant temperatures: 15, 20, 25, 30, and 34°C. The European red mite, Panonychus ulmi, an important pest in Korean apple orchards, was used as prey. Mean developmental time and adult longevity were inversely related to temperature from 15 to 30°C. Lifetime fecundity was greatest at 25°C, whereas daily fecundity was highest at 30°C. The sex ratio (female to male) was highest (0.77) at 25°C and lowest (0.67) at 34°C. Survivorship during immature development varied from 74.3 to 92.9%, with the lowest rate at 34°C. Life table parameters were analyzed and pseudo-replicates for the generation time (t ( G )), the intrinsic rate of natural increase (r (m)), finite rate of increase (λ), net reproductive rate (R (0)), and doubling time (t ( D )) were generated using the Jackknife method. Generation time (t ( G )) was lowest (10.7 days) at 34°C, R (0) was highest (49.2) at 25°C, and both r (m) (0.29) and λ (1.34) were highest at 30°C. In conclusion, the development and adult life-history traits obtained for N. californicus fed on P. ulmi indicated significant potential for biological control. PMID:22270114

  19. Stagewise pseudo-value regression for time-varying effects on the cumulative incidence.

    PubMed

    Zöller, Daniela; Schmidtmann, Irene; Weinmann, Arndt; Gerds, Thomas A; Binder, Harald

    2016-03-30

    In a competing risks setting, the cumulative incidence of an event of interest describes the absolute risk for this event as a function of time. For regression analysis, one can either choose to model all competing events by separate cause-specific hazard models or directly model the association between covariates and the cumulative incidence of one of the events. With a suitable link function, direct regression models allow for a straightforward interpretation of covariate effects on the cumulative incidence. In practice, where data can be right-censored, these regression models are implemented using a pseudo-value approach. For a grid of time points, the possibly unobserved binary event status is replaced by a jackknife pseudo-value based on the Aalen-Johansen method. We combine a stagewise regression technique with the pseudo-value approach to provide variable selection while allowing for time-varying effects. This is implemented by coupling variable selection between the grid times, but determining estimates separately. The effect estimates are regularized to also allow for model fitting with a low to moderate number of observations. This technique is illustrated in an application using clinical cancer registry data from hepatocellular carcinoma patients. The results are contrasted with traditional hazard-based modeling. In addition to a more straightforward interpretation, when using the proposed technique, the identification of time-varying effect patterns on the cumulative incidence is seen to be feasible with a moderate number of observations. Copyright © 2015 John Wiley & Sons, Ltd. PMID:26510388

  20. Predicting the Types of J-Proteins Using Clustered Amino Acids

    PubMed Central

    Feng, Pengmian; Zuo, Yongchun

    2014-01-01

    J-proteins are molecular chaperones and present in a wide variety of organisms from prokaryote to eukaryote. Based on their domain organizations, J-proteins can be classified into 4 types, that is, Type I, Type II, Type III, and Type IV. Different types of J-proteins play distinct roles in influencing cancer properties and cell death. Thus, reliably annotating the types of J-proteins is essential to better understand their molecular functions. In the present work, a support vector machine based method was developed to identify the types of J-proteins using the tripeptide composition of reduced amino acid alphabet. In the jackknife cross-validation, the maximum overall accuracy of 94% was achieved on a stringent benchmark dataset. We also analyzed the amino acid compositions by using analysis of variance and found the distinct distributions of amino acids in each family of the J-proteins. To enhance the value of the practical applications of the proposed model, an online web server was developed and can be freely accessed. PMID:24804260

  1. iDPF-PseRAAAC: A Web-Server for Identifying the Defensin Peptide Family and Subfamily Using Pseudo Reduced Amino Acid Alphabet Composition

    PubMed Central

    Zuo, Yongchun; Lv, Yang; Wei, Zhuying; Yang, Lei; Li, Guangpeng; Fan, Guoliang

    2015-01-01

    Defensins as one of the most abundant classes of antimicrobial peptides are an essential part of the innate immunity that has evolved in most living organisms from lower organisms to humans. To identify specific defensins as interesting antifungal leads, in this study, we constructed a more rigorous benchmark dataset and the iDPF-PseRAAAC server was developed to predict the defensin family and subfamily. Using reduced dipeptide compositions were used, the overall accuracy of proposed method increased to 95.10% for the defensin family, and 98.39% for the vertebrate subfamily, which is higher than the accuracy from other methods. The jackknife test shows that more than 4% improvement was obtained comparing with the previous method. A free online server was further established for the convenience of most experimental scientists at http://wlxy.imu.edu.cn/college/biostation/fuwu/iDPF-PseRAAAC/index.asp. A friendly guide is provided to describe how to use the web server. We anticipate that iDPF-PseRAAAC may become a useful high-throughput tool for both basic research and drug design. PMID:26713618

  2. Microfluidic co-culture platform to quantify chemotaxis of primary stem cells.

    PubMed

    Tatárová, Z; Abbuehl, J P; Maerkl, S; Huelsken, J

    2016-05-21

    Functional analysis of primary tissue-specific stem cells is hampered by their rarity. Here we describe a greatly miniaturized microfluidic device for the multiplexed, quantitative analysis of the chemotactic properties of primary, bone marrow-derived mesenchymal stem cells (MSC). The device was integrated within a fully customized platform that both increased the viability of stem cells ex vivo and simplified manipulation during multidimensional acquisition. Since primary stem cells can be isolated only in limited number, we optimized the design for efficient cell trapping from low volume and low concentration cell suspensions. Using nanoliter volumes and automated microfluidic controls for pulsed medium supply, our platform is able to create stable gradients of chemoattractant secreted from mammalian producer cells within the device, as was visualized by a secreted NeonGreen fluorescent reporter. The design was functionally validated by a CXCL/CXCR ligand/receptor combination resulting in preferential migration of primary, non-passaged MSC. Stable gradient formation prolonged assay duration and resulted in enhanced response rates for slowly migrating stem cells. Time-lapse video microscopy facilitated determining a number of migratory properties based on single cell analysis. Jackknife-resampling revealed that our assay requires only 120 cells to obtain statistically significant results, enabling new approaches in the research on rare primary stem cells. Compartmentalization of the device not only facilitated such quantitative measurements but will also permit future, high-throughput functional screens. PMID:27137768

  3. Coupling SWAT and ANN models for enhanced daily streamflow prediction

    NASA Astrophysics Data System (ADS)

    Noori, Navideh; Kalin, Latif

    2016-02-01

    To improve daily flow prediction in unmonitored watersheds a hybrid model was developed by combining a quasi-distributed watershed model and artificial neural network (ANN). Daily streamflow data from 29 nearby watersheds in and around the city of Atlanta, Southeastern United States, with leave-one-site-out jackknifing technique were used to build the flow predictive models during warm and cool seasons. Daily streamflow was first simulated with the Soil and Water Assessment Tool (SWAT) and then the SWAT simulated baseflow and stormflow were used as inputs to ANN. Out of the total 29 test watersheds, 62% and 83% of them had Nash-Sutcliffe values above 0.50 during the cool and warm seasons, respectively (considered good or better). As the percent forest cover or the size of test watershed increased, the performances of the models gradually decreased during both warm and cool seasons. This indicates that the developed models work better in urbanized watersheds. In addition, SWAT and SWAT Calibration Uncertainty Procedure (SWAT-CUP) program were run separately for each station to compare the flow prediction accuracy of the hybrid approach to SWAT. Only 31% of the sites during the calibration and 34% of validation runs had ENASH values ⩾0.50. This study showed that coupling ANN with semi-distributed models can lead to improved daily streamflow predictions in ungauged watersheds.

  4. Identifying Antioxidant Proteins by Using Optimal Dipeptide Compositions.

    PubMed

    Feng, Pengmian; Chen, Wei; Lin, Hao

    2016-06-01

    Antioxidant proteins are a kind of molecules that can terminate cellular and DNA damages caused by free radical intermediates. The use of antioxidant proteins for prevention of diseases has been intensively studied in recent years. Thus, accurate identification of antioxidant proteins is essential for understanding their roles in pharmacology. In this study, a support vector machine-based predictor called AodPred was developed for identifying antioxidant proteins. In this predictor, the sequence was formulated by using the optimal 3-gap dipeptides obtained by using feature selection method. It was observed by jackknife cross-validation test that AodPred can achieve an overall accuracy of 74.79 % in identifying antioxidant proteins. As a user-friendly tool, AodPred is freely accessible at http://lin.uestc.edu.cn/server/AntioxiPred . To maximize the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the web server to obtain the desired results. PMID:26345449

  5. Patterns of connectivity among populations of a coral reef fish

    NASA Astrophysics Data System (ADS)

    Chittaro, P. M.; Hogan, J. D.

    2013-06-01

    Knowledge of the patterns and scale of connectivity among populations is essential for the effective management of species, but our understanding is still poor for marine species. We used otolith microchemistry of newly settled bicolor damselfish ( Stegastes partitus) in the Mesoamerican Reef System (MRS), Western Caribbean, to investigate patterns of connectivity among populations over 2 years. First, we assessed spatial and temporal variability in trace elemental concentrations from the otolith edge to make a `chemical map' of potential source reef(s) in the region. Significant otolith chemical differences were detected at three spatial scales (within-atoll, between-atolls, and region-wide), such that individuals were classified to locations with moderate (52 % jackknife classification) to high (99 %) accuracy. Most sites at Turneffe Atoll, Belize showed significant temporal variability in otolith concentrations on the scale of 1-2 months. Using a maximum likelihood approach, we estimated the natal source of larvae recruiting to reefs across the MRS by comparing `natal' chemical signatures from the otolith of recruits to the `chemical map' of potential source reef(s). Our results indicated that populations at both Turneffe Atoll and Banco Chinchorro supply a substantial amount of individuals to their own reefs (i.e., self-recruitment) and thus emphasize that marine conservation and management in the MRS region would benefit from localized management efforts as well as international cooperation.

  6. Development of Pneumatic Aerodynamic Devices to Improve the Performance, Economics, and Safety of Heavy Vehicles

    SciTech Connect

    Robert J. Englar

    2000-06-19

    Under contract to the DOE Office of Heavy Vehicle Technologies, the Georgia Tech Research Institute (GTRI) is developing and evaluating pneumatic (blown) aerodynamic devices to improve the performance, economics, stability and safety of operation of Heavy Vehicles. The objective of this program is to apply the pneumatic aerodynamic aircraft technology previously developed and flight-tested by GTRI personnel to the design of an efficient blown tractor-trailer configuration. Recent experimental results obtained by GTRI using blowing have shown drag reductions of 35% on a streamlined automobile wind-tunnel model. Also measured were lift or down-load increases of 100-150% and the ability to control aerodynamic moments about all 3 axes without any moving control surfaces. Similar drag reductions yielded by blowing on bluff afterbody trailers in current US trucking fleet operations are anticipated to reduce yearly fuel consumption by more than 1.2 billion gallons, while even further reduction is possible using pneumatic lift to reduce tire rolling resistance. Conversely, increased drag and down force generated instantaneously by blowing can greatly increase braking characteristics and control in wet/icy weather due to effective ''weight'' increases on the tires. Safety is also enhanced by controlling side loads and moments caused on these Heavy Vehicles by winds, gusts and other vehicles passing. This may also help to eliminate the jack-knifing problem if caused by extreme wind side loads on the trailer. Lastly, reduction of the turbulent wake behind the trailer can reduce splash and spray patterns and rough air being experienced by following vehicles. To be presented by GTRI in this paper will be results developed during the early portion of this effort, including a preliminary systems study, CFD prediction of the blown flowfields, and design of the baseline conventional tractor-trailer model and the pneumatic wind-tunnel model.

  7. Microseismicity distribution in the southern Dead Sea basin and its implications on the structure of the basin

    NASA Astrophysics Data System (ADS)

    Braeuer, B.; Asch, Guenter; Hofstetter, R.; Haberland, Ch.; Jaser, D.; El-Kelani, R.; Weber, M.

    2012-03-01

    While the Dead Sea basin has been studied for a long time, the available knowledge about the detailed seismicity distribution in the area, as well as the deeper structure of the basin, is limited. Therefore, within the framework of the international project DESIRE (DEad Sea Integrated REsearch project), a dense temporary local seismological network was operated in the southern Dead Sea area. We use 530 local earthquakes, having all together 26 730 P- and S-arrival times for a simultaneous inversion of 1-D velocity models, station corrections and precise earthquake locations. Jackknife tests suggest an accuracy of the derived hypocentre locations of about 1 km. Thus, the result is the first clear image of the absolute distribution of the microseismicity of the area, especially in depth. The seismicity is concentrated in the upper crust down to 20 km depth while the lower limit of the seismicity is reached at 31 km depth. The seismic events at the eastern boundary fault (EBF) in the southern part of the study area represent the northward transform motion of the Arabian Plate along the Dead Sea Transform. North of the Boqeq fault the seismic activity represents the transfer of the motion in the pull-apart basin from the eastern to the western boundary. We find that from the surface downward the seismic events are tracing the boundary faults of the basin. The western boundary is mapped down to 12 km depth while the EBF reaches about 17 km depth, forming an asymmetric basin. One fifth of the data set is related to a specific cluster in time and space, which occurred in 2007 February at the western border fault. This cluster is aligned vertically, that is, it is perpendicular to the direction of the dominating left-lateral strike-slip movement at the main transform fault.

  8. Sodium and potassium intakes among US adults: NHANES 200320081234

    PubMed Central

    Zhang, Zefeng; Carriquiry, Alicia L; Gunn, Janelle P; Kuklina, Elena V; Saydah, Sharon H; Yang, Quanhe; Moshfegh, Alanna J

    2012-01-01

    Background: The American Heart Association (AHA), Institute of Medicine (IOM), and US Departments of Health and Human Services and Agriculture (USDA) Dietary Guidelines for Americans all recommend that Americans limit sodium intake and choose foods that contain potassium to decrease the risk of hypertension and other adverse health outcomes. Objective: We estimated the distributions of usual daily sodium and potassium intakes by sociodemographic and health characteristics relative to current recommendations. Design: We used 24-h dietary recalls and other data from 12,581 adults aged ?20 y who participated in NHANES in 20032008. Estimates of sodium and potassium intakes were adjusted for within-individual day-to-day variation by using measurement error models. SEs and 95% CIs were assessed by using jackknife replicate weights. Results: Overall, 99.4% (95% CI: 99.3%, 99.5%) of US adults consumed more sodium daily than recommended by the AHA (<1500 mg), and 90.7% (89.6%, 91.8%) consumed more than the IOM Tolerable Upper Intake Level (2300 mg). In US adults who are recommended by the Dietary Guidelines to further reduce sodium intake to 1500 mg/d (ie, African Americans aged ?51 y or persons with hypertension, diabetes, or chronic kidney disease), 98.8% (98.4%, 99.2%) overall consumed >1500 mg/d, and 60.4% consumed >3000 mg/dmore than double the recommendation. Overall, <2% of US adults and ?5% of US men consumed ?4700 mg K/d (ie, met recommendations for potassium). Conclusion: Regardless of recommendations or sociodemographic or health characteristics, the vast majority of US adults consume too much sodium and too little potassium. PMID:22854410

  9. A hybrid orographic plus statistical model for downscaling daily precipitation in Northern California

    USGS Publications Warehouse

    Pandey, G.R.; Cayan, D.R.; Dettinger, M.D.; Georgakakos, K.P.

    2000-01-01

    A hybrid (physical-statistical) scheme is developed to resolve the finescale distribution of daily precipitation over complex terrain. The scheme generates precipitation by combining information from the upper-air conditions and from sparsely distributed station measurements; thus, it proceeds in two steps. First, an initial estimate of the precipitation is made using a simplified orographic precipitation model. It is a steady-state, multilayer, and two-dimensional model following the concepts of Rhea. The model is driven by the 2.5?? ?? 2.5?? gridded National Oceanic and Atmospheric Administration-National Centers for Environmental Prediction upper-air profiles, and its parameters are tuned using the observed precipitation structure of the region. Precipitation is generated assuming a forced lifting of the air parcels as they cross the mountain barrier following a straight trajectory. Second, the precipitation is adjusted using errors between derived precipitation and observations from nearby sites. The study area covers the northern half of California, including coastal mountains, central valley, and the Sierra Nevada. The model is run for a 5-km rendition of terrain for days of January-March over the period of 1988-95. A jackknife analysis demonstrates the validity of the approach. The spatial and temporal distributions of the simulated precipitation field agree well with the observed precipitation. Further, a mapping of model performance indices (correlation coefficients, model bias, root-mean-square error, and threat scores) from an array of stations from the region indicates that the model performs satisfactorily in resolving daily precipitation at 5-km resolution.

  10. Classification of Birds and Bats Using Flight Tracks

    SciTech Connect

    Cullinan, Valerie I.; Matzner, Shari; Duberstein, Corey A.

    2015-05-01

    Classification of birds and bats that use areas targeted for offshore wind farm development and the inference of their behavior is essential to evaluating the potential effects of development. The current approach to assessing the number and distribution of birds at sea involves transect surveys using trained individuals in boats or airplanes or using high-resolution imagery. These approaches are costly and have safety concerns. Based on a limited annotated library extracted from a single-camera thermal video, we provide a framework for building models that classify birds and bats and their associated behaviors. As an example, we developed a discriminant model for theoretical flight paths and applied it to data (N = 64 tracks) extracted from 5-min video clips. The agreement between model- and observer-classified path types was initially only 41%, but it increased to 73% when small-scale jitter was censored and path types were combined. Classification of 46 tracks of bats, swallows, gulls, and terns on average was 82% accurate, based on a jackknife cross-validation. Model classification of bats and terns (N = 4 and 2, respectively) was 94% and 91% correct, respectively; however, the variance associated with the tracks from these targets is poorly estimated. Model classification of gulls and swallows (N ≥ 18) was on average 73% and 85% correct, respectively. The models developed here should be considered preliminary because they are based on a small data set both in terms of the numbers of species and the identified flight tracks. Future classification models would be greatly improved by including a measure of distance between the camera and the target.

  11. High-Resolution Taxonomic Profiling of the Subgingival Microbiome for Biomarker Discovery and Periodontitis Diagnosis

    PubMed Central

    Szafranski, Szymon P.; Wos-Oxley, Melissa L.; Vilchez-Vargas, Ramiro; Jáuregui, Ruy; Plumeier, Iris; Klawonn, Frank; Tomasch, Jürgen; Meisinger, Christa; Kühnisch, Jan; Sztajer, Helena; Pieper, Dietmar H.

    2014-01-01

    The oral microbiome plays a key role for caries, periodontitis, and systemic diseases. A method for rapid, high-resolution, robust taxonomic profiling of subgingival bacterial communities for early detection of periodontitis biomarkers would therefore be a useful tool for individualized medicine. Here, we used Illumina sequencing of the V1-V2 and V5-V6 hypervariable regions of the 16S rRNA gene. A sample stratification pipeline was developed in a pilot study of 19 individuals, 9 of whom had been diagnosed with chronic periodontitis. Five hundred twenty-three operational taxonomic units (OTUs) were obtained from the V1-V2 region and 432 from the V5-V6 region. Key periodontal pathogens like Porphyromonas gingivalis, Treponema denticola, and Tannerella forsythia could be identified at the species level with both primer sets. Principal coordinate analysis identified two outliers that were consistently independent of the hypervariable region and method of DNA extraction used. The linear discriminant analysis (LDA) effect size algorithm (LEfSe) identified 80 OTU-level biomarkers of periodontitis and 17 of health. Health- and periodontitis-related clusters of OTUs were identified using a connectivity analysis, and the results confirmed previous studies with several thousands of samples. A machine learning algorithm was developed which was trained on all but one sample and then predicted the diagnosis of the left-out sample (jackknife method). Using a combination of the 10 best biomarkers, 15 of 17 samples were correctly diagnosed. Training the algorithm on time-resolved community profiles might provide a highly sensitive tool to detect the onset of periodontitis. PMID:25452281

  12. Characterizing informative sequence descriptors and predicting binding affinities of heterodimeric protein complexes

    PubMed Central

    2015-01-01

    Background Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. Results This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. Conclusions The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes. PMID:26681483

  13. Microbial diversity of biofilms in dental unit water systems.

    PubMed

    Singh, Ruby; Stine, O Colin; Smith, David L; Spitznagel, John K; Labib, Mohamed E; Williams, Henry N

    2003-06-01

    We investigated the microbial diversity of biofilms found in dental unit water systems (DUWS) by three methods. The first was microscopic examination by scanning electron microscopy (SEM), acridine orange staining, and fluorescent in situ hybridization (FISH). Most bacteria present in the biofilm were viable. FISH detected the beta and gamma, but not the alpha, subclasses of Proteobacteria: In the second method, 55 cultivated biofilm isolates were identified with the Biolog system, fatty acid analysis, and 16S ribosomal DNA (rDNA) sequencing. Only 16S identified all 55 isolates, which represented 13 genera. The most common organisms, as shown by analyses of 16S rDNA, belonged to the genera Afipia (28%) and Sphingomonas (16%). The third method was a culture-independent direct amplification and sequencing of 165 subclones from community biofilm 16S rDNA. This method revealed 40 genera: the most common ones included Leptospira (20%), Sphingomonas (14%), Bacillus (7%), Escherichia (6%), Geobacter (5%), and Pseudomonas (5%). Some of these organisms may be opportunistic pathogens. Our results have demonstrated that a biofilm in a health care setting may harbor a vast diversity of organisms. The results also reflect the limitations of culture-based techniques to detect and identify bacteria. Although this is the greatest diversity reported in DUWS biofilms, other genera may have been missed. Using a technique based on jackknife subsampling, we projected that a 25-fold increase in the number of subclones sequenced would approximately double the number of genera observed, reflecting the richness and high diversity of microbial communities in these biofilms. PMID:12788744

  14. iDNA-Prot|dis: identifying DNA-binding proteins by incorporating amino acid distance-pairs and reduced alphabet profile into the general pseudo amino acid composition.

    PubMed

    Liu, Bin; Xu, Jinghao; Lan, Xun; Xu, Ruifeng; Zhou, Jiyun; Wang, Xiaolong; Chou, Kuo-Chen

    2014-01-01

    Playing crucial roles in various cellular processes, such as recognition of specific nucleotide sequences, regulation of transcription, and regulation of gene expression, DNA-binding proteins are essential ingredients for both eukaryotic and prokaryotic proteomes. With the avalanche of protein sequences generated in the postgenomic age, it is a critical challenge to develop automated methods for accurate and rapidly identifying DNA-binding proteins based on their sequence information alone. Here, a novel predictor, called "iDNA-Prot|dis", was established by incorporating the amino acid distance-pair coupling information and the amino acid reduced alphabet profile into the general pseudo amino acid composition (PseAAC) vector. The former can capture the characteristics of DNA-binding proteins so as to enhance its prediction quality, while the latter can reduce the dimension of PseAAC vector so as to speed up its prediction process. It was observed by the rigorous jackknife and independent dataset tests that the new predictor outperformed the existing predictors for the same purpose. As a user-friendly web-server, iDNA-Prot|dis is accessible to the public at http://bioinformatics.hitsz.edu.cn/iDNA-Prot_dis/. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step protocol guide is provided on how to use the web-server to get their desired results without the need to follow the complicated mathematic equations that are presented in this paper just for the integrity of its developing process. It is anticipated that the iDNA-Prot|dis predictor may become a useful high throughput tool for large-scale analysis of DNA-binding proteins, or at the very least, play a complementary role to the existing predictors in this regard. PMID:25184541

  15. iGPCR-drug: a web server for predicting interaction between GPCRs and drugs in cellular networking.

    PubMed

    Xiao, Xuan; Min, Jian-Liang; Wang, Pu; Chou, Kuo-Chen

    2013-01-01

    Involved in many diseases such as cancer, diabetes, neurodegenerative, inflammatory and respiratory disorders, G-protein-coupled receptors (GPCRs) are among the most frequent targets of therapeutic drugs. It is time-consuming and expensive to determine whether a drug and a GPCR are to interact with each other in a cellular network purely by means of experimental techniques. Although some computational methods were developed in this regard based on the knowledge of the 3D (dimensional) structure of protein, unfortunately their usage is quite limited because the 3D structures for most GPCRs are still unknown. To overcome the situation, a sequence-based classifier, called "iGPCR-drug", was developed to predict the interactions between GPCRs and drugs in cellular networking. In the predictor, the drug compound is formulated by a 2D (dimensional) fingerprint via a 256D vector, GPCR by the PseAAC (pseudo amino acid composition) generated with the grey model theory, and the prediction engine is operated by the fuzzy K-nearest neighbour algorithm. Moreover, a user-friendly web-server for iGPCR-drug was established at http://www.jci-bioinfo.cn/iGPCR-Drug/. For the convenience of most experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results without the need to follow the complicated math equations presented in this paper just for its integrity. The overall success rate achieved by iGPCR-drug via the jackknife test was 85.5%, which is remarkably higher than the rate by the existing peer method developed in 2010 although no web server was ever established for it. It is anticipated that iGPCR-Drug may become a useful high throughput tool for both basic research and drug development, and that the approach presented here can also be extended to study other drug - target interaction networks. PMID:24015221

  16. Comparing halo bias from abundance and clustering

    NASA Astrophysics Data System (ADS)

    Hoffmann, K.; Bel, J.; Gaztañaga, E.

    2015-06-01

    We model the abundance of haloes in the ˜(3 Gpc h-1)3 volume of the MICE Grand Challenge simulation by fitting the universal mass function with an improved Jackknife error covariance estimator that matches theory predictions. We present unifying relations between different fitting models and new predictions for linear (b1) and non-linear (c2 and c3) halo clustering bias. Different mass function fits show strong variations in their performance when including the low mass range (Mh ≲ 3 × 1012 M⊙ h-1) in the analysis. Together with fits from the literature, we find an overall variation in the amplitudes of around 10 per cent in the low mass and up to 50 per cent in the high mass (galaxy cluster) range (Mh > 1014 M⊙ h-1). These variations propagate into a 10 per cent change in b1 predictions and a 50 per cent change in c2 or c3. Despite these strong variations, we find universal relations between b1 and c2 or c3 for which we provide simple fits. Excluding low-mass haloes, different models fitted with reasonable goodness in this analysis, show per cent level agreement in their b1 predictions, but are systematically 5-10 per cent lower than the bias directly measured with two-point halo-mass clustering. This result confirms previous findings derived from smaller volumes (and smaller masses). Inaccuracies in the bias predictions lead to 5-10 per cent errors in growth measurements. They also affect any halo occupation distribution fitting or (cluster) mass calibration from clustering measurements.

  17. Benthic macrofauna habitat associations in Willapa Bay, Washington, USA

    NASA Astrophysics Data System (ADS)

    Ferraro, Steven P.; Cole, Faith A.

    2007-02-01

    Estuary-wide benthic macrofauna-habitat associations in Willapa Bay, Washington, United States, were determined for 4 habitats (eelgrass [ Zostera marina], Atlantic cordgrass [ Spartina alterniflora], mud shrimp [ Upogebia pugettensis], ghost shrimp [ Neotrypaea californiensis]) in 1996 and 7 habitats (eelgrass, Atlantic cordgrass, mud shrimp, ghost shrimp, oyster [ Crassostrea gigas], bare mud/sand, subtidal) in 1998. Most benthic macrofaunal species inhabited multiple habitats; however, 2 dominants, a fanworm, Manayunkia aestuarina, in Spartina, and a sand dollar, Dendraster excentricus, in subtidal, were rare or absent in all other habitats. Benthic macrofaunal Bray-Curtis similarity varied among all habitats except eelgrass and oyster. There were significant differences among habitats within- and between-years on several of the following ecological indicators: mean number of species ( S), abundance ( A), biomass ( B), abundance of deposit (AD), suspension (AS), and facultative (AF) feeders, Swartz's index (SI), Brillouin's index ( H), and jackknife estimates of habitat species richness (HSR). In the 4 habitats sampled in both years, A was about 2.5× greater in 1996 (a La Niña year) than 1998 (a strong El Niño year) yet relative values of S, A, B, AD, AS, SI, and H among the habitats were not significantly different, indicating strong benthic macrofauna-habitat associations despite considerable climatic and environmental variability. In general, the rank order of habitats on indicators associated with high diversity and productivity (high S, A, B, SI, H, HSR) was eelgrass = oyster ≥ Atlantic cordgrass ≥ mud shrimp ≥ bare mud/sand ≥ ghost shrimp = subtidal. Vegetation, burrowing shrimp, and oyster density and sediment %silt + clay and %total organic carbon were generally poor, temporally inconsistent predictors of ecological indicator variability within habitats. The benthic macrofauna-habitat associations in this study can be used to help identify critical habitats, prioritize habitats for environmental protection, index habitat suitability, assess habitat equivalency, and as habitat value criteria in ecological risk assessments in Willapa Bay.

  18. Ngram time series model to predict activity type and energy cost from wrist, hip and ankle accelerometers: implications of age

    PubMed Central

    Strath, Scott J; Kate, Rohit J; Keenan, Kevin G; Welch, Whitney A; Swartz, Ann M

    2016-01-01

    To develop and test time series single site and multi-site placement models, we used wrist, hip and ankle processed accelerometer data to estimate energy cost and type of physical activity in adults. Ninety-nine subjects in three age groups (18–39, 40–64, 65 + years) performed 11 activities while wearing three triaxial accelereometers: one each on the non-dominant wrist, hip, and ankle. During each activity net oxygen cost (METs) was assessed. The time series of accelerometer signals were represented in terms of uniformly discretized values called bins. Support Vector Machine was used for activity classification with bins and every pair of bins used as features. Bagged decision tree regression was used for net metabolic cost prediction. To evaluate model performance we employed the jackknife leave-one-out cross validation method. Single accelerometer and multi-accelerometer site model estimates across and within age group revealed similar accuracy, with a bias range of −0.03 to 0.01 METs, bias percent of −0.8 to 0.3%, and a rMSE range of 0.81–1.04 METs. Multi-site accelerometer location models improved activity type classification over single site location models from a low of 69.3% to a maximum of 92.8% accuracy. For each accelerometer site location model, or combined site location model, percent accuracy classification decreased as a function of age group, or when young age groups models were generalized to older age groups. Specific age group models on average performed better than when all age groups were combined. A time series computation show promising results for predicting energy cost and activity type. Differences in prediction across age group, a lack of generalizability across age groups, and that age group specific models perform better than when all ages are combined needs to be considered as analytic calibration procedures to detect energy cost and type are further developed. PMID:26449155

  19. An Ancient Origin for the Enigmatic Flat-Headed Frogs (Bombinatoridae: Barbourula) from the Islands of Southeast Asia

    PubMed Central

    Blackburn, David C.; Bickford, David P.; Diesmos, Arvin C.; Iskandar, Djoko T.; Brown, Rafe M.

    2010-01-01

    Background The complex history of Southeast Asian islands has long been of interest to biogeographers. Dispersal and vicariance events in the Pleistocene have received the most attention, though recent studies suggest a potentially more ancient history to components of the terrestrial fauna. Among this fauna is the enigmatic archaeobatrachian frog genus Barbourula, which only occurs on the islands of Borneo and Palawan. We utilize this lineage to gain unique insight into the temporal history of lineage diversification in Southeast Asian islands. Methodology/Principal Findings Using mitochondrial and nuclear genetic data, multiple fossil calibration points, and likelihood and Bayesian methods, we estimate phylogenetic relationships and divergence times for Barbourula. We determine the sensitivity of focal divergence times to specific calibration points by jackknife approach in which each calibration point is excluded from analysis. We find that relevant divergence time estimates are robust to the exclusion of specific calibration points. Barbourula is recovered as a monophyletic lineage nested within a monophyletic Costata. Barbourula diverged from its sister taxon Bombina in the Paleogene and the two species of Barbourula diverged in the Late Miocene. Conclusions/Significance The divergences within Barbourula and between it and Bombina are surprisingly old and represent the oldest estimates for a cladogenetic event resulting in living taxa endemic to Southeast Asian islands. Moreover, these divergence time estimates are consistent with a new biogeographic scenario: the Palawan Ark Hypothesis. We suggest that components of Palawan's terrestrial fauna might have “rafted” on emergent portions of the North Palawan Block during its migration from the Asian mainland to its present-day position near Borneo. Further, dispersal from Palawan to Borneo (rather than Borneo to Palawan) may explain the current day disjunct distribution of this ancient lineage. PMID:20711504

  20. Estimation of sex from the anthropometric ear measurements of a Sudanese population.

    PubMed

    Ahmed, Altayeb Abdalla; Omer, Nosyba

    2015-09-01

    The external ear and its prints have multifaceted roles in medico-legal practice, e.g., identification and facial reconstruction. Furthermore, its norms are essential in the diagnosis of congenital anomalies and the design of hearing aids. Body part dimensions vary in different ethnic groups, so the most accurate statistical estimations of biological attributes are developed using population-specific standards. Sudan lacks comprehensive data about ear norms; moreover, there is a universal rarity in assessing the possibility of sex estimation from ear dimensions using robust statistical techniques. Therefore, this study attempts to establish data for normal adult Sudanese Arabs, assessing the existence of asymmetry and developing a population-specific equation for sex estimation. The study sample comprised 200 healthy Sudanese Arab volunteers (100 males and 100 females) in the age range of 18-30years. The physiognomic ear length and width, lobule length and width, and conchal length and width measurements were obtained by direct anthropometry, using a digital sliding caliper. Moreover, indices and asymmetry were assessed. Data were analyzed using basic descriptive statistics and discriminant function analyses employing jackknife validations of classification results. All linear dimensions used were sexually dimorphic except lobular lengths. Some of the variables and indices show asymmetry. Ear dimensions showed cross-validated sex classification accuracy ranging between 60.5% and 72%. Hence, the ear measurements cannot be used as an effective tool in the estimation of sex. However, in the absence of other more reliable means, it still can be considered a supportive trait in sex estimation. Further, asymmetry should be considered in identification from the ear measurements. PMID:25813757

  1. Lower-extremity strength profiles and gender-based classification of basketball players ages 9-22 years.

    PubMed

    Buchanan, Patricia A; Vardaxis, Vassilios G

    2009-03-01

    Despite an increase in women sports participants and recognition of gender differences in injury patterns (e.g., knee), few normative strength data exist beyond hamstrings and quadriceps measures. This study had 2 purposes: to assess the lower-extremity strength of women (W) and men (M) basketball players who were 9-22 years old, and to determine which strength measures most correctly classify the gender of 12- to 22-year-old athletes. Fifty basketball players (26 W, 24 M) without ligamentous or meniscal injury performed concentric isokinetic testing of bilateral hip, knee, and ankle musculature. We identified maximal peak torques for the hip (flexors, extensors, abductors, adductors), knee (flexors and extensors), and ankle (plantar flexors and dorsiflexors), and we formed periarticular (hip, knee, and ankle), antigravity, and total leg strength composite measures. We calculated mean and 95% confidence intervals. With body mass-height normalization, most age and gender differences were small. Mean values were typically higher for older vs. younger players and for men vs. women players. Mean values were often lower for girls 12-13 years vs. those 9-10 years. In the age group of 16-22 years, men had stronger knee flexors, hip flexors, plantar flexors, and total leg strength than women. Men who were 16-22 years old had stronger knee flexors and hip flexors than did younger men and women players. Based on discriminant function, knee strength measures did not adequately classify gender. Instead, total leg strength measures had correct gender classifications of 74 and 69% (jackknifed) with significant multivariate tests (p = 0.025). For researchers and practitioners, these results support strength assessment and training of the whole lower extremity, not just knee musculature. Limited strength differences between girls 9-10 years old and those 12-13 years old suggest that the peripubertal period is an important time to target strength development. PMID:19209081

  2. Experience in reading digital images may decrease observer accuracy in mammography

    NASA Astrophysics Data System (ADS)

    Rawashdeh, Mohammad A.; Lewis, Sarah J.; Lee, Warwick; Mello-Thoms, Claudia; Reed, Warren M.; McEntee, Mark; Tapia, Kriscia; Brennan, Patrick C.

    2015-03-01

    Rationale and Objectives: To identify parameters linked to higher levels of performance in screening mammography. In particular we explored whether experience in reading digital cases enhances radiologists' performance. Methods: A total of 60 cases were presented to the readers, of which 20 contained cancers and 40 showed no abnormality. Each case comprised of four images and 129 breast readers participated in the study. Each reader was asked to identify and locate any malignancies using a 1-5 confidence scale. All images were displayed using 5MP monitors, supported by radiology workstations with full image manipulation capabilities. A jack-knife free-response receiver operating characteristic, figure of merit (JAFROC, FOM) methodology was employed to assess reader performance. Details were obtained from each reader regarding their experience, qualifications and breast reading activities. Spearman and Mann Whitney U techniques were used for statistical analysis. Results: Higher performance was positively related to numbers of years professionally qualified (r= 0.18; P<0.05), number of years reading breast images (r= 0.24; P<0.01), number of mammography images read per year (r= 0.28; P<0.001) and number of hours reading mammographic images per week (r= 0.19; P<0.04). Unexpectedly, higher performance was inversely linked to previous experience with digital images (r= - 0.17; p<0.05) and further analysis, demonstrated that this finding was due to changes in specificity. Conclusion: This study suggests suggestion that readers with experience in digital images reporting may exhibit a reduced ability to correctly identify normal appearances requires further investigation. Higher performance is linked to number of cases read per year.

  3. Analysis and prediction of affinity of TAP binding peptides using cascade SVM.

    PubMed

    Bhasin, Manoj; Raghava, G P S

    2004-03-01

    The generation of cytotoxic T lymphocyte (CTL) epitopes from an antigenic sequence involves number of intracellular processes, including production of peptide fragments by proteasome and transport of peptides to endoplasmic reticulum through transporter associated with antigen processing (TAP). In this study, 409 peptides that bind to human TAP transporter with varying affinity were analyzed to explore the selectivity and specificity of TAP transporter. The abundance of each amino acid from P1 to P9 positions in high-, intermediate-, and low-affinity TAP binders were examined. The rules for predicting TAP binding regions in an antigenic sequence were derived from the above analysis. The quantitative matrix was generated on the basis of contribution of each position and residue in binding affinity. The correlation of r = 0.65 was obtained between experimentally determined and predicted binding affinity by using a quantitative matrix. Further a support vector machine (SVM)-based method has been developed to model the TAP binding affinity of peptides. The correlation (r = 0.80) was obtained between the predicted and experimental measured values by using sequence-based SVM. The reliability of prediction was further improved by cascade SVM that uses features of amino acids along with sequence. An extremely good correlation (r = 0.88) was obtained between measured and predicted values, when the cascade SVM-based method was evaluated through jackknife testing. A Web service, TAPPred (http://www.imtech.res.in/raghava/tappred/ or http://bioinformatics.uams.edu/mirror/tappred/), has been developed based on this approach. PMID:14978300

  4. Nestedness in centipede (Chilopoda) assemblages on continental islands (Aegean, Greece)

    NASA Astrophysics Data System (ADS)

    Simaiakis, Stylianos Michail; Martínez-Morales, Miguel Angel

    2010-05-01

    In natural ecosystems, species assemblages among isolated ecological communities such as continental islands often show a nested pattern in which biotas of sites with low species richness are non-random subsets of biotas of richer sites. The distribution of centipede (Chilopoda) species in the central and south Aegean archipelago was tested for nestedness. To achieve this aim we used distribution data for 53 species collected on 24 continental Aegean islands (Kyklades and Dodekanisa). Based on the first-order jackknife estimator, most of islands were comprehensively surveyed. In order to quantify nestedness, we used the nestedness temperature calculator (NTC) as well as the nestedness metric based on overlap and decreasing Fill (NODF). NTC indicated that data exhibited a high degree of nestedness in the central and south Aegean island complexes. As far as the Kyklades and Dodekanisa are concerned, NTC showed less nested centipede structures than the 24 islands. Likewise, NODF revealed a significant degree of nestedness in central and south Aegean islands. It also showed that biotas matrices without singletons were more nested than the complete ones (Aegean, Kyklades and Dodekanisa). The two commonest centipede taxa (lithobiomorphs and geophilomorphs) contributed differently to centipede assemblages. In the Kyklades and Dodekanisa, geophilomorphs did not show a reliable nested arrangement unlike lithobiomorphs. In relation to the entire data set, nestedness was positively associated with the degree of isolation. In the Kyklades altitudinal range best explained nestedness patterns, while in Dodekanisa habitat heterogeneity proved to be more important for the centipede communities. Island area does not seem to be a significant explanatory variable. Some of our results from the Kyklades were critically compared with those for terrestrial isopod and land snail nested assemblages from the same geographical area. The complex geological and palaeogeographical history of the Aegean archipelago partly accounted for the pattern of centipede assemblages.

  5. Analysis of the successional patterns of insects on carrion in southwest Virginia.

    PubMed

    Tabor, Kimberly L; Brewster, Carlyle C; Fell, Richard D

    2004-07-01

    Studies of carrion-insect succession on domestic pig, Sus scrofa L., were conducted in the spring and summer of 2001 and 2002 in Blacksburg, VA, to identify and analyze the successional patterns of the taxa of forensic importance in southwest Virginia. Forty-seven insect taxa were collected in the spring. These were represented by 11 families (Diptera: Calliphoridae, Sarcophagidae, Muscidae, Sepsidae, Piophilidae; Coleoptera: Staphylinidae, Silphidae, Cleridae, Trogidae, Dermestidae, Histeridae). In the summer, 33 taxa were collected that were represented by all of the families collected in the spring, except Trogidae. The most common flies collected were the calliphorids: Phormia regina (Meigen) and Phaenicia coeruleiviridis (Macquart). The most common beetles were Creophilus maxillosus L. (Staphylinidae), Oiceoptoma noveboracense Forster, Necrophila americana L., Necrodes surinamensis (F.) (Silphidae), Euspilotus assimilis (Paykull), and Hister abbreviatus F. (Histeridae). Occurrence matrices were constructed for the successional patterns of insect taxa during 21 sampling intervals in the spring and 8 intervals in the summer studies. Jackknife estimates (mean+/-95% confidence limits) of overall Jaccard similarity in insect taxa among sampling intervals in the occurrence matrices were 0.213+/-0.081 (spring 2001), 0.194+/-0.043 (summer 2001), 0.257+/-0.068 (spring 2002), and 0.274+/-0.172 (summer 2002). Permutation analyses of the occurrence matrices showed that the patterns of succession of insect taxa were similar between spring 2001 and 2002 (P = 0.001) and between summer 2001 and 2002 (P = 0.007). The successional patterns seem to be typical for the seasonal periods and provide data on baseline fauna for estimating postmortem interval in cases of human death. This study is the first of its kind for southwest Virginia. PMID:15311476

  6. Modeling the Distribution of Cutaneous Leishmaniasis Vectors (Psychodidae: Phlebotominae) in Iran: A Potential Transmission in Disease Prone Areas.

    PubMed

    Hanafi-Bojd, Ahmad Ali; Yaghoobi-Ershadi, Mohammad Reza; Haghdoost, Ali Akbar; Akhavan, Amir Ahmad; Rassi, Yavar; Karimi, Ameneh; Charrahy, Zabihollah

    2015-07-01

    Cutaneous leishmaniasis (CL) is now the main vector-borne disease in Iran. Two forms of the disease exist in the country, transmitted by Phlebotomus papatasi and Phlebotomus sergenti s.l. Modeling distribution of the vector species is beneficial for preparedness and planning to interrupt the transmission cycle. Data on sand fly distribution during 1990-2013 were used to predict the niche suitability. MaxEnt algorithm model was used for prediction using bioclimatic and environmental variables (precipitation, temperature, altitude, slope, and aspect). Regularized training, area under the curve, and unregularized training gains were 0.916, 0.915, and 1.503, respectively, for Ph. papatasi. These values were calculated as 0.987, 0.923, and 1.588 for Ph. sergenti s.l. The jackknife test showed that the environmental variable with the highest gain when used in isolation has the mean temperature of the wettest quarter for both species, while slope decreases the gain the most when it is omitted from the model. Classification of probability of presence for two studied species was performed on five classes using equal intervals in ArcGIS. More than 60% probability of presence was considered as areas with high potential of CL transmission. These areas include arid and semiarid climates, mainly located in central part of the country. Mean of altitude, annual precipitation, and temperature in these areas were calculated 990 and 1,235 m, 273 and 226 mm, and 17.5 and 16.4°C for Ph. papatasi and Ph. sergenti s.l., respectively. These findings can be used in the prediction of CL transmission potential, as well as for planning the disease control interventions. PMID:26335462

  7. Housefly Population Density Correlates with Shigellosis among Children in Mirzapur, Bangladesh: A Time Series Analysis

    PubMed Central

    Farag, Tamer H.; Faruque, Abu S.; Wu, Yukun; Das, Sumon K.; Hossain, Anowar; Ahmed, Shahnawaz; Ahmed, Dilruba; Nasrin, Dilruba; Kotloff, Karen L.; Panchilangam, Sandra; Nataro, James P.; Cohen, Dani; Blackwelder, William C.; Levine, Myron M.

    2013-01-01

    Background Shigella infections are a public health problem in developing and transitional countries because of high transmissibility, severity of clinical disease, widespread antibiotic resistance and lack of a licensed vaccine. Whereas Shigellae are known to be transmitted primarily by direct fecal-oral contact and less commonly by contaminated food and water, the role of the housefly Musca domestica as a mechanical vector of transmission is less appreciated. We sought to assess the contribution of houseflies to Shigella-associated moderate-to-severe diarrhea (MSD) among children less than five years old in Mirzapur, Bangladesh, a site where shigellosis is hyperendemic, and to model the potential impact of a housefly control intervention. Methods Stool samples from 843 children presenting to Kumudini Hospital during 2009–2010 with new episodes of MSD (diarrhea accompanied by dehydration, dysentery or hospitalization) were analyzed. Housefly density was measured twice weekly in six randomly selected sentinel households. Poisson time series regression was performed and autoregression-adjusted attributable fractions (AFs) were calculated using the Bruzzi method, with standard errors via jackknife procedure. Findings Dramatic springtime peaks in housefly density in 2009 and 2010 were followed one to two months later by peaks of Shigella-associated MSD among toddlers and pre-school children. Poisson time series regression showed that housefly density was associated with Shigella cases at three lags (six weeks) (Incidence Rate Ratio = 1.39 [95% CI: 1.23 to 1.58] for each log increase in fly count), an association that was not confounded by ambient air temperature. Autocorrelation-adjusted AF calculations showed that a housefly control intervention could have prevented approximately 37% of the Shigella cases over the study period. Interpretation Houseflies may play an important role in the seasonal transmission of Shigella in some developing country ecologies. Interventions to control houseflies should be evaluated as possible additions to the public health arsenal to diminish Shigella (and perhaps other causes of) diarrheal infection. PMID:23818998

  8. APSLAP: an adaptive boosting technique for predicting subcellular localization of apoptosis protein.

    PubMed

    Saravanan, Vijayakumar; Lakshmi, P T V

    2013-12-01

    Apoptotic proteins play key roles in understanding the mechanism of programmed cell death. Knowledge about the subcellular localization of apoptotic protein is constructive in understanding the mechanism of programmed cell death, determining the functional characterization of the protein, screening candidates in drug design, and selecting protein for relevant studies. It is also proclaimed that the information required for determining the subcellular localization of protein resides in their corresponding amino acid sequence. In this work, a new biological feature, class pattern frequency of physiochemical descriptor, was effectively used in accordance with the amino acid composition, protein similarity measure, CTD (composition, translation, and distribution) of physiochemical descriptors, and sequence similarity to predict the subcellular localization of apoptosis protein. AdaBoost with the weak learner as Random-Forest was designed for the five modules and prediction is made based on the weighted voting system. Bench mark dataset of 317 apoptosis proteins were subjected to prediction by our system and the accuracy was found to be 100.0 and 92.4 %, and 90.1 % for self-consistency test, jack-knife test, and tenfold cross validation test respectively, which is 0.9 % higher than that of other existing methods. Beside this, the independent data (N151 and ZW98) set prediction resulted in the accuracy of 90.7 and 87.7 %, respectively. These results show that the protein feature represented by a combined feature vector along with AdaBoost algorithm holds well in effective prediction of subcellular localization of apoptosis proteins. The user friendly web interface "APSLAP" has been constructed, which is freely available at http://apslap.bicpu.edu.in and it is anticipated that this tool will play a significant role in determining the specific role of apoptosis proteins with reliability. PMID:23982307

  9. [Forest lighting fire forecasting for Daxing'anling Mountains based on MAXENT model].

    PubMed

    Sun, Yu; Shi, Ming-Chang; Peng, Huan; Zhu, Pei-Lin; Liu, Si-Lin; Wu, Shi-Lei; He, Cheng; Chen, Feng

    2014-04-01

    Daxing'anling Mountains is one of the areas with the highest occurrence of forest lighting fire in Heilongjiang Province, and developing a lightning fire forecast model to accurately predict the forest fires in this area is of importance. Based on the data of forest lightning fires and environment variables, the MAXENT model was used to predict the lightning fire in Daxing' anling region. Firstly, we studied the collinear diagnostic of each environment variable, evaluated the importance of the environmental variables using training gain and the Jackknife method, and then evaluated the prediction accuracy of the MAXENT model using the max Kappa value and the AUC value. The results showed that the variance inflation factor (VIF) values of lightning energy and neutralized charge were 5.012 and 6.230, respectively. They were collinear with the other variables, so the model could not be used for training. Daily rainfall, the number of cloud-to-ground lightning, and current intensity of cloud-to-ground lightning were the three most important factors affecting the lightning fires in the forest, while the daily average wind speed and the slope was of less importance. With the increase of the proportion of test data, the max Kappa and AUC values were increased. The max Kappa values were above 0.75 and the average value was 0.772, while all of the AUC values were above 0.5 and the average value was 0. 859. With a moderate level of prediction accuracy being achieved, the MAXENT model could be used to predict forest lightning fire in Daxing'anling Mountains. PMID:25011305

  10. Life history dependent morphometric variation in stream-dwelling Atlantic salmon

    USGS Publications Warehouse

    Letcher, B.H.

    2003-01-01

    The time course of morphometric variation among life histories for stream-dwelling Atlantic salmon (Salmo salar L.) parr (age-0+ to age-2+) was analyzed. Possible life histories were combinations of parr maturity status in the autumn (mature or immature) and age at outmigration (smolt at age-2+ or later age). Actual life histories expressed with enough fish for analysis in the 1997 cohort were immature/age-2+ smolt, mature/age-2 +smolt, and mature/age-2+ non-smolt. Tagged fish were assigned to one of the three life histories and digital pictures from the field were analyzed using landmark-based geometric morphometrics. Results indicated that successful grouping of fish according to life history varied with fish age, but that fish could be grouped before the actual expression of the life histories. By March (age-1+), fish were successfully grouped using a descriptive discriminant function and successful assignment ranged from 84 to 97% for the remainder of stream residence. A jackknife of the discriminant function revealed an average life history prediction success of 67% from age-1+ summer to smolting. Low sample numbers for one of the life histories may have limited prediction success. A MANOVA on the shape descriptors (relative warps) also indicated significant differences in shape among life histories from age-1+ summer through to smolting. Across all samples, shape varied significantly with size. Within samples, shape did not vary significantly with size for samples from December (age-0+) to May (age-1+). During the age-1+ summer however, shape varied significantly with size, but the relationship between shape and size was not different among life histories. In the autumn (age-1+) and winter (age-2+), life history differences explained a significant portion of the change in shape with size. Life history dependent morphometric variation may be useful to indicate the timing of early expressions of life history variation and as a tool to explore temporal and spatial variation in life history expression.

  11. Predicting protein subcellular location by fusing multiple classifiers.

    PubMed

    Chou, Kuo-Chen; Shen, Hong-Bin

    2006-10-01

    One of the fundamental goals in cell biology and proteomics is to identify the functions of proteins in the context of compartments that organize them in the cellular environment. Knowledge of subcellular locations of proteins can provide key hints for revealing their functions and understanding how they interact with each other in cellular networking. Unfortunately, it is both time-consuming and expensive to determine the localization of an uncharacterized protein in a living cell purely based on experiments. With the avalanche of newly found protein sequences emerging in the post genomic era, we are facing a critical challenge, that is, how to develop an automated method to fast and reliably identify their subcellular locations so as to be able to timely use them for basic research and drug discovery. In view of this, an ensemble classifier was developed by the approach of fusing many basic individual classifiers through a voting system. Each of these basic classifiers was trained in a different dimension of the amphiphilic pseudo amino acid composition (Chou [2005] Bioinformatics 21: 10-19). As a demonstration, predictions were performed with the fusion classifier for proteins among the following 14 localizations: (1) cell wall, (2) centriole, (3) chloroplast, (4) cytoplasm, (5) cytoskeleton, (6) endoplasmic reticulum, (7) extracellular, (8) Golgi apparatus, (9) lysosome, (10) mitochondria, (11) nucleus, (12) peroxisome, (13) plasma membrane, and (14) vacuole. The overall success rates thus obtained via the resubstitution test, jackknife test, and independent dataset test were all significantly higher than those by the existing classifiers. It is anticipated that the novel ensemble classifier may also become a very useful vehicle in classifying other attributes of proteins according to their sequences, such as membrane protein type, enzyme family/sub-family, G-protein coupled receptor (GPCR) type, and structural class, among many others. The fusion ensemble classifier will be available at www.pami.sjtu.edu.cn/people/hbshen. PMID:16639720

  12. Historical extension of operational NDVI products for livestock insurance in Kenya

    NASA Astrophysics Data System (ADS)

    Vrieling, Anton; Meroni, Michele; Shee, Apurba; Mude, Andrew G.; Woodard, Joshua; de Bie, C. A. J. M. (Kees); Rembold, Felix

    2014-05-01

    Droughts induce livestock losses that severely affect Kenyan pastoralists. Recent index insurance schemes have the potential of being a viable tool for insuring pastoralists against drought-related risk. Such schemes require as input a forage scarcity (or drought) index that can be reliably updated in near real-time, and that strongly relates to livestock mortality. Generally, a long record (>25 years) of the index is needed to correctly estimate mortality risk and calculate the related insurance premium. Data from current operational satellites used for large-scale vegetation monitoring span over a maximum of 15 years, a time period that is considered insufficient for accurate premium computation. This study examines how operational NDVI datasets compare to, and could be combined with the non-operational recently constructed 30-year GIMMS AVHRR record (1981-2011) to provide a near-real time drought index with a long term archive for the arid lands of Kenya. We compared six freely available, near-real time NDVI products: five from MODIS and one from SPOT-VEGETATION. Prior to comparison, all datasets were averaged in time for the two vegetative seasons in Kenya, and aggregated spatially at the administrative division level at which the insurance is offered. The feasibility of extending the resulting aggregated drought indices back in time was assessed using jackknifed R2 statistics (leave-one-year-out) for the overlapping period 2002-2011. We found that division-specific models were more effective than a global model for linking the division-level temporal variability of the index between NDVI products. Based on our results, good scope exists for historically extending the aggregated drought index, thus providing a longer operational record for insurance purposes. We showed that this extension may have large effects on the calculated insurance premium. Finally, we discuss several possible improvements to the drought index.

  13. The effect of compression on confidence during the detection of skull fractures in CT

    NASA Astrophysics Data System (ADS)

    Nikolovski, Ines; McEntee, Mark F.; Bourne, Roger; Pietrzyk, Mariusz W.; Evanoff, Michael G.; Brennan, Patrick C.; Tay, Kevin

    2012-02-01

    As part of a study to establish whether detection of cranial vault fractures is affected by JPEG 2000 30:1 and 60:1 lossy compression when compared to JPEG 2000 lossless compression we looked at the effects on confidence ratings 55 CT images, with three levels of JPEG 2000 compression (lossless, 30:1 & 60:1) were presented to 14 senior radiologists, 12 from the American Board of Radiology and 2 form Australia, 7 of whom were MSK specialists and 7 were neuroradiologists. 32 Images contained a single skull fracture while 23 were normal. Images were displayed on one calibrated, secondary LCD, in an ambient lighting of 32.2 lux. Observers were asked to identify the presence or absence of a fracture and where a fracture was present to locate and rate their confidence in its presence. A jack-knifed alternate free-response receiver operating characteristic (JAFROC) and a ROC methodology was employed and the DBM MRMC and ANOVA were used to explore differences between the lossless and lossy compressed images. A significant trend of increased confidence in true and false positive scores was seen with JPEG2000 Lossy 60:1 compression. An ANOVA on the mean confidence rating obtained for correct (TP) and incorrect (FP) localization skull fractions demonstrated that this was a significant difference between lossless and 60:1 [FP, p<0.001 TP, p<0.014] and 30:1 and 60:1 [FP, p<0.014 TP, p<0.037].

  14. External Quality Assessment (EQA) program for the preanalytical and analytical immunohistochemical determination of HER2 in breast cancer: an experience on a regional scale

    PubMed Central

    2013-01-01

    Background An External Quality Assessment (EQA) program was developed to investigate the state of the art of HER2 immunohistochemical determination in breast cancer (BC) in 16 Pathology Departments in the Lazio Region (Italy). This program was implemented through two specific steps to evaluate HER2 staining (step 1) and interpretation (step 2) reproducibility among participants. Methods The management activities of this EQA program were assigned to the Coordinating Center (CC), the Revising Centers (RCs) and the Participating Centers (PCs). In step 1, 4 BC sections, selected by RCs, were stained by each PC using their own procedures. In step 2, each PC interpreted HER2 score in 10 BC sections stained by the CC. The concordance pattern was evaluated by using the kappa category-specific statistic and/or the weighted kappa statistic with the corresponding 95% Jackknife confidence interval. Results In step 1, a substantial/almost perfect agreement was reached between the PCs for scores 0 and 3+ whereas a moderate and fair agreement was observed for scores 1+ and 2+, respectively. In step 2, a fully satisfactory agreement was observed for 6 out of the 16 PCs and a quite satisfactory agreement was obtained for the remaining 10 PCs. Conclusions Our findings highlight that in the whole HER2 evaluation process the two intermediate categories, scores 1+ and 2+, are less reproducible than scores 0 and 3+. These findings are relevant in clinical practice where the choice of treatment is based on HER2 positivity, suggesting the need to share evaluation procedures within laboratories and implement educational programs. PMID:23965490

  15. A phantom-based JAFROC observer study of two CT reconstruction methods: the search for optimisation of lesion detection and effective dose

    NASA Astrophysics Data System (ADS)

    Thompson, John D.; Chakraborty, Dev P.; Szczepura, Katy; Vamvakas, Ioannis; Tootell, Andrew; Manning, David J.; Hogg, Peter

    2015-03-01

    Purpose: To investigate the dose saving potential of iterative reconstruction (IR) in a computed tomography (CT) examination of the thorax. Materials and Methods: An anthropomorphic chest phantom containing various configurations of simulated lesions (5, 8, 10 and 12mm; +100, -630 and -800 Hounsfield Units, HU) was imaged on a modern CT system over a tube current range (20, 40, 60 and 80mA). Images were reconstructed with (IR) and filtered back projection (FBP). An ATOM 701D (CIRS, Norfolk, VA) dosimetry phantom was used to measure organ dose. Effective dose was calculated. Eleven observers (15.11+/-8.75 years of experience) completed a free response study, localizing lesions in 544 single CT image slices. A modified jackknife alternative free-response receiver operating characteristic (JAFROC) analysis was completed to look for a significant effect of two factors: reconstruction method and tube current. Alpha was set at 0.05 to control the Type I error in this study. Results: For modified JAFROC analysis of reconstruction method there was no statistically significant difference in lesion detection performance between FBP and IR when figures-of-merit were averaged over tube current (F(1,10)=0.08, p = 0.789). For tube current analysis, significant differences were revealed between multiple pairs of tube current settings (F(3,10) = 16.96, p<0.001) when averaged over image reconstruction method. Conclusion: The free-response study suggests that lesion detection can be optimized at 40mA in this phantom model, a measured effective dose of 0.97mSv. In high-contrast regions the diagnostic value of IR, compared to FBP, is less clear.

  16. pRNAm-PC: Predicting N(6)-methyladenosine sites in RNA sequences via physical-chemical properties.

    PubMed

    Liu, Zi; Xiao, Xuan; Yu, Dong-Jun; Jia, Jianhua; Qiu, Wang-Ren; Chou, Kuo-Chen

    2016-03-15

    Just like PTM or PTLM (post-translational modification) in proteins, PTCM (post-transcriptional modification) in RNA plays very important roles in biological processes. Occurring at adenine (A) with the genetic code motif (GAC), N(6)-methyldenosine (m(6)A) is one of the most common and abundant PTCMs in RNA found in viruses and most eukaryotes. Given an uncharacterized RNA sequence containing many GAC motifs, which of them can be methylated, and which cannot? It is important for both basic research and drug development to address this problem. Particularly with the avalanche of RNA sequences generated in the postgenomic age, it is highly demanded to develop computational methods for timely identifying the N(6)-methyldenosine sites in RNA. Here we propose a new predictor called pRNAm-PC, in which RNA sequence samples are expressed by a novel mode of pseudo dinucleotide composition (PseDNC) whose components were derived from a physical-chemical matrix via a series of auto-covariance and cross covariance transformations. It was observed via a rigorous jackknife test that, in comparison with the existing predictor for the same purpose, pRNAm-PC achieved remarkably higher success rates in both overall accuracy and stability, indicating that the new predictor will become a useful high-throughput tool for identifying methylation sites in RNA, and that the novel approach can also be used to study many other RNA-related problems and conduct genome analysis. A user-friendly Web server for pRNAm-PC has been established at http://www.jci-bioinfo.cn/pRNAm-PC, by which users can easily get their desired results without needing to go through the mathematical details. PMID:26748145

  17. Modelling temperature, photoperiod and vernalization responses of Brunonia australis (Goodeniaceae) and Calandrinia sp. (Portulacaceae) to predict flowering time

    PubMed Central

    Cave, Robyn L.; Hammer, Graeme L.; McLean, Greg; Birch, Colin J.; Erwin, John E.; Johnston, Margaret E.

    2013-01-01

    Background and Aims Crop models for herbaceous ornamental species typically include functions for temperature and photoperiod responses, but very few incorporate vernalization, which is a requirement of many traditional crops. This study investigated the development of floriculture crop models, which describe temperature responses, plus photoperiod or vernalization requirements, using Australian native ephemerals Brunonia australis and Calandrinia sp. Methods A novel approach involved the use of a field crop modelling tool, DEVEL2. This optimization program estimates the parameters of selected functions within the development rate models using an iterative process that minimizes sum of squares residual between estimated and observed days for the phenological event. Parameter profiling and jack-knifing are included in DEVEL2 to remove bias from parameter estimates and introduce rigour into the parameter selection process. Key Results Development rate of B. australis from planting to first visible floral bud (VFB) was predicted using a multiplicative approach with a curvilinear function to describe temperature responses and a broken linear function to explain photoperiod responses. A similar model was used to describe the development rate of Calandrinia sp., except the photoperiod function was replaced with an exponential vernalization function, which explained a facultative cold requirement and included a coefficient for determining the vernalization ceiling temperature. Temperature was the main environmental factor influencing development rate for VFB to anthesis of both species and was predicted using a linear model. Conclusions The phenology models for B. australis and Calandrinia sp. described development rate from planting to VFB and from VFB to anthesis in response to temperature and photoperiod or vernalization and may assist modelling efforts of other herbaceous ornamental plants. In addition to crop management, the vernalization function could be used to identify plant communities most at risk from predicted increases in temperature due to global warming. PMID:23404991

  18. Quantifying variability in earthquake rupture models using multidimensional scaling: application to the 2011 Tohoku earthquake

    NASA Astrophysics Data System (ADS)

    Razafindrakoto, Hoby N. T.; Mai, P. Martin; Genton, Marc G.; Zhang, Ling; Thingbaijam, Kiran K. S.

    2015-07-01

    Finite-fault earthquake source inversion is an ill-posed inverse problem leading to non-unique solutions. In addition, various fault parametrizations and input data may have been used by different researchers for the same earthquake. Such variability leads to large intra-event variability in the inferred rupture models. One way to understand this problem is to develop robust metrics to quantify model variability. We propose a Multi Dimensional Scaling (MDS) approach to compare rupture models quantitatively. We consider normalized squared and grey-scale metrics that reflect the variability in the location, intensity and geometry of the source parameters. We test the approach on two-dimensional random fields generated using a von Kármán autocorrelation function and varying its spectral parameters. The spread of points in the MDS solution indicates different levels of model variability. We observe that the normalized squared metric is insensitive to variability of spectral parameters, whereas the grey-scale metric is sensitive to small-scale changes in geometry. From this benchmark, we formulate a similarity scale to rank the rupture models. As case studies, we examine inverted models from the Source Inversion Validation (SIV) exercise and published models of the 2011 Mw 9.0 Tohoku earthquake, allowing us to test our approach for a case with a known reference model and one with an unknown true solution. The normalized squared and grey-scale metrics are respectively sensitive to the overall intensity and the extension of the three classes of slip (very large, large, and low). Additionally, we observe that a three-dimensional MDS configuration is preferable for models with large variability. We also find that the models for the Tohoku earthquake derived from tsunami data and their corresponding predictions cluster with a systematic deviation from other models. We demonstrate the stability of the MDS point-cloud using a number of realizations and jackknife tests, for both the random field and the case studies.

  19. Aboveground biomass and leaf area index (LAI) mapping for Niassa Reserve, northern Mozambique

    NASA Astrophysics Data System (ADS)

    Ribeiro, Natasha S.; Saatchi, Sassan S.; Shugart, Herman H.; Washington-Allen, Robert A.

    2008-09-01

    Estimations of biomass are critical in miombo woodlands because they represent the primary source of goods and services for over 80% of the population in southern Africa. This study was carried out in Niassa Reserve, northern Mozambique. The main objectives were first to estimate woody biomass and Leaf Area Index (LAI) using remotely sensed data [RADARSAT (C-band, λ = 5.7-cm)] and Landsat ETM+ derived Normalized Difference Vegetation Index (NDVI) and Simple Ratio (SR) calibrated by field measurements and, second to determine, at both landscape and plot scales, the environmental controls (precipitation, woody cover density, fire and elephants) of biomass and LAI. A land-cover map (72% overall accuracy) was derived from the June 2004 ETM+ mosaic. Field biomass and LAI were correlated with RADARSAT backscatter (rbiomass = 0.65, rLAI = 0.57, p < 0.0001) from July 2004, NDVI (rbiomass = 0.30, rLAI = 0.35; p < 0.0001) and SR (rbiomass = 0.36, rLAI = 0.40, p < 0.0001). A jackknife stepwise regression technique was used to develop the best predictive models for biomass (biomass = -5.19 + 0.074 * radarsat + 1.56 * SR, r2 = 0.55) and LAI (LAI = -0.66 + 0.01 * radarsat + 0.22 * SR, r2 = 0.45). Biomass and LAI maps were produced with an estimated peak of 18 kg m-2 and 2.80 m2 m-2, respectively. On the landscape-scale, both biomass and LAI were strongly determined by mean annual precipitation (F = 13.91, p = 0.0002). On the plot spatial scale, woody biomass was significantly determined by fire frequency, and LAI by vegetation type.

  20. A Novel Feature Extraction Method with Feature Selection to Identify Golgi-Resident Protein Types from Imbalanced Data

    PubMed Central

    Yang, Runtao; Zhang, Chengjin; Gao, Rui; Zhang, Lina

    2016-01-01

    The Golgi Apparatus (GA) is a major collection and dispatch station for numerous proteins destined for secretion, plasma membranes and lysosomes. The dysfunction of GA proteins can result in neurodegenerative diseases. Therefore, accurate identification of protein subGolgi localizations may assist in drug development and understanding the mechanisms of the GA involved in various cellular processes. In this paper, a new computational method is proposed for identifying cis-Golgi proteins from trans-Golgi proteins. Based on the concept of Common Spatial Patterns (CSP), a novel feature extraction technique is developed to extract evolutionary information from protein sequences. To deal with the imbalanced benchmark dataset, the Synthetic Minority Over-sampling Technique (SMOTE) is adopted. A feature selection method called Random Forest-Recursive Feature Elimination (RF-RFE) is employed to search the optimal features from the CSP based features and g-gap dipeptide composition. Based on the optimal features, a Random Forest (RF) module is used to distinguish cis-Golgi proteins from trans-Golgi proteins. Through the jackknife cross-validation, the proposed method achieves a promising performance with a sensitivity of 0.889, a specificity of 0.880, an accuracy of 0.885, and a Matthew’s Correlation Coefficient (MCC) of 0.765, which remarkably outperforms previous methods. Moreover, when tested on a common independent dataset, our method also achieves a significantly improved performance. These results highlight the promising performance of the proposed method to identify Golgi-resident protein types. Furthermore, the CSP based feature extraction method may provide guidelines for protein function predictions. PMID:26861308

  1. A Novel Feature Extraction Method with Feature Selection to Identify Golgi-Resident Protein Types from Imbalanced Data.

    PubMed

    Yang, Runtao; Zhang, Chengjin; Gao, Rui; Zhang, Lina

    2016-01-01

    The Golgi Apparatus (GA) is a major collection and dispatch station for numerous proteins destined for secretion, plasma membranes and lysosomes. The dysfunction of GA proteins can result in neurodegenerative diseases. Therefore, accurate identification of protein subGolgi localizations may assist in drug development and understanding the mechanisms of the GA involved in various cellular processes. In this paper, a new computational method is proposed for identifying cis-Golgi proteins from trans-Golgi proteins. Based on the concept of Common Spatial Patterns (CSP), a novel feature extraction technique is developed to extract evolutionary information from protein sequences. To deal with the imbalanced benchmark dataset, the Synthetic Minority Over-sampling Technique (SMOTE) is adopted. A feature selection method called Random Forest-Recursive Feature Elimination (RF-RFE) is employed to search the optimal features from the CSP based features and g-gap dipeptide composition. Based on the optimal features, a Random Forest (RF) module is used to distinguish cis-Golgi proteins from trans-Golgi proteins. Through the jackknife cross-validation, the proposed method achieves a promising performance with a sensitivity of 0.889, a specificity of 0.880, an accuracy of 0.885, and a Matthew's Correlation Coefficient (MCC) of 0.765, which remarkably outperforms previous methods. Moreover, when tested on a common independent dataset, our method also achieves a significantly improved performance. These results highlight the promising performance of the proposed method to identify Golgi-resident protein types. Furthermore, the CSP based feature extraction method may provide guidelines for protein function predictions. PMID:26861308

  2. Identification of microRNA precursor with the degenerate K-tuple or Kmer strategy.

    PubMed

    Liu, Bin; Fang, Longyun; Wang, Shanyi; Wang, Xiaolong; Li, Hongtao; Chou, Kuo-Chen

    2015-11-21

    The microRNA (miRNA), a small non-coding RNA molecule, plays an important role in transcriptional and post-transcriptional regulation of gene expression. Its abnormal expression, however, has been observed in many cancers and other disease states, implying that the miRNA molecules are also deeply involved in these diseases, particularly in carcinogenesis. Therefore, it is important for both basic research and miRNA-based therapy to discriminate the real pre-miRNAs from the false ones (such as hairpin sequences with similar stem-loops). Most existing methods in this regard were based on the strategy in which RNA samples were formulated by a vector formed by their Kmer components. But the length of Kmers must be very short; otherwise, the vector's dimension would be extremely large, leading to the "high-dimension disaster" or overfitting problem. Inspired by the concept of "degenerate energy levels" in quantum mechanics, we introduced the "degenerate Kmer" (deKmer) to represent RNA samples. By doing so, not only we can accommodate long-range coupling effects but also we can avoid the high-dimension problem. Rigorous jackknife tests and cross-species experiments indicated that our approach is very promising. It has not escaped our notice that the deKmer approach can also be applied to many other areas of computational biology. A user-friendly web-server for the new predictor has been established at http://bioinformatics.hitsz.edu.cn/miRNA-deKmer/, by which users can easily get their desired results. PMID:26362104

  3. Flood flow estimation in ungauged basins, application to a nordic environment

    NASA Astrophysics Data System (ADS)

    Ouarda, T. B. M. J.; Crsteanu, A. A.; Bobe, B.

    2003-04-01

    Design flood estimates at ungauged sites or at gauged sites with short records can be obtained through regional estimation techniques. It is important to accurately estimate the values of these design floods to avoid over- and under-estimation of hydraulic structures. Various methods have been employed for the regional analysis of extreme hydrological events. These regionalization approaches make different assumptions and hypotheses concerning the hydrological phenomena being modeled, rely on various types of continuous and non-continuous data, and often fall under completely different theories. The dam safety law in the Province of Quebec, Canada, makes it compulsory to have accurate estimates of design and operational floods at all locations in the Province. A general tool, based on regional estimation, was developed to allow the estimation of flood characteristics on any river in any location in the Province of Quebec. First flood frequency analysis was carried out in all gauged sites in the territory of study and flood quantiles were estimated in each site based on the best fitting distribution. Regional estimation techniques are then used for the automatic estimation of flood characteristics in any ungauged location by transferring information from gauged sites to ungauged ones. Regional estimation is carried out based on previously estimated flood characteristics in gauged sites and based on physiographic and meteorological characteristics in the site of interest. Geostatistical methods are used within the developed tool, in a GIS framework, to estimate automatically the physiographic and meteorological characteristics for any basin in the Province based on digitized maps of the relevant information. A Jack-knife approach is used to demonstrate the usefulness of this tool, which is shown to be precise, unbiased and accurate.

  4. Prediction and Analysis of Antibody Amyloidogenesis from Sequences

    PubMed Central

    Liaw, Chyn; Tung, Chun-Wei; Ho, Shinn-Ying

    2013-01-01

    Antibody amyloidogenesis is the aggregation of soluble proteins into amyloid fibrils that is one of major causes of the failures of humanized antibodies. The prediction and prevention of antibody amyloidogenesis are helpful for restoring and enhancing therapeutic effects. Due to a large number of possible germlines, the existing method is not practical to predict sequences of novel germlines, which establishes individual models for each known germline. This study proposes a first automatic and across-germline prediction method (named AbAmyloid) capable of predicting antibody amyloidogenesis from sequences. Since the amyloidogenesis is determined by a whole sequence of an antibody rather than germline-dependent properties such as mutated residues, this study assess three types of germline-independent sequence features (amino acid composition, dipeptide composition and physicochemical properties). AbAmyloid using a Random Forests classifier with dipeptide composition performs well on a data set of 12 germlines. The within- and across-germline prediction accuracies are 83.10% and 83.33% using Jackknife tests, respectively, and the novel-germline prediction accuracy using a leave-one-germline-out test is 72.22%. A thorough analysis of sequence features is conducted to identify informative properties for further providing insights to antibody amyloidogenesis. Some identified informative physicochemical properties are amphiphilicity, hydrophobicity, reverse turn, helical structure, isoelectric point, net charge, mutability, coil, turn, linker, nuclear protein, etc. Additionally, the numbers of ubiquitylation sites in amyloidogenic and non-amyloidogenic antibodies are found to be significantly different. It reveals that antibodies less likely to be ubiquitylated tend to be amyloidogenic. The method AbAmyloid capable of automatically predicting antibody amyloidogenesis of novel germlines is implemented as a publicly available web server at http://iclab.life.nctu.edu.tw/abamyloid. PMID:23308169

  5. Does more sequence data improve estimates of galliform phylogeny? Analyses of a rapid radiation using a complete data matrix

    PubMed Central

    Braun, Edward L.

    2014-01-01

    The resolution of rapid evolutionary radiations or “bushes” in the tree of life has been one of the most difficult and interesting problems in phylogenetics. The avian order Galliformes appears to have undergone several rapid radiations that have limited the resolution of prior studies and obscured the position of taxa important both agriculturally and as model systems (chicken, turkey, Japanese quail). Here we present analyses of a multi-locus data matrix comprising over 15,000 sites, primarily from nuclear introns but also including three mitochondrial regions, from 46 galliform taxa with all gene regions sampled for all taxa. The increased sampling of unlinked nuclear genes provided strong bootstrap support for all but a small number of relationships. Coalescent-based methods to combine individual gene trees and analyses of datasets that are independent of published data indicated that this well-supported topology is likely to reflect the galliform species tree. The inclusion or exclusion of mitochondrial data had a limited impact upon analyses upon analyses using either concatenated data or multispecies coalescent methods. Some of the key phylogenetic findings include support for a second major clade within the core phasianids that includes the chicken and Japanese quail and clarification of the phylogenetic relationships of turkey. Jackknifed datasets suggested that there is an advantage to sampling many independent regions across the genome rather than obtaining long sequences for a small number of loci, possibly reflecting the differences among gene trees that differ due to incomplete lineage sorting. Despite the novel insights we obtained using this increased sampling of gene regions, some nodes remain unresolved, likely due to periods of rapid diversification. Resolving these remaining groups will likely require sequencing a very large number of gene regions, but our analyses now appear to support a robust backbone for this order. PMID:24795852

  6. Multiscale finite-frequency Rayleigh wave tomography of the Kaapvaal craton

    NASA Astrophysics Data System (ADS)

    Chevrot, S.; Zhao, L.

    2007-04-01

    We have measured phase delays of fundamental-mode Rayleigh waves for 12 events recorded by the Southern Africa Seismic Experiment at frequencies between 0.005 and 0.035 Hz. A novel multiscale finite-frequency tomographic method based on wavelet decomposition of 3-D sensitivity kernels for the phase of Rayleigh waves is used to map the shear velocities in the upper mantle beneath southern Africa. The kernels are computed by summing coupled normal modes over a very fine grid surrounding the seismic array. To estimate and minimize the biases in the model resulting from structures outside the tomographic grid, a jackknife inversion method is implemented. The contribution of heterogeneities outside the target volume is significant, but produces artefacts in the tomographic model that are easily identified and discarded before interpretation. With structures on length scales as short as 100 km retrieved beneath the array, the deep structure of the Kaapvaal craton is revealed with unprecedented detail. Outside the array, the corresponding resolution is 200 km. High velocity cratonic roots are confined to the Archean craton, and extend to depths of at least 250 km. Confirming earlier surface structural studies, we recognize two distinct units in the Kaapvaal craton. The eastern Witwatersrand block and the western Kimberley block are separated by a major near-vertical translithospheric boundary which coincides with the Colesberg Lineament. Lower than average velocities south and east of the Kaapvaal craton reveal extensive metasomatism and heating of the lithosphere, probably related to the Karoo magmatic event and to the opening of the South Atlantic Ocean.

  7. iLoc-Euk: A Multi-Label Classifier for Predicting the Subcellular Localization of Singleplex and Multiplex Eukaryotic Proteins

    PubMed Central

    Chou, Kuo-Chen; Wu, Zhi-Cheng; Xiao, Xuan

    2011-01-01

    Predicting protein subcellular localization is an important and difficult problem, particularly when query proteins may have the multiplex character, i.e., simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing protein subcellular location predictor can only be used to deal with the single-location or “singleplex” proteins. Actually, multiple-location or “multiplex” proteins should not be ignored because they usually posses some unique biological functions worthy of our special notice. By introducing the “multi-labeled learning” and “accumulation-layer scale”, a new predictor, called iLoc-Euk, has been developed that can be used to deal with the systems containing both singleplex and multiplex proteins. As a demonstration, the jackknife cross-validation was performed with iLoc-Euk on a benchmark dataset of eukaryotic proteins classified into the following 22 location sites: (1) acrosome, (2) cell membrane, (3) cell wall, (4) centriole, (5) chloroplast, (6) cyanelle, (7) cytoplasm, (8) cytoskeleton, (9) endoplasmic reticulum, (10) endosome, (11) extracellular, (12) Golgi apparatus, (13) hydrogenosome, (14) lysosome, (15) melanosome, (16) microsome (17) mitochondrion, (18) nucleus, (19) peroxisome, (20) spindle pole body, (21) synapse, and (22) vacuole, where none of proteins included has pairwise sequence identity to any other in a same subset. The overall success rate thus obtained by iLoc-Euk was 79%, which is significantly higher than that by any of the existing predictors that also have the capacity to deal with such a complicated and stringent system. As a user-friendly web-server, iLoc-Euk is freely accessible to the public at the web-site http://icpr.jci.edu.cn/bioinfo/iLoc-Euk. It is anticipated that iLoc-Euk may become a useful bioinformatics tool for Molecular Cell Biology, Proteomics, System Biology, and Drug Development Also, its novel approach will further stimulate the development of predicting other protein attributes. PMID:21483473

  8. Predicting protein subnuclear location with optimized evidence-theoretic K-nearest classifier and pseudo amino acid composition.

    PubMed

    Shen, Hong-Bin; Chou, Kuo-Chen

    2005-11-25

    The nucleus is the brain of eukaryotic cells that guides the life processes of the cell by issuing key instructions. For in-depth understanding of the biochemical process of the nucleus, the knowledge of localization of nuclear proteins is very important. With the avalanche of protein sequences generated in the post-genomic era, it is highly desired to develop an automated method for fast annotating the subnuclear locations for numerous newly found nuclear protein sequences so as to be able to timely utilize them for basic research and drug discovery. In view of this, a novel approach is developed for predicting the protein subnuclear location. It is featured by introducing a powerful classifier, the optimized evidence-theoretic K-nearest classifier, and using the pseudo amino acid composition [K.C. Chou, PROTEINS: Structure, Function, and Genetics, 43 (2001) 246], which can incorporate a considerable amount of sequence-order effects, to represent protein samples. As a demonstration, identifications were performed for 370 nuclear proteins among the following 9 subnuclear locations: (1) Cajal body, (2) chromatin, (3) heterochromatin, (4) nuclear diffuse, (5) nuclear pore, (6) nuclear speckle, (7) nucleolus, (8) PcG body, and (9) PML body. The overall success rates thus obtained by both the re-substitution test and jackknife cross-validation test are significantly higher than those by existing classifiers on the same working dataset. It is anticipated that the powerful approach may also become a useful high throughput vehicle to bridge the huge gap occurring in the post-genomic era between the number of gene sequences in databases and the number of gene products that have been functionally characterized. The OET-KNN classifier will be available at www.pami.sjtu.edu.cn/people/hbshen. PMID:16213466

  9. iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance-Pairs and Reduced Alphabet Profile into the General Pseudo Amino Acid Composition

    PubMed Central

    Liu, Bin; Xu, Jinghao; Lan, Xun; Xu, Ruifeng; Zhou, Jiyun; Wang, Xiaolong; Chou, Kuo-Chen

    2014-01-01

    Playing crucial roles in various cellular processes, such as recognition of specific nucleotide sequences, regulation of transcription, and regulation of gene expression, DNA-binding proteins are essential ingredients for both eukaryotic and prokaryotic proteomes. With the avalanche of protein sequences generated in the postgenomic age, it is a critical challenge to develop automated methods for accurate and rapidly identifying DNA-binding proteins based on their sequence information alone. Here, a novel predictor, called “iDNA-Prot|dis”, was established by incorporating the amino acid distance-pair coupling information and the amino acid reduced alphabet profile into the general pseudo amino acid composition (PseAAC) vector. The former can capture the characteristics of DNA-binding proteins so as to enhance its prediction quality, while the latter can reduce the dimension of PseAAC vector so as to speed up its prediction process. It was observed by the rigorous jackknife and independent dataset tests that the new predictor outperformed the existing predictors for the same purpose. As a user-friendly web-server, iDNA-Prot|dis is accessible to the public at http://bioinformatics.hitsz.edu.cn/iDNA-Prot_dis/. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step protocol guide is provided on how to use the web-server to get their desired results without the need to follow the complicated mathematic equations that are presented in this paper just for the integrity of its developing process. It is anticipated that the iDNA-Prot|dis predictor may become a useful high throughput tool for large-scale analysis of DNA-binding proteins, or at the very least, play a complementary role to the existing predictors in this regard. PMID:25184541

  10. Assessment of lifetime cumulative sun exposure using a self-administered questionnaire: Reliability of two approaches

    PubMed Central

    Yu, Chu-Ling; Li, Yan; Freedman, D. Michal; Fears, Thomas R.; Kwok, Richard; Chodick, Gabriel; Alexander, Bruce; Kimlin, Michael G.; Kricker, Anne; Armstrong, Bruce K.; Linet, Martha S.

    2009-01-01

    Few studies have evaluated the reliability of lifetime sun exposure estimated from inquiring about the number of hours people spent outdoors in a given period on a typical weekday or weekend day (the time-based approach). Some investigations have suggested that women have a particularly difficult task in estimating time outdoors in adulthood due to their family and occupational roles. We hypothesized that people might gain additional memory cues and estimate lifetime hours spent outdoors more reliably if asked about time spent outdoors according to specific activities (an activity-based approach). Using self-administered, mailed questionnaires, test-retest responses to time-based and to activity-based approaches were evaluated in 124 volunteer U.S. radiologic technologist participants: 64 females and 60 males 48 to 80 years of age. Intraclass correlation coefficients (ICCs) were used to evaluate the test-retest reliability of average numbers of hours spent outdoors in summer estimated for each approach. We tested the differences between the two ICCs, corresponding to each approach, using a t-test with the variance of the difference estimated by the Jackknife method. During childhood and adolescence, the two approaches gave similar ICCs for average numbers of hours spent outdoors in summer. By contrast, compared with the time-based approach, the activity-based approach showed significantly higher ICCs during adult ages (0.69 vs. 0.43, P=0.003) and over the lifetime (0.69 vs. 0.52, P=0.05); the higher ICCs for the activity-based questionnaire were primarily derived from the results for females. Research is needed to further improve the activity-based questionnaire approach for long-term sun exposure assessment. PMID:19190171

  11. CE-PLoc: an ensemble classifier for predicting protein subcellular locations by fusing different modes of pseudo amino acid composition.

    PubMed

    Khan, Asifullah; Majid, Abdul; Hayat, Maqsood

    2011-08-10

    Precise information about protein locations in a cell facilitates in the understanding of the function of a protein and its interaction in the cellular environment. This information further helps in the study of the specific metabolic pathways and other biological processes. We propose an ensemble approach called "CE-PLoc" for predicting subcellular locations based on fusion of individual classifiers. The proposed approach utilizes features obtained from both dipeptide composition (DC) and amphiphilic pseudo amino acid composition (PseAAC) based feature extraction strategies. Different feature spaces are obtained by varying the dimensionality using PseAAC for a selected base learner. The performance of the individual learning mechanisms such as support vector machine, nearest neighbor, probabilistic neural network, covariant discriminant, which are trained using PseAAC based features is first analyzed. Classifiers are developed using same learning mechanism but trained on PseAAC based feature spaces of varying dimensions. These classifiers are combined through voting strategy and an improvement in prediction performance is achieved. Prediction performance is further enhanced by developing CE-PLoc through the combination of different learning mechanisms trained on both DC based feature space and PseAAC based feature spaces of varying dimensions. The predictive performance of proposed CE-PLoc is evaluated for two benchmark datasets of protein subcellular locations using accuracy, MCC, and Q-statistics. Using the jackknife test, prediction accuracies of 81.47 and 83.99% are obtained for 12 and 14 subcellular locations datasets, respectively. In case of independent dataset test, prediction accuracies are 87.04 and 87.33% for 12 and 14 class datasets, respectively. PMID:21864791

  12. Region-specific transcriptional response to chronic nicotine in rat brain

    PubMed Central

    Konu, Özlen; Kane, Justin K.; Barrett, Tanya; Vawter, Marquis P.; Chang, Ruying; Ma, Jennie Z.; Donovan, David M.; Sharp, Burt; Becker, Kevin G.; Li, Ming D.

    2010-01-01

    Even though nicotine has been shown to modulate mRNA expression of a variety of genes, a comprehensive high-throughput study of the effects of nicotine on the tissue-specific gene expression profiles has been lacking in the literature. In this study, cDNA microarrays containing 1117 genes and ESTs were used to assess the transcriptional response to chronic nicotine treatment in rat, based on four brain regions, i.e. prefrontal cortex (PFC), nucleus accumbens (NAs), ventral tegmental area (VTA), and amygdala (AMYG). On the basis of a non-parametric resampling method, an index (called jackknifed reliability index, JRI) was proposed, and employed to determine the inherent measurement error across multiple arrays used in this study. Upon removal of the outliers, the mean correlation coefficient between duplicate measurements increased to 0.978±0.0035 from 0.941 ±0.045. Results from principal component analysis and pairwise correlations suggested that brain regions studied were highly similar in terms of their absolute expression levels, but exhibited divergent transcriptional responses to chronic nicotine administration. For example, PFC and NAs were significantly more similar to each other (r=0.7; P<10−14) than to either VTA or AMYG. Furthermore, we confirmed our microarray results for two representative genes, i.e. the weak inward rectifier K+ channel (TWIK-1), and phosphate and tensin homolog (PTEN) by using real-time quantitative RT-PCR technique. Finally, a number of genes, involved in MAPK, phosphatidylinositol, and EGFR signaling pathways, were identified and proposed as possible targets in response to nicotine administration. © 2001 Elsevier Science B.V. All rights reserved. PMID:11478936

  13. Monitoring hydrofrac-induced seismicity by surface arrays - the DHM-Project Basel case study

    NASA Astrophysics Data System (ADS)

    Blascheck, P.; Häge, M.; Joswig, M.

    2012-04-01

    The method "nanoseismic monitoring" was applied during the hydraulic stimulation at the Deep-Heat-Mining-Project (DHM-Project) Basel. Two small arrays in a distance of 2.1 km and 4.8 km to the borehole recorded continuously for two days. During this time more than 2500 seismic events were detected. The method of the surface monitoring of induced seismicity was compared to the reference which the hydrofrac monitoring presented. The latter was conducted by a network of borehole seismometers by Geothermal Explorers Limited. Array processing provides a outlier resistant, graphical jack-knifing localization method which resulted in a average deviation towards the reference of 850 m. Additionally, by applying the relative localization master-event method, the NNW-SSE strike direction of the reference was confirmed. It was shown that, in order to successfully estimate the magnitude of completeness as well as the b-value at the event rate and detection sensibility present, 3 h segments of data are sufficient. This is supported by two segment out of over 13 h of evaluated data. These segments were chosen so that they represent a time during the high seismic noise during normal working hours in daytime as well as the minimum anthropogenic noise at night. The low signal-to-noise ratio was compensated by the application of a sonogram event detection as well as a coincidence analysis within each array. Sonograms allow by autoadaptive, non-linear filtering to enhance signals whose amplitudes are just above noise level. For these events the magnitude was determined by the master-event method, allowing to compute the magnitude of completeness by the entire-magnitude-range method provided by the ZMAP toolbox. Additionally, the b-values were determined and compared to the reference values. An introduction to the method of "nanoseismic monitoring" will be given as well as the comparison to reference data in the Basel case study.

  14. The fate of an immigrant: Ensis directus in the eastern German Bight

    NASA Astrophysics Data System (ADS)

    Dannheim, Jennifer; Rumohr, Heye

    2012-09-01

    We studied Ensis directus in the subtidal (7-16 m depth) of the eastern German Bight. The jack-knife clam that invaded in the German Bight in 1978 has all characteristics of a successful immigrant: Ensis directus has a high reproductive capacity (juveniles, July 2001: Amrumbank 1,914 m-2, Eiderstedt/Vogelsand: 11,638 m-2), short generation times and growths rapidly: maximum growth rates were higher than in former studies (mean: 3 mm month-1, 2nd year: up to 14 mm month-1). Ensis directus uses natural mechanisms for rapid dispersal, occurs gregariously and exhibits a wide environmental tolerance. However, optimal growth and population-structure annual gaps might be influenced by reduced salinity: at Vogelsand (transition area of Elbe river), maximum growth was lower (164 mm) than at the Eiderstedt site (outer range of Elbe river, L ∞ = 174 mm). Mass mortalities of the clams are probably caused by washout (video inspections), low winter temperature and strong storms. Ensis directus immigrated into the community finding its own habitat on mobile sands with strong tidal currents. Recent studies on E. directus found that the species neither suppresses native species nor takes over the position of an established one which backs up our study findings over rather short time scales. On the contrary, E. directus seems to favour the settlement of some deposit feeders. Dense clam mats might stabilise the sediment and function as a sediment-trap for organic matter. Ensis directus has neither become a nuisance to other species nor developed according to the `boom-and-bust' theory. The fate of the immigrant E. directus rather is a story of a successful trans-ocean invasion which still holds on 23 years after the first findings in the outer elbe estuary off Vogelsand.

  15. Evaluation of Limiting Climatic Factors and Simulation of a Climatically Suitable Habitat for Chinese Sea Buckthorn.

    PubMed

    Li, Guoqing; Du, Sheng; Guo, Ke

    2015-01-01

    Chinese sea buckthorn (Hippophae rhamnoides subsp. sinensis) has considerable economic potential and plays an important role in reclamation and soil and water conservation. For scientific cultivation of this species across China, we identified the key climatic factors and explored climatically suitable habitat in order to maximize survival of Chinese sea buckthorn using MaxEnt and GIS tools, based on 98 occurrence records from herbarium and publications and 13 climatic factors from Bioclim, Holdridge life zone and Kria' index variables. Our simulation showed that the MaxEnt model performance was significantly better than random, with an average test AUC value of 0.93 with 10-fold cross validation. A jackknife test and the regularized gain change, which were applied to the training algorithm, showed that precipitation of the driest month (PDM), annual precipitation (AP), coldness index (CI) and annual range of temperature (ART) were the most influential climatic factors in limiting the distribution of Chinese sea buckthorn, which explained 70.1% of the variation. The predicted map showed that the core of climatically suitable habitat was distributed from the southwest to northwest of Gansu, Ningxia, Shaanxi and Shanxi provinces, where the most influential climate variables were PDM of 1.0-7.0 mm, AP of 344.0-1089.0 mm, CI of -47.7-0.0°C, and ART of 26.1-45.0°C. We conclude that the distribution patterns of Chinese sea buckthorn are related to the northwest winter monsoon, the southwest summer monsoon and the southeast summer monsoon systems in China. PMID:26177033

  16. iGPCR-Drug: A Web Server for Predicting Interaction between GPCRs and Drugs in Cellular Networking

    PubMed Central

    Xiao, Xuan; Min, Jian-Liang; Wang, Pu; Chou, Kuo-Chen

    2013-01-01

    Involved in many diseases such as cancer, diabetes, neurodegenerative, inflammatory and respiratory disorders, G-protein-coupled receptors (GPCRs) are among the most frequent targets of therapeutic drugs. It is time-consuming and expensive to determine whether a drug and a GPCR are to interact with each other in a cellular network purely by means of experimental techniques. Although some computational methods were developed in this regard based on the knowledge of the 3D (dimensional) structure of protein, unfortunately their usage is quite limited because the 3D structures for most GPCRs are still unknown. To overcome the situation, a sequence-based classifier, called “iGPCR-drug”, was developed to predict the interactions between GPCRs and drugs in cellular networking. In the predictor, the drug compound is formulated by a 2D (dimensional) fingerprint via a 256D vector, GPCR by the PseAAC (pseudo amino acid composition) generated with the grey model theory, and the prediction engine is operated by the fuzzy K-nearest neighbour algorithm. Moreover, a user-friendly web-server for iGPCR-drug was established at http://www.jci-bioinfo.cn/iGPCR-Drug/. For the convenience of most experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results without the need to follow the complicated math equations presented in this paper just for its integrity. The overall success rate achieved by iGPCR-drug via the jackknife test was 85.5%, which is remarkably higher than the rate by the existing peer method developed in 2010 although no web server was ever established for it. It is anticipated that iGPCR-Drug may become a useful high throughput tool for both basic research and drug development, and that the approach presented here can also be extended to study other drug – target interaction networks. PMID:24015221

  17. Spatial variation in otolith chemistry of Lutjanus apodus at Turneffe Atoll, Belize

    NASA Astrophysics Data System (ADS)

    Chittaro, P. M.; Usseglio, P.; Fryer, B. J.; Sale, P. F.

    2006-05-01

    Lutjanus apodus (Schoolmaster) were collected from several mangroves and coral reefs at Turneffe Atoll, Belize, in order to investigate whether elemental concentrations from the otolith edge could be used as a means to identify the habitat (mangrove or coral reef) and site (9 mangrove sites and 6 reef sites) from which they were collected. Results of a two factor nested MANOVA (sites nested within habitat) indicated significant differences in elemental concentrations between habitats (i.e., mangrove versus reef) as well as among sites. When separate Linear Discriminant Function Analyses (LDFA) were used to assess whether the spatial variability in otolith chemistry was sufficient to differentiate individuals to their respective habitats or sites, the results indicated that fish were classified (jackknife procedure) with a moderate to poor degree of accuracy (i.e., on average, 67% and 40% of the individuals were correctly classified to the habitat and site from which they were collected, respectively). Using a partial Mantel test we did not find a significant correlation between the differences in otolith elemental concentrations between sites and the distance between sites, while controlling the effect of habitat type (mangrove or reef). This suggests that for mangrove and reef sites at Turneffe Atoll, Belize, the overlap in terms of L. apodus otolith elemental concentrations is too high for investigations of fish movement. Finally, by comparing previously published Haemulon flavolineatum otolith chemistry to that of L. apodus we assessed whether these species showed similar habitat and/or site specific patterns in their otolith chemistry. Although both species were collected from the same sites our results indicated little similarity in their elemental concentrations, thus suggesting that habitat and site elemental signatures are species specific.

  18. Dose dependence of mass and microcalcification detection in digital mammography: Free response human observer studies

    SciTech Connect

    Ruschin, Mark; Timberg, Pontus; Ba ring th, Magnus; Hemdal, Bengt; Svahn, Tony; Saunders, Rob S.; Samei, Ehsan; Andersson, Ingvar; Mattsson, Soeren; Chakraborty, Dev P.; Tingberg, Anders

    2007-02-15

    The purpose of this study was to evaluate the effect of dose reduction in digital mammography on the detection of two lesion types--malignant masses and clusters of microcalcifications. Two free-response observer studies were performed--one for each lesion type. Ninety screening images were retrospectively selected; each image was originally acquired under automatic exposure conditions, corresponding to an average glandular dose of 1.3 mGy for a standard breast (50 mm compressed breast thickness with 50% glandularity). For each study, one to three simulated lesions were added to each of 40 images (abnormals) while 50 were kept without lesions (normals). Two levels of simulated system noise were added to the images yielding two new image sets, corresponding to simulated dose levels of 50% and 30% of the original images (100%). The manufacturer's standard display processing was subsequently applied to all images. Four radiologists experienced in mammography evaluated the images by searching for lesions and marking and assigning confidence levels to suspicious regions. The search data were analyzed using jackknife free-response (JAFROC) methodology. For the detection of masses, the mean figure-of-merit (FOM) averaged over all readers was 0.74, 0.71, and 0.68 corresponding to dose levels of 100%, 50%, and 30%, respectively. These values were not statistically different from each other (F=1.67, p=0.19) but showed a decreasing trend. In contrast, in the microcalcification study the mean FOM was 0.93, 0.67, and 0.38 for the same dose levels and these values were all significantly different from each other (F=109.84, p<0.0001). The results indicate that lowering the present dose level by a factor of two compromised the detection of microcalcifications but had a weaker effect on mass detection.

  19. Combined triaxial accelerometry and heart rate telemetry for the physiological characterization of Latin dance in non-professional adults.

    PubMed

    Domene, Pablo A; Easton, Chris

    2014-03-01

    The purpose of this study was to value calibrate, cross-validate, and determine the reliability of a combined triaxial accelerometry and heart rate telemetry technique for characterizing the physiological and physical activity parameters of Latin dance. Twenty-two non-professional adult Latin dancers attended two laboratory-based dance trials each. After familiarization and a standardized warm-up, a multi-stage (3 x 5-minute) incremental (based on song tempo) Afro-Cuban salsa choreography was performed while following a video displayed on a projection screen. Data were collected with a portable indirect calorimeter, a heart rate telemeter, and wrist-, hip-, and ankle-mounted ActiGraph GT3X+ accelerometers. Prediction equations for energy expenditure and step count were value calibrated using forced entry multiple regression and cross-validated using a delete-one jackknife approach with additional Bland-Altman analysis. The average dance intensity reached 6.09 ± 0.96 kcal/kg/h and demanded 45.9 ± 11.3% of the heart rate reserve. Predictive ability of the derived models was satisfactory, where R(2) = 0.80; SEE = 0.44 kcal/kg/h and R(2) = 0.74; SEE = 3 step/min for energy expenditure and step count, respectively. Dependent t-tests indicated no differences between predicted and measured values for both energy expenditure (t65 = -0.25, p = 0.80) and step count (t65 = -0.89, p = 0.38). The 95% limits of agreement for energy expenditure and step count were -0.98 to 0.95 kcal/kg/h and -7 to 7 step/min, respectively. Latin dance to salsa music elicits physiological responses representative of moderate to vigorous physical activity, and a wrist-worn accelerometer with simultaneous heart rate measurement constitutes a valid and reliable technique for the prediction of energy expenditure and step count during Latin dance. PMID:24568801

  20. Improved classification of lung cancer tumors based on structural and physicochemical properties of proteins using data mining models.

    PubMed

    Ramani, R Geetha; Jacob, Shomona Gracia

    2013-01-01

    Detecting divergence between oncogenic tumors plays a pivotal role in cancer diagnosis and therapy. This research work was focused on designing a computational strategy to predict the class of lung cancer tumors from the structural and physicochemical properties (1497 attributes) of protein sequences obtained from genes defined by microarray analysis. The proposed methodology involved the use of hybrid feature selection techniques (gain ratio and correlation based subset evaluators with Incremental Feature Selection) followed by Bayesian Network prediction to discriminate lung cancer tumors as Small Cell Lung Cancer (SCLC), Non-Small Cell Lung Cancer (NSCLC) and the COMMON classes. Moreover, this methodology eliminated the need for extensive data cleansing strategies on the protein properties and revealed the optimal and minimal set of features that contributed to lung cancer tumor classification with an improved accuracy compared to previous work. We also attempted to predict via supervised clustering the possible clusters in the lung tumor data. Our results revealed that supervised clustering algorithms exhibited poor performance in differentiating the lung tumor classes. Hybrid feature selection identified the distribution of solvent accessibility, polarizability and hydrophobicity as the highest ranked features with Incremental feature selection and Bayesian Network prediction generating the optimal Jack-knife cross validation accuracy of 87.6%. Precise categorization of oncogenic genes causing SCLC and NSCLC based on the structural and physicochemical properties of their protein sequences is expected to unravel the functionality of proteins that are essential in maintaining the genomic integrity of a cell and also act as an informative source for drug design, targeting essential protein properties and their composition that are found to exist in lung cancer tumors. PMID:23505559

  1. Lethal and Demographic Impact of Chlorpyrifos and Spinosad on the Ectoparasitoid Habrobracon hebetor (Say) (Hymenoptera: Braconidae).

    PubMed

    Mahdavi, V; Saber, M; Rafiee-Dastjerdi, H; Kamita, S G

    2015-12-01

    The appropriate use of biological agents and chemical compounds is necessary to establish successful integrated pest management (IPM) programs. Thus, the off-target effects of pesticides on biological control agents are essential considerations of IPM. In this study, the effects of lethal and sublethal concentrations of chlorpyrifos and spinosad on the demographic parameters of Habrobracon hebetor (Say) (Hymenoptera: Braconidae) were assessed. Bioassays were carried out on immature and adult stages by using dipping and contact exposure of dry pesticide residue on an inert material, respectively. The lethal concentration (LC)50 values of chlorpyrifos and spinosad were 3.69 and 151.37 ppm, respectively, on the larval stage and 1.75 and 117.37 ppm, respectively, on adults. Hazard quotient (HQ) values for chlorpyrifos and spinosad were 400 and 2.2, respectively, on the larval stage and 857.14 and 2.84, respectively, on adults. A low lethal concentration (LC30) was used to assess the sublethal effects of both pesticides on the surviving females. In each treatment, 25 survivors were randomly selected and transferred into 6-cm Petri dishes. Adults were provided daily with last instars of Anagasta kuehniella (Zeller) as a host until all of the females died. The number of eggs laid, percent of larvae hatched, longevity, and sex ratio were recorded. Stable population growth parameters were estimated by the Jackknife method. In control, chlorpyrifos, and spinosad treatments, the intrinsic rates of increase (r m) values were 0.23, 0.10, and 0.21, respectively. The results of this study suggest a relative compatibility between spinosad use and H. hebetor. Finally, further studies should be conducted under natural conditions to verify the compatibility of spinosad with H. hebetor in IPM programs. PMID:26280986

  2. Detection of calcification clusters in digital breast tomosynthesis slices at different dose levels utilizing a SRSAR reconstruction and JAFROC

    NASA Astrophysics Data System (ADS)

    Timberg, P.; Dustler, M.; Petersson, H.; Tingberg, A.; Zackrisson, S.

    2015-03-01

    Purpose: To investigate detection performance for calcification clusters in reconstructed digital breast tomosynthesis (DBT) slices at different dose levels using a Super Resolution and Statistical Artifact Reduction (SRSAR) reconstruction method. Method: Simulated calcifications with irregular profile (0.2 mm diameter) where combined to form clusters that were added to projection images (1-3 per abnormal image) acquired on a DBT system (Mammomat Inspiration, Siemens). The projection images were dose reduced by software to form 35 abnormal cases and 25 normal cases as if acquired at 100%, 75% and 50% dose level (AGD of approximately 1.6 mGy for a 53 mm standard breast, measured according to EUREF v0.15). A standard FBP and a SRSAR reconstruction method (utilizing IRIS (iterative reconstruction filters), and outlier detection using Maximum-Intensity Projections and Average-Intensity Projections) were used to reconstruct single central slices to be used in a Free-response task (60 images per observer and dose level). Six observers participated and their task was to detect the clusters and assign confidence rating in randomly presented images from the whole image set (balanced by dose level). Each trial was separated by one weeks to reduce possible memory bias. The outcome was analyzed for statistical differences using Jackknifed Alternative Free-response Receiver Operating Characteristics. Results: The results indicate that it is possible reduce the dose by 50% with SRSAR without jeopardizing cluster detection. Conclusions: The detection performance for clusters can be maintained at a lower dose level by using SRSAR reconstruction.

  3. Phylogenetic studies favour the unification of Pennisetum, Cenchrus and Odontelytrum (Poaceae): a combined nuclear, plastid and morphological analysis, and nomenclatural combinations in Cenchrus

    PubMed Central

    Chemisquy, M. Amelia; Giussani, Liliana M.; Scataglini, María A.; Kellogg, Elizabeth A.; Morrone, Osvaldo

    2010-01-01

    Backgrounds and Aims Twenty-five genera having sterile inflorescence branches were recognized as the bristle clade within the x = 9 Paniceae (Panicoideae). Within the bristle clade, taxonomic circumscription of Cenchrus (20–25 species), Pennisetum (80–140) and the monotypic Odontelytrum is still unclear. Several criteria have been applied to characterize Cenchrus and Pennisetum, but none of these has proved satisfactory as the diagnostic characters, such as fusion of bristles in the inflorescences, show continuous variation. Methods A phylogenetic analysis based on morphological, plastid (trnL-F, ndhF) and nuclear (knotted) data is presented for a representative species sampling of the genera. All analyses were conducted under parsimony, using heuristic searches with TBR branch swapping. Branch support was assessed with parsimony jackknifing. Key Results Based on plastid and morphological data, Pennisetum, Cenchrus and Odontelytrum were supported as a monophyletic group: the PCO clade. Only one section of Pennisetum (Brevivalvula) was supported as monophyletic. The position of P. lanatum differed among data partitions, although the combined plastid and morphology and nuclear analyses showed this species to be a member of the PCO clade. The basic chromosome number x = 9 was found to be plesiomorphic, and x = 5, 7, 8, 10 and 17 were derived states. The nuclear phylogenetic analysis revealed a reticulate pattern of relationships among Pennisetum and Cenchrus, suggesting that there are at least three different genomes. Because apomixis can be transferred among species through hybridization, its history most likely reflects crossing relationships, rather than multiple independent appearances. Conclusions Due to the consistency between the present results and different phylogenetic hypotheses (including morphological, developmental and multilocus approaches), and the high support found for the PCO clade, also including the type species of the three genera, we propose unification of Pennisetum, Cenchrus and Odontelytrum. Species of Pennisetum and Odontelytrum are here transferred into Cenchrus, which has priority. Sixty-six new combinations are made here. PMID:20570830

  4. Detecting taxonomic signal in an under-utilised character system: geometric morphometrics of the forcipular coxae of Scutigeromorpha (Chilopoda)

    PubMed Central

    Gutierrez, Beatriz Lopez; MacLeod, Norman; Edgecombe, Gregory D.

    2011-01-01

    Abstract To date, the forcipules have played almost no role in determining the systematics of scutigeromorph centipedes though in his 1974 review of taxonomic characters Markus Würmli suggested some potentially informative variation might be found in these structures. Geometric morphometric analyses were used to evaluate Würmli’s suggestion, specifically to determine whether the shape of the forcipular coxa contains information useful for diagnosing species. The geometry of the coxae of eight species from the genera Sphendononema, Scutigera, Dendrothereua, Thereuonema, Thereuopoda, Thereuopodina, Allothereua and Parascutigera was characterised using a combination of landmark- and semi-landmark-based sampling methods to summarize group-specific morphological variation. Canonical variates analysis of shape data characterizing the forcipular coxae indicates that these structures differ significantly between taxa at various systematic levels. Models calculated for the canonical variates space facilitate identification of the main shape differences between genera, including overall length/width, curvature of the external coxal margin, and the extent to which the coxofemoral condyle projects laterally. Jackknifed discriminant function analysis demonstrates that forcipular coxal training-set specimens were assigned to correct species in 61% of cases on average, the most accurate assignments being those of Parascutigera (Parascutigera guttata) and Thereuonema (Thereuonema microstoma). The geographically widespread species Thereuopoda longicornis, Sphendononema guildingii, Scutigera coleoptrata, and Dendrothereua linceci exhibit the least diagnostic coxae in our dataset. Thereuopoda longicornis populations sampled from different parts of East and Southeast Asia were significantly discriminated from each other, suggesting that, in this case, extensive synonymy may be obscuring diagnosable inter-species coxal shape differences. PMID:22303095

  5. Demographic history and rare allele sharing among human populations.

    PubMed

    Gravel, Simon; Henn, Brenna M; Gutenkunst, Ryan N; Indap, Amit R; Marth, Gabor T; Clark, Andrew G; Yu, Fuli; Gibbs, Richard A; Bustamante, Carlos D

    2011-07-19

    High-throughput sequencing technology enables population-level surveys of human genomic variation. Here, we examine the joint allele frequency distributions across continental human populations and present an approach for combining complementary aspects of whole-genome, low-coverage data and targeted high-coverage data. We apply this approach to data generated by the pilot phase of the Thousand Genomes Project, including whole-genome 2-4× coverage data for 179 samples from HapMap European, Asian, and African panels as well as high-coverage target sequencing of the exons of 800 genes from 697 individuals in seven populations. We use the site frequency spectra obtained from these data to infer demographic parameters for an Out-of-Africa model for populations of African, European, and Asian descent and to predict, by a jackknife-based approach, the amount of genetic diversity that will be discovered as sample sizes are increased. We predict that the number of discovered nonsynonymous coding variants will reach 100,000 in each population after ∼1,000 sequenced chromosomes per population, whereas ∼2,500 chromosomes will be needed for the same number of synonymous variants. Beyond this point, the number of segregating sites in the European and Asian panel populations is expected to overcome that of the African panel because of faster recent population growth. Overall, we find that the majority of human genomic variable sites are rare and exhibit little sharing among diverged populations. Our results emphasize that replication of disease association for specific rare genetic variants across diverged populations must overcome both reduced statistical power because of rarity and higher population divergence. PMID:21730125

  6. PRIMUS: Galaxy Clustering as a Function of Luminosity and Color at 0.2 < z < 1

    NASA Astrophysics Data System (ADS)

    Skibba, Ramin A.; Smith, M. Stephen M.; Coil, Alison L.; Moustakas, John; Aird, James; Blanton, Michael R.; Bray, Aaron D.; Cool, Richard J.; Eisenstein, Daniel J.; Mendez, Alexander J.; Wong, Kenneth C.; Zhu, Guangtun

    2014-04-01

    We present measurements of the luminosity and color-dependence of galaxy clustering at 0.2 < z < 1.0 in the Prism Multi-object Survey. We quantify the clustering with the redshift-space and projected two-point correlation functions, ξ(rp , π) and wp (rp ), using volume-limited samples constructed from a parent sample of over ~130, 000 galaxies with robust redshifts in seven independent fields covering 9 deg2 of sky. We quantify how the scale-dependent clustering amplitude increases with increasing luminosity and redder color, with relatively small errors over large volumes. We find that red galaxies have stronger small-scale (0.1 Mpc h -1 < rp < 1 Mpc h -1) clustering and steeper correlation functions compared to blue galaxies, as well as a strong color dependent clustering within the red sequence alone. We interpret our measured clustering trends in terms of galaxy bias and obtain values of b gal ≈ 0.9-2.5, quantifying how galaxies are biased tracers of dark matter depending on their luminosity and color. We also interpret the color dependence with mock catalogs, and find that the clustering of blue galaxies is nearly constant with color, while redder galaxies have stronger clustering in the one-halo term due to a higher satellite galaxy fraction. In addition, we measure the evolution of the clustering strength and bias, and we do not detect statistically significant departures from passive evolution. We argue that the luminosity- and color-environment (or halo mass) relations of galaxies have not significantly evolved since z ~ 1. Finally, using jackknife subsampling methods, we find that sampling fluctuations are important and that the COSMOS field is generally an outlier, due to having more overdense structures than other fields; we find that "cosmic variance" can be a significant source of uncertainty for high-redshift clustering measurements.

  7. Effect of herbicide combinations on Bt-maize rhizobacterial diversity.

    PubMed

    Valverde, José R; Marín, Silvia; Mellado, Rafael P

    2014-11-28

    Reports of herbicide resistance events are proliferating worldwide, leading to new cultivation strategies using combinations of pre-emergence and post-emergence herbicides. We analyzed the impact during a one-year cultivation cycle of several herbicide combinations on the rhizobacterial community of glyphosate-tolerant Bt-maize and compared them to those of the untreated or glyphosate-treated soils. Samples were analyzed using pyrosequencing of the V6 hypervariable region of the 16S rRNA gene. The sequences obtained were subjected to taxonomic, taxonomy-independent, and phylogeny-based diversity studies, followed by a statistical analysis using principal components analysis and hierarchical clustering with jackknife statistical validation. The resilience of the microbial communities was analyzed by comparing their relative composition at the end of the cultivation cycle. The bacterial communites from soil subjected to a combined treatment with mesotrione plus s-metolachlor followed by glyphosate were not statistically different from those treated with glyphosate or the untreated ones. The use of acetochlor plus terbuthylazine followed by glyphosate, and the use of aclonifen plus isoxaflutole followed by mesotrione clearly affected the resilience of their corresponding bacterial communities. The treatment with pethoxamid followed by glyphosate resulted in an intermediate effect. The use of glyphosate alone seems to be the less aggressive one for bacterial communities. Should a combined treatment be needed, the combination of mesotrione and s-metolachlor shows the next best final resilience. Our results show the relevance of comparative rhizobacterial community studies when novel combined herbicide treatments are deemed necessary to control weed growth.. PMID:25394507

  8. [Seasonal evaluation of mammal species richness and abundance in the "Mrio Viana" municipal reserve, Mato Grosso, Brasil].

    PubMed

    Rocha, Ednaldo Cndido; Silva, Elias; Martins, Sebastio Venncio; Barreto, Francisco Cndido Cardoso

    2006-09-01

    We evaluated seasonal species presence and richness, and abundance of medium and large sized mammalian terrestrial fauna in the "Mrio Viana" Municipal Biological Reserve, Nova Xavantina, Mato Grosso, Brazil. During 2001, two monthly visits were made to an established transect, 2,820 m in length. Records of 22 mammal species were obtained and individual footprint sequences quantified for seasonal calculation of species richness and relative abundance index (x footprints/km traveled). All 22 species occurred during the rainy season, but only 18 during the dry season. Pseudalopex vetulus (Lund, 1842) (hoary fox), Eira barbara (Linnaeus, 1758) (tayra), Puma concolor (Linnaeus, 1771) (cougar) and Hydrochaeris hydrochaeris (Linnaeus, 1766) (capybara) were only registered during the rainy season. The species diversity estimated using the Jackknife procedure in the dry season (19.83, CI = 2.73) was smaller than in the rainy season (25.67, CI = 3.43). Among the 18 species common in the two seasons, only four presented significantly different abundance indexes: Dasypus novemcinctus Linnaeus, 1758 (nine-banded armadillo), Euphractus sexcinctus (Linnaeus, 1758) (six-banded armadillo), Dasyprocta azarae Lichtenstein, 1823 (Azara's Agouti) and Tapirus terrestris (Linnaeus, 1758) (tapir). On the other hand, Priodontes maximus (Kerr, 1792) (giant armadillo) and Leopardus pardalis (Linnaeus, 1758) (ocelot) had identical abundance index over the two seasons. Distribution of species abundance in the sampled area followed the expected pattern for communities in equilibrium, especially in the rainy season, suggesting that the environment still maintains good characteristics for mammal conservation. The present study shows that the reserve, although only 470 ha in size, plays an important role for conservation of mastofauna of the area as a refuge in an environment full of anthropic influence (mainly cattle breeding in exotic pasture). PMID:18491629

  9. Historical shoreline mapping (II): application of the Digital Shoreline Mapping and Analysis Systems (DSMS/DSAS) to shoreline change mapping in Puerto Rico

    USGS Publications Warehouse

    Thieler, E. Robert; Danforth, William W.

    1994-01-01

    A new, state-of-the-art method for mapping historical shorelines from maps and aerial photographs, the Digital Shoreline Mapping System (DSMS), has been developed. The DSMS is a freely available, public domain software package that meets the cartographic and photogrammetric requirements of precise coastal mapping, and provides a means to quantify and analyze different sources of error in the mapping process. The DSMS is also capable of resolving imperfections in aerial photography that commonly are assumed to be nonexistent. The DSMS utilizes commonly available computer hardware and software, and permits the entire shoreline mapping process to be executed rapidly by a single person in a small lab. The DSMS generates output shoreline position data that are compatible with a variety of Geographic Information Systems (GIS). A second suite of programs, the Digital Shoreline Analysis System (DSAS) has been developed to calculate shoreline rates-of-change from a series of shoreline data residing in a GIS. Four rate-of-change statistics are calculated simultaneously (end-point rate, average of rates, linear regression and jackknife) at a user-specified interval along the shoreline using a measurement baseline approach. An example of DSMS and DSAS application using historical maps and air photos of Punta Uvero, Puerto Rico provides a basis for assessing the errors associated with the source materials as well as the accuracy of computed shoreline positions and erosion rates. The maps and photos used here represent a common situation in shoreline mapping: marginal-quality source materials. The maps and photos are near the usable upper limit of scale and accuracy, yet the shoreline positions are still accurate ±9.25 m when all sources of error are considered. This level of accuracy yields a resolution of ±0.51 m/yr for shoreline rates-of-change in this example, and is sufficient to identify the short-term trend (36 years) of shoreline change in the study area.

  10. Remote Sensing of Miombo Woodland's Aboveground Biomass and LAI using RADARSAT and Landsat ETM+ Data

    NASA Astrophysics Data System (ADS)

    Ribeiro, N. S.; Saatchi, S. S.; Shugart, H. H.; Wshington-Allen, R. A.

    2007-05-01

    Estimations of biomass are critical in Miombo Woodlands because they represent a primary source of food, fiber, and fuel for 340 million rural peoples and another 15 million urban dwellers in southern Africa. The purpose of this study is to estimate woody aboveground biomass and Leaf Area Index (LAI) in Niassa Reserve, northern Mozambique. The objective of this study is to use optical and microwave satellite data with contemporaneous field data to estimate biomass and LAI. Fifty field plots were surveyed across the Niassa Reserve for biomass and LAI in July and December 2004, respectively. Remote sensing data consisting of RADARSAT backscatter (C- band, ë=5.6 cm) and a June 2004 Landsat ETM+ were acquired. Normalized Difference Vegetation Index (NDVI), Simple Ratio (SR), and a land-cover map (72% total accuracy) were derived from the Landsat scene. Field measurements of biomass and LAI correlated with Radarsat backscatter (Rsqbiomass=0.45, RsqLAI = 0.35, P<0.0001 ), NDVI (Rsqbiomass =0.15, RsqLAI=0.14-, p <0.0001 ) and SR (Rsqbiomass=-0.14, RsqLAI= 0.17, p <0.0001). A jackknife stepwise regression technique was used to develop the best predictive models for biomass (biomass = -5.19 +0.074*radarsat+1.56*SR, Rsq=0.53) and LAI (LAI= -0.66+0.01*radarsat+0.22*SR, Rsq=0.45). The addition of NDVI did not improve the model. Forest biomass and LAI maps were then produced for Niassa Reserve with an estimated peak total biomass of 18 kg/hm2 and a mean LAI of 2.8 m2/m2. In the east both biomass and LAI are lower than the western Niassa Reserve.

  11. Efficacy of digital breast tomosynthesis for breast cancer diagnosis

    NASA Astrophysics Data System (ADS)

    Alakhras, M.; Mello-Thoms, C.; Rickard, M.; Bourne, R.; Brennan, P. C.

    2014-03-01

    Purpose: To compare the diagnostic performance of digital breast tomosynthesis (DBT) in combination with digital mammography (DM) with that of digital mammography alone. Materials and Methods: Twenty six experienced radiologists who specialized in breast imaging read 50 cases (27 cancers and 23 non-cancer cases) of patients who underwent DM and DBT. Both exams included the craniocaudal (CC) and mediolateral oblique (MLO) views. Histopathologic examination established truth in all lesions. Each case was interpreted in two modes, once with DM alone followed by DM+DBT, and the observers were asked to mark the location of any lesions, if present, and give it a score based on a five-category assessment by the Royal Australian and New Zealand College of Radiologists (RANZCR). The diagnostic performance of DM compared with that of DM+DBT was evaluated in terms of the difference between areas under receiver-operating characteristic curves (AUCs), Jackknife free-response receiver operator characteristics (JAFROC) figure-of-merit, sensitivity, location sensitivity and specificity. Results: Average AUC and JAFROC for DM versus DM+DBT was significantly different (AUCs 0.690 vs 0.781, p=< 0.0001), (JAFROC 0.618 vs. 0.732, p=< 0.0001) respectively. In addition, the use of DM+DBT resulted in an improvement in sensitivity (0.629 vs. 0.701, p=0.0011), location sensitivity (0.548 vs. 0.690, p=< 0.0001) and specificity (0.656 vs. 0.758, p=0.0015) when compared to DM alone. Conclusion: Adding DBT to the standard DM significantly improved radiologists' performance in terms of AUCs, JAFROC figure of merit, sensitivity, location sensitivity and specificity values.

  12. Asymmetric Constriction of Dividing Escherichia coli Cells Induced by Expression of a Fusion between Two Min Proteins

    PubMed Central

    Rowlett, Veronica Wells

    2014-01-01

    The Min system, consisting of MinC, MinD, and MinE, plays an important role in localizing the Escherichia coli cell division machinery to midcell by preventing FtsZ ring (Z ring) formation at cell poles. MinC has two domains, MinCn and MinCc, which both bind to FtsZ and act synergistically to inhibit FtsZ polymerization. Binary fission of E. coli usually proceeds symmetrically, with daughter cells at roughly 180° to each other. In contrast, we discovered that overproduction of an artificial MinCc-MinD fusion protein in the absence of other Min proteins induced frequent and dramatic jackknife-like bending of cells at division septa, with cell constriction predominantly on the outside of the bend. Mutations in the fusion known to disrupt MinCc-FtsZ, MinCc-MinD, or MinD-membrane interactions largely suppressed bending division. Imaging of FtsZ-green fluorescent protein (GFP) showed no obvious asymmetric localization of FtsZ during MinCc-MinD overproduction, suggesting that a downstream activity of the Z ring was inhibited asymmetrically. Consistent with this, MinCc-MinD fusions localized predominantly to segments of the Z ring at the inside of developing cell bends, while FtsA (but not ZipA) tended to localize to the outside. As FtsA is required for ring constriction, we propose that this asymmetric localization pattern blocks constriction of the inside of the septal ring while permitting continued constriction of the outside portion. PMID:24682325

  13. Caries-preventive Effect of Supervised Toothbrushing and Sealants.

    PubMed

    Hilgert, L A; Leal, S C; Mulder, J; Creugers, N H J; Frencken, J E

    2015-09-01

    To investigate the effectiveness of 3 caries-preventive measures on high- and low-caries risk occlusal surfaces of first permanent molars over 3 y. This cluster-randomized controlled clinical trial covered 242 schoolchildren, 6 to 7 y old, from low socioeconomic areas. At baseline, caries risk was assessed at the tooth surface level, through a combination of ICDAS II (International Caries Detection and Assessment System) and fissure depth codes. High-caries risk occlusal surfaces were treated according to daily supervised toothbrushing (STB) at school and 2 sealants: composite resin (CR) and atraumatic restorative treatment-high-viscosity glass-ionomer cement (ART-GIC). Low-caries risk occlusal surfaces received STB or no intervention. Evaluations were performed after 0.5, 1, 2, and 3 y. A cavitated dentine carious lesion was considered a failure. Data were analyzed according to the proportional hazard rate regression model with frailty correction, Wald test, analysis of variance, and t test, according to the jackknife procedure for calculating standard errors. The cumulative survival rates of cavitated dentine carious lesion-free, high-caries risk occlusal surfaces were 95.6%, 91.4%, and 90.2% for STB, CR, and ART-GIC, respectively, over 3 y, which were not statistically significantly different. For low-caries risk occlusal surfaces, no statistically significant difference was observed between the cumulative survival rate of the STB group (94.8%) and the no-intervention group (92.1%) over 3 y. There was neither a difference among STB, CR, and ART-GIC on school premises in preventing cavitated dentine carious lesions in high-caries risk occlusal surfaces of first permanent molars nor a difference between STB and no intervention for low-caries risk occlusal surfaces of first permanent molars over 3 y. PMID:26116491

  14. Neuroanatomic localization of priming effects for famous faces with latency-corrected event-related potentials.

    PubMed

    Kashyap, Rajan; Ouyang, Guang; Sommer, Werner; Zhou, Changsong

    2016-02-01

    The late components of event-related brain potentials (ERPs) pose a difficult problem in source localization. One of the reasons is the smearing of these components in conventional averaging because of trial-to-trial latency-variability. The smearing problem may be addressed by reconstructing the ERPs after latency synchronization with the Residue Iteration Decomposition (RIDE) method. Here we assessed whether the benefits of RIDE at the surface level also improve source localization of RIDE-reconstructed ERPs (RERPs) measured in a face priming paradigm. Separate source models for conventionally averaged ERPs and RERPs were derived and sources were localized for both early and late components. Jackknife averaging on the data was used to reduce the residual variance during source localization compared to conventional source model fitting on individual subject data. Distances between corresponding sources of both ERP and RERP models were measured to check consistency in both source models. Sources for activity around P100, N170, early repetition effect (ERE/N250r) and late repetition effect (LRE/N400) were reported and priming effects in these sources were evaluated for six time windows. Significant improvement in priming effect of the late sources was found from the RERP source model, especially in the Medio-Temporal Lobe, Prefrontal Cortex, and Anterior Temporal Lobe. Consistent with previous studies, we found early priming effects in the right hemisphere and late priming effects in the left hemisphere. Also, the priming effects in right hemisphere outnumbered the left hemisphere, signifying dominance of right hemisphere in face recognition. In conclusion, RIDE reconstructed ERPs promise a comprehensive understanding of the time-resolved dynamics the late sources play during face recognition. PMID:26683085

  15. Personal and Network Dynamics in Performance of Knowledge Workers: A Study of Australian Breast Radiologists

    PubMed Central

    Tavakoli Taba, Seyedamir; Hossain, Liaquat; Heard, Robert; Brennan, Patrick; Lee, Warwick; Lewis, Sarah

    2016-01-01

    Materials and Methods In this paper, we propose a theoretical model based upon previous studies about personal and social network dynamics of job performance. We provide empirical support for this model using real-world data within the context of the Australian radiology profession. An examination of radiologists’ professional network topology through structural-positional and relational dimensions and radiologists’ personal characteristics in terms of knowledge, experience and self-esteem is provided. Thirty one breast imaging radiologists completed a purpose designed questionnaire regarding their network characteristics and personal attributes. These radiologists also independently read a test set of 60 mammographic cases: 20 cases with cancer and 40 normal cases. A Jackknife free response operating characteristic (JAFROC) method was used to measure the performance of the radiologists’ in detecting breast cancers. Results Correlational analyses showed that reader performance was positively correlated with the social network variables of degree centrality and effective size, but negatively correlated with constraint and hierarchy. For personal characteristics, the number of mammograms read per year and self-esteem (self-evaluation) positively correlated with reader performance. Hierarchical multiple regression analysis indicated that the combination of number of mammograms read per year and network’s effective size, hierarchy and tie strength was the best fitting model, explaining 63.4% of the variance in reader performance. The results from this study indicate the positive relationship between reading high volumes of cases by radiologists and expertise development, but also strongly emphasise the association between effective social/professional interactions and informal knowledge sharing with high performance. PMID:26918644

  16. Early detection of production deficit hot spots in semi-arid environment using FAPAR time series and a probabilistic approach

    NASA Astrophysics Data System (ADS)

    Meroni, M.; Fasbender, D.; Kayitakire, F.; Pini, G.; Rembold, F.; Urbano, F.; Verstraete, M. M.

    2013-12-01

    Timely information on vegetation development at regional scale is needed in arid and semiarid African regions where rainfall variability leads to high inter-annual fluctuations in crop and pasture productivity, as well as to high risk of food crisis in the presence of severe drought events. The present study aims at developing and testing an automatic procedure to estimate the probability of experiencing a seasonal biomass production deficit solely on the basis of historical and near real-time remote sensing observations. The method is based on the extraction of vegetation phenology from SPOT-VEGTATION time series of the Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) and the subsequent computation of seasonally cumulated FAPAR as a proxy for vegetation gross primary production. Within season forecasts of the overall seasonal performance, expressed in terms of probability of experiencing a critical deficit, are based on a statistical approach taking into account two factors: i) the similarity between the current FAPAR profile and past profiles observable in the 15 years FAPAR time series; ii) the uncertainty of past predictions of season outcome as derived using jack-knifing technique. The method is applicable at the regional to continental scale and can be updated regularly during the season (whenever a new satellite observation is made available) to provide a synoptic view of the hot spots of likely production deficit. The specific objective of the procedure described here is to deliver to the food security analyst, as early as possible within the season, only the relevant information (e.g., masking out areas without active vegetation at the time of analysis), expressed through a reliable and easily interpretable measure of impending risk. Evaluation of method performance and examples of application in the Sahel region are discussed.

  17. Metric optimisation for analogue forecasting by simulated annealing

    NASA Astrophysics Data System (ADS)

    Bliefernicht, J.; Bárdossy, A.

    2009-04-01

    It is well known that weather patterns tend to recur from time to time. This property of the atmosphere is used by analogue forecasting techniques. They have a long history in weather forecasting and there are many applications predicting hydrological variables at the local scale for different lead times. The basic idea of the technique is to identify past weather situations which are similar (analogue) to the predicted one and to take the local conditions of the analogues as forecast. But the forecast performance of the analogue method depends on user-defined criteria like the choice of the distance function and the size of the predictor domain. In this study we propose a new methodology of optimising both criteria by minimising the forecast error with simulated annealing. The performance of the methodology is demonstrated for the probability forecast of daily areal precipitation. It is compared with a traditional analogue forecasting algorithm, which is used operational as an element of a hydrological forecasting system. The study is performed for several meso-scale catchments located in the Rhine basin in Germany. The methodology is validated by a jack-knife method in a perfect prognosis framework for a period of 48 years (1958-2005). The predictor variables are derived from the NCEP/NCAR reanalysis data set. The Brier skill score and the economic value are determined to evaluate the forecast skill and value of the technique. In this presentation we will present the concept of the optimisation algorithm and the outcome of the comparison. It will be also demonstrated how a decision maker should apply a probability forecast to maximise the economic benefit from it.

  18. Predicting physical activity energy expenditure using accelerometry in adults from sub-Sahara Africa.

    PubMed

    Assah, Felix K; Ekelund, Ulf; Brage, Soren; Corder, Kirsten; Wright, Antony; Mbanya, Jean C; Wareham, Nicholas J

    2009-08-01

    Lack of physical activity may be an important etiological factor in the current epidemiological transition characterized by increasing prevalence of obesity and chronic diseases in sub-Sahara Africa. However, there is a dearth of data on objectively measured physical activity energy expenditure (PAEE) in this region. We sought to develop regression equations using body composition and accelerometer counts to predict PAEE. We conducted a cross-sectional study of 33 adult volunteers from an urban (n = 16) and a rural (n = 17) residential site in Cameroon. Energy expenditure was measured by doubly labeled water (DLW) over a period of seven consecutive days. Simultaneously, a hip-mounted Actigraph accelerometer recorded body movement. PAEE prediction equations were derived using accelerometer counts, age, sex, and body composition variables, and cross-validated by the jack-knife method. The Bland and Altman limits of agreement (LOAs) approach was used to assess agreement. Our results show that PAEE (kJ/kg/day) was significantly and positively correlated with activity counts from the accelerometer (r = 0.37, P = 0.03). The derived equations explained 14-40% of the variance in PAEE. Age, sex, and accelerometer counts together explained 34% of the variance in PAEE, with accelerometer counts alone explaining 14%. The LOAs between DLW and the derived equations were wide, with predicted PAEE being up to 60 kJ/kg/day below or above the measured value. In summary, the derived equations performed better than existing published equations in predicting PAEE from accelerometer counts in this population. Accelerometry could be used to predict PAEE in this population and, therefore, has important applications for monitoring population levels of total physical activity patterns. PMID:19247268

  19. Predicting physical activity energy expenditure using accelerometry in adults from sub-Sahara Africa

    PubMed Central

    Assah, Felix K.; Ekelund, Ulf; Brage, Soren; Corder, Kirsten; Wright, Antony; Mbanya, Jean Claude; Wareham, Nicholas J.

    2009-01-01

    Lack of physical activity may be an important etiological factor in the current epidemiological transition characterised by increasing prevalence of obesity and chronic diseases in sub-Sahara Africa. However, there is a dearth of data on objectively measured physical activity energy expenditure (PAEE) in this region. We sought to develop regression equations using body composition and accelerometer counts to predict PAEE. We conducted a cross-sectional study of 33 adult volunteers from an urban (n=16) and a rural (n=17) residential site in Cameroon. Energy expenditure was measured by doubly labelled water over a period of 7 consecutive days. Simultaneously, a hip-mounted Actigraph accelerometer recorded body movement. PAEE prediction equations were derived using accelerometer counts, age, sex and body composition variables, and cross-validated by the jack-knife method. The Bland and Altman limits of agreement (LOA) approach was used to assess agreement. Our results show that PAEE (kJkg?1day?1) was significantly and positively correlated with activity counts from the accelerometer (r=0.37, p=0.03). The derived equations explained 14 to 40% of the variance in PAEE. Age, sex and accelerometer counts together explained 34% of the variance in PAEE, with accelerometer counts alone explaining 14%. The LOA between DLW and the derived equations were wide, with predicted PAEE being up to 60 kJkg?1day?1 below or above the measured value. In summary, the derived equations performed better than existing published equations in predicting PAEE from accelerometer counts in this population. Accelerometry could be used to predict PAEE in this population and therefore has important applications for monitoring population levels of total physical activity patterns. PMID:19247268

  20. The projack: a resampling approach to correct for ranking bias in high-throughput studies

    PubMed Central

    Zhou, Yi-Hui; Wright, Fred A.

    2016-01-01

    The problem of ranked inference arises in a number of settings, for which the investigator wishes to perform parameter inference after ordering a set of \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$m$\\end{document} statistics. In contrast to inference for a single hypothesis, the ranking procedure introduces considerable bias, a problem known as the “winner's curse” in genetic association. We introduce the projack (for Prediction by Re- Ordered Jackknife and Cross-Validation, \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$K$\\end{document}-fold). The projack is a resampling-based procedure that provides low-bias estimates of the expected ranked effect size parameter for a set of possibly correlated \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$z$\\end{document} statistics. The approach is flexible, and has wide applicability to high-dimensional datasets, including those arising from genomics platforms. Initially, motivated for the setting where original data are available for resampling, the projack can be extended to the situation where only the vector of \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$z$\\end{document} values is available. We illustrate the projack for correction of the winner's curse in genetic association, although it can be used much more generally. PMID:26040912

  1. A hybrid method for prediction and repositioning of drug Anatomical Therapeutic Chemical classes.

    PubMed

    Chen, Lei; Lu, Jing; Zhang, Ning; Huang, Tao; Cai, Yu-Dong

    2014-04-01

    In the Anatomical Therapeutic Chemical (ATC) classification system, therapeutic drugs are divided into 14 main classes according to the organ or system on which they act and their chemical, pharmacological and therapeutic properties. This system, recommended by the World Health Organization (WHO), provides a global standard for classifying medical substances and serves as a tool for international drug utilization research to improve quality of drug use. In view of this, it is necessary to develop effective computational prediction methods to identify the ATC-class of a given drug, which thereby could facilitate further analysis of this system. In this study, we initiated an attempt to develop a prediction method and to gain insights from it by utilizing ontology information of drug compounds. Since only about one-fourth of drugs in the ATC classification system have ontology information, a hybrid prediction method combining the ontology information, chemical interaction information and chemical structure information of drug compounds was proposed for the prediction of drug ATC-classes. As a result, by using the Jackknife test, the 1st prediction accuracies for identifying the 14 main ATC-classes in the training dataset, the internal validation dataset and the external validation dataset were 75.90%, 75.70% and 66.36%, respectively. Analysis of some samples with false-positive predictions in the internal and external validation datasets indicated that some of them may even have a relationship with the false-positive predicted ATC-class, suggesting novel uses of these drugs. It was conceivable that the proposed method could be used as an efficient tool to identify ATC-classes of novel drugs or to discover novel uses of known drugs. PMID:24492783

  2. A comparison of ROC inferred from FROC and conventional ROC

    NASA Astrophysics Data System (ADS)

    McEntee, Mark F.; Littlefair, Stephen; Pietrzyk, Mariusz W.

    2014-03-01

    This study aims to determine whether receiver operating characteristic (ROC) scores inferred from free-response receiver operating characteristic (FROC) were equivalent to conventional ROC scores for the same readers and cases. Forty-five examining radiologists of the American Board of Radiology independently reviewed 47 PA chest radiographs under at least two conditions. Thirty-seven cases had abnormal findings and 10 cases had normal findings. Half the readers were asked to first locate any visualized lung nodules, mark them and assign a level of confidence [the FROC mark-rating pair] and second give an overall to the entire image on the same scale [the ROC score]. The second half of readers gave the ROC rating first followed by the FROC mark-rating pairs. A normal image was represented with number 1 and malignant lesions with numbers 2-5. A jackknife free-response receiver operating characteristic (JAFROC), and inferred ROC (infROC) was calculated from the mark-rating pairs using JAFROC V4.1 software. ROC based on the overall rating of the image calculated using DBM MRMC software, which was also used to compare infROC and ROC AUCs treating the methods as modalities. Pearson's correlations coefficient and linear regression were used to examine their relationship using SPSS, version 21.0; (SPSS, Chicago, IL). The results of this study showed no significant difference between the ROC and Inferred ROC AUCs (p≤0.25). While Pearson's correlation coefficient was 0.7 (p≤0.01). Inter-reader correlation calculated from Obuchowski- Rockette covariance's ranged from 0.43-0.86 while intra-reader agreement was greater than previously reported ranging from 0.68-0.82.

  3. A comparison of Australian and USA radiologists' performance in detection of breast cancer

    NASA Astrophysics Data System (ADS)

    Suleiman, Wasfi I.; Georgian-Smith, Dianne; Evanoff, Michael G.; Lewis, Sarah; McEntee, Mark F.

    2014-03-01

    The aim of current work was to compare the performance of radiologists that read a higher number of cases to those that read a lower number, as well as examine the effect of number of years of experience on performance. This study compares Australian and USA radiologist with differing levels of experience when reading mammograms. Thirty mammographic cases were presented to 41 radiologists, 21 from Australia and 20 from the USA. Readers were asked to locate and visualize cancer and assign a mark-rating pair with confidence levels from 1 to 5. A jackknife free-response receiver operating characteristic (JAFROC), inferred receiver operating characteristic (ROC), sensitivity, specificity and location sensitivity were calculated. A Mann-Whitney test was used to compare the performance of Australian and USA radiologists using SPSS software. The results showed that the USA radiologists sampled had more years of experience (p≤0.01) but read less mammograms per year (p≤0.03). Significantly higher sensitivity and location sensitivity (p≤ 0.001) were found for the Australia radiologists when experience and the number of mammograms read per year were taken into account. There were no differences between the two countries in overall performance measured by JAFROC and inferred ROC. For the most experienced radiologists within the Australian sample experienced ROC and location sensitivity were higher when compared to the least experienced. The increased number of years experience of the USA radiologists did not result in an increase in any performance metrics. The number of cases per year is a better predictor of improved diagnostic performance.

  4. High-resolution taxonomic profiling of the subgingival microbiome for biomarker discovery and periodontitis diagnosis.

    PubMed

    Szafranski, Szymon P; Wos-Oxley, Melissa L; Vilchez-Vargas, Ramiro; Jáuregui, Ruy; Plumeier, Iris; Klawonn, Frank; Tomasch, Jürgen; Meisinger, Christa; Kühnisch, Jan; Sztajer, Helena; Pieper, Dietmar H; Wagner-Döbler, Irene

    2015-02-01

    The oral microbiome plays a key role for caries, periodontitis, and systemic diseases. A method for rapid, high-resolution, robust taxonomic profiling of subgingival bacterial communities for early detection of periodontitis biomarkers would therefore be a useful tool for individualized medicine. Here, we used Illumina sequencing of the V1-V2 and V5-V6 hypervariable regions of the 16S rRNA gene. A sample stratification pipeline was developed in a pilot study of 19 individuals, 9 of whom had been diagnosed with chronic periodontitis. Five hundred twenty-three operational taxonomic units (OTUs) were obtained from the V1-V2 region and 432 from the V5-V6 region. Key periodontal pathogens like Porphyromonas gingivalis, Treponema denticola, and Tannerella forsythia could be identified at the species level with both primer sets. Principal coordinate analysis identified two outliers that were consistently independent of the hypervariable region and method of DNA extraction used. The linear discriminant analysis (LDA) effect size algorithm (LEfSe) identified 80 OTU-level biomarkers of periodontitis and 17 of health. Health- and periodontitis-related clusters of OTUs were identified using a connectivity analysis, and the results confirmed previous studies with several thousands of samples. A machine learning algorithm was developed which was trained on all but one sample and then predicted the diagnosis of the left-out sample (jackknife method). Using a combination of the 10 best biomarkers, 15 of 17 samples were correctly diagnosed. Training the algorithm on time-resolved community profiles might provide a highly sensitive tool to detect the onset of periodontitis. PMID:25452281

  5. Absolute and relative locations of earthquakes at Mount St. Helens, Washington, using continuous data: implications for magmatic processes: Chapter 4 in A volcano rekindled: the renewed eruption of Mount St. Helens, 2004-2006

    USGS Publications Warehouse

    Thelen, Weston A.; Crosson, Robert S.; Creager, Kenneth C.

    2008-01-01

    This study uses a combination of absolute and relative locations from earthquake multiplets to investigate the seismicity associated with the eruptive sequence at Mount St. Helens between September 23, 2004, and November 20, 2004. Multiplets, a prominent feature of seismicity during this time period, occurred as volcano-tectonic, hybrid, and low-frequency earthquakes spanning a large range of magnitudes and lifespans. Absolute locations were improved through the use of a new one-dimensional velocity model with excellent shallow constraints on P-wave velocities. We used jackknife tests to minimize possible biases in absolute and relative locations resulting from station outages and changing station configurations. In this paper, we show that earthquake hypocenters shallowed before the October 1 explosion along a north-dipping structure under the 1980-86 dome. Relative relocations of multiplets during the initial seismic unrest and ensuing eruption showed rather small source volumes before the October 1 explosion and larger tabular source volumes after October 5. All multiplets possess absolute locations very close to each other. However, the highly dissimilar waveforms displayed by each of the multiplets analyzed suggest that different sources and mechanisms were present within a very small source volume. We suggest that multiplets were related to pressurization of the conduit system that produced a stationary source that was highly stable over long time periods. On the basis of their response to explosions occurring in October 2004, earthquakes not associated with multiplets also appeared to be pressure dependent. The pressure source for these earthquakes appeared, however, to be different from the pressure source of the multiplets.

  6. Comparative analysis of different retrieval methods for mapping grassland leaf area index using airborne imaging spectroscopy

    NASA Astrophysics Data System (ADS)

    Atzberger, Clement; Darvishzadeh, Roshanak; Immitzer, Markus; Schlerf, Martin; Skidmore, Andrew; le Maire, Guerric

    2015-12-01

    Fine scale maps of vegetation biophysical variables are useful status indicators for monitoring and managing national parks and endangered habitats. Here, we assess in a comparative way four different retrieval methods for estimating leaf area index (LAI) in grassland: two radiative transfer model (RTM) inversion methods (one based on look-up-tables (LUT) and one based on predictive equations) and two statistical modelling methods (one partly, the other entirely based on in situ data). For prediction, spectral data were used that had been acquired over Majella National Park in Italy by the airborne hyperspectral HyMap instrument. To assess the performance of the four investigated models, the normalized root mean squared error (nRMSE) and coefficient of determination (R2) between estimates and in situ LAI measurements are reported (n = 41). Using a jackknife approach, we also quantified the accuracy and robustness of empirical models as a function of the size of the available calibration data set. The results of the study demonstrate that the LUT-based RTM inversion yields higher accuracies for LAI estimation (R2 = 0.91, nRMSE = 0.18) as compared to RTM inversions based on predictive equations (R2 = 0.79, nRMSE = 0.38). The two statistical methods yield accuracies similar to the LUT method. However, as expected, the accuracy and robustness of the statistical models decrease when the size of the calibration database is reduced to fewer samples. The results of this study are of interest for the remote sensing community developing improved inversion schemes for spaceborne hyperspectral sensors applicable to different vegetation types. The examples provided in this paper may also serve as illustrations for the drawbacks and advantages of physical and empirical models.

  7. Evaluation of Limiting Climatic Factors and Simulation of a Climatically Suitable Habitat for Chinese Sea Buckthorn

    PubMed Central

    Li, Guoqing; Du, Sheng; Guo, Ke

    2015-01-01

    Chinese sea buckthorn (Hippophae rhamnoides subsp. sinensis) has considerable economic potential and plays an important role in reclamation and soil and water conservation. For scientific cultivation of this species across China, we identified the key climatic factors and explored climatically suitable habitat in order to maximize survival of Chinese sea buckthorn using MaxEnt and GIS tools, based on 98 occurrence records from herbarium and publications and 13 climatic factors from Bioclim, Holdridge life zone and Kria' index variables. Our simulation showed that the MaxEnt model performance was significantly better than random, with an average test AUC value of 0.93 with 10-fold cross validation. A jackknife test and the regularized gain change, which were applied to the training algorithm, showed that precipitation of the driest month (PDM), annual precipitation (AP), coldness index (CI) and annual range of temperature (ART) were the most influential climatic factors in limiting the distribution of Chinese sea buckthorn, which explained 70.1% of the variation. The predicted map showed that the core of climatically suitable habitat was distributed from the southwest to northwest of Gansu, Ningxia, Shaanxi and Shanxi provinces, where the most influential climate variables were PDM of 1.0–7.0 mm, AP of 344.0–1089.0 mm, CI of -47.7–0.0°C, and ART of 26.1–45.0°C. We conclude that the distribution patterns of Chinese sea buckthorn are related to the northwest winter monsoon, the southwest summer monsoon and the southeast summer monsoon systems in China. PMID:26177033

  8. Validation of the ALARO-0 model within the EURO-CORDEX framework

    NASA Astrophysics Data System (ADS)

    Giot, O.; Termonia, P.; Degrauwe, D.; De Troch, R.; Caluwaerts, S.; Smet, G.; Berckmans, J.; Deckmyn, A.; De Cruz, L.; De Meutter, P.; Duerinckx, A.; Gerard, L.; Hamdi, R.; Van den Bergh, J.; Van Ginderachter, M.; Van Schaeybroeck, B.

    2015-10-01

    Using the regional climate model ALARO-0 the Royal Meteorological Institute of Belgium has performed two simulations of the past observed climate within the framework of the Coordinated Regional Climate Downscaling Experiment (CORDEX). The ERA-Interim reanalysis was used to drive the model for the period 1979-2010 on the EURO-CORDEX domain with two horizontal resolutions, 0.11 and 0.44 °. ALARO-0 is characterised by the new microphysics scheme 3MT, which allows for a better representation of convective precipitation. In Kotlarski et al. (2014) several metrics assessing the performance in representing seasonal mean near-surface air temperature and precipitation are defined and the corresponding scores are calculated for an ensemble of models for different regions and seasons for the period 1989-2008. Of special interest within this ensemble is the ARPEGE model by the Centre National de Recherches Météorologiques (CNRM), which shares a large amount of core code with ALARO-0. Results show that ALARO-0 is capable of representing the European climate in an acceptable way as most of the ALARO-0 scores lie within the existing ensemble. However, for near-surface air temperature some large biases, which are often also found in the ARPEGE results, persist. For precipitation, on the other hand, the ALARO-0 model produces some of the best scores within the ensemble and no clear resemblance to ARPEGE is found, which is attributed to the inclusion of 3MT. Additionally, a jackknife procedure is applied to the ALARO-0 results in order to test whether the scores are robust, by which we mean independent of the period used to calculate them. Periods of 20 years are sampled from the 32 year simulation and used to construct the 95 % confidence interval for each score. For most scores these intervals are very small compared to the total ensemble spread, implying that model differences in the scores are significant.

  9. Validation of the ALARO-0 model within the EURO-CORDEX framework

    NASA Astrophysics Data System (ADS)

    Giot, Olivier; Termonia, Piet; Degrauwe, Daan; De Troch, Rozemien; Caluwaerts, Steven; Smet, Geert; Berckmans, Julie; Deckmyn, Alex; De Cruz, Lesley; De Meutter, Pieter; Duerinckx, Annelies; Gerard, Luc; Hamdi, Rafiq; Van den Bergh, Joris; Van Ginderachter, Michiel; Van Schaeybroeck, Bert

    2016-03-01

    Using the regional climate model ALARO-0, the Royal Meteorological Institute of Belgium and Ghent University have performed two simulations of the past observed climate within the framework of the Coordinated Regional Climate Downscaling Experiment (CORDEX). The ERA-Interim reanalysis was used to drive the model for the period 1979-2010 on the EURO-CORDEX domain with two horizontal resolutions, 0.11 and 0.44°. ALARO-0 is characterised by the new microphysics scheme 3MT, which allows for a better representation of convective precipitation. In Kotlarski et al. (2014) several metrics assessing the performance in representing seasonal mean near-surface air temperature and precipitation are defined and the corresponding scores are calculated for an ensemble of models for different regions and seasons for the period 1989-2008. Of special interest within this ensemble is the ARPEGE model by the Centre National de Recherches Météorologiques (CNRM), which shares a large amount of core code with ALARO-0. Results show that ALARO-0 is capable of representing the European climate in an acceptable way as most of the ALARO-0 scores lie within the existing ensemble. However, for near-surface air temperature, some large biases, which are often also found in the ARPEGE results, persist. For precipitation, on the other hand, the ALARO-0 model produces some of the best scores within the ensemble and no clear resemblance to ARPEGE is found, which is attributed to the inclusion of 3MT. Additionally, a jackknife procedure is applied to the ALARO-0 results in order to test whether the scores are robust, meaning independent of the period used to calculate them. Periods of 20 years are sampled from the 32-year simulation and used to construct the 95 % confidence interval for each score. For most scores, these intervals are very small compared to the total ensemble spread, implying that model differences in the scores are significant.

  10. Improved Classification of Lung Cancer Tumors Based on Structural and Physicochemical Properties of Proteins Using Data Mining Models

    PubMed Central

    Ramani, R. Geetha; Jacob, Shomona Gracia

    2013-01-01

    Detecting divergence between oncogenic tumors plays a pivotal role in cancer diagnosis and therapy. This research work was focused on designing a computational strategy to predict the class of lung cancer tumors from the structural and physicochemical properties (1497 attributes) of protein sequences obtained from genes defined by microarray analysis. The proposed methodology involved the use of hybrid feature selection techniques (gain ratio and correlation based subset evaluators with Incremental Feature Selection) followed by Bayesian Network prediction to discriminate lung cancer tumors as Small Cell Lung Cancer (SCLC), Non-Small Cell Lung Cancer (NSCLC) and the COMMON classes. Moreover, this methodology eliminated the need for extensive data cleansing strategies on the protein properties and revealed the optimal and minimal set of features that contributed to lung cancer tumor classification with an improved accuracy compared to previous work. We also attempted to predict via supervised clustering the possible clusters in the lung tumor data. Our results revealed that supervised clustering algorithms exhibited poor performance in differentiating the lung tumor classes. Hybrid feature selection identified the distribution of solvent accessibility, polarizability and hydrophobicity as the highest ranked features with Incremental feature selection and Bayesian Network prediction generating the optimal Jack-knife cross validation accuracy of 87.6%. Precise categorization of oncogenic genes causing SCLC and NSCLC based on the structural and physicochemical properties of their protein sequences is expected to unravel the functionality of proteins that are essential in maintaining the genomic integrity of a cell and also act as an informative source for drug design, targeting essential protein properties and their composition that are found to exist in lung cancer tumors. PMID:23505559

  11. Phylogenetic diversity and ecological pattern of ammonia-oxidizing archaea in the surface sediments of the western Pacific.

    PubMed

    Cao, Huiluo; Hong, Yiguo; Li, Meng; Gu, Ji-Dong

    2011-11-01

    The phylogenetic diversity of ammonia-oxidizing archaea (AOA) was surveyed in the surface sediments from the northern part of the South China Sea (SCS). The distribution pattern of AOA in the western Pacific was discussed through comparing the SCS with other areas in the western Pacific including Changjiang Estuary and the adjacent East China Sea where high input of anthropogenic nitrogen was evident, the tropical West Pacific Continental Margins close to the Philippines, the deep-sea methane seep sediments in the Okhotsk Sea, the cold deep sea of Northeastern Japan Sea, and the hydrothermal field in the Southern Okinawa Trough. These various environments provide a wide spectrum of physical and chemical conditions for a better understanding of the distribution pattern and diversities of AOA in the western Pacific. Under these different conditions, the distinct community composition between shallow and deep-sea sediments was clearly delineated based on the UniFrac PCoA and Jackknife Environmental Cluster analyses. Phylogenetic analyses showed that a few ammonia-oxidizing archaeal subclades in the marine water column/sediment clade and endemic lineages were indicative phylotypes for some environments. Higher phylogenetic diversity was observed in the Philippines while lower diversity in the hydrothermal vent habitat. Water depth and possibly with other environmental factors could be the main driving forces to shape the phylogenetic diversity of AOA observed, not only in the SCS but also in the whole western Pacific. The multivariate regression tree analysis also supported this observation consistently. Moreover, the functions of current and other climate factors were also discussed in comparison of phylogenetic diversity. The information collectively provides important insights into the ecophysiological requirements of uncultured ammonia-oxidizing archaeal lineages in the western Pacific Ocean. PMID:21748268

  12. A multi-scale study of Orthoptera species richness and human population size controlling for sampling effort

    NASA Astrophysics Data System (ADS)

    Cantarello, Elena; Steck, Claude E.; Fontana, Paolo; Fontaneto, Diego; Marini, Lorenzo; Pautasso, Marco

    2010-03-01

    Recent large-scale studies have shown that biodiversity-rich regions also tend to be densely populated areas. The most obvious explanation is that biodiversity and human beings tend to match the distribution of energy availability, environmental stability and/or habitat heterogeneity. However, the species-people correlation can also be an artefact, as more populated regions could show more species because of a more thorough sampling. Few studies have tested this sampling bias hypothesis. Using a newly collated dataset, we studied whether Orthoptera species richness is related to human population size in Italy’s regions (average area 15,000 km2) and provinces (2,900 km2). As expected, the observed number of species increases significantly with increasing human population size for both grain sizes, although the proportion of variance explained is minimal at the provincial level. However, variations in observed Orthoptera species richness are primarily associated with the available number of records, which is in turn well correlated with human population size (at least at the regional level). Estimated Orthoptera species richness (Chao2 and Jackknife) also increases with human population size both for regions and provinces. Both for regions and provinces, this increase is not significant when controlling for variation in area and number of records. Our study confirms the hypothesis that broad-scale human population-biodiversity correlations can in some cases be artefactual. More systematic sampling of less studied taxa such as invertebrates is necessary to ascertain whether biogeographical patterns persist when sampling effort is kept constant or included in models.

  13. Food selection among Atlantic Coast seaducks in relation to historic food habits

    USGS Publications Warehouse

    Perry, M.C.; Osenton, P.C.; Wells-Berlin, A. M.; Kidwell, D.M.

    2005-01-01

    Food selection among Atlantic Coast seaducks during 1999-2005 was determined from hunter-killed ducks and compared to data from historic food habits file (1885-1985) for major migrational and wintering areas in the Atlantic Flyway. Food selection was determined by analyses of the gullet (esophagus and proventriculus) and gizzard of 860 ducks and summarized by aggregate percent for each species. When sample size was adequate comparisons were made among age and sex groupings and also among local sites in major habitat areas. Common eiders in Maine and the Canadian Maritimes fed predominantly (53%) on the blue mussel (Mytilus edulis). Scoters in Massachusetts, Maine, and the Canadian Maritimes fed predominantly on the blue mussel (46%), Atlantic jackknife clam (Ensis directus; 19%), and Atlantic surf clam (Spisula solidissima; 15%), whereas scoters in the Chesapeake Bay fed predominantly on hooked mussel (Ischadium recurvum; 42%), the stout razor clam (Tagelus plebeius; 22%), and dwarf surf clam (Mulinia lateralis; 15%). The amethyst gem clam (Gemma gemma) was the predominant food (45%) of long-tailed ducks in Chesapeake Bay. Buffleheads and common goldeneyes fed on a mixed diet of mollusks and soft bodied invertebrates (amphipods, isopods and polychaetes). No major differences were noticed between the sexes in regard to food selection in any of the wintering areas. Comparisons to historic food habits in all areas failed to detect major differences. However, several invertebrate species recorded in historic samples were not found in current samples and two invasive species (Atlantic Rangia, Rangia cuneata and green crab, Carcinas maenas) were recorded in modem samples, but not in historic samples. Benthic sampling in areas where seaducks were collected showed a close correlation between consumption and availability. Each seaduck species appears to fill a unique niche in regard to feeding ecology, although there is much overlap of prey species selected. Understanding the trophic relationships of seaducks in coastal wintering areas will give managers a better understanding of habitat changes in regard to future environmental perturbations.

  14. Predicting Transcriptional Activity of Multiple Site p53 Mutants Based on Hybrid Properties

    PubMed Central

    Xu, Zhongping; Huang, Yun; Kong, Xiangyin; Cai, Yu-Dong; Chou, Kuo-Chen

    2011-01-01

    As an important tumor suppressor protein, reactivate mutated p53 was found in many kinds of human cancers and that restoring active p53 would lead to tumor regression. In this work, we developed a new computational method to predict the transcriptional activity for one-, two-, three- and four-site p53 mutants, respectively. With the approach from the general form of pseudo amino acid composition, we used eight types of features to represent the mutation and then selected the optimal prediction features based on the maximum relevance, minimum redundancy, and incremental feature selection methods. The Mathew's correlation coefficients (MCC) obtained by using nearest neighbor algorithm and jackknife cross validation for one-, two-, three- and four-site p53 mutants were 0.678, 0.314, 0.705, and 0.907, respectively. It was revealed by the further optimal feature set analysis that the 2D (two-dimensional) structure features composed the largest part of the optimal feature set and maybe played the most important roles in all four types of p53 mutant active status prediction. It was also demonstrated by the optimal feature sets, especially those at the top level, that the 3D structure features, conservation, physicochemical and biochemical properties of amino acid near the mutation site, also played quite important roles for p53 mutant active status prediction. Our study has provided a new and promising approach for finding functionally important sites and the relevant features for in-depth study of p53 protein and its action mechanism. PMID:21857971

  15. Do adolescent Ecstasy users have different attitudes towards drugs when compared to Marijuana users?

    PubMed Central

    Martins, Silvia S.; Storr, Carla L.; Alexandre, Pierre K.; Chilcoat, Howard D.

    2008-01-01

    Background Perceived risk and attitudes about the consequences of drug use, perceptions of others expectations and self-efficacy influence the intent to try drugs and continue drug use once use has started. We examine associations between adolescents’ attitudes and beliefs towards ecstasy use; because most ecstasy users have a history of marijuana use, we estimate the association for three groups of adolescents: non-marijuana/ecstasy users, marijuana users (used marijuana at least once but never used ecstasy) and ecstasy users (used ecstasy at least once). Methods Data from 5,049 adolescents aged 12–18 years old who had complete weighted data information in Round 2 of the Restricted Use Files (RUF) of the National Survey of Parents and Youth (NSPY). Data were analyzed using jackknife weighted multinomial logistic regression models. Results Adolescent marijuana and ecstasy users were more likely to approve of marijuana and ecstasy use as compared to non-drug using youth. Adolescent marijuana and ecstasy users were more likely to have close friends who approved of ecstasy as compared to non-drug using youth. The magnitudes of these two associations were stronger for ecstasy use than for marijuana use in the final adjusted model. Our final adjusted model shows that approval of marijuana and ecstasy use was more strongly associated with marijuana and ecstasy use in adolescence than perceived risk in using both drugs. Conclusion Information about the risks and consequences of ecstasy use need to be presented to adolescents in order to attempt to reduce adolescents’ approval of ecstasy use as well as ecstasy experimentation. PMID:18068314

  16. Demographic history and rare allele sharing among human populations

    PubMed Central

    Gravel, Simon; Henn, Brenna M.; Gutenkunst, Ryan N.; Indap, Amit R.; Marth, Gabor T.; Clark, Andrew G.; Yu, Fuli; Gibbs, Richard A.; Bustamante, Carlos D.; Altshuler, David L.; Durbin, Richard M.; Abecasis, Gonçalo R.; Bentley, David R.; Chakravarti, Aravinda; Clark, Andrew G.; Collins, Francis S.; De La Vega, Francisco M.; Donnelly, Peter; Egholm, Michael; Flicek, Paul; Gabriel, Stacey B.; Gibbs, Richard A.; Knoppers, Bartha M.; Lander, Eric S.; Lehrach, Hans; Mardis, Elaine R.; McVean, Gil A.; Nickerson, Debbie A.; Peltonen, Leena; Schafer, Alan J.; Sherry, Stephen T.; Wang, Jun; Wilson, Richard K.; Gibbs, Richard A.; Deiros, David; Metzker, Mike; Muzny, Donna; Reid, Jeff; Wheeler, David; Wang, Jun; Li, Jingxiang; Jian, Min; Li, Guoqing; Li, Ruiqiang; Liang, Huiqing; Tian, Geng; Wang, Bo; Wang, Jian; Wang, Wei; Yang, Huanming; Zhang, Xiuqing; Zheng, Huisong; Lander, Eric S.; Altshuler, David L.; Ambrogio, Lauren; Bloom, Toby; Cibulskis, Kristian; Fennell, Tim J.; Gabriel, Stacey B.; Jaffe, David B.; Shefler, Erica; Sougnez, Carrie L.; Bentley, David R.; Gormley, Niall; Humphray, Sean; Kingsbury, Zoya; Koko-Gonzales, Paula; Stone, Jennifer; McKernan, Kevin J.; Costa, Gina L.; Ichikawa, Jeffry K.; Lee, Clarence C.; Sudbrak, Ralf; Lehrach, Hans; Borodina, Tatiana A.; Dahl, Andreas; Davydov, Alexey N.; Marquardt, Peter; Mertes, Florian; Nietfeld, Wilfiried; Rosenstiel, Philip; Schreiber, Stefan; Soldatov, Aleksey V.; Timmermann, Bernd; Tolzmann, Marius; Egholm, Michael; Affourtit, Jason; Ashworth, Dana; Attiya, Said; Bachorski, Melissa; Buglione, Eli; Burke, Adam; Caprio, Amanda; Celone, Christopher; Clark, Shauna; Conners, David; Desany, Brian; Gu, Lisa; Guccione, Lorri; Kao, Kalvin; Kebbel, Andrew; Knowlton, Jennifer; Labrecque, Matthew; McDade, Louise; Mealmaker, Craig; Minderman, Melissa; Nawrocki, Anne; Niazi, Faheem; Pareja, Kristen; Ramenani, Ravi; Riches, David; Song, Wanmin; Turcotte, Cynthia; Wang, Shally; Mardis, Elaine R.; Wilson, Richard K.; Dooling, David; Fulton, Lucinda; Fulton, Robert; Weinstock, George; Durbin, Richard M.; Burton, John; Carter, David M.; Churcher, Carol; Coffey, Alison; Cox, Anthony; Palotie, Aarno; Quail, Michael; Skelly, Tom; Stalker, James; Swerdlow, Harold P.; Turner, Daniel; De Witte, Anniek; Giles, Shane; Gibbs, Richard A.; Wheeler, David; Bainbridge, Matthew; Challis, Danny; Sabo, Aniko; Yu, Fuli; Yu, Jin; Wang, Jun; Fang, Xiaodong; Guo, Xiaosen; Li, Ruiqiang; Li, Yingrui; Luo, Ruibang; Tai, Shuaishuai; Wu, Honglong; Zheng, Hancheng; Zheng, Xiaole; Zhou, Yan; Li, Guoqing; Wang, Jian; Yang, Huanming; Marth, Gabor T.; Garrison, Erik P.; Huang, Weichun; Indap, Amit; Kural, Deniz; Lee, Wan-Ping; Leong, Wen Fung; Quinlan, Aaron R.; Stewart, Chip; Stromberg, Michael P.; Ward, Alistair N.; Wu, Jiantao; Lee, Charles; Mills, Ryan E.; Shi, Xinghua; Daly, Mark J.; DePristo, Mark A.; Altshuler, David L.; Ball, Aaron D.; Banks, Eric; Bloom, Toby; Browning, Brian L.; Cibulskis, Kristian; Fennell, Tim J.; Garimella, Kiran V.; Grossman, Sharon R.; Handsaker, Robert E.; Hanna, Matt; Hartl, Chris; Jaffe, David B.; Kernytsky, Andrew M.; Korn, Joshua M.; Li, Heng; Maguire, Jared R.; McCarroll, Steven A.; McKenna, Aaron; Nemesh, James C.; Philippakis, Anthony A.; Poplin, Ryan E.; Price, Alkes; Rivas, Manuel A.; Sabeti, Pardis C.; Schaffner, Stephen F.; Shefler, Erica; Shlyakhter, Ilya A.; Cooper, David N.; Ball, Edward V.; Mort, Matthew; Phillips, Andrew D.; Stenson, Peter D.; Sebat, Jonathan; Makarov, Vladimir; Ye, Kenny; Yoon, Seungtai C.; Bustamante, Carlos D.; Clark, Andrew G.; Boyko, Adam; Degenhardt, Jeremiah; Gravel, Simon; Gutenkunst, Ryan N.; Kaganovich, Mark; Keinan, Alon; Lacroute, Phil; Ma, Xin; Reynolds, Andy; Clarke, Laura; Flicek, Paul; Cunningham, Fiona; Herrero, Javier; Keenen, Stephen; Kulesha, Eugene; Leinonen, Rasko; McLaren, William M.; Radhakrishnan, Rajesh; Smith, Richard E.; Zalunin, Vadim; Zheng-Bradley, Xiangqun; Korbel, Jan O.; Stütz, Adrian M.; Humphray, Sean; Bauer, Markus; Cheetham, R. Keira; Cox, Tony; Eberle, Michael; James, Terena; Kahn, Scott; Murray, Lisa; Chakravarti, Aravinda; Ye, Kai; De La Vega, Francisco M.; Fu, Yutao; Hyland, Fiona C. L.; Manning, Jonathan M.; McLaughlin, Stephen F.; Peckham, Heather E.; Sakarya, Onur; Sun, Yongming A.; Tsung, Eric F.; Batzer, Mark A.; Konkel, Miriam K.; Walker, Jerilyn A.; Sudbrak, Ralf; Albrecht, Marcus W.; Amstislavskiy, Vyacheslav S.; Herwig, Ralf; Parkhomchuk, Dimitri V.; Sherry, Stephen T.; Agarwala, Richa; Khouri, Hoda M.; Morgulis, Aleksandr O.; Paschall, Justin E.; Phan, Lon D.; Rotmistrovsky, Kirill E.; Sanders, Robert D.; Shumway, Martin F.; Xiao, Chunlin; McVean, Gil A.; Auton, Adam; Iqbal, Zamin; Lunter, Gerton; Marchini, Jonathan L.; Moutsianas, Loukas; Myers, Simon; Tumian, Afidalina; Desany, Brian; Knight, James; Winer, Roger; Craig, David W.; Beckstrom-Sternberg, Steve M.; Christoforides, Alexis; Kurdoglu, Ahmet A.; Pearson, John V.; Sinari, Shripad A.; Tembe, Waibhav D.; Haussler, David; Hinrichs, Angie S.; Katzman, Sol J.; Kern, Andrew; Kuhn, Robert M.; Przeworski, Molly; Hernandez, Ryan D.; Howie, Bryan; Kelley, Joanna L.; Melton, S. Cord; Abecasis, Gonçalo R.; Li, Yun; Anderson, Paul; Blackwell, Tom; Chen, Wei; Cookson, William O.; Ding, Jun; Kang, Hyun Min; Lathrop, Mark; Liang, Liming; Moffatt, Miriam F.; Scheet, Paul; Sidore, Carlo; Snyder, Matthew; Zhan, Xiaowei; Zöllner, Sebastian; Awadalla, Philip; Casals, Ferran; Idaghdour, Youssef; Keebler, John; Stone, Eric A.; Zilversmit, Martine; Jorde, Lynn; Xing, Jinchuan; Eichler, Evan E.; Aksay, Gozde; Alkan, Can; Hajirasouliha, Iman; Hormozdiari, Fereydoun; Kidd, Jeffrey M.; Sahinalp, S. Cenk; Sudmant, Peter H.; Mardis, Elaine R.; Chen, Ken; Chinwalla, Asif; Ding, Li; Koboldt, Daniel C.; McLellan, Mike D.; Dooling, David; Weinstock, George; Wallis, John W.; Wendl, Michael C.; Zhang, Qunyuan; Durbin, Richard M.; Albers, Cornelis A.; Ayub, Qasim; Balasubramaniam, Senduran; Barrett, Jeffrey C.; Carter, David M.; Chen, Yuan; Conrad, Donald F.; Danecek, Petr; Dermitzakis, Emmanouil T.; Hu, Min; Huang, Ni; Hurles, Matt E.; Jin, Hanjun; Jostins, Luke; Keane, Thomas M.; Le, Si Quang; Lindsay, Sarah; Long, Quan; MacArthur, Daniel G.; Montgomery, Stephen B.; Parts, Leopold; Stalker, James; Tyler-Smith, Chris; Walter, Klaudia; Zhang, Yujun; Gerstein, Mark B.; Snyder, Michael; Abyzov, Alexej; Balasubramanian, Suganthi; Bjornson, Robert; Du, Jiang; Grubert, Fabian; Habegger, Lukas; Haraksingh, Rajini; Jee, Justin; Khurana, Ekta; Lam, Hugo Y. K.; Leng, Jing; Mu, Xinmeng Jasmine; Urban, Alexander E.; Zhang, Zhengdong; Li, Yingrui; Luo, Ruibang; Marth, Gabor T.; Garrison, Erik P.; Kural, Deniz; Quinlan, Aaron R.; Stewart, Chip; Stromberg, Michael P.; Ward, Alistair N.; Wu, Jiantao; Lee, Charles; Mills, Ryan E.; Shi, Xinghua; McCarroll, Steven A.; Banks, Eric; DePristo, Mark A.; Handsaker, Robert E.; Hartl, Chris; Korn, Joshua M.; Li, Heng; Nemesh, James C.; Sebat, Jonathan; Makarov, Vladimir; Ye, Kenny; Yoon, Seungtai C.; Degenhardt, Jeremiah; Kaganovich, Mark; Clarke, Laura; Smith, Richard E.; Zheng-Bradley, Xiangqun; Korbel, Jan O.; Humphray, Sean; Cheetham, R. Keira; Eberle, Michael; Kahn, Scott; Murray, Lisa; Ye, Kai; De La Vega, Francisco M.; Fu, Yutao; Peckham, Heather E.; Sun, Yongming A.; Batzer, Mark A.; Konkel, Miriam K.; Walker, Jerilyn A.; Xiao, Chunlin; Iqbal, Zamin; Desany, Brian; Blackwell, Tom; Snyder, Matthew; Xing, Jinchuan; Eichler, Evan E.; Aksay, Gozde; Alkan, Can; Hajirasouliha, Iman; Hormozdiari, Fereydoun; Kidd, Jeffrey M.; Chen, Ken; Chinwalla, Asif; Ding, Li; McLellan, Mike D.; Wallis, John W.; Hurles, Matt E.; Conrad, Donald F.; Walter, Klaudia; Zhang, Yujun; Gerstein, Mark B.; Snyder, Michael; Abyzov, Alexej; Du, Jiang; Grubert, Fabian; Haraksingh, Rajini; Jee, Justin; Khurana, Ekta; Lam, Hugo Y. K.; Leng, Jing; Mu, Xinmeng Jasmine; Urban, Alexander E.; Zhang, Zhengdong; Gibbs, Richard A.; Bainbridge, Matthew; Challis, Danny; Coafra, Cristian; Dinh, Huyen; Kovar, Christie; Lee, Sandy; Muzny, Donna; Nazareth, Lynne; Reid, Jeff; Sabo, Aniko; Yu, Fuli; Yu, Jin; Marth, Gabor T.; Garrison, Erik P.; Indap, Amit; Leong, Wen Fung; Quinlan, Aaron R.; Stewart, Chip; Ward, Alistair N.; Wu, Jiantao; Cibulskis, Kristian; Fennell, Tim J.; Gabriel, Stacey B.; Garimella, Kiran V.; Hartl, Chris; Shefler, Erica; Sougnez, Carrie L.; Wilkinson, Jane; Clark, Andrew G.; Gravel, Simon; Grubert, Fabian; Clarke, Laura; Flicek, Paul; Smith, Richard E.; Zheng-Bradley, Xiangqun; Sherry, Stephen T.; Khouri, Hoda M.; Paschall, Justin E.; Shumway, Martin F.; Xiao, Chunlin; McVean, Gil A.; Katzman, Sol J.; Abecasis, Gonçalo R.; Blackwell, Tom; Mardis, Elaine R.; Dooling, David; Fulton, Lucinda; Fulton, Robert; Koboldt, Daniel C.; Durbin, Richard M.; Balasubramaniam, Senduran; Coffey, Allison; Keane, Thomas M.; MacArthur, Daniel G.; Palotie, Aarno; Scott, Carol; Stalker, James; Tyler-Smith, Chris; Gerstein, Mark B.; Balasubramanian, Suganthi; Chakravarti, Aravinda; Knoppers, Bartha M.; Abecasis, Gonçalo R.; Bustamante, Carlos D.; Gharani, Neda; Gibbs, Richard A.; Jorde, Lynn; Kaye, Jane S.; Kent, Alastair; Li, Taosha; McGuire, Amy L.; McVean, Gil A.; Ossorio, Pilar N.; Rotimi, Charles N.; Su, Yeyang; Toji, Lorraine H.; TylerSmith, Chris; Brooks, Lisa D.; Felsenfeld, Adam L.; McEwen, Jean E.; Abdallah, Assya; Juenger, Christopher R.; Clemm, Nicholas C.; Collins, Francis S.; Duncanson, Audrey; Green, Eric D.; Guyer, Mark S.; Peterson, Jane L.; Schafer, Alan J.; Abecasis, Gonçalo R.; Altshuler, David L.; Auton, Adam; Brooks, Lisa D.; Durbin, Richard M.; Gibbs, Richard A.; Hurles, Matt E.; McVean, Gil A.

    2011-01-01

    High-throughput sequencing technology enables population-level surveys of human genomic variation. Here, we examine the joint allele frequency distributions across continental human populations and present an approach for combining complementary aspects of whole-genome, low-coverage data and targeted high-coverage data. We apply this approach to data generated by the pilot phase of the Thousand Genomes Project, including whole-genome 2–4× coverage data for 179 samples from HapMap European, Asian, and African panels as well as high-coverage target sequencing of the exons of 800 genes from 697 individuals in seven populations. We use the site frequency spectra obtained from these data to infer demographic parameters for an Out-of-Africa model for populations of African, European, and Asian descent and to predict, by a jackknife-based approach, the amount of genetic diversity that will be discovered as sample sizes are increased. We predict that the number of discovered nonsynonymous coding variants will reach 100,000 in each population after ∼1,000 sequenced chromosomes per population, whereas ∼2,500 chromosomes will be needed for the same number of synonymous variants. Beyond this point, the number of segregating sites in the European and Asian panel populations is expected to overcome that of the African panel because of faster recent population growth. Overall, we find that the majority of human genomic variable sites are rare and exhibit little sharing among diverged populations. Our results emphasize that replication of disease association for specific rare genetic variants across diverged populations must overcome both reduced statistical power because of rarity and higher population divergence. PMID:21730125

  17. Flexible Meta-Regression to Assess the Shape of the Benzene–Leukemia Exposure–Response Curve

    PubMed Central

    Vlaanderen, Jelle; Portengen, Lützen; Rothman, Nathaniel; Lan, Qing; Kromhout, Hans; Vermeulen, Roel

    2010-01-01

    Background Previous evaluations of the shape of the benzene–leukemia exposure–response curve (ERC) were based on a single set or on small sets of human occupational studies. Integrating evidence from all available studies that are of sufficient quality combined with flexible meta-regression models is likely to provide better insight into the functional relation between benzene exposure and risk of leukemia. Objectives We used natural splines in a flexible meta-regression method to assess the shape of the benzene–leukemia ERC. Methods We fitted meta-regression models to 30 aggregated risk estimates extracted from nine human observational studies and performed sensitivity analyses to assess the impact of a priori assessed study characteristics on the predicted ERC. Results The natural spline showed a supralinear shape at cumulative exposures less than 100 ppm-years, although this model fitted the data only marginally better than a linear model (p = 0.06). Stratification based on study design and jackknifing indicated that the cohort studies had a considerable impact on the shape of the ERC at high exposure levels (> 100 ppm-years) but that predicted risks for the low exposure range (< 50 ppm-years) were robust. Conclusions Although limited by the small number of studies and the large heterogeneity between studies, the inclusion of all studies of sufficient quality combined with a flexible meta-regression method provides the most comprehensive evaluation of the benzene–leukemia ERC to date. The natural spline based on all data indicates a significantly increased risk of leukemia [relative risk (RR) = 1.14; 95% confidence interval (CI), 1.04–1.26] at an exposure level as low as 10 ppm-years. PMID:20064779

  18. Ngram time series model to predict activity type and energy cost from wrist, hip and ankle accelerometers: implications of age.

    PubMed

    Strath, Scott J; Kate, Rohit J; Keenan, Kevin G; Welch, Whitney A; Swartz, Ann M

    2015-11-01

    To develop and test time series single site and multi-site placement models, we used wrist, hip and ankle processed accelerometer data to estimate energy cost and type of physical activity in adults. Ninety-nine subjects in three age groups (18-39, 40-64, 65 +  years) performed 11 activities while wearing three triaxial accelereometers: one each on the non-dominant wrist, hip, and ankle. During each activity net oxygen cost (METs) was assessed. The time series of accelerometer signals were represented in terms of uniformly discretized values called bins. Support Vector Machine was used for activity classification with bins and every pair of bins used as features. Bagged decision tree regression was used for net metabolic cost prediction. To evaluate model performance we employed the jackknife leave-one-out cross validation method. Single accelerometer and multi-accelerometer site model estimates across and within age group revealed similar accuracy, with a bias range of -0.03 to 0.01 METs, bias percent of -0.8 to 0.3%, and a rMSE range of 0.81-1.04 METs. Multi-site accelerometer location models improved activity type classification over single site location models from a low of 69.3% to a maximum of 92.8% accuracy. For each accelerometer site location model, or combined site location model, percent accuracy classification decreased as a function of age group, or when young age groups models were generalized to older age groups. Specific age group models on average performed better than when all age groups were combined. A time series computation show promising results for predicting energy cost and activity type. Differences in prediction across age group, a lack of generalizability across age groups, and that age group specific models perform better than when all ages are combined needs to be considered as analytic calibration procedures to detect energy cost and type are further developed. PMID:26449155

  19. Ammonia- and methane-oxidizing microorganisms in high-altitude wetland sediments and adjacent agricultural soils.

    PubMed

    Yang, Yuyin; Shan, Jingwen; Zhang, Jingxu; Zhang, Xiaoling; Xie, Shuguang; Liu, Yong

    2014-12-01

    Ammonia oxidation is known to be carried out by ammonia-oxidizing bacteria (AOB) and archaea (AOA), while methanotrophs (methane-oxidizing bacteria (MOB)) play an important role in mitigating methane emissions from the environment. However, the difference of AOA, AOB, and MOB distribution in wetland sediment and adjacent upland soil remains unclear. The present study investigated the abundances and community structures of AOA, AOB, and MOB in sediments of a high-altitude freshwater wetland in Yunnan Province (China) and adjacent agricultural soils. Variations of AOA, AOB, and MOB community sizes and structures were found in water lily-vegetated and Acorus calamus-vegetated sediments and agricultural soils (unflooded rice soil, cabbage soil, and garlic soil and flooded rice soil). AOB community size was higher than AOA in agricultural soils and lily-vegetated sediment, but lower in A. calamus-vegetated sediment. MOB showed a much higher abundance than AOA and AOB. Flooded rice soil had the largest AOA, AOB, and MOB community sizes. Principal coordinate analyses and Jackknife Environment Clusters analyses suggested that unflooded and flooded rice soils had relatively similar AOA, AOB, and MOB structures. Cabbage soil and A. calamus-vegetated sediment had relatively similar AOA and AOB structures, but their MOB structures showed a large difference. Nitrososphaera-like microorganisms were the predominant AOA species in garlic soil but were present with a low abundance in unflooded rice soil and cabbage soil. Nitrosospira-like AOB were dominant in wetland sediments and agricultural soils. Type I MOB Methylocaldum and type II MOB Methylocystis were dominant in wetland sediments and agricultural soils. Moreover, Pearson's correlation analysis indicated that AOA Shannon diversity was positively correlated with the ratio of organic carbon to nitrogen (p < 0.05). This work could provide some new insights toward ammonia and methane oxidation in soil and wetland sediment ecosystems. PMID:25030456

  20. Testate Amoebae as Paleohydrological Proxies in the Florida Everglades

    NASA Astrophysics Data System (ADS)

    Andrews, T.; Booth, R.; Bernhardt, C. E.; Willard, D. A.

    2011-12-01

    The largest wetland restoration effort ever attempted, the Comprehensive Everglades Restoration Plan (CERP), is currently underway in the Florida Everglades, and a critical goal of CERP is reestablishment of the pre-drainage (pre-AD 1880) hydrology. Paleoecological research in the greater Everglades ecosystem is underway to reconstruct past water levels and variability throughout the system, providing a basis for restoration targets. Testate amoebae, a group of unicellular organisms that form decay-resistant tests, have been successfully used in northern-latitude bogs to reconstruct past wetland hydrology; however, their application in other peatland types, particularly at lower latitudes, has not been well studied. We assessed the potential use of testate amoebae as tools to reconstruct the past hydrology of the Everglades. Modern surface samples were collected from the Everglades National Park and Water Conservation Areas, across a water table gradient that included four vegetation types (tree island interior, tree island edge, sawgrass transition, slough). Community composition was quantified and compared to environmental conditions (water table, pH, vegetation) using ordination and gradient-analysis approaches. Results of nonmetric multidimensional scaling revealed that the most important pattern of community change, representing about 30% of the variance in the dataset, was related to water-table depth (r2=0.32). Jackknifed cross-validation of a transfer function for water table depth, based on a simple weighted average model, indicated the potential for testate amoebae in studies of past Everglades hydrology (RMSEP = 9 cm, r2=0.47). Although the performance of the transfer function was not as good as those from northern-latitude bogs, our results suggest that testate amoebae could be could be a valuable tool in paleohydrological studies of the Everglades, particularly when used with other hydrological proxies (e.g., pollen, plant macrofossils, diatoms).

  1. Comparative assessment of GIS-based methods and metrics for estimating long-term exposures to air pollution

    NASA Astrophysics Data System (ADS)

    Gulliver, John; de Hoogh, Kees; Fecht, Daniela; Vienneau, Danielle; Briggs, David

    2011-12-01

    The development of geographical information system techniques has opened up a wide array of methods for air pollution exposure assessment. The extent to which these provide reliable estimates of air pollution concentrations is nevertheless not clearly established. Nor is it clear which methods or metrics should be preferred in epidemiological studies. This paper compares the performance of ten different methods and metrics in terms of their ability to predict mean annual PM 10 concentrations across 52 monitoring sites in London, UK. Metrics analysed include indicators (distance to nearest road, traffic volume on nearest road, heavy duty vehicle (HDV) volume on nearest road, road density within 150 m, traffic volume within 150 m and HDV volume within 150 m) and four modelling approaches: based on the nearest monitoring site, kriging, dispersion modelling and land use regression (LUR). Measures were computed in a GIS, and resulting metrics calibrated and validated against monitoring data using a form of grouped jack-knife analysis. The results show that PM 10 concentrations across London show little spatial variation. As a consequence, most methods can predict the average without serious bias. Few of the approaches, however, show good correlations with monitored PM 10 concentrations, and most predict no better than a simple classification based on site type. Only land use regression reaches acceptable levels of correlation ( R2 = 0.47), though this can be improved by also including information on site type. This might therefore be taken as a recommended approach in many studies, though care is needed in developing meaningful land use regression models, and like any method they need to be validated against local data before their application as part of epidemiological studies.

  2. Making Mosquito Taxonomy Useful: A Stable Classification of Tribe Aedini that Balances Utility with Current Knowledge of Evolutionary Relationships

    PubMed Central

    Wilkerson, Richard C.; Linton, Yvonne-Marie; Fonseca, Dina M.; Schultz, Ted R.; Price, Dana C.; Strickman, Daniel A.

    2015-01-01

    The tribe Aedini (Family Culicidae) contains approximately one-quarter of the known species of mosquitoes, including vectors of deadly or debilitating disease agents. This tribe contains the genus Aedes, which is one of the three most familiar genera of mosquitoes. During the past decade, Aedini has been the focus of a series of extensive morphology-based phylogenetic studies published by Reinert, Harbach, and Kitching (RH&K). Those authors created 74 new, elevated or resurrected genera from what had been the single genus Aedes, almost tripling the number of genera in the entire family Culicidae. The proposed classification is based on subjective assessments of the “number and nature of the characters that support the branches” subtending particular monophyletic groups in the results of cladistic analyses of a large set of morphological characters of representative species. To gauge the stability of RH&K’s generic groupings we reanalyzed their data with unweighted parsimony jackknife and maximum-parsimony analyses, with and without ordering 14 of the characters as in RH&K. We found that their phylogeny was largely weakly supported and their taxonomic rankings failed priority and other useful taxon-naming criteria. Consequently, we propose simplified aedine generic designations that 1) restore a classification system that is useful for the operational community; 2) enhance the ability of taxonomists to accurately place new species into genera; 3) maintain the progress toward a natural classification based on monophyletic groups of species; and 4) correct the current classification system that is subject to instability as new species are described and existing species more thoroughly defined. We do not challenge the phylogenetic hypotheses generated by the above-mentioned series of morphological studies. However, we reduce the ranks of the genera and subgenera of RH&K to subgenera or informal species groups, respectively, to preserve stability as new data become available. PMID:26226613

  3. MSLoc-DT: a new method for predicting the protein subcellular location of multispecies based on decision templates.

    PubMed

    Zhang, Shao-Wu; Liu, Yan-Fang; Yu, Yong; Zhang, Ting-He; Fan, Xiao-Nan

    2014-03-15

    Revealing the subcellular location of newly discovered protein sequences can bring insight to their function and guide research at the cellular level. The rapidly increasing number of sequences entering the genome databanks has called for the development of automated analysis methods. Currently, most existing methods used to predict protein subcellular locations cover only one, or a very limited number of species. Therefore, it is necessary to develop reliable and effective computational approaches to further improve the performance of protein subcellular prediction and, at the same time, cover more species. The current study reports the development of a novel predictor called MSLoc-DT to predict the protein subcellular locations of human, animal, plant, bacteria, virus, fungi, and archaea by introducing a novel feature extraction approach termed Amino Acid Index Distribution (AAID) and then fusing gene ontology information, sequential evolutionary information, and sequence statistical information through four different modes of pseudo amino acid composition (PseAAC) with a decision template rule. Using the jackknife test, MSLoc-DT can achieve 86.5, 98.3, 90.3, 98.5, 95.9, 98.1, and 99.3% overall accuracy for human, animal, plant, bacteria, virus, fungi, and archaea, respectively, on seven stringent benchmark datasets. Compared with other predictors (e.g., Gpos-PLoc, Gneg-PLoc, Virus-PLoc, Plant-PLoc, Plant-mPLoc, ProLoc-Go, Hum-PLoc, GOASVM) on the gram-positive, gram-negative, virus, plant, eukaryotic, and human datasets, the new MSLoc-DT predictor is much more effective and robust. Although the MSLoc-DT predictor is designed to predict the single location of proteins, our method can be extended to multiple locations of proteins by introducing multilabel machine learning approaches, such as the support vector machine and deep learning, as substitutes for the K-nearest neighbor (KNN) method. As a user-friendly web server, MSLoc-DT is freely accessible at http://bioinfo.ibp.ac.cn/MSLOC_DT/index.html. PMID:24361712

  4. Digital Mapping of Soil Salinity and Crop Yield across a Coastal Agricultural Landscape Using Repeated Electromagnetic Induction (EMI) Surveys.

    PubMed

    Yao, Rongjiang; Yang, Jingsong; Wu, Danhua; Xie, Wenping; Gao, Peng; Jin, Wenhui

    2016-01-01

    Reliable and real-time information on soil and crop properties is important for the development of management practices in accordance with the requirements of a specific soil and crop within individual field units. This is particularly the case in salt-affected agricultural landscape where managing the spatial variability of soil salinity is essential to minimize salinization and maximize crop output. The primary objectives were to use linear mixed-effects model for soil salinity and crop yield calibration with horizontal and vertical electromagnetic induction (EMI) measurements as ancillary data, to characterize the spatial distribution of soil salinity and crop yield and to verify the accuracy of spatial estimation. Horizontal and vertical EMI (type EM38) measurements at 252 locations were made during each survey, and root zone soil samples and crop samples at 64 sampling sites were collected. This work was periodically conducted on eight dates from June 2012 to May 2013 in a coastal salt-affected mud farmland. Multiple linear regression (MLR) and restricted maximum likelihood (REML) were applied to calibrate root zone soil salinity (ECe) and crop annual output (CAO) using ancillary data, and spatial distribution of soil ECe and CAO was generated using digital soil mapping (DSM) and the precision of spatial estimation was examined using the collected meteorological and groundwater data. Results indicated that a reduced model with EMh as a predictor was satisfactory for root zone ECe calibration, whereas a full model with both EMh and EMv as predictors met the requirement of CAO calibration. The obtained distribution maps of ECe showed consistency with those of EMI measurements at the corresponding time, and the spatial distribution of CAO generated from ancillary data showed agreement with that derived from raw crop data. Statistics of jackknifing procedure confirmed that the spatial estimation of ECe and CAO exhibited reliability and high accuracy. A general increasing trend of ECe was observed and moderately saline and very saline soils were predominant during the survey period. The temporal dynamics of root zone ECe coincided with those of daily rainfall, water table and groundwater data. Long-range EMI surveys and data collection are needed to capture the spatial and temporal variability of soil and crop parameters. Such results allowed us to conclude that, cost-effective and efficient EMI surveys, as one part of multi-source data for DSM, could be successfully used to characterize the spatial variability of soil salinity, to monitor the spatial and temporal dynamics of soil salinity, and to spatially estimate potential crop yield. PMID:27203697

  5. A DEEP SEARCH FOR EXTENDED RADIO CONTINUUM EMISSION FROM DWARF SPHEROIDAL GALAXIES: IMPLICATIONS FOR PARTICLE DARK MATTER

    SciTech Connect

    Spekkens, Kristine; Mason, Brian S.; Aguirre, James E.; Nhan, Bang

    2013-08-10

    We present deep radio observations of four nearby dwarf spheroidal (dSph) galaxies, designed to detect extended synchrotron emission resulting from weakly interacting massive particle (WIMP) dark matter annihilations in their halos. Models by Colafrancesco et al. (CPU07) predict the existence of angularly large, smoothly distributed radio halos in such systems, which stem from electron and positron annihilation products spiraling in a turbulent magnetic field. We map a total of 40.5 deg{sup 2} around the Draco, Ursa Major II, Coma Berenices, and Willman 1 dSphs with the Green Bank Telescope (GBT) at 1.4 GHz to detect this annihilation signature, greatly reducing discrete-source confusion using the NVSS catalog. We achieve a sensitivity of {sigma}{sub sub} {approx}< 7 mJy beam{sup -1} in our discrete source-subtracted maps, implying that the NVSS is highly effective at removing background sources from GBT maps. For Draco we obtained approximately concurrent Very Large Array observations to quantify the variability of the discrete source background, and find it to have a negligible effect on our results. We construct radial surface brightness profiles from each of the subtracted maps, and jackknife the data to quantify the significance of the features therein. At the {approx}10' resolution of our observations, foregrounds contribute a standard deviation of 1.8 mJy beam{sup -1} {<=} {sigma}{sub ast} {<=} 5.7 mJy beam{sup -1} to our high-latitude maps, with the emission in Draco and Coma dominated by foregrounds. On the other hand, we find no significant emission in the Ursa Major II and Willman 1 fields, and explore the implications of non-detections in these fields for particle dark matter using the fiducial models of CPU07. For a WIMP mass M{sub {chi}} = 100 GeV annihilating into b b-bar final states and B = 1 {mu}G, upper limits on the annihilation cross-section for Ursa Major II and Willman I are log (({sigma}v){sub {chi}}, cm{sup 3} s{sup -1}) {approx}< -25 for the preferred set of charged particle propagation parameters adopted by CPU07; this is comparable to that inferred at {gamma}-ray energies from the two-year Fermi Large Area Telescope data. We discuss three avenues for improving the constraints on ({sigma}v){sub {chi}} presented here, and conclude that deep radio observations of dSphs are highly complementary to indirect WIMP searches at higher energies.

  6. Subspace Dimensionality: A Tool for Automated QC in Seismic Array Processing

    NASA Astrophysics Data System (ADS)

    Rowe, C. A.; Stead, R. J.; Begnaud, M. L.

    2013-12-01

    Because of the great resolving power of seismic arrays, the application of automated processing to array data is critically important in treaty verification work. A significant problem in array analysis is the inclusion of bad sensor channels in the beamforming process. We are testing an approach to automated, on-the-fly quality control (QC) to aid in the identification of poorly performing sensor channels prior to beam-forming in routine event detection or location processing. The idea stems from methods used for large computer servers, when monitoring traffic at enormous numbers of nodes is impractical on a node-by node basis, so the dimensionality of the node traffic is instead monitoried for anomalies that could represent malware, cyber-attacks or other problems. The technique relies upon the use of subspace dimensionality or principal components of the overall system traffic. The subspace technique is not new to seismology, but its most common application has been limited to comparing waveforms to an a priori collection of templates for detecting highly similar events in a swarm or seismic cluster. In the established template application, a detector functions in a manner analogous to waveform cross-correlation, applying a statistical test to assess the similarity of the incoming data stream to known templates for events of interest. In our approach, we seek not to detect matching signals, but instead, we examine the signal subspace dimensionality in much the same way that the method addresses node traffic anomalies in large computer systems. Signal anomalies recorded on seismic arrays affect the dimensional structure of the array-wide time-series. We have shown previously that this observation is useful in identifying real seismic events, either by looking at the raw signal or derivatives thereof (entropy, kurtosis), but here we explore the effects of malfunctioning channels on the dimension of the data and its derivatives, and how to leverage this effect for identifying bad array elements through a jackknifing process to isolate the anomalous channels, so that an automated analysis system might discard them prior to FK analysis and beamforming on events of interest.

  7. Effect of reconstruction methods and x-ray tube current–time product on nodule detection in an anthropomorphic thorax phantom: A crossed-modality JAFROC observer study

    PubMed Central

    Thompson, J. D.; Chakraborty, D. P.; Szczepura, K.; Tootell, A. K.; Vamvakas, I.; Manning, D. J.; Hogg, P.

    2016-01-01

    Purpose: To evaluate nodule detection in an anthropomorphic chest phantom in computed tomography (CT) images reconstructed with adaptive iterative dose reduction 3D (AIDR3D) and filtered back projection (FBP) over a range of tube current–time product (mAs). Methods: Two phantoms were used in this study: (i) an anthropomorphic chest phantom was loaded with spherical simulated nodules of 5, 8, 10, and 12 mm in diameter and +100, −630, and −800 Hounsfield units electron density; this would generate CT images for the observer study; (ii) a whole-body dosimetry verification phantom was used to ultimately estimate effective dose and risk according to the model of the BEIR VII committee. Both phantoms were scanned over a mAs range (10, 20, 30, and 40), while all other acquisition parameters remained constant. Images were reconstructed with both AIDR3D and FBP. For the observer study, 34 normal cases (no nodules) and 34 abnormal cases (containing 1–3 nodules, mean 1.35 ± 0.54) were chosen. Eleven observers evaluated images from all mAs and reconstruction methods under the free-response paradigm. A crossed-modality jackknife alternative free-response operating characteristic (JAFROC) analysis method was developed for data analysis, averaging data over the two factors influencing nodule detection in this study: mAs and image reconstruction (AIDR3D or FBP). A Bonferroni correction was applied and the threshold for declaring significance was set at 0.025 to maintain the overall probability of Type I error at α = 0.05. Contrast-to-noise (CNR) was also measured for all nodules and evaluated by a linear least squares analysis. Results: For random-reader fixed-case crossed-modality JAFROC analysis, there was no significant difference in nodule detection between AIDR3D and FBP when data were averaged over mAs [F(1, 10) = 0.08, p = 0.789]. However, when data were averaged over reconstruction methods, a significant difference was seen between multiple pairs of mAs settings [F(3, 30) = 15.96, p < 0.001]. Measurements of effective dose and effective risk showed the expected linear dependence on mAs. Nodule CNR was statistically higher for simulated nodules on images reconstructed with AIDR3D (p < 0.001). Conclusions: No significant difference in nodule detection performance was demonstrated between images reconstructed with FBP and AIDR3D. mAs was found to influence nodule detection, though further work is required for dose optimization. PMID:26936711

  8. A Deep Search for Extended Radio Continuum Emission from Dwarf Spheroidal Galaxies: Implications for Particle Dark Matter

    NASA Astrophysics Data System (ADS)

    Spekkens, Kristine; Mason, Brian S.; Aguirre, James E.; Nhan, Bang

    2013-08-01

    We present deep radio observations of four nearby dwarf spheroidal (dSph) galaxies, designed to detect extended synchrotron emission resulting from weakly interacting massive particle (WIMP) dark matter annihilations in their halos. Models by Colafrancesco et al. (CPU07) predict the existence of angularly large, smoothly distributed radio halos in such systems, which stem from electron and positron annihilation products spiraling in a turbulent magnetic field. We map a total of 40.5 deg2 around the Draco, Ursa Major II, Coma Berenices, and Willman 1 dSphs with the Green Bank Telescope (GBT) at 1.4 GHz to detect this annihilation signature, greatly reducing discrete-source confusion using the NVSS catalog. We achieve a sensitivity of ?sub <~ 7 mJy beam-1 in our discrete source-subtracted maps, implying that the NVSS is highly effective at removing background sources from GBT maps. For Draco we obtained approximately concurrent Very Large Array observations to quantify the variability of the discrete source background, and find it to have a negligible effect on our results. We construct radial surface brightness profiles from each of the subtracted maps, and jackknife the data to quantify the significance of the features therein. At the ~10' resolution of our observations, foregrounds contribute a standard deviation of 1.8 mJy beam-1 <= ?ast <= 5.7 mJy beam-1 to our high-latitude maps, with the emission in Draco and Coma dominated by foregrounds. On the other hand, we find no significant emission in the Ursa Major II and Willman 1 fields, and explore the implications of non-detections in these fields for particle dark matter using the fiducial models of CPU07. For a WIMP mass M ? = 100 GeV annihilating into b\\bar{b} final states and B = 1 ?G, upper limits on the annihilation cross-section for Ursa Major II and Willman I are log (lang?vrang?, cm3 s-1) <~ -25 for the preferred set of charged particle propagation parameters adopted by CPU07; this is comparable to that inferred at ?-ray energies from the two-year Fermi Large Area Telescope data. We discuss three avenues for improving the constraints on lang?vrang? presented here, and conclude that deep radio observations of dSphs are highly complementary to indirect WIMP searches at higher energies.

  9. Are the orbital poles of binary stars in the solar neighbourhood anisotropically distributed?

    NASA Astrophysics Data System (ADS)

    Agati, J.-L.; Bonneau, D.; Jorissen, A.; Soulié, E.; Udry, S.; Verhas, P.; Dommanget, J.

    2015-02-01

    We test whether or not the orbital poles of the systems in the solar neighbourhood are isotropically distributed on the celestial sphere. The problem is plagued by the ambiguity on the position of the ascending node. Of the 95 systems closer than 18 pc from the Sun with an orbit in the 6th Catalogue of Orbits of Visual Binaries, the pole ambiguity could be resolved for 51 systems using radial velocity collected in the literature and CORAVEL database or acquired with the HERMES/Mercator spectrograph. For several systems, we can correct the erroneous nodes in the 6th Catalogue of Orbits and obtain new combined spectroscopic/astrometric orbits for seven systems [WDS 01083+5455Aa,Ab; 01418+4237AB; 02278+0426AB (SB2); 09006+4147AB (SB2); 16413+3136AB; 17121+4540AB; 18070+3034AB]. We used of spherical statistics to test for possible anisotropy. After ordering the binary systems by increasing distance from the Sun, we computed the false-alarm probability for subsamples of increasing sizes, from N = 1 up to the full sample of 51 systems. Rayleigh-Watson and Beran tests deliver a false-alarm probability of 0.5% for the 20 systems closer than 8.1 pc. To evaluate the robustness of this conclusion, we used a jackknife approach, for which we repeated this procedure after removing one system at a time from the full sample. The false-alarm probability was then found to vary between 1.5% and 0.1%, depending on which system is removed. The reality of the deviation from isotropy can thus not be assessed with certainty at this stage, because only so few systems are available, despite our efforts to increase the sample. However, when considering the full sample of 51 systems, the concentration of poles toward the Galactic position l = 46.0°, b = 37°, as observed in the 8.1 pc sphere, totally vanishes (the Rayleigh-Watson false-alarm probability then rises to 18%). Tables 1-3 and Appendices are available in electronic form at http://www.aanda.org† Deceased October 1, 2014.

  10. Computer-aided detection of clustered microcalcifications in multiscale bilateral filtering regularized reconstructed digital breast tomosynthesis volume

    SciTech Connect

    Samala, Ravi K. Chan, Heang-Ping; Lu, Yao; Hadjiiski, Lubomir; Wei, Jun; Helvie, Mark A.; Sahiner, Berkman

    2014-02-15

    Purpose: Develop a computer-aided detection (CADe) system for clustered microcalcifications in digital breast tomosynthesis (DBT) volume enhanced with multiscale bilateral filtering (MSBF) regularization. Methods: With Institutional Review Board approval and written informed consent, two-view DBT of 154 breasts, of which 116 had biopsy-proven microcalcification (MC) clusters and 38 were free of MCs, was imaged with a General Electric GEN2 prototype DBT system. The DBT volumes were reconstructed with MSBF-regularized simultaneous algebraic reconstruction technique (SART) that was designed to enhance MCs and reduce background noise while preserving the quality of other tissue structures. The contrast-to-noise ratio (CNR) of MCs was further improved with enhancement-modulated calcification response (EMCR) preprocessing, which combined multiscale Hessian response to enhance MCs by shape and bandpass filtering to remove the low-frequency structured background. MC candidates were then located in the EMCR volume using iterative thresholding and segmented by adaptive region growing. Two sets of potential MC objects, cluster centroid objects and MC seed objects, were generated and the CNR of each object was calculated. The number of candidates in each set was controlled based on the breast volume. Dynamic clustering around the centroid objects grouped the MC candidates to form clusters. Adaptive criteria were designed to reduce false positive (FP) clusters based on the size, CNR values and the number of MCs in the cluster, cluster shape, and cluster based maximum intensity projection. Free-response receiver operating characteristic (FROC) and jackknife alternative FROC (JAFROC) analyses were used to assess the performance and compare with that of a previous study. Results: Unpaired two-tailedt-test showed a significant increase (p < 0.0001) in the ratio of CNRs for MCs with and without MSBF regularization compared to similar ratios for FPs. For view-based detection, a sensitivity of 85% was achieved at an FP rate of 2.16 per DBT volume. For case-based detection, a sensitivity of 85% was achieved at an FP rate of 0.85 per DBT volume. JAFROC analysis showed a significant improvement in the performance of the current CADe system compared to that of our previous system (p = 0.003). Conclusions: MBSF regularized SART reconstruction enhances MCs. The enhancement in the signals, in combination with properly designed adaptive threshold criteria, effective MC feature analysis, and false positive reduction techniques, leads to a significant improvement in the detection of clustered MCs in DBT.

  11. Detection of B-mode polarization at degree angular scales by BICEP2.

    PubMed

    Ade, P A R; Aikin, R W; Barkats, D; Benton, S J; Bischoff, C A; Bock, J J; Brevik, J A; Buder, I; Bullock, E; Dowell, C D; Duband, L; Filippini, J P; Fliescher, S; Golwala, S R; Halpern, M; Hasselfield, M; Hildebrandt, S R; Hilton, G C; Hristov, V V; Irwin, K D; Karkare, K S; Kaufman, J P; Keating, B G; Kernasovskiy, S A; Kovac, J M; Kuo, C L; Leitch, E M; Lueker, M; Mason, P; Netterfield, C B; Nguyen, H T; O'Brient, R; Ogburn, R W; Orlando, A; Pryke, C; Reintsema, C D; Richter, S; Schwarz, R; Sheehy, C D; Staniszewski, Z K; Sudiwala, R V; Teply, G P; Tolan, J E; Turner, A D; Vieregg, A G; Wong, C L; Yoon, K W

    2014-06-20

    We report results from the BICEP2 experiment, a cosmic microwave background (CMB) polarimeter specifically designed to search for the signal of inflationary gravitational waves in the B-mode power spectrum around ℓ∼80. The telescope comprised a 26 cm aperture all-cold refracting optical system equipped with a focal plane of 512 antenna coupled transition edge sensor 150 GHz bolometers each with temperature sensitivity of ≈300  μK(CMB)√s. BICEP2 observed from the South Pole for three seasons from 2010 to 2012. A low-foreground region of sky with an effective area of 380 square deg was observed to a depth of 87 nK deg in Stokes Q and U. In this paper we describe the observations, data reduction, maps, simulations, and results. We find an excess of B-mode power over the base lensed-ΛCDM expectation in the range 30 < ℓ < 150, inconsistent with the null hypothesis at a significance of >5σ. Through jackknife tests and simulations based on detailed calibration measurements we show that systematic contamination is much smaller than the observed excess. Cross correlating against WMAP 23 GHz maps we find that Galactic synchrotron makes a negligible contribution to the observed signal. We also examine a number of available models of polarized dust emission and find that at their default parameter values they predict power ∼(5-10)× smaller than the observed excess signal (with no significant cross-correlation with our maps). However, these models are not sufficiently constrained by external public data to exclude the possibility of dust emission bright enough to explain the entire excess signal. Cross correlating BICEP2 against 100 GHz maps from the BICEP1 experiment, the excess signal is confirmed with 3σ significance and its spectral index is found to be consistent with that of the CMB, disfavoring dust at 1.7σ. The observed B-mode power spectrum is well fit by a lensed-ΛCDM+tensor theoretical model with tensor-to-scalar ratio r = 0.20_(-0.05)(+0.07), with r = 0 disfavored at 7.0σ. Accounting for the contribution of foreground, dust will shift this value downward by an amount which will be better constrained with upcoming data sets. PMID:24996078

  12. Presence-only approach to assess landslide triggering-thickness susceptibility. A test for the Mili catchment (North-Eastern Sicily, Italy)

    NASA Astrophysics Data System (ADS)

    Lombardo, Luigi; Fubelli, Giandomenico; Amato, Gabriele; Bonasera, Mauro; Hochschild, Volker; Rotigliano, Edoardo

    2015-04-01

    This study aims at comparing the performances of a presence only approach, namely Maximum Entropy, in assessing landslide triggering-thickness susceptibility within the Mili catchment, located in the north-eastern Sicily, Italy. This catchment has been recently exposed to three main meteorological extreme events, resulting in the activation of multiple fast landslides, which occurred on the 1st October 2009, 10th March 2010 and 1st March 2011. Differently from the 2009 event, which only marginally hit the catchment, the 2010 and 2011 storms fully involved the area of the Mili catchment. Detailed field data was collected to associate the thickness of mobilised materials at the triggering zone to each mass movement within the catchment. This information has been used to model the landslide susceptibility for two classes of processes clustered into shallow failures for maximum depths of 0.5m and deep ones in case of values equal or greater than 0.5m. As the authors believed that the peculiar geomorphometry of this narrow and steep catchment played a fundamental role in generating two distinct patterns of landslide thicknesses during the initiation phase, a HRDEM was used to extract topographic attributes to express near-triggering geomorphological conditions. On the other hand, medium resolution vegetation indexes derived from ASTER scenes were used as explanatory variables pertaining to a wider spatial neighbourhood, whilst a revised geological map, the land use from CORINE and a tectonic map were used to convey an even wider area connected to the slope instability. The choice of a presence-only approach allowed to effectively discriminate between the two types of landslide thicknesses at the triggering zone, producing outstanding prediction skills associated with relatively low variances across a set of 20 randomly generated replicates. The validation phase produced indeed average AUC values of 0.91 with a standard deviation of 0.03 for both the modelled landslide thicknesses. In addition, the role of each predictor within the whole modelling procedure was assessed by applying Jackknife tests. These analyses focussed on evaluating the variation of AUC values across replicates comparing single variable models with models based on the full set of predictors iteratively deprived of one covariate. As a result, relevant differences among main contributors between the two considered classes were also quantitatively derived and geomorphologically interpreted. This work can be considered as an example for creating specific landslide susceptibility maps to be used in master planning in order to establish proportional countermeasures to different activation mechanisms. Keywords: statistical analysis, shallow landslide, landslide susceptibility, triggering factors, presence-only approach

  13. PRIMUS: Galaxy clustering as a function of luminosity and color at 0.2 < z < 1

    SciTech Connect

    Skibba, Ramin A.; Smith, M. Stephen M.; Coil, Alison L.; Mendez, Alexander J.; Moustakas, John; Aird, James; Blanton, Michael R.; Bray, Aaron D.; Eisenstein, Daniel J.; Cool, Richard J.; Wong, Kenneth C.; Zhu, Guangtun

    2014-04-01

    We present measurements of the luminosity and color-dependence of galaxy clustering at 0.2 < z < 1.0 in the Prism Multi-object Survey. We quantify the clustering with the redshift-space and projected two-point correlation functions, ξ(r{sub p} , π) and w{sub p} (r{sub p} ), using volume-limited samples constructed from a parent sample of over ∼130, 000 galaxies with robust redshifts in seven independent fields covering 9 deg{sup 2} of sky. We quantify how the scale-dependent clustering amplitude increases with increasing luminosity and redder color, with relatively small errors over large volumes. We find that red galaxies have stronger small-scale (0.1 Mpc h {sup –1} < r{sub p} < 1 Mpc h {sup –1}) clustering and steeper correlation functions compared to blue galaxies, as well as a strong color dependent clustering within the red sequence alone. We interpret our measured clustering trends in terms of galaxy bias and obtain values of b {sub gal} ≈ 0.9-2.5, quantifying how galaxies are biased tracers of dark matter depending on their luminosity and color. We also interpret the color dependence with mock catalogs, and find that the clustering of blue galaxies is nearly constant with color, while redder galaxies have stronger clustering in the one-halo term due to a higher satellite galaxy fraction. In addition, we measure the evolution of the clustering strength and bias, and we do not detect statistically significant departures from passive evolution. We argue that the luminosity- and color-environment (or halo mass) relations of galaxies have not significantly evolved since z ∼ 1. Finally, using jackknife subsampling methods, we find that sampling fluctuations are important and that the COSMOS field is generally an outlier, due to having more overdense structures than other fields; we find that 'cosmic variance' can be a significant source of uncertainty for high-redshift clustering measurements.

  14. Detection of prostate cancer by integration of line-scan diffusion, T2-mapping and T2-weighted magnetic resonance imaging; a multichannel statistical classifier.

    PubMed

    Chan, Ian; Wells, William; Mulkern, Robert V; Haker, Steven; Zhang, Jianqing; Zou, Kelly H; Maier, Stephan E; Tempany, Clare M C

    2003-09-01

    A multichannel statistical classifier for detecting prostate cancer was developed and validated by combining information from three different magnetic resonance (MR) methodologies: T2-weighted, T2-mapping, and line scan diffusion imaging (LSDI). From these MR sequences, four different sets of image intensities were obtained: T2-weighted (T2W) from T2-weighted imaging, Apparent Diffusion Coefficient (ADC) from LSDI, and proton density (PD) and T2 (T2 Map) from T2-mapping imaging. Manually segmented tumor labels from a radiologist, which were validated by biopsy results, served as tumor "ground truth." Textural features were extracted from the images using co-occurrence matrix (CM) and discrete cosine transform (DCT). Anatomical location of voxels was described by a cylindrical coordinate system. A statistical jack-knife approach was used to evaluate our classifiers. Single-channel maximum likelihood (ML) classifiers were based on 1 of the 4 basic image intensities. Our multichannel classifiers: support vector machine (SVM) and Fisher linear discriminant (FLD), utilized five different sets of derived features. Each classifier generated a summary statistical map that indicated tumor likelihood in the peripheral zone (PZ) of the prostate gland. To assess classifier accuracy, the average areas under the receiver operator characteristic (ROC) curves over all subjects were compared. Our best FLD classifier achieved an average ROC area of 0.839(+/-0.064), and our best SVM classifier achieved an average ROC area of 0.761(+/-0.043). The T2W ML classifier, our best single-channel classifier, only achieved an average ROC area of 0.599(+/-0.146). Compared to the best single-channel ML classifier, our best multichannel FLD and SVM classifiers have statistically superior ROC performance (P=0.0003 and 0.0017, respectively) from pairwise two-sided t-test. By integrating the information from multiple images and capturing the textural and anatomical features in tumor areas, summary statistical maps can potentially aid in image-guided prostate biopsy and assist in guiding and controlling delivery of localized therapy under image guidance. PMID:14528961

  15. Revised and annotated checklist of aquatic and semi-aquatic Heteroptera of Hungary with comments on biodiversity patterns

    PubMed Central

    Boda, Pál; Bozóki, Tamás; Vásárhelyi, Tamás; Bakonyi, Gábor; Várbíró, Gábor

    2015-01-01

    Abstract A basic knowledge of regional faunas is necessary to follow the changes in macroinvertebrate communities caused by environmental influences and climatic trends in the future. We collected all the available data on water bugs in Hungary using an inventory method, a UTM grid based database was built, and Jackknife richness estimates and species accumulation curves were calculated. Fauna compositions were compared among Central-European states. As a result, an updated and annotated checklist for Hungary is provided, containing 58 species in 21 genera and 12 families. A total 66.8% of the total UTM 10 × 10 km squares in Hungary possess faunistic data for water bugs. The species number in grid cells numbered from 0 to 42, and their diversity patterns showed heterogeneity. The estimated species number of 58 is equal to the actual number of species known from the country. The asymptotic shape of the accumulative species curve predicts that additional sampling efforts will not increase the number of species currently known from Hungary. These results suggest that the number of species in the country was estimated correctly and that the species accumulation curve levels off at an asymptotic value. Thus a considerable increase in species richness is not expected in the future. Even with the species composition changing the chance of species turn-over does exist. Overall, 36.7% of the European water bug species were found in Hungary. The differences in faunal composition between Hungary and its surrounding countries were caused by the rare or unique species, whereas 33 species are common in the faunas of the eight countries. Species richness does show a correlation with latitude, and similar species compositions were observed in the countries along the same latitude. The species list and the UTM-based database are now up-to-date for Hungary, and it will provide a basis for future studies of distributional and biodiversity patterns, biogeography, relative abundance and frequency of occurrences important in community ecology, or the determination of conservation status. PMID:25987880

  16. NR-2L: A Two-Level Predictor for Identifying Nuclear Receptor Subfamilies Based on Sequence-Derived Features

    PubMed Central

    Wang, Pu; Xiao, Xuan; Chou, Kuo-Chen

    2011-01-01

    Nuclear receptors (NRs) are one of the most abundant classes of transcriptional regulators in animals. They regulate diverse functions, such as homeostasis, reproduction, development and metabolism. Therefore, NRs are a very important target for drug development. Nuclear receptors form a superfamily of phylogenetically related proteins and have been subdivided into different subfamilies due to their domain diversity. In this study, a two-level predictor, called NR-2L, was developed that can be used to identify a query protein as a nuclear receptor or not based on its sequence information alone; if it is, the prediction will be automatically continued to further identify it among the following seven subfamilies: (1) thyroid hormone like (NR1), (2) HNF4-like (NR2), (3) estrogen like, (4) nerve growth factor IB-like (NR4), (5) fushi tarazu-F1 like (NR5), (6) germ cell nuclear factor like (NR6), and (7) knirps like (NR0). The identification was made by the Fuzzy K nearest neighbor (FK-NN) classifier based on the pseudo amino acid composition formed by incorporating various physicochemical and statistical features derived from the protein sequences, such as amino acid composition, dipeptide composition, complexity factor, and low-frequency Fourier spectrum components. As a demonstration, it was shown through some benchmark datasets derived from the NucleaRDB and UniProt with low redundancy that the overall success rates achieved by the jackknife test were about 93% and 89% in the first and second level, respectively. The high success rates indicate that the novel two-level predictor can be a useful vehicle for identifying NRs and their subfamilies. As a user-friendly web server, NR-2L is freely accessible at either http://icpr.jci.edu.cn/bioinfo/NR2L or http://www.jci-bioinfo.cn/NR2L. Each job submitted to NR-2L can contain up to 500 query protein sequences and be finished in less than 2 minutes. The less the number of query proteins is, the shorter the time will usually be. All the program codes for NR-2L are available for non-commercial purpose upon request. PMID:21858146

  17. A Multi-Label Predictor for Identifying the Subcellular Locations of Singleplex and Multiplex Eukaryotic Proteins

    PubMed Central

    Wang, Xiao; Li, Guo-Zheng

    2012-01-01

    Subcellular locations of proteins are important functional attributes. An effective and efficient subcellular localization predictor is necessary for rapidly and reliably annotating subcellular locations of proteins. Most of existing subcellular localization methods are only used to deal with single-location proteins. Actually, proteins may simultaneously exist at, or move between, two or more different subcellular locations. To better reflect characteristics of multiplex proteins, it is highly desired to develop new methods for dealing with them. In this paper, a new predictor, called Euk-ECC-mPLoc, by introducing a powerful multi-label learning approach which exploits correlations between subcellular locations and hybridizing gene ontology with dipeptide composition information, has been developed that can be used to deal with systems containing both singleplex and multiplex eukaryotic proteins. It can be utilized to identify eukaryotic proteins among the following 22 locations: (1) acrosome, (2) cell membrane, (3) cell wall, (4) centrosome, (5) chloroplast, (6) cyanelle, (7) cytoplasm, (8) cytoskeleton, (9) endoplasmic reticulum, (10) endosome, (11) extracellular, (12) Golgi apparatus, (13) hydrogenosome, (14) lysosome, (15) melanosome, (16) microsome, (17) mitochondrion, (18) nucleus, (19) peroxisome, (20) spindle pole body, (21) synapse, and (22) vacuole. Experimental results on a stringent benchmark dataset of eukaryotic proteins by jackknife cross validation test show that the average success rate and overall success rate obtained by Euk-ECC-mPLoc were 69.70% and 81.54%, respectively, indicating that our approach is quite promising. Particularly, the success rates achieved by Euk-ECC-mPLoc for small subsets were remarkably improved, indicating that it holds a high potential for simulating the development of the area. As a user-friendly web-server, Euk-ECC-mPLoc is freely accessible to the public at the website http://levis.tongji.edu.cn:8080/bioinfo/Euk-ECC-mPLoc/. We believe that Euk-ECC-mPLoc may become a useful high-throughput tool, or at least play a complementary role to the existing predictors in identifying subcellular locations of eukaryotic proteins. PMID:22629314

  18. Prices, infrastructure, household characteristics and child height.

    PubMed

    Thomas, D; Strauss, J

    1992-10-01

    A Brazilian household survey, ENDEF, in 1974-75 and the 1974 Informacoes Basicas Municipais (IBM) provided data for the analysis of the impact of community services and infrastructure and household characteristics on the logarithm of child height, standardized for age and gender. The sample was comprised of 36,974 children stratified by residential location, the child's age, and the educational level of the mother. Variance and covariance matrices were estimated with the jackknife developed by Efron (1982). Household characteristics included the logarithm of per capita expenditure as a measure of household resource availability, income, and parental education. Community characteristics were local market price indices for 6 food groups (dairy products, beans, cereals, meat, fish, and sugar), level of urbanization, buildings with sewage, water, and electricity connections per capita, per capita number of buildings, and population density. Health services were measured as per capita number of hospitals and clinics and doctors and nurses, and the number of beds are hospital. Educational services include a measure of student teacher ratios, elementary school class size, and per capita number of teachers living in the community. the results show that expenditure had a positive, significant effect on the height of children 2 years and older. Expenditure was a significant determinant for literate and illiterate mothers, and not well educated mothers. The impact of maternal education was largest on the length of babies and declined with the age of the child. Father's education had not impact of length of babies. The effect of parents' education was complementary. The effect of father's education was largest when mothers had some education. Better educated parents had healthier children. Maternal rather than paternal height had an impact of the length of a baby. In the community models, prices had a significant effect on child height, in both urban and rural areas, in all age groups, and for all levels of maternal education. Higher prices were associated with shorter children. Joint price and expenditure interactions were significant. Children at the top of the expenditure distribution were more affected by some prices than by others. Capital building improvements alone and with expenditures were all positively associated with child height. Only nurses per capita impacted on child height. PMID:12318394

  19. Predicting infection risk of hop by Pseudoperonospora humuli.

    PubMed

    Gent, David H; Ocamb, Cynthia M

    2009-10-01

    ABSTRACT Downy mildew, caused by Pseudoperonospora humuli, is one of the most destructive diseases of hop. Weather factors associated with infection risk by P. humuli in the maritime region of western Oregon were examined for 24- and 48-h periods and quadratic discriminant function models were developed to classify periods as favorable for disease development on leaves. For the 24-h data sets, the model with superior predictive ability included variables for hours of relative humidity>80%, degree-hours of wetness, and mean night temperature. The same variables were selected for the 48-h data sets, with the addition of a product variable for mean night temperature and hours of relative humidity>80%. Cut-points (pT) on receiver operating characteristic curves that minimized the overall error rate were identified by selecting the cut-point with the highest value of Youden's index. For the 24- and 48-h models these were pT=0.49 and 0.39, respectively. With these thresholds, the sensitivity and specificity of the models in cross validation by jackknife exclusion were 83.3 and 88.8% for the 24-h model and 87.5 and 84.4% for the 48-h model, respectively. Cut-points that minimized the average costs associated with disease control and crop loss due to classification errors were determined using estimates of economic damage during vegetative development and on cones near harvest. Use of the 24- and 48-h models was estimated to reduce average management costs during vegetative development when disease prevalence was <0.31 and 0.16, respectively. Using economic assumptions near harvest, management decisions informed by the models reduced average costs when disease prevalence was <0.21 and 0.1 for the 24- and 48-h models, respectively. The value of the models in management decisions was greatest when disease prevalence was relatively low during vegetative development, which generally corresponds to the normally drier period from late spring to midsummer in the Pacific Northwest of the United States. PMID:19740033

  20. Warfarin Anticoagulant Therapy: A Southern Italy Pharmacogenetics-Based Dosing Model

    PubMed Central

    Mazzaccara, Cristina; Conti, Valeria; Liguori, Rosario; Simeon, Vittorio; Toriello, Mario; Severini, Angelo; Perricone, Corrado; Meccariello, Alfonso; Meccariello, Pasquale; Vitale, Dino Franco; Filippelli, Amelia; Sacchetti, Lucia

    2013-01-01

    Background and Aim Warfarin is the most frequently prescribed anticoagulant worldwide. However, warfarin therapy is associated with a high risk of bleeding and thromboembolic events because of a large interindividual dose-response variability. We investigated the effect of genetic and non genetic factors on warfarin dosage in a South Italian population in the attempt to setup an algorithm easily applicable in the clinical practice. Materials and Methods A total of 266 patients from Southern Italy affected by cardiovascular diseases were enrolled and their clinical and anamnestic data recorded. All patients were genotyped for CYP2C9*2,*3, CYP4F2*3, VKORC1 -1639 G>A by the TaqMan assay and for variants VKORC1 1173 C>T and VKORC1 3730 G>A by denaturing high performance liquid chromatography and direct sequencing. The effect of genetic and not genetic factors on warfarin dose variability was tested by multiple linear regression analysis, and an algorithm based on our data was established and then validated by the Jackknife procedure. Results Warfarin dose variability was influenced, in decreasing order, by VKORC1-1639 G>A (29.7%), CYP2C9*3 (11.8%), age (8.5%), CYP2C9*2 (3.5%), gender (2.0%) and lastly CYP4F2*3 (1.7%); VKORC1 1173 C>T and VKORC1 3730 G>A exerted a slight effect (<1% each). Taken together, these factors accounted for 58.4% of the warfarin dose variability in our population. Data obtained with our algorithm significantly correlated with those predicted by the two online algorithms: Warfarin dosing and Pharmgkb (p<0.001; R2 = 0.805 and p<0.001; R2 = 0.773, respectively). Conclusions Our algorithm, which is based on six polymorphisms, age and gender, is user-friendly and its application in clinical practice could improve the personalized management of patients undergoing warfarin therapy. PMID:23990957

  1. Estimating the spatial distribution of evapotranspiration using the water balance model WAVE and fine spatial resolution airborne remote sensing images from the DAIS-sensor: Experimental set-up

    NASA Astrophysics Data System (ADS)

    Verstraeten, W. W.; Veroustraete, F.; Feyen, J.

    2003-04-01

    Actual evapotranspiration (ET) of agricultural land and forestland surfaces play an important role in the redistribution of water on the Earth's surface. Any change in evapotranspiration, either through change in vegetation or climate change, directly effects the available water resources. For quantifying these effects physical models need to be constructed. Most hydrological models have to deal with a lack of good spatial resolution, despite their good temporal information. Remote sensing techniques on the contrary determine the spatial pattern of landscape features and hence are very useful on large scales. The main objective of this research is the combination of the spatial pattern of remote sensing (using visible and thermal infrared spectrum) with the temporal pattern of the water balance model WAVE (Vanclooster et al., 1994 and 1996). To realise this, the following objectives are formulated: (i) relate soil and vegetation surface temperatures to actual evapotranspiration of forest and crops simulated with the water balance model WAVE using remote sensing derived parameters. Three methods will be used and mutually compared. Both airborne and satellite imagery will be implemented; (1) compare the spatial pattern of evapotranspiration, as a result of the three methods, with the energy balance model SEBAL (Bastiaanssen et al., 1998) and finally; (2) subject the up-scaled WAVE and SEBAL models to an uncertainty analysis using the GLUE-approach (Generalised Likelihood Uncertainty Estimate) (Beven en Binley, 1992). To study the behaviour of the model beyond the field-scale (micro-scale), a meso-scale study was conducted at the test-site of DURAS (50°50'38"N, 5°08'50"W, Sint-Truiden). Airborne imagery from the DAIS/ROSIS sensor are available. For the determination of the spatial pattern of actual evapotranspiration the next two methods are considered: (1) relations between surface temperature, surface albedo and vegetation indices are linked with field-simulations of the water balance model WAVE on some specific locations (derived from Price, 1990 and Bastiaanssen et al., 1998) and; (2) determining the spatial input parameters of WAVE in order to simulate the spatial patterns of evapotranspiration (based on D'Urso); the Split-Data, Split-Window, Jack-knife, and cross-correlation techniques will take care of the temporal and spatial validation of the results.

  2. Computer-aided mass detection in mammography: False positive reduction via gray-scale invariant ranklet texture features

    SciTech Connect

    Masotti, Matteo; Lanconelli, Nico; Campanini, Renato

    2009-02-15

    In this work, gray-scale invariant ranklet texture features are proposed for false positive reduction (FPR) in computer-aided detection (CAD) of breast masses. Two main considerations are at the basis of this proposal. First, false positive (FP) marks surviving our previous CAD system seem to be characterized by specific texture properties that can be used to discriminate them from masses. Second, our previous CAD system achieves invariance to linear/nonlinear monotonic gray-scale transformations by encoding regions of interest into ranklet images through the ranklet transform, an image transformation similar to the wavelet transform, yet dealing with pixels' ranks rather than with their gray-scale values. Therefore, the new FPR approach proposed herein defines a set of texture features which are calculated directly from the ranklet images corresponding to the regions of interest surviving our previous CAD system, hence, ranklet texture features; then, a support vector machine (SVM) classifier is used for discrimination. As a result of this approach, texture-based information is used to discriminate FP marks surviving our previous CAD system; at the same time, invariance to linear/nonlinear monotonic gray-scale transformations of the new CAD system is guaranteed, as ranklet texture features are calculated from ranklet images that have this property themselves by construction. To emphasize the gray-scale invariance of both the previous and new CAD systems, training and testing are carried out without any in-between parameters' adjustment on mammograms having different gray-scale dynamics; in particular, training is carried out on analog digitized mammograms taken from a publicly available digital database, whereas testing is performed on full-field digital mammograms taken from an in-house database. Free-response receiver operating characteristic (FROC) curve analysis of the two CAD systems demonstrates that the new approach achieves a higher reduction of FP marks when compared to the previous one. Specifically, at 60%, 65%, and 70% per-mammogram sensitivity, the new CAD system achieves 0.50, 0.68, and 0.92 FP marks per mammogram, whereas at 70%, 75%, and 80% per-case sensitivity it achieves 0.37, 0.48, and 0.71 FP marks per mammogram, respectively. Conversely, at the same sensitivities, the previous CAD system reached 0.71, 0.87, and 1.15 FP marks per mammogram, and 0.57, 0.73, and 0.92 FPs per mammogram. Also, statistical significance of the difference between the two per-mammogram and per-case FROC curves is demonstrated by the p-value<0.001 returned by jackknife FROC analysis performed on the two CAD systems.

  3. Computer-aided detection of breast masses: Four-view strategy for screening mammography

    SciTech Connect

    Wei Jun; Chan Heangping; Zhou Chuan; Wu Yita; Sahiner, Berkman; Hadjiiski, Lubomir M.; Roubidoux, Marilyn A.; Helvie, Mark A.

    2011-04-15

    Purpose: To improve the performance of a computer-aided detection (CAD) system for mass detection by using four-view information in screening mammography. Methods: The authors developed a four-view CAD system that emulates radiologists' reading by using the craniocaudal and mediolateral oblique views of the ipsilateral breast to reduce false positives (FPs) and the corresponding views of the contralateral breast to detect asymmetry. The CAD system consists of four major components: (1) Initial detection of breast masses on individual views, (2) information fusion of the ipsilateral views of the breast (referred to as two-view analysis), (3) information fusion of the corresponding views of the contralateral breast (referred to as bilateral analysis), and (4) fusion of the four-view information with a decision tree. The authors collected two data sets for training and testing of the CAD system: A mass set containing 389 patients with 389 biopsy-proven masses and a normal set containing 200 normal subjects. All cases had four-view mammograms. The true locations of the masses on the mammograms were identified by an experienced MQSA radiologist. The authors randomly divided the mass set into two independent sets for cross validation training and testing. The overall test performance was assessed by averaging the free response receiver operating characteristic (FROC) curves of the two test subsets. The FP rates during the FROC analysis were estimated by using the normal set only. The jackknife free-response ROC (JAFROC) method was used to estimate the statistical significance of the difference between the test FROC curves obtained with the single-view and the four-view CAD systems. Results: Using the single-view CAD system, the breast-based test sensitivities were 58% and 77% at the FP rates of 0.5 and 1.0 per image, respectively. With the four-view CAD system, the breast-based test sensitivities were improved to 76% and 87% at the corresponding FP rates, respectively. The improvement was found to be statistically significant (p<0.0001) by JAFROC analysis. Conclusions: The four-view information fusion approach that emulates radiologists' reading strategy significantly improves the performance of breast mass detection of the CAD system in comparison with the single-view approach.

  4. Predicting Neuroinflammation in Morphine Tolerance for Tolerance Therapy from Immunostaining Images of Rat Spinal Cord

    PubMed Central

    Lin, Shinn-Long; Chang, Fang-Lin; Ho, Shinn-Ying; Charoenkwan, Phasit; Wang, Kuan-Wei; Huang, Hui-Ling

    2015-01-01

    Long-term morphine treatment leads to tolerance which attenuates analgesic effect and hampers clinical utilization. Recent studies have sought to reveal the mechanism of opioid receptors and neuroinflammation by observing morphological changes of cells in the rat spinal cord. This work proposes a high-content screening (HCS) based computational method, HCS-Morph, for predicting neuroinflammation in morphine tolerance to facilitate the development of tolerance therapy using immunostaining images for astrocytes, microglia, and neurons in the spinal cord. HCS-Morph first extracts numerous HCS-based features of cellular phenotypes. Next, an inheritable bi-objective genetic algorithm is used to identify a minimal set of features by maximizing the prediction accuracy of neuroinflammation. Finally, a mathematic model using a support vector machine with the identified features is established to predict drug-treated images to assess the effects of tolerance therapy. The dataset consists of 15 saline controls (1 μl/h), 15 morphine-tolerant rats (15 μg/h), and 10 rats receiving a co-infusion of morphine (15 μg/h) and gabapentin (15 μg/h, Sigma). The three individual models of astrocytes, microglia, and neurons for predicting neuroinflammation yielded respective Jackknife test accuracies of 96.67%, 90.00%, and 86.67% on the 30 rats, and respective independent test accuracies of 100%, 90%, and 60% on the 10 co-infused rats. The experimental results suggest that neuroinflammation activity expresses more predominantly in astrocytes and microglia than in neuron cells. The set of features for predicting neuroinflammation from images of astrocytes comprises mean cell intensity, total cell area, and second-order geometric moment (relating to cell distribution), relevant to cell communication, cell extension, and cell migration, respectively. The present investigation provides the first evidence for the role of gabapentin in the attenuation of morphine tolerance from phenotypic changes of astrocytes and microglia. Based on neuroinflammation prediction, the proposed computer-aided image diagnosis system can greatly facilitate the development of tolerance therapy with anti-inflammatory drugs. PMID:26437460

  5. EEG spectral coherence data distinguish chronic fatigue syndrome patients from healthy controls and depressed patients-A case control study

    PubMed Central

    2011-01-01

    Background Previous studies suggest central nervous system involvement in chronic fatigue syndrome (CFS), yet there are no established diagnostic criteria. CFS may be difficult to differentiate from clinical depression. The study's objective was to determine if spectral coherence, a computational derivative of spectral analysis of the electroencephalogram (EEG), could distinguish patients with CFS from healthy control subjects and not erroneously classify depressed patients as having CFS. Methods This is a study, conducted in an academic medical center electroencephalography laboratory, of 632 subjects: 390 healthy normal controls, 70 patients with carefully defined CFS, 24 with major depression, and 148 with general fatigue. Aside from fatigue, all patients were medically healthy by history and examination. EEGs were obtained and spectral coherences calculated after extensive artifact removal. Principal Components Analysis identified coherence factors and corresponding factor loading patterns. Discriminant analysis determined whether spectral coherence factors could reliably discriminate CFS patients from healthy control subjects without misclassifying depression as CFS. Results Analysis of EEG coherence data from a large sample (n = 632) of patients and healthy controls identified 40 factors explaining 55.6% total variance. Factors showed highly significant group differentiation (p < .0004) identifying 89.5% of unmedicated female CFS patients and 92.4% of healthy female controls. Recursive jackknifing showed predictions were stable. A conservative 10-factor discriminant function model was subsequently applied, and also showed highly significant group discrimination (p < .001), accurately classifying 88.9% unmedicated males with CFS, and 82.4% unmedicated male healthy controls. No patient with depression was classified as having CFS. The model was less accurate (73.9%) in identifying CFS patients taking psychoactive medications. Factors involving the temporal lobes were of primary importance. Conclusions EEG spectral coherence analysis identified unmedicated patients with CFS and healthy control subjects without misclassifying depressed patients as CFS, providing evidence that CFS patients demonstrate brain physiology that is not observed in healthy normals or patients with major depression. Studies of new CFS patients and comparison groups are required to determine the possible clinical utility of this test. The results concur with other studies finding neurological abnormalities in CFS, and implicate temporal lobe involvement in CFS pathophysiology. PMID:21722376

  6. Development of waveform inversion techniques for using body-wave waveforms to infer localized three-dimensional seismic structure and an application to D"

    NASA Astrophysics Data System (ADS)

    Kawai, K.; Konishi, K.; Geller, R. J.; Fuji, N.

    2013-12-01

    In order to further extract information on localized three-dimensional seismic structure from observed seismic data, we have developed and applied methods for seismic waveform inversion. Deriving algorithms for the calculation of synthetic seismograms and their partial derivatives, development of efficient software for their computation and for data handling, correction for near-source and near-receiver structure, and choosing appropriate parameterization of the model space are the key steps in such an inversion. We formulate the inverse problem of waveform inversion for localized structure, computing partial derivatives of waveforms with respect to the 3-D elastic moduli at arbitrary points in space for anisotropic and anelastic media. Our method does not use any great circle approximations in computing the synthetics and their partial derivatives. In order to efficiently solve the inverse problem we use the conjugate gradient (CG) method. We apply our methods to inversion for the three-dimensional shear wave structure in the lowermost mantle beneath Central America and the Western Pacific using waveforms in the period band from 12.5 to 200~s. Checkerboard tests show that waveform inversion of S, ScS, and the other phases which arrive between them can resolve laterally heterogenous shear-wave structure in the lowermost mantle using waves propagating only in a relatively limited range of azimuths. Checkerboard tests show that white noise has little impact on the results of waveform inversion. Various tests such as a jackknife test show that our model is robust. We verify the near-orthogonality of partial derivatives with respect to structure inside and outside the target region; we find that although datasets with only a small number of waveforms (e.g., waveforms recorded by stations for only a single event) cannot resolve structure inside and outside the target region, a dataset with a large number of waveforms can almost completely remove the effects of near-source and near-receiver structure. Waveform inversion with a large dataset is thus confirmed to be a promising approach to infer 3-D seismic fine structure in the Earth's deep interior.

  7. iLM-2L: A two-level predictor for identifying protein lysine methylation sites and their methylation degrees by incorporating K-gap amino acid pairs into Chou׳s general PseAAC.

    PubMed

    Ju, Zhe; Cao, Jun-Zhe; Gu, Hong

    2015-11-21

    As one of the most critical post-translational modifications, lysine methylation plays a key role in regulating various protein functions. In order to understand the molecular mechanism of lysine methylation, it is important to identify lysine methylation sites and their methylation degrees accurately. As the traditional experimental methods are time-consuming and labor-intensive, several computational methods have been developed for the identification of methylation sites. However, the prediction accuracy of existing computational methods is still unsatisfactory. Moreover, they are only focused on predicting whether a query lysine residue is a methylation site, without considering its methylation degrees. In this paper, a novel two-level predictor named iLM-2L is proposed to predict lysine methylation sites and their methylation degrees using composition of k-spaced amino acid pairs feature coding scheme and support vector machine algorithm. The 1st level is to identify whether a query lysine residue is a methylation site, and the 2nd level is to identify which methylation degree(s) the query lysine residue belongs to if it has been predicted as a methyllysine site in the 1st level identification. The iLM-2L achieves a promising performance with a Sensitivity of 76.46%, a Specificity of 91.90%, an Accuracy of 85.31% and a Matthew's correlation coefficient of 69.94% for the 1st level as well as a Precision of 84.81%, an accuracy of 79.35%, a recall of 80.83%, an Absolute_Ture of 73.89% and a Hamming_loss of 15.63% for the 2nd level in jackknife test. As illustrated by independent test, the performance of iLM-2L outperforms other existing lysine methylation site predictors significantly. A matlab software package for iLM-2L can be freely downloaded from https://github.com/juzhe1120/Matlab_Software/blob/master/iLM-2L_Matlab_Software.rar. PMID:26254214

  8. Gpos-PLoc: an ensemble classifier for predicting subcellular localization of Gram-positive bacterial proteins.

    PubMed

    Shen, Hong-Bin; Chou, Kuo-Chen

    2007-01-01

    A statistical analysis indicated that, of the 35,016 Gram-positive bacterial proteins from the recent Swiss-Prot database, approximately 57% of these entries are without subcellular location annotations. In the gene ontology database, the corresponding percentage is approximately 67%, meaning the percentage of proteins without subcellular component annotations is even higher. With the avalanche of gene products generated in the post-genomic era, the number of such location-unknown entries will continuously increase. It is highly desired to develop an automated method for timely and accurately identifying their subcellular localization because the information thus obtained is very useful for both basic research and drug discovery practice. In view of this, an ensemble classifier called 'Gpos-PLoc' was developed for predicting Gram-positive protein subcellular localization. The new predictor is featured by fusing many basic classifiers, each of which was engineered according to the optimized evidence-theoretic K-nearest neighbors rule. As a demonstration, tests were performed on Gram-positive proteins among the following five subcellular location sites: (1) cell wall, (2) cytoplasm, (3) extracell, (4) periplasm and (5) plasma membrane. To eliminate redundancy and homology bias, only those proteins which have < 25% sequence identity to any other in a same subcellular location were allowed to be included in the benchmark datasets. The overall success rates thus achieved by Gpos-PLoc were > 80% for both jackknife cross-validation test and independent dataset test, implying that Gpos-PLoc might become a very useful vehicle for expediting the analysis of Gram-positive bacterial proteins. Gpos-PLoc is freely accessible to public as a web-server at http://202.120.37.186/bioinf/Gpos/. To support the need of many investigators in the relevant areas, a downloadable file is provided at the same website to list the results identified by Gpos-PLoc for 31,898 Gram-positive bacterial protein entries in Swiss-Prot database that either have no subcellular location annotation or are annotated with uncertain terms such as 'probable', 'potential', 'perhaps' and 'by similarity'. Such large-scale results will be updated once a year to include the new entries of Gram-positive bacterial proteins and reflect the continuous development of Gpos-PLoc. PMID:17244638

  9. Reproducibility and optimization of in vivo human diffusion-weighted MRS of the corpus callosum at 3 T and 7 T.

    PubMed

    Wood, Emily T; Ercan, Ayse Ece; Branzoli, Francesca; Webb, Andrew; Sati, Pascal; Reich, Daniel S; Ronen, Itamar

    2015-08-01

    Diffusion-weighted MRS (DWS) of brain metabolites enables the study of cell-specific alterations in tissue microstructure by probing the diffusion of intracellular metabolites. In particular, the diffusion properties of neuronal N-acetylaspartate (NAA), typically co-measured with N-acetylaspartyl glutamate (NAAG) (NAA + NAAG = tNAA), have been shown to be sensitive to intraneuronal/axonal damage in pathologies such as stroke and multiple sclerosis. Lacking, so far, are empirical assessments of the reproducibility of DWS measures across time and subjects, as well as a systematic investigation of the optimal acquisition parameters for DWS experiments, both of which are sorely needed for clinical applications of the method. In this study, we acquired comprehensive single-volume DWS datasets of the human corpus callosum at 3 T and 7 T. We investigated the inter- and intra-subject variability of empirical and modeled diffusion properties of tNAA [D(avg) (tNAA) and D(model) (tNAA), respectively]. Subsequently, we used a jackknife-like resampling approach to explore the variance of these properties in partial data subsets reflecting different total scan durations. The coefficients of variation (C(V)) and repeatability coefficients (C(R)) for D(avg) (tNAA) and D(model) (tNAA) were calculated for both 3 T and 7 T, with overall lower variability in the 7 T results. Although this work is limited to the estimation of the diffusion properties in the corpus callosum, we show that a careful choice of diffusion-weighting conditions at both field strengths allows the accurate measurement of tNAA diffusion properties in clinically relevant experimental time. Based on the resampling results, we suggest optimized acquisition schemes of 13-min duration at 3T and 10-min duration at 7 T, whilst retaining low variability (C(V) ≈ 8%) for the tNAA diffusion measures. Power calculations for the estimation of D(model )(tNAA) and D(avg) (tNAA) based on the suggested schemes show that less than 21 subjects per group are sufficient for the detection of a 10% effect between two groups in case-control studies. PMID:26084563

  10. Evaluation of clinical image processing algorithms used in digital mammography.

    PubMed

    Zanca, Federica; Jacobs, Jurgen; Van Ongeval, Chantal; Claus, Filip; Celis, Valerie; Geniets, Catherine; Provost, Veerle; Pauwels, Herman; Marchal, Guy; Bosmans, Hilde

    2009-03-01

    Screening is the only proven approach to reduce the mortality of breast cancer, but significant numbers of breast cancers remain undetected even when all quality assurance guidelines are implemented. With the increasing adoption of digital mammography systems, image processing may be a key factor in the imaging chain. Although to our knowledge statistically significant effects of manufacturer-recommended image processings have not been previously demonstrated, the subjective experience of our radiologists, that the apparent image quality can vary considerably between different algorithms, motivated this study. This article addresses the impact of five such algorithms on the detection of clusters of microcalcifications. A database of unprocessed (raw) images of 200 normal digital mammograms, acquired with the Siemens Novation DR, was collected retrospectively. Realistic simulated microcalcification clusters were inserted in half of the unprocessed images. All unprocessed images were subsequently processed with five manufacturer-recommended image processing algorithms (Agfa Musica 1, IMS Raffaello Mammo 1.2, Sectra Mamea AB Sigmoid, Siemens OPVIEW v2, and Siemens OPVIEW v1). Four breast imaging radiologists were asked to locate and score the clusters in each image on a five point rating scale. The free-response data were analyzed by the jackknife free-response receiver operating characteristic (JAFROC) method and, for comparison, also with the receiver operating characteristic (ROC) method. JAFROC analysis revealed highly significant differences between the image processings (F = 8.51, p < 0.0001), suggesting that image processing strongly impacts the detectability of clusters. Siemens OPVIEW2 and Siemens OPVIEW1 yielded the highest and lowest performances, respectively. ROC analysis of the data also revealed significant differences between the processing but at lower significance (F = 3.47, p = 0.0305) than JAFROC. Both statistical analysis methods revealed that the same six pairs of modalities were significantly different, but the JAFROC confidence intervals were about 32% smaller than ROC confidence intervals. This study shows that image processing has a significant impact on the detection of microcalcifications in digital mammograms. Objective measurements, such as described here, should be used by the manufacturers to select the optimal image processing algorithm. PMID:19378737

  11. Adaptive Statistical Iterative Reconstruction-Applied Ultra-Low-Dose CT with Radiography-Comparable Radiation Dose: Usefulness for Lung Nodule Detection

    PubMed Central

    Yoon, Hyun Jung; Hwang, Hye Sun; Moon, Jung Won; Lee, Kyung Soo

    2015-01-01

    Objective To assess the performance of adaptive statistical iterative reconstruction (ASIR)-applied ultra-low-dose CT (ULDCT) in detecting small lung nodules. Materials and Methods Thirty patients underwent both ULDCT and standard dose CT (SCT). After determining the reference standard nodules, five observers, blinded to the reference standard reading results, independently evaluated SCT and both subsets of ASIR- and filtered back projection (FBP)-driven ULDCT images. Data assessed by observers were compared statistically. Results Converted effective doses in SCT and ULDCT were 2.81 ± 0.92 and 0.17 ± 0.02 mSv, respectively. A total of 114 lung nodules were detected on SCT as a standard reference. There was no statistically significant difference in sensitivity between ASIR-driven ULDCT and SCT for three out of the five observers (p = 0.678, 0.735, < 0.01, 0.038, and < 0.868 for observers 1, 2, 3, 4, and 5, respectively). The sensitivity of FBP-driven ULDCT was significantly lower than that of ASIR-driven ULDCT in three out of the five observers (p < 0.01 for three observers, and p = 0.064 and 0.146 for two observers). In jackknife alternative free-response receiver operating characteristic analysis, the mean values of figure-of-merit (FOM) for FBP, ASIR-driven ULDCT, and SCT were 0.682, 0.772, and 0.821, respectively, and there were no significant differences in FOM values between ASIR-driven ULDCT and SCT (p = 0.11), but the FOM value of FBP-driven ULDCT was significantly lower than that of ASIR-driven ULDCT and SCT (p = 0.01 and 0.00). Conclusion Adaptive statistical iterative reconstruction-driven ULDCT delivering a radiation dose of only 0.17 mSv offers acceptable sensitivity in nodule detection compared with SCT and has better performance than FBP-driven ULDCT. PMID:26357505

  12. Observer performance for adaptive, image-based denoising and filtered back projection compared to scanner-based iterative reconstruction for lower dose CT enterography

    PubMed Central

    Fletcher, Joel G.; Hara, Amy K.; Fidler, Jeff L.; Silva, Alvin C.; Barlow, John M.; Carter, Rickey E.; Bartley, Adam; Shiung, Maria; Holmes, David R.; Weber, Nicolas K.; Bruining, David H.; Yu, Lifeng; McCollough, Cynthia H.

    2015-01-01

    Purpose The purpose of this study was to compare observer performance for detection of intestinal inflammation for low-dose CT enterography (LD-CTE) using scanner-based iterative reconstruction (IR) vs. vendor-independent, adaptive image-based noise reduction (ANLM) or filtered back projection (FBP). Methods Sixty-two LD-CTE exams were performed. LD-CTE images were reconstructed using IR, ANLM, and FBP. Three readers, blinded to image type, marked intestinal inflammation directly on patient images using a specialized workstation over three sessions, interpreting one image type/patient/session. Reference standard was created by a gastroenterologist and radiologist, who reviewed all available data including dismissal Gastroenterology records, and who marked all inflamed bowel segments on the same workstation. Reader and reference localizations were then compared. Non-inferiority was tested using Jackknife free-response ROC (JAFROC) figures of merit (FOM) for ANLM and FBP compared to IR. Patient-level analyses for the presence or absence of inflammation were also conducted. Results There were 46 inflamed bowel segments in 24/62 patients (CTDIvol interquartile range 6.9–10.1 mGy). JAFROC FOM for ANLM and FBP were 0.84 (95% CI 0.75–0.92) and 0.84 (95% CI 0.75–0.92), and were statistically non-inferior to IR (FOM 0.84; 95% CI 0.76–0.93). Patient-level pooled confidence intervals for sensitivity widely overlapped, as did specificities. Image quality was rated as better with IR and AMLM compared to FBP (p < 0.0001), with no difference in reading times (p = 0.89). Conclusions Vendor-independent adaptive image-based noise reduction and FBP provided observer performance that was non-inferior to scanner-based IR methods. Adaptive image-based noise reduction maintained or improved upon image quality ratings compared to FBP when performing CTE at lower dose levels. PMID:25725794

  13. Growing Season Temperatures in Europe and Climate Forcings Over the Past 1400 Years

    PubMed Central

    Guiot, Joel; Corona, Christophe

    2010-01-01

    Background The lack of instrumental data before the mid-19th-century limits our understanding of present warming trends. In the absence of direct measurements, we used proxies that are natural or historical archives recording past climatic changes. A gridded reconstruction of spring-summer temperature was produced for Europe based on tree-rings, documentaries, pollen assemblages and ice cores. The majority of proxy series have an annual resolution. For a better inference of long-term climate variation, they were completed by low-resolution data (decadal or more), mostly on pollen and ice-core data. Methodology/Principal Findings An original spectral analog method was devised to deal with this heterogeneous dataset, and to preserve long-term variations and the variability of temperature series. So we can replace the recent climate changes in a broader context of the past 1400 years. This preservation is possible because the method is not based on a calibration (regression) but on similarities between assemblages of proxies. The reconstruction of the April-September temperatures was validated with a Jack-knife technique. It was also compared to other spatially gridded temperature reconstructions, literature data, and glacier advance and retreat curves. We also attempted to relate the spatial distribution of European temperature anomalies to known solar and volcanic forcings. Conclusions We found that our results were accurate back to 750. Cold periods prior to the 20th century can be explained partly by low solar activity and/or high volcanic activity. The Medieval Warm Period (MWP) could be correlated to higher solar activity. During the 20th century, however only anthropogenic forcing can explain the exceptionally high temperature rise. Warm periods of the Middle Age were spatially more heterogeneous than last decades, and then locally it could have been warmer. However, at the continental scale, the last decades were clearly warmer than any period of the last 1400 years. The heterogeneity of MWP versus the homogeneity of the last decades is likely an argument that different forcings could have operated. These results support the fact that we are living a climate change in Europe never seen in the past 1400 years. PMID:20376366

  14. The relative roles of types of extracurricular activity on smoking and drinking initiation among tweens

    PubMed Central

    Adachi-Mejia, Anna M.; Gibson Chambers, Jennifer J.; Li, Zhigang; Sargent, James D.

    2014-01-01

    Objective Youth involvement in extracurricular activities may help prevent smoking and drinking initiation. However, the relative roles of types of extracurricular activity on these risks are unclear. Therefore, we examined the association between substance use and participation in team sports with a coach, other sports without a coach, music, school clubs, and other clubs in a nationally representative sample of US tweens. Methods We conducted telephone surveys with 6,522 U.S. students (ages 10-14) in 2003. We asked participants if they had ever tried smoking or drinking and about their participation in extracurricular activities. We used sample weighting to produce response estimates that were representative of the population of adolescents aged 10-14 years at the time of data collection. Logistic regression models that adjusted for appropriate sampling weights using Jackknife variance estimation tested associations with trying smoking and drinking, controlling for sociodemographics, child and parent characteristics, friend/sibling/parent substance use, and media use. Results A little over half of the students reported participating in team sports with a coach (55.5%) and without a coach (55.4%) a few times per week or more. Most had minimal to no participation in school clubs (74.2%), however most reported being involved in other clubs (85.8%). A little less than half participated in music, choir, dance, and/or band lessons. Over half of participants involved in religious activity did those activities a few times per week or more. In the multiple regression analysis, team sport participation with a coach was the only extracurricular activity associated with lower risk of trying smoking (adjusted OR = 0.68, 95% C.I. 0.49, 0.96) compared to none or minimal participation. Participating in other clubs was the only extracurricular activity associated with lower risk of trying drinking (adjusted OR = 0.56, 95% C.I. 0.32, 0.99) compared to none or minimal participation. Conclusions Type of extracurricular involvement may be associated with risk of youth smoking and drinking initiation. Future research should seek to better understand the underlying reasons behind these differences. PMID:24767780

  15. Modelling and mapping the local distribution of representative species on the Le Danois Bank, El Cachucho Marine Protected Area (Cantabrian Sea)

    NASA Astrophysics Data System (ADS)

    García-Alegre, Ana; Sánchez, Francisco; Gómez-Ballesteros, María; Hinz, Hilmar; Serrano, Alberto; Parra, Santiago

    2014-08-01

    The management and protection of potentially vulnerable species and habitats require the availability of detailed spatial data. However, such data are often not readily available in particular areas that are challenging for sampling by traditional sampling techniques, for example seamounts. Within this study habitat modelling techniques were used to create predictive maps of six species of conservation concern for the Le Danois Bank (El Cachucho Marine Protected Area in the South of the Bay of Biscay). The study used data from ECOMARG multidisciplinary surveys that aimed to create a representative picture of the physical and biological composition of the area. Classical fishing gear (otter trawl and beam trawl) was used to sample benthic communities that inhabit sedimentary areas, and non-destructive visual sampling techniques (ROV and photogrammetric sled) were used to determine the presence of epibenthic macrofauna in complex and vulnerable habitats. Multibeam echosounder data, high-resolution seismic profiles (TOPAS system) and geological data from box-corer were used to characterize the benthic terrain. ArcGIS software was used to produce high-resolution maps (75×75 m2) of such variables in the entire area. The Maximum Entropy (MAXENT) technique was used to process these data and create Habitat Suitability maps for six species of special conservation interest. The model used seven environmental variables (depth, rugosity, aspect, slope, Bathymetric Position Index (BPI) in fine and broad scale and morphosedimentary characteristics) to identify the most suitable habitats for such species and indicates which environmental factors determine their distribution. The six species models performed highly significantly better than random (p<0.0001; Mann-Whitney test) when Area Under the Curve (AUC) values were tested. This indicates that the environmental variables chosen are relevant to distinguish the distribution of these species. The Jackknife test estimated depth to be the key factor structuring their distribution, followed by the seabed morpho-sedimentary characteristics and rugosity variables. Three of the species studied (Asconema setubalense, Callogorgia verticillata and Helicolenus dactylopterus) were found to have small suitable areas as a result of being restrictive species related to the environmental characteristics of the top of the bank. The other species (Pheronema carpenteri, Phycis blennoides and Trachyscorpia cristulata), which were species less restrictive to the environmental variables used, had highly suitable areas of distribution. The study provides high-resolution maps of species that characterize the habitat of two communities included in OSPAR and NATURA networks, whose distributions corroborate the adequate protection of this area by the management measures applied at present.

  16. Geographic assignment of seabirds to their origin: combining morphologic, genetic, and biogeochemical analyses.

    PubMed

    Gómez-Díaz, Elena; González-Solis, Jacob

    2007-07-01

    Longline fisheries, oil spills, and offshore wind farms are some of the major threats increasing seabird mortality at sea, but the impact of these threats on specific populations has been difficult to determine so far. We tested the use of molecular markers, morphometric measures, and stable isotope (delta15N and delta13C) and trace element concentrations in the first primary feather (grown at the end of the breeding period) to assign the geographic origin of Calonectris shearwaters. Overall, we sampled birds from three taxa: 13 Mediterranean Cory's Shearwater (Calonectris diomedea diomedea) breeding sites, 10 Atlantic Cory's Shearwater (Calonectris diomedea borealis) breeding sites, and one Cape Verde Shearwater (C. edwardsii) breeding site. Assignment rates were investigated at three spatial scales: breeding colony, breeding archipelago, and taxa levels. Genetic analyses based on the mitochondrial control region (198 birds from 21 breeding colonies) correctly assigned 100% of birds to the three main taxa but failed in detecting geographic structuring at lower scales. Discriminant analyses based on trace elements composition achieved the best rate of correct assignment to colony (77.5%). Body measurements or stable isotopes mainly succeeded in assigning individuals among taxa (87.9% and 89.9%, respectively) but failed at the colony level (27.1% and 38.0%, respectively). Combining all three approaches (morphometrics, isotopes, and trace elements on 186 birds from 15 breeding colonies) substantially improved correct classifications (86.0%, 90.7%, and 100% among colonies, archipelagos, and taxa, respectively). Validations using two independent data sets and jackknife cross-validation confirmed the robustness of the combined approach in the colony assignment (62.5%, 58.8%, and 69.8% for each validation test, respectively). A preliminary application of the discriminant model based on stable isotope delta15N and delta13C values and trace elements (219 birds from 17 breeding sites) showed that 41 Cory's Shearwaters caught by western Mediterranean long-liners came mainly from breeding colonies in Menorca (48.8%), Ibiza (14.6%), and Crete (31.7%). Our findings show that combining analyses of trace elements and stable isotopes on feathers can achieve high rates of correct geographic assignment of birds in the marine environment, opening new prospects for the study of seabird mortality at sea. PMID:17708223

  17. Euk-PLoc: an ensemble classifier for large-scale eukaryotic protein subcellular location prediction.

    PubMed

    Shen, H-B; Yang, J; Chou, K-C

    2007-07-01

    With the avalanche of newly-found protein sequences emerging in the post genomic era, it is highly desirable to develop an automated method for fast and reliably identifying their subcellular locations because knowledge thus obtained can provide key clues for revealing their functions and understanding how they interact with each other in cellular networking. However, predicting subcellular location of eukaryotic proteins is a challenging problem, particularly when unknown query proteins do not have significant homology to proteins of known subcellular locations and when more locations need to be covered. To cope with the challenge, protein samples are formulated by hybridizing the information derived from the gene ontology database and amphiphilic pseudo amino acid composition. Based on such a representation, a novel ensemble hybridization classifier was developed by fusing many basic individual classifiers through a voting system. Each of these basic classifiers was engineered by the KNN (K-Nearest Neighbor) principle. As a demonstration, a new benchmark dataset was constructed that covers the following 18 localizations: (1) cell wall, (2) centriole, (3) chloroplast, (4) cyanelle, (5) cytoplasm, (6) cytoskeleton, (7) endoplasmic reticulum, (8) extracell, (9) Golgi apparatus, (10) hydrogenosome, (11) lysosome, (12) mitochondria, (13) nucleus, (14) peroxisome, (15) plasma membrane, (16) plastid, (17) spindle pole body, and (18) vacuole. To avoid the homology bias, none of the proteins included has > or =25% sequence identity to any other in a same subcellular location. The overall success rates thus obtained via the 5-fold and jackknife cross-validation tests were 81.6 and 80.3%, respectively, which were 40-50% higher than those performed by the other existing methods on the same strict dataset. The powerful predictor, named "Euk-PLoc", is available as a web-server at http://202.120.37.186/bioinf/euk . Furthermore, to support the need of people working in the relevant areas, a downloadable file will be provided at the same website to list the results predicted by Euk-PLoc for all eukaryotic protein entries (excluding fragments) in Swiss-Prot database that do not have subcellular location annotations or are annotated as being uncertain. The large-scale results will be updated twice a year to include the new entries of eukaryotic proteins and reflect the continuous development of Euk-PLoc. PMID:17235453

  18. Atlantic Tropical Cyclone Monitoring with AMSU-A: Estimation of Maximum Sustained Wind Speeds

    NASA Technical Reports Server (NTRS)

    Spencer, Roy; Braswell, William D.; Goodman, H. Michael (Technical Monitor)

    2001-01-01

    The first Advanced Microwave Sounding Unit temperature sounder (AMSU-A) was launched on the NOAA-15 satellite on 13 May 1998. The AMSU-A's higher spatial and radiometric resolutions provide more useful information on the strength of the middle and upper tropospheric warm cores associated with tropical cyclones than have previous microwave temperature sounders. The gradient wind relationship suggests that the temperature gradient near the core of tropical cyclones increases nonlinearly with wind speed. We recast the gradient wind equation to include AMSU-A derived variables. Stepwise regression is used to determine which of these variables is most closely related to maximum sustained winds (V(sub max)). The satellite variables investigated include the radially averaged gradients at two spatial resolutions of AMSU-A channels 1 through 10 T(sub b) data (delta(sub r)T(sub b)), the squares of these gradients, a channel 15 based scattering index (SI-89), and area averaged T(sub b). Calculations of Tb and delta(sub r)T(sub b) from mesoscale model simulations of Andrew reveal the effects of the AMSU spatial sampling on the cyclone warm core presentation. Stepwise regression of 66 AMSU-A terms against National Hurricane Center (NHC) V(sub max) estimates from the 1998 and 1999 Atlantic hurricane season confirms the existence of a nonlinear relationship between wind speed and radially averaged temperature gradients near the cyclone warm core. Of six regression terms, four are dominated by temperature information, and two are interpreted as correcting for hydrometeor contamination. Jackknifed regressions were performed to estimate the algorithm performance on independent data. For the 82 cases that had in situ measurements of V(sub max), the average error standard deviation was 4.7 m/s. For 108 cases without in situ wind data, the average error standard deviation was 7.5 m/s. Operational considerations, including the detection of weak cyclones and false alarm reduction are also discussed.

  19. Atlantic Tropical Cyclone Monitoring with AMSU-A: Estimation of Maximum Sustained Wind Speeds

    NASA Technical Reports Server (NTRS)

    Spencer, Roy W.; Braswell, William D.

    2001-01-01

    The first Advanced Microwave Sounding Unit temperature sounder (AMSU-A) was launched on the NOAA-15 satellite on 13 May 1998. The AMSU-A's higher spatial and radiometric resolutions provide more useful information on the strength of the middle- and upper-tropospheric warm cores associated with tropical cyclones than have previous microwave temperature sounders. The gradient wind relationship suggests that the temperature gradient near the core of tropical cyclones increases nonlinearly with wind speed. The gradient wind equation is recast to include AMSU-A-derived variables, Stepwise regression is used to determine which of these variables is most closely related to maximum sustained winds (V(sub max)). The satellite variables investigated include the radially averaged gradients at two spatial resolutions of AMSU-A channels 1-10 T(sub b) data (delta(sub r)T(sub B)), the squares of these gradients, a channel-15-based scattering index (SI(sub 89)), and area-averaged T(sub B). Calculations of T(sub B) and delta(sub r)T(sub B) from mesoscale model simulations of Andrew reveal the effects of the AMSU spatial sampling on the cyclone warm core presentation. Stepwise regression of 66 AMSU-A terms against National Hurricane Center V(sub max) estimates from the 1998 and 1999 Atlantic hurricane season confirms the existence of a nonlinear relationship between wind speed and radially averaged temperature gradients near the cyclone warm core. Of six regression terms, four are dominated by temperature information, and two are interpreted as correcting for hydrometeor contamination. Jackknifed regressions were performed to estimate the algorithm performance on independent data. For the 82 cases that had in situ measurements of V(sub max), the average error standard deviation was 4.7 m/s. For 108 cases without in situ wind data, the average error standard deviation was 7.5 m/s Operational considerations, including the detection of weak cyclones and false alarm reduction, are also discussed.

  20. Spatial Distribution of Sand Fly Vectors and Eco-Epidemiology of Cutaneous Leishmaniasis Transmission in Colombia

    PubMed Central

    Ferro, Cristina; López, Marla; Fuya, Patricia; Lugo, Ligia; Cordovez, Juan Manuel; González, Camila

    2015-01-01

    Background Leishmania is transmitted by Phlebotominae insects that maintain the enzootic cycle by circulating between sylvatic and domestic mammals; humans enter the cycles as accidental hosts due to the vector’s search for blood source. In Colombia, leishmaniasis is an endemic disease and 95% of all cases are cutaneous (CL), these cases have been reported in several regions of the country where the intervention of sylvatic areas by the introduction of agriculture seem to have an impact on the rearrangement of new transmission cycles. Our study aimed to update vector species distribution in the country and to analyze the relationship between vectors’ distribution, climate, land use and CL prevalence. Methods A database with geographic information was assembled, and ecological niche modeling was performed to explore the potential distribution of each of the 21 species of medical importance in Colombia, using thirteen bioclimatic variables, three topographic and three principal components derived from NDVI. Binary models for each species were obtained and related to both land use coverage, and a CL prevalence map with available epidemiological data. Finally, maps of species potential distribution were summed to define potential species richness in the country. Results In total, 673 single records were obtained with Lutzomyia gomezi, Lutzomyia longipalpis, Psychodopygus panamensis, Psathyromyia shannoni and Pintomyia evansi the species with the highest number of records. Eighteen species had significant models, considering the area under the curve and the jackknife results: L. gomezi and P. panamensis had the widest potential distribution. All sand fly species except for Nyssomyia antunesi are mainly distributed in regions with rates of prevalence between 0.33 to 101.35 cases per 100,000 inhabitants and 76% of collection data points fall into transformed ecosystems. Discussion Distribution ranges of sand flies with medical importance in Colombia correspond predominantly to disturbed areas, where the original land coverage is missing therefore increasing the domiciliation potential. We highlight the importance of the use of distribution maps as a tool for the development of strategies for prevention and control of diseases. PMID:26431546

  1. Revised and annotated checklist of aquatic and semi-aquatic Heteroptera of Hungary with comments on biodiversity patterns.

    PubMed

    Boda, Pál; Bozóki, Tamás; Vásárhelyi, Tamás; Bakonyi, Gábor; Várbíró, Gábor

    2015-01-01

    A basic knowledge of regional faunas is necessary to follow the changes in macroinvertebrate communities caused by environmental influences and climatic trends in the future. We collected all the available data on water bugs in Hungary using an inventory method, a UTM grid based database was built, and Jackknife richness estimates and species accumulation curves were calculated. Fauna compositions were compared among Central-European states. As a result, an updated and annotated checklist for Hungary is provided, containing 58 species in 21 genera and 12 families. A total 66.8% of the total UTM 10 × 10 km squares in Hungary possess faunistic data for water bugs. The species number in grid cells numbered from 0 to 42, and their diversity patterns showed heterogeneity. The estimated species number of 58 is equal to the actual number of species known from the country. The asymptotic shape of the accumulative species curve predicts that additional sampling efforts will not increase the number of species currently known from Hungary. These results suggest that the number of species in the country was estimated correctly and that the species accumulation curve levels off at an asymptotic value. Thus a considerable increase in species richness is not expected in the future. Even with the species composition changing the chance of species turn-over does exist. Overall, 36.7% of the European water bug species were found in Hungary. The differences in faunal composition between Hungary and its surrounding countries were caused by the rare or unique species, whereas 33 species are common in the faunas of the eight countries. Species richness does show a correlation with latitude, and similar species compositions were observed in the countries along the same latitude. The species list and the UTM-based database are now up-to-date for Hungary, and it will provide a basis for future studies of distributional and biodiversity patterns, biogeography, relative abundance and frequency of occurrences important in community ecology, or the determination of conservation status. PMID:25987880

  2. Power laws for heavy-tailed distributions: modeling allele and haplotype diversity for the national marrow donor program.

    PubMed

    Slater, Noa; Louzoun, Yoram; Gragert, Loren; Maiers, Martin; Chatterjee, Ansu; Albrecht, Mark

    2015-04-01

    Measures of allele and haplotype diversity, which are fundamental properties in population genetics, often follow heavy tailed distributions. These measures are of particular interest in the field of hematopoietic stem cell transplant (HSCT). Donor/Recipient suitability for HSCT is determined by Human Leukocyte Antigen (HLA) similarity. Match predictions rely upon a precise description of HLA diversity, yet classical estimates are inaccurate given the heavy-tailed nature of the distribution. This directly affects HSCT matching and diversity measures in broader fields such as species richness. We, therefore, have developed a power-law based estimator to measure allele and haplotype diversity that accommodates heavy tails using the concepts of regular variation and occupancy distributions. Application of our estimator to 6.59 million donors in the Be The Match Registry revealed that haplotypes follow a heavy tail distribution across all ethnicities: for example, 44.65% of the European American haplotypes are represented by only 1 individual. Indeed, our discovery rate of all U.S. European American haplotypes is estimated at 23.45% based upon sampling 3.97% of the population, leaving a large number of unobserved haplotypes. Population coverage, however, is much higher at 99.4% given that 90% of European Americans carry one of the 4.5% most frequent haplotypes. Alleles were found to be less diverse suggesting the current registry represents most alleles in the population. Thus, for HSCT registries, haplotype discovery will remain high with continued recruitment to a very deep level of sampling, but population coverage will not. Finally, we compared the convergence of our power-law versus classical diversity estimators such as Capture recapture, Chao, ACE and Jackknife methods. When fit to the haplotype data, our estimator displayed favorable properties in terms of convergence (with respect to sampling depth) and accuracy (with respect to diversity estimates). This suggests that power-law based estimators offer a valid alternative to classical diversity estimators and may have broad applicability in the field of population genetics. PMID:25901749

  3. iCDI-PseFpt: identify the channel-drug interaction in cellular networking with PseAAC and molecular fingerprints.

    PubMed

    Xiao, Xuan; Min, Jian-Liang; Wang, Pu; Chou, Kuo-Chen

    2013-11-21

    Many crucial functions in life, such as heartbeat, sensory transduction and central nervous system response, are controlled by cell signalings via various ion channels. Therefore, ion channels have become an excellent drug target, and study of ion channel-drug interaction networks is an important topic for drug development. However, it is both time-consuming and costly to determine whether a drug and a protein ion channel are interacting with each other in a cellular network by means of experimental techniques. Although some computational methods were developed in this regard based on the knowledge of the 3D (three-dimensional) structure of protein, unfortunately their usage is quite limited because the 3D structures for most protein ion channels are still unknown. With the avalanche of protein sequences generated in the post-genomic age, it is highly desirable to develop the sequence-based computational method to address this problem. To take up the challenge, we developed a new predictor called iCDI-PseFpt, in which the protein ion-channel sample is formulated by the PseAAC (pseudo amino acid composition) generated with the gray model theory, the drug compound by the 2D molecular fingerprint, and the operation engine is the fuzzy K-nearest neighbor algorithm. The overall success rate achieved by iCDI-PseFpt via the jackknife cross-validation was 87.27%, which is remarkably higher than that by any of the existing predictors in this area. As a user-friendly web-server, iCDI-PseFpt is freely accessible to the public at the website http://www.jci-bioinfo.cn/iCDI-PseFpt/. Furthermore, for the convenience of most experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results without the need to follow the complicated math equations presented in the paper just for its integrity. It has not escaped our notice that the current approach can also be used to study other drug-target interaction networks. PMID:23988798

  4. One year survival of ART and conventional restorations in patients with disability

    PubMed Central

    2014-01-01

    Background Providing restorative treatment for persons with disability may be challenging and has been related to the patient’s ability to cope with the anxiety engendered by treatment and to cooperate fully with the demands of the clinical situation. The aim of the present study was to assess the survival rate of ART restorations compared to conventional restorations in people with disability referred for special care dentistry. Methods Three treatment protocols were distinguished: ART (hand instruments/high-viscosity glass-ionomer); conventional restorative treatment (rotary instrumentation/resin composite) in the clinic (CRT/clinic) and under general anaesthesia (CRT/GA). Patients were referred for restorative care to a special care centre and treated by one of two specialists. Patients and/or their caregivers were provided with written and verbal information regarding the proposed techniques, and selected the type of treatment they were to receive. Treatment was provided as selected but if this option proved clinically unfeasible one of the alternative techniques was subsequently proposed. Evaluation of restoration survival was performed by two independent trained and calibrated examiners using established ART restoration assessment codes at 6 months and 12 months. The Proportional Hazard model with frailty corrections was applied to calculate survival estimates over a one year period. Results 66 patients (13.6 ± 7.8 years) with 16 different medical disorders participated. CRT/clinic proved feasible for 5 patients (7.5%), the ART approach for 47 patients (71.2%), and 14 patients received CRT/GA (21.2%). In all, 298 dentine carious lesions were restored in primary and permanent teeth, 182 (ART), 21 (CRT/clinic) and 95 (CRT/GA). The 1-year survival rates and jackknife standard error of ART and CRT restorations were 97.8 ± 1.0% and 90.5 ± 3.2%, respectively (p = 0.01). Conclusions These short-term results indicate that ART appears to be an effective treatment protocol for treating patients with disability restoratively, many of whom have difficulty coping with the conventional restorative treatment. Trial registration number Netherlands Trial Registration: NTR 4400 PMID:24885938

  5. Effect of CAD on Radiologists’ Detection of Lung Nodules on Thoracic CT Scans: Analysis of an Observer Performance Study by Nodule Size

    PubMed Central

    Sahiner, Berkman; Chan, Heang-Ping; Hadjiiski, Lubomir M.; Cascade, Philip N.; Kazerooni, Ella A.; Chughtai, Aamer R.; Poopat, Chad; Song, Thomas; Frank, Luba; Stojanovska, Jadranka; Attili, Anil

    2009-01-01

    Rationale and Objectives To retrospectively investigate the effect of a computer aided detection (CAD) system on radiologists’ performance for detecting small pulmonary nodules in CT examinations, with a panel of expert radiologists serving as the reference standard. Materials and Methods Institutional review board approval was obtained. Our data set contained 52 CT examinations collected by the Lung Image Database Consortium, and 33 from our institution. All CTs were read by multiple expert thoracic radiologists to identify the reference standard for detection. Six other thoracic radiologists read the CT examinations first without, and then with CAD. Performance was evaluated using free-response receiver operating characteristics (FROC) and the jackknife FROC analysis methods (JAFROC) for nodules above different diameter thresholds. Results 241 nodules, ranging in size from 3.0 to 18.6 mm (mean 5.3 mm) were identified as the reference standard. At diameter thresholds of 3, 4, 5, and 6 mm, the CAD system had a sensitivity of 54%, 64%, 68%, and 76%, respectively, with an average of 5.6 false-positives (FPs) per scan. Without CAD, the average figures-of-merit (FOMs) for the six radiologists, obtained from JAFROC analysis, were 0.661, 0.729, 0.793 and 0.838 for the same nodule diameter thresholds, respectively. With CAD, the corresponding average FOMs improved to 0.705, 0.763, 0.810 and 0.862, respectively. The improvement achieved statistical significance for nodules at the 3 and 4 mm thresholds (p=0.002 and 0.020, respectively), and did not achieve significance at 5 and 6 mm (p=0.18 and 0.13, respectively). At a nodule diameter threshold of 3 mm, the radiologists’ average sensitivity and FP rate were 0.56 and 0.67, respectively, without CAD, and 0.67 and 0.78 with CAD. Conclusion CAD improves thoracic radiologists’ performance for detecting pulmonary nodules under 5 mm on CT examinations, which are often overlooked by visual inspection alone. PMID:19896069

  6. School-age effects of the newborn individualized developmental care and assessment program for preterm infants with intrauterine growth restriction: preliminary findings

    PubMed Central

    2013-01-01

    Background The experience in the newborn intensive care nursery results in premature infants’ neurobehavioral and neurophysiological dysfunction and poorer brain structure. Preterms with severe intrauterine growth restriction are doubly jeopardized given their compromised brains. The Newborn Individualized Developmental Care and Assessment Program improved outcome at early school-age for preterms with appropriate intrauterine growth. It also showed effectiveness to nine months for preterms with intrauterine growth restriction. The current study tested effectiveness into school-age for preterms with intrauterine growth restriction regarding executive function (EF), electrophysiology (EEG) and neurostructure (MRI). Methods Twenty-three 9-year-old former growth-restricted preterms, randomized at birth to standard care (14 controls) or to the Newborn Individualized Developmental Care and Assessment Program (9 experimentals) were assessed with standardized measures of cognition, achievement, executive function, electroencephalography, and magnetic resonance imaging. The participating children were comparable to those lost to follow-up, and the controls to the experimentals, in terms of newborn background health and demographics. All outcome measures were corrected for mother’s intelligence. Analysis techniques included two-group analysis of variance and stepwise discriminate analysis for the outcome measures, Wilks’ lambda and jackknifed classification to ascertain two-group classification success per and across domains; canonical correlation analysis to explore relationships among neuropsychological, electrophysiological and neurostructural domains at school-age, and from the newborn period to school-age. Results Controls and experimentals were comparable in age at testing, anthropometric and health parameters, and in cognitive and achievement scores. Experimentals scored better in executive function, spectral coherence, and cerebellar volumes. Furthermore, executive function, spectral coherence and brain structural measures discriminated controls from experimentals. Executive function correlated with coherence and brain structure measures, and with newborn-period neurobehavioral assessment. Conclusion The intervention in the intensive care nursery improved executive function as well as spectral coherence between occipital and frontal as well as parietal regions. The experimentals’ cerebella were significantly larger than the controls’. These results, while preliminary, point to the possibility of long-term brain improvement even of intrauterine growth compromised preterms if individualized intervention begins with admission to the NICU and extends throughout transition home. Larger sample replications are required in order to confirm these results. Clinical trial registration The study is registered as a clinical trial. The trial registration number is NCT00914108. PMID:23421857

  7. Evaluation of optical remote sensing to estimate evapotranspiration and canopy conductance

    NASA Astrophysics Data System (ADS)

    Yebra, M.; van Dijk, A. I.; Leuning, R.; Huete, A. R.; Guerschman, J. P.

    2012-12-01

    e compared evapotranspiration (ET) estimates produced with six different vegetation measures derived from the Moderate Resolution Imaging Spectroradiometer (MODIS) and three contrasting estimation approaches using measurements from eddy covariance flux towers at 16 FLUXNET sites located over six different land cover types. The aim was to assess optimal approaches in using optical remote sensing to estimate ET. The first two approaches directly regressed various MODIS vegetation indices (VIs) and leaf area index (LAI) and fraction of photosynthetically active radiation (fPAR) products with ET and evaporative fraction. In the third approach, the Penman-Monteith (PM) equation was inverted to obtain surface conductance (Gs), represented by days dominated by dry canopy conductance (Gc). The Gc values were then regressed against the MODIS vegetation products and used to parameterize the PM equation for retrievals of ET. Jack-Knifing cross validation was used to evaluate the various regression models and assess their performance across all land cover types and sites. Our analysis shows that the PM-Gc approach leads to the lowest root mean square errors and highest determination coefficients globally across all sites. The MODIS LAI and fPAR products produced the poorest estimates of ET; while the VIs each performed best for some of the land cover types. The enhanced vegetation index (EVI) produced considerably better ET estimates for evergreen needlefleaf forest, the normalized difference vegetation index (NDVI) best estimated ET in grassland, cropland and woody savannas and the VI-based crop coefficient (Kc) yielded the best estimates for evergreen and deciduous broadleaf forests. Using the mean of the Gc estimates derived from NDVI, EVI and Kc we computed global grids of Gc from which annual statistics were extracted to characterise different functional types. The resulting values can be used to parameterize land surface models.ean global Gc for 2001-2011 estimated as the average of values predicted based on NDVI, EVI and Kc calculated from MCD43C4 data (downloaded from ftp://e4ftl01.cr.usgs.gov/MOTA/MCD43C4.005/ in February 2012).

  8. Complex Faulting within the New Madrid Seismic Zone

    NASA Astrophysics Data System (ADS)

    Deshon, H. R.; Powell, C. A.; Magnani, M.; Bisrat, S. T.

    2010-12-01

    Relative relocations derived using double-difference tomography techniques reveal a complex sequence of faulting within the New Madrid Seismic Zone (NMSZ) and upper Mississippi Embayment. The majority of NMSZ seismicity recorded over the last 30 years occurs along four limbs: 1) a NE-SW trending dextral strike-slip fault, termed the Axial fault, coincident with the central valley of the Cambrian Reelfoot Rift system; 2) the SE-NW trending Reelfoot thrust fault; 3) a E-W trending left lateral strike-slip fault extending off of the northern terminus of the Reelfoot fault, here termed New Madrid west; and 4) a NE-SW dextral strike-slip fault also extending off of the northern terminus of the Reelfoot fault, here termed New Madrid north. Each of these segments is thought to have ruptured during the 1811-1812 large earthquake sequence. A fifth segment, the Bootheel lineament, is marked by 1811-1812 related liquefaction features but appears largely aseismic, though we suggest there are at least five events in the catalog associated with this feature. Geological and geophysical evidence across the embayment suggests that the region is crossed by additional faults at shallow depths (<1-2 km), while seismicity is generally confined to the 3-20 km depth range. Here we present relative relocations derived using catalog and waveform cross-correlation differential times of the 1989-1992 local PANDA network and the 1995-2010 Cooperative New Madrid Seismic Network. We show that the four known seismic lineations exhibit internal complexity. For example, New Madrid north is composed of two parallel faults rather then a single fault, and seismicity associated with the Axial lineation exhibits temporal changes along strike and becomes spatially more diffuse south of the Axial fault/Bootheel lineament intersection. Seismicity along the southern Reelfoot fault does not define a dipping plane consistent with thrust faulting, unlike the northern Reelfoot fault, and is associated with anomalously low P wave velocities. Swarm activity along the southern portion of the Reelfoot fault and near the northern portion of the Reelfoot fault terminus may be related to fault intersections within this complicated transpressional system. Recent reflection data of the upper 1 km imaged along the Mississippi River indicate that both the north termini of the Reelfoot and Axial faults are characterized by splay faulting, while at depth microseismicity is planar. Absolute and relative error will be assessed by computing locations within two 3D P and S wave velocity models of the study area, using finite difference and pseudo-bending ray tracing approaches, and jack-knife approaches to test dependence on network geometry.

  9. Using Chironomid-Based Transfer Function and Stable Isotopes for Reconstructing Past Climate in South Eastern Australia

    NASA Astrophysics Data System (ADS)

    Chang, J.; Shulmeister, J.; Woodward, C.

    2014-12-01

    A transfer-function based on chironomids was created to reconstruct past summer temperatures from a training set comprised of 33 south eastern Australian lakes. Statistical analyses show that mean February temperature (MFT) is the most robust and independent variable explaining chironomid species variability. The best MFT transfer function was a partial least squares (PLS) model with a coefficient of determination (r2Jackknifed) of 0.69, a root mean squared error of prediction (RMSEP) of 2.33˚C, and maximum bias of 2.15°C. The transfer function was tested by applying it to a Late Glacial to Holocene record from Blue Lake, New South Wales using published data. The reconstruction displays an overall pattern very similar to the Milankovitch driven summer insolation curve for 30°S and to the chironomid based summer temperature reconstruction from Eagle Tarn, Tasmania (Rees and Cwynar 2010) suggesting that the model is robust. The transfer function was also applied to reconstruct the Last Glacial Maxium (LGM) summer temperature from Welsby Lagoon, North Stradbroke Island (Queensland). Preliminary results show a c. 4.2~8.6˚C of cooling in summer temperatures during the LGM from south east Australia. Stable oxygen and deuterium isotope composition (δ18O and δD) of the chitnous subfossil head capsules from Australian chironomids were also measured to explore the opportunity developing them as an independent temperature proxy. This is the first application of this technique in the Southern Hemisphere. The modern range of chironomid δ18O values were measured based on the same 33 lakes sampled for the transfer function. For these lakes, head capsules of single genera were picked to avoid complications from 'vital effects'. The relationship of chironomid δ18O to modern lake temperatures has been investigated. Deuterium (δD) on the head capsules has been measured concurrently and the relationship to climate and environment will be explored based on the latest available results. References Rees A.B.H, and Cwynar, L.C. (2010) Evidence for early postglacial warming in Mount Field National Park, Tasmania. Quaternary Science Reviews29, 443-454.

  10. Identifying DNA-binding proteins by combining support vector machine and PSSM distance transformation

    PubMed Central

    2015-01-01

    Background DNA-binding proteins play a pivotal role in various intra- and extra-cellular activities ranging from DNA replication to gene expression control. Identification of DNA-binding proteins is one of the major challenges in the field of genome annotation. There have been several computational methods proposed in the literature to deal with the DNA-binding protein identification. However, most of them can't provide an invaluable knowledge base for our understanding of DNA-protein interactions. Results We firstly presented a new protein sequence encoding method called PSSM Distance Transformation, and then constructed a DNA-binding protein identification method (SVM-PSSM-DT) by combining PSSM Distance Transformation with support vector machine (SVM). First, the PSSM profiles are generated by using the PSI-BLAST program to search the non-redundant (NR) database. Next, the PSSM profiles are transformed into uniform numeric representations appropriately by distance transformation scheme. Lastly, the resulting uniform numeric representations are inputted into a SVM classifier for prediction. Thus whether a sequence can bind to DNA or not can be determined. In benchmark test on 525 DNA-binding and 550 non DNA-binding proteins using jackknife validation, the present model achieved an ACC of 79.96%, MCC of 0.622 and AUC of 86.50%. This performance is considerably better than most of the existing state-of-the-art predictive methods. When tested on a recently constructed independent dataset PDB186, SVM-PSSM-DT also achieved the best performance with ACC of 80.00%, MCC of 0.647 and AUC of 87.40%, and outperformed some existing state-of-the-art methods. Conclusions The experiment results demonstrate that PSSM Distance Transformation is an available protein sequence encoding method and SVM-PSSM-DT is a useful tool for identifying the DNA-binding proteins. A user-friendly web-server of SVM-PSSM-DT was constructed, which is freely accessible to the public at the web-site on http://bioinformatics.hitsz.edu.cn/PSSM-DT/. PMID:25708928

  11. Use of 16S-23S rRNA intergenic spacer region PCR and repetitive extragenic palindromic PCR analyses of Escherichia coli isolates to identify nonpoint fecal sources.

    PubMed

    Seurinck, Sylvie; Verstraete, Willy; Siciliano, Steven D

    2003-08-01

    Despite efforts to minimize fecal input into waterways, this kind of pollution continues to be a problem due to an inability to reliably identify nonpoint sources. Our objective was to find candidate source-specific Escherichia coli fingerprints as potential genotypic markers for raw sewage, horses, dogs, gulls, and cows. We evaluated 16S-23S rRNA intergenic spacer region (ISR)-PCR and repetitive extragenic palindromic (rep)-PCR analyses of E. coli isolates as tools to identify nonpoint fecal sources. The BOXA1R primer was used for rep-PCR analysis. A total of 267 E. coli isolates from different fecal sources were typed with both techniques. E. coli was found to be highly diverse. Only two candidate source-specific E. coli fingerprints, one for cow and one for raw sewage, were identified out of 87 ISR fingerprints. Similarly, there was only one candidate source-specific E. coli fingerprint for horse out of 59 BOX fingerprints. Jackknife analysis resulted in an average rate of correct classification (ARCC) of 83% for BOX-PCR analysis and 67% for ISR-PCR analysis for the five source categories of this study. When nonhuman sources were pooled so that each isolate was classified as animal or human derived (raw sewage), ARCCs of 82% for BOX-PCR analysis and 72% for ISR-PCR analysis were obtained. Critical factors affecting the utility of these methods, namely sample size and fingerprint stability, were also assessed. Chao1 estimation showed that generally 32 isolates per fecal source individual were sufficient to characterize the richness of the E. coli population of that source. The results of a fingerprint stability experiment indicated that BOX and ISR fingerprints were stable in natural waters at 4, 12, and 28 degrees C for 150 days. In conclusion, 16S-23S rRNA ISR-PCR and rep-PCR analyses of E. coli isolates have the potential to identify nonpoint fecal sources. A fairly small number of isolates was needed to find candidate source-specific E. coli fingerprints that were stable under the simulated environmental conditions. PMID:12902290

  12. BLAST: CORRELATIONS IN THE COSMIC FAR-INFRARED BACKGROUND AT 250, 350, AND 500 mum REVEAL CLUSTERING OF STAR-FORMING GALAXIES

    SciTech Connect

    Viero, Marco P.; Martin, Peter G.; Netterfield, Calvin B.; Ade, Peter A. R.; Griffin, Matthew; Hargrave, Peter C.; Mauskopf, Philip; Moncelsi, Lorenzo; Pascale, Enzo; Bock, James J.; Chapin, Edward L.; Halpern, Mark; Marsden, Gaelen; Devlin, Mark J.; Klein, Jeff; Gundersen, Joshua O.; Hughes, David H.; MacTavish, Carrie J.; Negrello, Mattia; Olmi, Luca

    2009-12-20

    We detect correlations in the cosmic far-infrared background due to the clustering of star-forming galaxies in observations made with the Balloon-borne Large Aperture Submillimeter Telescope, at 250, 350, and 500 mum. We perform jackknife and other tests to confirm the reality of the signal. The measured correlations are well fitted by a power law over scales of 5'-25', with DELTAI/I = 15.1% +- 1.7%. We adopt a specific model for submillimeter sources in which the contribution to clustering comes from sources in the redshift ranges 1.3 <= z <= 2.2, 1.5 <= z <= 2.7, and 1.7 <= z <= 3.2, at 250, 350, and 500 mum, respectively. With these distributions, our measurement of the power spectrum, P(k{sub t}heta), corresponds to linear bias parameters, b = 3.8 +- 0.6, 3.9 +- 0.6, and 4.4 +- 0.7, respectively. We further interpret the results in terms of the halo model, and find that at the smaller scales, the simplest halo model fails to fit our results. One way to improve the fit is to increase the radius at which dark matter halos are artificially truncated in the model, which is equivalent to having some star-forming galaxies at z >= 1 located in the outskirts of groups and clusters. In the context of this model, we find a minimum halo mass required to host a galaxy is log(M{sub min}/M{sub sun}) = 11.5{sup +0.4}{sub -0.1}, and we derive effective biases b{sub eff} = 2.2 +- 0.2, 2.4 +- 0.2, and 2.6 +- 0.2, and effective masses log(M{sub eff}/M{sub odot})=12.9+-0.3, 12.8 +- 0.2, and 12.7 +- 0.2, at 250, 350 and 500 mum, corresponding to spatial correlation lengths of r{sub 0} = 4.9, 5.0, and 5.2+-0.7 h{sup -1}Mpc, respectively. Finally, we discuss implications for clustering measurement strategies with Herschel and Planck.

  13. Observer Performance in the Detection and Classification of Malignant Hepatic Nodules and Masses with CT Image-Space Denoising and Iterative Reconstruction1

    PubMed Central

    Fletcher, Joel G.; Yu, Lifeng; Li, Zhoubo; Manduca, Armando; Blezek, Daniel J.; Hough, David M.; Venkatesh, Sudhakar K.; Brickner, Gregory C.; Cernigliaro, Joseph C.; Hara, Amy K.; Fidler, Jeff L.; Lake, David S.; Shiung, Maria; Lewis, David; Leng, Shuai; Augustine, Kurt E.; Carter, Rickey E.; Holmes, David R.; McCollough, Cynthia H.

    2015-01-01

    Purpose To determine if lower-dose computed tomographic (CT) scans obtained with adaptive image-based noise reduction (adaptive nonlocal means [ANLM]) or iterative reconstruction (sinogram-affirmed iterative reconstruction [SAFIRE]) result in reduced observer performance in the detection of malignant hepatic nodules and masses compared with routine-dose scans obtained with filtered back projection (FBP). Materials and Methods This study was approved by the institutional review board and was compliant with HIPAA. Informed consent was obtained from patients for the retrospective use of medical records for research purposes. CT projection data from 33 abdominal and 27 liver or pancreas CT examinations were collected (median volume CT dose index, 13.8 and 24.0 mGy, respectively). Hepatic malignancy was defined by progression or regression or with histopathologic findings. Lower-dose data were created by using a validated noise insertion method (10.4 mGy for abdominal CT and 14.6 mGy for liver or pancreas CT) and images reconstructed with FBP, ANLM, and SAFIRE. Four readers evaluated routine-dose FBP images and all lower-dose images, circumscribing liver lesions and selecting diagnosis. The jack-knife free-response receiver operating characteristic figure of merit (FOM) was calculated on a per-malignant nodule or per-mass basis. Noninferiority was defined by the lower limit of the 95% confidence interval (CI) of the difference between lower-dose and routine-dose FOMs being less than −0.10. Results Twenty-nine patients had 62 malignant hepatic nodules and masses. Estimated FOM differences between lower-dose FBP and lower-dose ANLM versus routine-dose FBP were noninferior (difference: −0.041 [95% CI: −0.090, 0.009] and −0.003 [95% CI: −0.052, 0.047], respectively). In patients with dedicated liver scans, lower-dose ANLM images were noninferior (difference: +0.015 [95% CI: −0.077, 0.106]), whereas lower-dose FBP images were not (difference −0.049 [95% CI: −0.140, 0.043]). In 37 patients with SAFIRE reconstructions, the three lower-dose alternatives were found to be noninferior to the routine-dose FBP. Conclusion At moderate levels of dose reduction, lower-dose FBP images without ANLM or SAFIRE were noninferior to routine-dose images for abdominal CT but not for liver or pancreas CT. PMID:26020436

  14. Stock assessment of fishery target species in Lake Koka, Ethiopia.

    PubMed

    Tesfaye, Gashaw; Wolff, Matthias

    2015-09-01

    Effective management is essential for small-scale fisheries to continue providing food and livelihoods for households, particularly in developing countries where other options are often limited. Studies on the population dynamics and stock assessment on fishery target species are thus imperative to sustain their fisheries and the benefits for the society. In Lake Koka (Ethiopia), very little is known about the vital population parameters and exploitation status of the fishery target species: tilapia Oreochromis niloticus, common carp Cyprinus carpio and catfish Clarias gariepinus. Our study, therefore, aimed at determining the vital population parameters and assessing the status of these target species in Lake Koka using length frequency data collected quarterly from commercial catches from 2007-2012. A total of 20,097 fish specimens (distributed as 7,933 tilapia, 6,025 catfish and 6,139 common carp) were measured for the analysis. Von Bertalarffy growth parameters and their confidence intervals were determined from modal progression analysis using ELEFAN I and applying the jackknife technique. Mortality parameters were determined from length-converted catch curves and empirical models. The exploitation status of these target species were then assessed by computing exploitation rates (E) from mortality parameters as well as from size indicators i.e., assessing the size distribution of fish catches relative to the size at maturity (Lm), the size that provides maximum cohort biomass (Lopt) and the abundance of mega-spawners. The mean value of growth parameters L∞, K and the growth performance index ø' were 44.5 cm, 0.41/year and 2.90 for O. niloticus, 74.1 cm, 0.28/year and 3.19 for C. carpio and 121.9 cm, 0.16/year and 3.36 for C. gariepinus, respectively. The 95 % confidence intervals of the estimates were also computed. Total mortality (Z) estimates were 1.47, 0.83 and 0.72/year for O. niloticus, C. carpio and C. gariepinus, respectively. Our study suggest that O. niloticus is in a healthy state, while C. gariepinus show signs of growth overfishing (when both exploitation rate (E) and size indicators were considered). In case of C. carpio, the low exploitation rate encountered would point to underfishing, while the size indicators of the catches would suggest that too small fish are harvested leading to growth overfishing. We concluded that fisheries production in Lake Koka could be enhanced by increasing E toward optimum level of exploitation (Eopt) for the underexploited C. carpio and by increasing the size at first capture (Lc) toward the Lopt, range for all target species. PMID:26666131

  15. A stable pattern of EEG spectral coherence distinguishes children with autism from neuro-typical controls - a large case control study

    PubMed Central

    2012-01-01

    Background The autism rate has recently increased to 1 in 100 children. Genetic studies demonstrate poorly understood complexity. Environmental factors apparently also play a role. Magnetic resonance imaging (MRI) studies demonstrate increased brain sizes and altered connectivity. Electroencephalogram (EEG) coherence studies confirm connectivity changes. However, genetic-, MRI- and/or EEG-based diagnostic tests are not yet available. The varied study results likely reflect methodological and population differences, small samples and, for EEG, lack of attention to group-specific artifact. Methods Of the 1,304 subjects who participated in this study, with ages ranging from 1 to 18 years old and assessed with comparable EEG studies, 463 children were diagnosed with autism spectrum disorder (ASD); 571 children were neuro-typical controls (C). After artifact management, principal components analysis (PCA) identified EEG spectral coherence factors with corresponding loading patterns. The 2- to 12-year-old subsample consisted of 430 ASD- and 554 C-group subjects (n = 984). Discriminant function analysis (DFA) determined the spectral coherence factors' discrimination success for the two groups. Loading patterns on the DFA-selected coherence factors described ASD-specific coherence differences when compared to controls. Results Total sample PCA of coherence data identified 40 factors which explained 50.8% of the total population variance. For the 2- to 12-year-olds, the 40 factors showed highly significant group differences (P < 0.0001). Ten randomly generated split half replications demonstrated high-average classification success (C, 88.5%; ASD, 86.0%). Still higher success was obtained in the more restricted age sub-samples using the jackknifing technique: 2- to 4-year-olds (C, 90.6%; ASD, 98.1%); 4- to 6-year-olds (C, 90.9%; ASD 99.1%); and 6- to 12-year-olds (C, 98.7%; ASD, 93.9%). Coherence loadings demonstrated reduced short-distance and reduced, as well as increased, long-distance coherences for the ASD-groups, when compared to the controls. Average spectral loading per factor was wide (10.1 Hz). Conclusions Classification success suggests a stable coherence loading pattern that differentiates ASD- from C-group subjects. This might constitute an EEG coherence-based phenotype of childhood autism. The predominantly reduced short-distance coherences may indicate poor local network function. The increased long-distance coherences may represent compensatory processes or reduced neural pruning. The wide average spectral range of factor loadings may suggest over-damped neural networks. PMID:22730909

  16. Power Laws for Heavy-Tailed Distributions: Modeling Allele and Haplotype Diversity for the National Marrow Donor Program

    PubMed Central

    Gragert, Loren; Maiers, Martin; Chatterjee, Ansu; Albrecht, Mark

    2015-01-01

    Measures of allele and haplotype diversity, which are fundamental properties in population genetics, often follow heavy tailed distributions. These measures are of particular interest in the field of hematopoietic stem cell transplant (HSCT). Donor/Recipient suitability for HSCT is determined by Human Leukocyte Antigen (HLA) similarity. Match predictions rely upon a precise description of HLA diversity, yet classical estimates are inaccurate given the heavy-tailed nature of the distribution. This directly affects HSCT matching and diversity measures in broader fields such as species richness. We, therefore, have developed a power-law based estimator to measure allele and haplotype diversity that accommodates heavy tails using the concepts of regular variation and occupancy distributions. Application of our estimator to 6.59 million donors in the Be The Match Registry revealed that haplotypes follow a heavy tail distribution across all ethnicities: for example, 44.65% of the European American haplotypes are represented by only 1 individual. Indeed, our discovery rate of all U.S. European American haplotypes is estimated at 23.45% based upon sampling 3.97% of the population, leaving a large number of unobserved haplotypes. Population coverage, however, is much higher at 99.4% given that 90% of European Americans carry one of the 4.5% most frequent haplotypes. Alleles were found to be less diverse suggesting the current registry represents most alleles in the population. Thus, for HSCT registries, haplotype discovery will remain high with continued recruitment to a very deep level of sampling, but population coverage will not. Finally, we compared the convergence of our power-law versus classical diversity estimators such as Capture recapture, Chao, ACE and Jackknife methods. When fit to the haplotype data, our estimator displayed favorable properties in terms of convergence (with respect to sampling depth) and accuracy (with respect to diversity estimates). This suggests that power-law based estimators offer a valid alternative to classical diversity estimators and may have broad applicability in the field of population genetics. PMID:25901749

  17. Establishing macroecological trait datasets: digitalization, extrapolation, and validation of diet preferences in terrestrial mammals worldwide

    PubMed Central

    Kissling, Wilm Daniel; Dalby, Lars; Fløjgaard, Camilla; Lenoir, Jonathan; Sandel, Brody; Sandom, Christopher; Trøjelsgaard, Kristian; Svenning, Jens-Christian

    2014-01-01

    Ecological trait data are essential for understanding the broad-scale distribution of biodiversity and its response to global change. For animals, diet represents a fundamental aspect of species’ evolutionary adaptations, ecological and functional roles, and trophic interactions. However, the importance of diet for macroevolutionary and macroecological dynamics remains little explored, partly because of the lack of comprehensive trait datasets. We compiled and evaluated a comprehensive global dataset of diet preferences of mammals (“MammalDIET”). Diet information was digitized from two global and cladewide data sources and errors of data entry by multiple data recorders were assessed. We then developed a hierarchical extrapolation procedure to fill-in diet information for species with missing information. Missing data were extrapolated with information from other taxonomic levels (genus, other species within the same genus, or family) and this extrapolation was subsequently validated both internally (with a jack-knife approach applied to the compiled species-level diet data) and externally (using independent species-level diet information from a comprehensive continentwide data source). Finally, we grouped mammal species into trophic levels and dietary guilds, and their species richness as well as their proportion of total richness were mapped at a global scale for those diet categories with good validation results. The success rate of correctly digitizing data was 94%, indicating that the consistency in data entry among multiple recorders was high. Data sources provided species-level diet information for a total of 2033 species (38% of all 5364 terrestrial mammal species, based on the IUCN taxonomy). For the remaining 3331 species, diet information was mostly extrapolated from genus-level diet information (48% of all terrestrial mammal species), and only rarely from other species within the same genus (6%) or from family level (8%). Internal and external validation showed that: (1) extrapolations were most reliable for primary food items; (2) several diet categories (“Animal”, “Mammal”, “Invertebrate”, “Plant”, “Seed”, “Fruit”, and “Leaf”) had high proportions of correctly predicted diet ranks; and (3) the potential of correctly extrapolating specific diet categories varied both within and among clades. Global maps of species richness and proportion showed congruence among trophic levels, but also substantial discrepancies between dietary guilds. MammalDIET provides a comprehensive, unique and freely available dataset on diet preferences for all terrestrial mammals worldwide. It enables broad-scale analyses for specific trophic levels and dietary guilds, and a first assessment of trait conservatism in mammalian diet preferences at a global scale. The digitalization, extrapolation and validation procedures could be transferable to other trait data and taxa. PMID:25165528

  18. Frequency-dependent Lg Q within the continental United States

    USGS Publications Warehouse

    Erickson, D.; McNamara, D.E.; Benz, H.M.

    2004-01-01

    Frequency-dependent crustal attenuation (1/Q) is determined for seven distinct physiographic/tectonic regions of the continental United States using high-quality Lg waveforms recorded on broadband stations in the frequency band 0.5 to 16 Hz. Lg attenuation is determined from time-domain amplitude measurements in one-octave frequency bands centered on the frequencies 0.75, 1.0, 3.0, 6.0, and 12.0 Hz. Modeling errors are determined using a delete-j jackknife resampling technique. The frequency-dependent quality factor is modeled in the form of Q = Q0f??. Regions were initially selected based on tectonic provinces but were eventually limited and adjusted to maximize ray path coverage in each area. Earthquake data was recorded on several different networks and constrained to events occurring within the crust (<40 km depth) and at least mb 3.5 in size. A singular value decomposition inversion technique was applied to the data to simultaneously solve for source and receiver terms along with Q for each region at specific frequencies. The lowest crustal Q was observed in northern and southern California where Q is described by the functions Q = 152(?? 37)f0.72(??0.16) and Q = 105(??26) f0.67(??0.16), respectively. The Basin and Range Province, Pacific Northwest, and Rocky Mountain states also display lower Q and a strong frequency dependence characterized by the functions Q = 200(??40)f0.68(??0.12), Q = 152(??49) f0.76(??0.18), and Q = 166(??37) f0.61(??0.14), respectively. In contrast, in the central and northeast United States Q functions are Q = 640(?? 225) f0.344(??0.22) and Q = 650(??143) f0.36(??0.14), respectively, show a high crustal Q and a weaker frequency dependence. These results improve upon previous Lg modeling by subdividing the United States into smaller, distinct tectonic regions and using significantly more data that provide improved constraints on frequency-dependent attenuation and errors. A detailed attenuation map of the continental United States can provide significant input into hazard map mitigation. Both scattering and intrinsic attenuation mechanisms are likely to play a comparable role in the frequency range considered in the study.

  19. Fossil Chironomidae (Insecta: Diptera) as quantitative indicators of past salinity in African lakes

    NASA Astrophysics Data System (ADS)

    Eggermont, Hilde; Heiri, Oliver; Verschuren, Dirk

    2006-08-01

    We surveyed sub-fossil chironomid assemblages in surface sediments of 73 low- to mid-elevation lakes in tropical East Africa (Uganda, Kenya, Tanzania, Ethiopia) to develop inference models for quantitative paleosalinity reconstruction. Using a calibration data set of 67 lakes with surface-water conductivity between 34 and 68,800 μS/cm, trial models based on partial least squares (PLS), weighted-averaging (WA), weighted-averaging partial least squares (WA-PLS), maximum likelihood (ML), and the weighted modern analogue technique (WMAT) produced jack-knifed coefficients of determination ( r2) between 0.83 and 0.87, and root-mean-squared errors of prediction (RMSEP) between 0.27 and 0.31 log 10 conductivity units, values indicating that fossil assemblages of African Chironomidae can be valuable indicators of past salinity change. The new inference models improve on previous models, which were calibrated with presence-absence data from live collections, by the much greater information content of the calibration data set, and greater probability of finding good modern analogues for fossil assemblages. However, inferences still suffered to a greater (WA, WMAT) or lesser (WA-PLS, PLS and ML) extent from weak correlation between chironomid species distribution and salinity in a broad range of fresh waters, and apparent threshold response of African chironomid communities to salinity change near 3000 μS/cm. To improve model sensitivity in freshwater lakes we expanded the calibration data set with 11 dilute (6-61 μS/cm) high-elevation lakes on Mt. Kenya (Kenya) and the Ruwenzori Mts. (Uganda). This did not appreciably improve models' error statistics, in part because it introduced a secondary environmental gradient to the faunal data, probably temperature. To evaluate whether a chironomid-based salinity inference model calibrated in East African lakes could be meaningfully used for environmental reconstruction elsewhere on the continent, we expanded the calibration data set with 8 fresh (15-168 μS/cm) lakes in Cameroon, West Africa, and one hypersaline desert lake in Chad. This experiment yielded poorer error statistics, primarily because the need to amalgamate East and West African sister taxa reduced overall taxonomic resolution and increased the mean tolerance range of retained taxa. However, the merged data set constrained better the salinity optimum of several freshwater taxa, and further increased the probability of finding good modern analogues. We then used chironomid stratigraphic data and independent proxy reconstructions from two fluctuating lakes in Kenya to compare the performance of new and previous African salinity-inference models. This analysis revealed significant differences between the various numerical techniques in reconstructed salinity trends through time, due to their different sensitivity to the presence or relative abundance of certain key taxa, combined with the above-mentioned threshold faunal response to salinity change. Simple WA and WMAT produced ecologically sensible reconstructions because their step-like change in inferred conductivity near 3000 μS/cm mirrors the relatively rapid transitions between fresh and saline lake phases associated with climate-driven lake-level change in shallow tropical closed-basin lakes. Statistical camouflaging of this threshold faunal response in WA-PLS and ML models resulted in less trustworthy reconstructions of past salinity in lakes crossing the freshwater-saline boundary. We conclude that selection of a particular inference model should not only be based on statistical performance measures, but consider chironomid community ecology in the study region, and the amplitude of reconstructed environmental change relative to the modern environmental gradient represented in the calibration data set.

  20. Implementing the national AIGA flash flood warning system in France

    NASA Astrophysics Data System (ADS)

    Organde, Didier; Javelle, Pierre; Demargne, Julie; Arnaud, Patrick; Caseri, Angelica; Fine, Jean-Alain; de Saint Aubin, Céline

    2015-04-01

    The French national hydro-meteorological and flood forecasting centre (SCHAPI) aims to implement a national flash flood warning system to improve flood alerts for small-to-medium (up to 1000 km2) ungauged basins. This system is based on the AIGA method, co-developed by IRSTEA these last 10 years. The method, initially set up for the Mediterranean area, is based on a simple event-based hourly hydrologic distributed model run every 15 minutes (Javelle et al. 2014). The hydrologic model ingests operational radar-gauge rainfall grids from Météo-France at a 1-km² resolution to produce discharges for successive outlets along the river network. Discharges are then compared to regionalized flood quantiles of given return periods and warnings (expressed as the range of the return period estimated in real-time) are provided on a river network map. The main interest of the method is to provide forecasters and emergency services with a synthetic view in real time of the ongoing flood situation, information that is especially critical in ungauged flood prone areas. In its enhanced national version, the hourly event-based distributed model is coupled to a continuous daily rainfall-runoff model which provides baseflow and a soil moisture index (for each 1-km² pixel) at the beginning of the hourly simulation. The rainfall-runoff models were calibrated on a selection of 700 French hydrometric stations with Météo-France radar-gauge reanalysis dataset for the 2002-2006 period. To estimate model parameters for ungauged basins, the 2 hydrologic models were regionalised by testing both regressions (using different catchment attributes, such as catchment area, soil type, and climate characteristic) and spatial proximity techniques (transposing parameters from neighbouring donor catchments), as well as different homogeneous hydrological areas. The most valuable regionalisation method was determined for each model through jack-knife cross-validation. The system performance was then evaluated with contingency criteria (e.g., Critical Success Index, Probability Of Detection, Success Ratio) using operational rainfall radar-gauge products from Météo-France for the 2009-2012 period. The regionalised parameters of the distributed model were finally adjusted for each homogeneous hydrological area to optimize the Heidke skill score (HSS) calculated with three levels of warnings (2-, 10- and 50-year flood quantiles). This work is currently being implemented by the SCHAPI to set up an automated national flash flood warning system by 2016. Planned improvements include developing a unique continuous model to be run at a sub-hourly timestep, discharge assimilation, as well as integrating precipitation forecasts while accounting for the main sources of forecast uncertainty. Javelle, P., Demargne, J., Defrance, D., and Arnaud, P. 2014. Evaluating flash flood warnings at ungauged locations using post-event surveys: a case study with the AIGA warning system. Hydrological Sciences Journal, DOI: 10.1080/02626667.2014.923970

  1. Source mechanisms of the 2000 earthquake swarm in the West Bohemia/Vogtland region (Central Europe)

    NASA Astrophysics Data System (ADS)

    Horálek, Josef; Šílený, Jan

    2013-08-01

    An earthquake swarm of magnitudes up to ML = 3.2 occurred in the region of West Bohemia/Vogtland (border area between Czech Republic and Germany) in autumn 2000. This swarm consisted of nine episodic phases and lasted 4 months. We retrieved source mechanisms of 102 earthquakes with magnitudes between ML = 1.6 and 3.2 applying inversion of the peak amplitudes of direct P and SH waves, which were determined from ground motion seismograms. The investigated events cover the whole swarm activity in both time and space. We use data from permanent stations of seismic network WEBNET and from temporal stations, which were deployed in the epicentral area during the swarm; the number of stations varied from 7 to 18. The unconstrained moment tensor (MT) expression of the mechanism, which describes a general system of dipoles, that is both double-couple (DC) and non-DC sources, was applied. MTs of each earthquake were estimated by inversion of three different sets of data: P-wave amplitudes only, P- and SH-wave amplitudes and P-wave amplitudes along with the SH-wave amplitudes from a priori selected four `base' WEBNET stations, the respective MT solutions are nearly identical for each event investigated. The resultant mechanisms of all events are dominantly DCs with only insignificant non-DC components mostly not exceeding 10 per cent. We checked reliability of the MTs in jackknife trials eliminating some data; we simulated the mislocation of hypocentre or contaminated the P- and SH-wave amplitudes by accidental errors. These tests proved stable and well constrained MT solutions. The massive dominance of the DC in all investigated events implies that the 2000 swarm consisted of a large number of pure shears along a fault plane. The focal mechanisms indicate both oblique-normal and oblique-thrust faulting, however, the oblique-normal faulting prevails. The predominant strikes and dips of the oblique-normal events fit well the geometry of the main fault plane Nový Kostel (NK) and also match the strike, dip and rake of the largest ML = 4.6 earthquake of a strong swarm in 1985/86. On the contrary, the 2000 source mechanisms differ substantially from those of the 1997-swarm (which took place in two fault segments at the edge of the main NK fault plane) in both the faulting and the content of non-DC components. Further, we found that the scalar seismic moment M0 is related to the local magnitude ML used by WEBNET as M0 ∝ 101.12ML, which differs from the scaling law using moment magnitude Mw, that is M0 ∝ 101.5Mw.

  2. Optimized multiple quantum MAS lineshape simulations in solid state NMR

    NASA Astrophysics Data System (ADS)

    Brouwer, William J.; Davis, Michael C.; Mueller, Karl T.

    2009-10-01

    The majority of nuclei available for study in solid state Nuclear Magnetic Resonance have half-integer spin I>1/2, with corresponding electric quadrupole moment. As such, they may couple with a surrounding electric field gradient. This effect introduces anisotropic line broadening to spectra, arising from distinct chemical species within polycrystalline solids. In Multiple Quantum Magic Angle Spinning (MQMAS) experiments, a second frequency dimension is created, devoid of quadrupolar anisotropy. As a result, the center of gravity of peaks in the high resolution dimension is a function of isotropic second order quadrupole and chemical shift alone. However, for complex materials, these parameters take on a stochastic nature due in turn to structural and chemical disorder. Lineshapes may still overlap in the isotropic dimension, complicating the task of assignment and interpretation. A distributed computational approach is presented here which permits simulation of the two-dimensional MQMAS spectrum, generated by random variates from model distributions of isotropic chemical and quadrupole shifts. Owing to the non-convex nature of the residual sum of squares (RSS) function between experimental and simulated spectra, simulated annealing is used to optimize the simulation parameters. In this manner, local chemical environments for disordered materials may be characterized, and via a re-sampling approach, error estimates for parameters produced. Program summaryProgram title: mqmasOPT Catalogue identifier: AEEC_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEEC_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 3650 No. of bytes in distributed program, including test data, etc.: 73 853 Distribution format: tar.gz Programming language: C, OCTAVE Computer: UNIX/Linux Operating system: UNIX/Linux Has the code been vectorised or parallelized?: Yes RAM: Example: (1597 powder angles) × (200 Samples) × (81 F2 frequency pts) × (31 F1 frequency points) = 3.5M, SMP AMD opteron Classification: 2.3 External routines: OCTAVE ( http://www.gnu.org/software/octave/), GNU Scientific Library ( http://www.gnu.org/software/gsl/), OPENMP ( http://openmp.org/wp/) Nature of problem: The optimal simulation and modeling of multiple quantum magic angle spinning NMR spectra, for general systems, especially those with mild to significant disorder. The approach outlined and implemented in C and OCTAVE also produces model parameter error estimates. Solution method: A model for each distinct chemical site is first proposed, for the individual contribution of crystallite orientations to the spectrum. This model is averaged over all powder angles [1], as well as the (stochastic) parameters; isotropic chemical shift and quadrupole coupling constant. The latter is accomplished via sampling from a bi-variate Gaussian distribution, using the Box-Muller algorithm to transform Sobol (quasi) random numbers [2]. A simulated annealing optimization is performed, and finally the non-linear jackknife [3] is applied in developing model parameter error estimates. Additional comments: The distribution contains a script, mqmasOpt.m, which runs in the OCTAVE language workspace. Running time: Example: (1597 powder angles) × (200 Samples) × (81 F2 frequency pts) × (31 F1 frequency points) = 58.35 seconds, SMP AMD opteron. References:S.K. Zaremba, Annali di Matematica Pura ed Applicata 73 (1966) 293. H. Niederreiter, Random Number Generation and Quasi-Monte Carlo Methods, SIAM, 1992. T. Fox, D. Hinkley, K. Larntz, Technometrics 22 (1980) 29.

  3. SIFlore, a dataset of geographical distribution of vascular plants covering five centuries of knowledge in France: Results of a collaborative project coordinated by the Federation of the National Botanical Conservatories

    PubMed Central

    Just, Anaïs; Gourvil, Johan; Millet, Jérôme; Boullet, Vincent; Milon, Thomas; Mandon, Isabelle; Dutrève, Bruno

    2015-01-01

    Abstract More than 20 years ago, the French Muséum National d’Histoire Naturelle1 (MNHN, Secretariat of the Fauna and Flora) published the first part of an atlas of the flora of France at a 20km spatial resolution, accounting for 645 taxa (Dupont 1990). Since then, at the national level, there has not been any work on this scale relating to flora distribution, despite the obvious need for a better understanding. In 2011, in response to this need, the Federation des Conservatoires Botaniques Nationaux2 (FCBN, http://www.fcbn.fr) launched an ambitious collaborative project involving eleven national botanical conservatories of France. The project aims to establish a formal procedure and standardized system for data hosting, aggregation and publication for four areas: flora, fungi, vegetation and habitats. In 2014, the first phase of the project led to the development of the national flora dataset: SIFlore. As it includes about 21 million records of flora occurrences, this is currently the most comprehensive dataset on the distribution of vascular plants (Tracheophyta) in the French territory. SIFlore contains information for about 15'454 plant taxa occurrences (indigenous and alien taxa) in metropolitan France and Reunion Island, from 1545 until 2014. The data records were originally collated from inventories, checklists, literature and herbarium records. SIFlore was developed by assembling flora datasets from the regional to the national level. At the regional level, source records are managed by the national botanical conservatories that are responsible for flora data collection and validation. In order to present our results, a geoportal was developed by the Fédération des conservatoires botaniques nationaux that allows the SIFlore dataset to be publically viewed. This portal is available at: http://siflore.fcbn.fr. As the FCBN belongs to the Information System for Nature and Landscapes’ (SINP), a governmental program, the dataset is also accessible through the websites of the National Inventory of Natural Heritage (http://www.inpn.fr) and the Global Biodiversity Information Facility (http://www.gbif.fr). SIFlore is regularly updated with additional data records. It is also planned to expand the scope of the dataset to include information about taxon biology, phenology, ecology, chorology, frequency, conservation status and seed banks. A map showing an estimation of the dataset completeness (based on Jackknife 1 estimator) is presented and included as a numerical appendix. Purpose: SIFlore aims to make the data of the flora of France available at the national level for conservation, policy management and scientific research. Such a dataset will provide enough information to allow for macro-ecological reviews of species distribution patterns and, coupled with climatic or topographic datasets, the identification of determinants of these patterns. This dataset can be considered as the primary indicator of the current state of knowledge of flora distribution across France. At a policy level, and in the context of global warming, this should promote the adoption of new measures aiming to improve and intensify flora conservation and surveys. PMID:26491386

  4. Program package for multicanonical simulations of U(1) lattice gauge theory-Second version

    NASA Astrophysics Data System (ADS)

    Bazavov, Alexei; Berg, Bernd A.

    2013-03-01

    A new version STMCMUCA_V1_1 of our program package is available. It eliminates compatibility problems of our Fortran 77 code, originally developed for the g77 compiler, with Fortran 90 and 95 compilers. New version program summaryProgram title: STMC_U1MUCA_v1_1 Catalogue identifier: AEET_v1_1 Licensing provisions: Standard CPC license, http://cpc.cs.qub.ac.uk/licence/licence.html Programming language: Fortran 77 compatible with Fortran 90 and 95 Computers: Any capable of compiling and executing Fortran code Operating systems: Any capable of compiling and executing Fortran code RAM: 10 MB and up depending on lattice size used No. of lines in distributed program, including test data, etc.: 15059 No. of bytes in distributed program, including test data, etc.: 215733 Keywords: Markov chain Monte Carlo, multicanonical, Wang-Landau recursion, Fortran, lattice gauge theory, U(1) gauge group, phase transitions of continuous systems Classification: 11.5 Catalogue identifier of previous version: AEET_v1_0 Journal Reference of previous version: Computer Physics Communications 180 (2009) 2339-2347 Does the new version supersede the previous version?: Yes Nature of problem: Efficient Markov chain Monte Carlo simulation of U(1) lattice gauge theory (or other continuous systems) close to its phase transition. Measurements and analysis of the action per plaquette, the specific heat, Polyakov loops and their structure factors. Solution method: Multicanonical simulations with an initial Wang-Landau recursion to determine suitable weight factors. Reweighting to physical values using logarithmic coding and calculating jackknife error bars. Reasons for the new version: The previous version was developed for the g77 compiler Fortran 77 version. Compiler errors were encountered with Fortran 90 and Fortran 95 compilers (specified below). Summary of revisions: epsilon=one/10**10 is replaced by epsilon/10.0D10 in the parameter statements of the subroutines u1_bmha.f, u1_mucabmha.f, u1wl_backup.f, u1wlread_backup.f of the folder Libs/U1_par. For the tested compilers script files are added in the folder ExampleRuns and readme.txt files are now provided in all subfolders of ExampleRuns. The gnuplot driver files produced by the routine hist_gnu.f of Libs/Fortran are adapted to syntax required by gnuplot version 4.0 and higher. Restrictions: Due to the use of explicit real*8 initialization the conversion into real*4 will require extra changes besides replacing the implicit.sta file by its real*4 version. Unusual features: The programs have to be compiled the script files like those contained in the folder ExampleRuns as explained in the original paper. Running time: The prepared test runs took up to 74 minutes to execute on a 2 GHz PC.

  5. Comparison of soft computing systems for the post-calibration of weather radar

    NASA Astrophysics Data System (ADS)

    Hessami Kermani, Masoud Reza

    The most usual tools to monitor rainfall events are raingauges and weather radar. Networks of raingauges provide accurate point estimates of rainfall, when appropriately set, but their usual low density restricts considerably the spatial resolution of the gathered information. Such networks, with rain gauges at distinct points, do not reflect the spatial distribution of rainfall. The quality of raingauge observations is also susceptible to some error sources, for example wind effects around the raingauges and poor raingauge reports due to hardware problems. Radar systems offer high spatial and temporal resolution observation which is much more efficient at providing the space-time evolution of a rainfall event in comparison with raingauge networks. However the radar measurements are not free of errors due to a variety of factors including ground clutter, bright bands, anomalous propagation, beam blockages, and attenuation. The effectiveness of weather radar operation is strongly linked to rigorous calibration. Various methods have been proposed to calibrate radar data. They can be classified into two main categories: deterministic and statistical. The deterministic approach involves the calibration of radar rainfall estimations against raingauge observations. The statistical approach includes multivariate analysis and cokriging. Geostatistical approaches are known as the best methods for radar-raingauge data integration but they are usually inefficient in real time, especially when dealing with the sampling rates of one hour or less necessary for urban and small watershed applications. Such methods also rely on a strong human expertise which can lead to user-dependent results. The objectives of this research are to introduce and to investigate the feasibility of soft computing systems for the post-calibration of weather radar in comparison with the best existing method based on geostatistics. In this work, the soft computing systems include artificial neural networks and Adaptive Neuro-Fuzzy Inference System (ANFIS) and the geostatistical approach includes residual kriging. The residual kriging calibration results are satisfying however this method is based on stationary hypotheses and requires variogram modeling, making it difficult in an operational context. This method has the advantage of providing a mean squared errors map based on variogram modeling for the estimations. For the artificial neural network, thirteen variants of the multilayer feedforward networks and two variants of radial basis functions are tested in this work. The neural calibration results showed that the Levenberg-Marquardt algorithm using Bayesian regularization is robust and reliable for radar-raingauge data integration. The ANFIS offers the precision and learning capability of artificial neural networks combined with the advantages of fuzzy logic. This method based on the Jackknife approach allows the use of all the available data for training and checking the neuro-fuzzy inference system, and provides a degree of reliability of the post-calibration. The training and the interpolation results of proposed methods can be obtained within just a few seconds using an ordinary personal computer, which is incomparably faster than geostatistical approaches. The proposed algorithms would be very efficient for real time post-calibration.

  6. Validation Analysis of the Shoal Groundwater Flow and Transport Model

    SciTech Connect

    A. Hassan; J. Chapman

    2008-11-01

    Environmental restoration at the Shoal underground nuclear test is following a process prescribed by a Federal Facility Agreement and Consent Order (FFACO) between the U.S. Department of Energy, the U.S. Department of Defense, and the State of Nevada. Characterization of the site included two stages of well drilling and testing in 1996 and 1999, and development and revision of numerical models of groundwater flow and radionuclide transport. Agreement on a contaminant boundary for the site and a corrective action plan was reached in 2006. Later that same year, three wells were installed for the purposes of model validation and site monitoring. The FFACO prescribes a five-year proof-of-concept period for demonstrating that the site groundwater model is capable of producing meaningful results with an acceptable level of uncertainty. The corrective action plan specifies a rigorous seven step validation process. The accepted groundwater model is evaluated using that process in light of the newly acquired data. The conceptual model of ground water flow for the Project Shoal Area considers groundwater flow through the fractured granite aquifer comprising the Sand Springs Range. Water enters the system by the infiltration of precipitation directly on the surface of the mountain range. Groundwater leaves the granite aquifer by flowing into alluvial deposits in the adjacent basins of Fourmile Flat and Fairview Valley. A groundwater divide is interpreted as coinciding with the western portion of the Sand Springs Range, west of the underground nuclear test, preventing flow from the test into Fourmile Flat. A very low conductivity shear zone east of the nuclear test roughly parallels the divide. The presence of these lateral boundaries, coupled with a regional discharge area to the northeast, is interpreted in the model as causing groundwater from the site to flow in a northeastward direction into Fairview Valley. Steady-state flow conditions are assumed given the absence of groundwater withdrawal activities in the area. The conceptual and numerical models were developed based upon regional hydrogeologic investigations conducted in the 1960s, site characterization investigations (including ten wells and various geophysical and geologic studies) at Shoal itself prior to and immediately after the test, and two site characterization campaigns in the 1990s for environmental restoration purposes (including eight wells and a year-long tracer test). The new wells are denoted MV-1, MV-2, and MV-3, and are located to the northnortheast of the nuclear test. The groundwater model was generally lacking data in the north-northeastern area; only HC-1 and the abandoned PM-2 wells existed in this area. The wells provide data on fracture orientation and frequency, water levels, hydraulic conductivity, and water chemistry for comparison with the groundwater model. A total of 12 real-number validation targets were available for the validation analysis, including five values of hydraulic head, three hydraulic conductivity measurements, three hydraulic gradient values, and one angle value for the lateral gradient in radians. In addition, the fracture dip and orientation data provide comparisons to the distributions used in the model and radiochemistry is available for comparison to model output. Goodness-of-fit analysis indicates that some of the model realizations correspond well with the newly acquired conductivity, head, and gradient data, while others do not. Other tests indicated that additional model realizations may be needed to test if the model input distributions need refinement to improve model performance. This approach (generating additional realizations) was not followed because it was realized that there was a temporal component to the data disconnect: the new head measurements are on the high side of the model distributions, but the heads at the original calibration locations themselves have also increased over time. This indicates that the steady-state assumption of the groundwater model is in error. To test the robustness of the model despite the transient nature of the heads, the newly acquired MV hydraulic head values were trended back to their likely values in 1999, the date of the calibration measurements. Additional statistical tests are performed using both the backward-projected MV heads and the observed heads to identify acceptable model realizations. A jackknife approach identified two possible threshold values to consider. For the analysis using the backward-trended heads, either 458 or 818 realizations (out of 1,000) are found acceptable, depending on the threshold chosen. The analysis using the observed heads found either 284 or 709 realizations acceptable. The impact of the refined set of realizations on the contaminant boundary was explored using an assumed starting mass of a single radionuclide and the acceptable realizations from the backward-trended analysis.

  7. The Galactic Evolution of Beryllium and Boron Revisited

    NASA Astrophysics Data System (ADS)

    King, Jeremy R.

    2001-12-01

    The largest, highest-quality, and most near-homogeneously treated extant available samples of Be, B, Fe, and O abundances are analyzed on four different stellar parameter scales, considering different O abundance indicators and deriving uncertainties in their relation with the required aid of jackknife and bootstrap simulations/resampling. Despite large slope and zero-point differences, the various Fe-poor ([Fe/H]<~-1) BeB-FeO relations are found to be independent of parameter scale within the present, sometimes substantial, uncertainties. Variations in the BeB-O relations (as large as 1.12 dex/dex and 1.24 dex in slope and zero point) from differing O indicators do significantly differ; surprisingly, the largest differences are within the same parameter scale and not across different ones. The well-defined mean Be-Fe relation is Be~Fe1.16+/-0.04 the B-Fe relation is virtually identical, B~Fe1.17+/-0.08. The BeB-mean O relations show smaller dispersion than BeB-OH or BeB-O I relations alone, because of the significant reduction in parameter uncertainties, and are in remarkable agreement, indicating Be~mean O1.51+/-0.05 and B~mean O1.61+/-0.12. The latter is in good agreement with the slope (B~O1.39+/-0.08) derived for metal-rich dwarfs by Smith et al. utilizing enhanced Mg I b-f opacity and presumed reliable λ6300 [O I] and λ6158 O I features. The BeB-FeO slopes are also all in excellent agreement with the reanalysis of Garcia Lopez, who utilizes a Hipparcos-based gravity scale. The equivalence of the Be- and B-FeO slopes limits prodigious ν-process 11B production at low metallicity and suggests little Galactic evolution of the B/Be ratio. The BeB-mean O slopes differ significantly from pure ``primary'' and ``secondary'' values, requiring a combination of production mechanisms. The differing behavior of [O/Fe] and [Be/Fe] with [Fe/H] seems to rule out production by accelerated CO-rich grain debris in ejecta of Type II supernovae having progenitor masses M>~8 Msolar. Instead, the data are in fine accord with near-primary/intermediate BeB-FeO slopes produced by various two-component models, including standard GCR and superbubble production. Such models with a low-energy cosmic-ray source from supernovae restricted to very large progenitor mass may be consistent with the large Be abundance in the ultra-metal-poor dwarf G64-12 found by Primas et al.; however, they predict unobserved maxima in B/Be evolution near [Fe/H]~-2, produce too much total Li at intermediate metallicity, and have been suggested to be energetically untenable. Superbubble models considering a range of supernova progenitor mass and a constant cosmic-ray source composition predict the inferred modest or flat slopes in B/Be evolution. These models face possible difficulties in reproducing any nonprimordial Be plateaus at very low [Fe/H], and not underproducing 6Li for [Fe/H]<~-2 additional data are required to provide firmer observational constraints. The BeB/FeO ratios do not show consistent evidence for two metal-poor populations expected from bimodal (isolated supernovae and collective supernovae in superbubbles) production mechanisms, though these signatures may be lost in the scatter or have drastically different contributing fractions. Finally, comparison of the metal-poor BeB-Fe and BeB-mean O slopes suggests that [O/Fe]~-0.25 [Fe/H]-not constant, but not as steep as suggested in some recent analyses and in agreement with the shallow [O/Fe] increase with declining [Fe/H] suggested by King.

  8. The Muenster Red Sky Survey: Large-scale structures in the universe

    NASA Astrophysics Data System (ADS)

    Ungruhe, R.; Seitter, W. C.; Duerbeck, H. W.

    2003-01-01

    We present a large-scale galaxy catalogue for the red spectral region which covers an area of 5 000 square degrees. It contains positions, red magnitudes, radii, ellipticities and position angles of about 5.5 million galaxies. Together with the APM catalogue (4,300 square degrees) in the blue spectral region, this catalogue forms at present the largest coherent data base for cosmological investigations in the southern hemisphere. 217 ESO Southern Sky Atlas R Schmidt plates with galactic latitudes -45 degrees were digitized with the two PDS microdensitometers of the Astronomisches Institut Münster, with a step width of 15 microns, corresponding to 1.01 arcseconds per pixel. All data were stored on different storage media and are available for further investigations. Suitable search parameters must be chosen in such a way that all objects are found on the plates, and that the percentage of artificial objects remains as low as possible. Based on two reference areas on different plates, a search threshold of 140 PDS density units and a minimum number of four pixels per object were chosen. The detected objects were stored, according to size, in frames of different size length. Each object was investigated in its frame, and 18 object parameters were determined. The classification of objects into stars, galaxies and perturbed objects was done with an automatic procedure which makes use of combinations of computed object parameters. In the first step, the perturbed objects are removed from the catalogue. Double objects and noise objects can be excluded on the basis of symmetry properties, while for satellite trails, a new classification criterium based on apparent magnitude, effective radius and apparent ellipticity, was developed. For the remaining objects, a star/galaxy separation was carried out. For bright objects, the relation between apparent magnitude and effective radius serves as the discriminating property, for fainter objects, the relation between effective radius and central intensity was used. In addition, a few regions of enhanced object density like dwarf galaxies and star clusters were removed from the catalogue. Because error estimates of the automatic classificationprocedure are very uncertain, an extensive visual check of the automatic classification was carried out. A large number of objects previously classified automatically - 1.3 million galaxies, 815,000 stars and 647,000 perturbed objects - was re-classified by eye. We found that galaxies suffer most from misclassification. Down to magnitude 13, the error is, independent of galactic latitude, at least 60%. Between13th and 17th magnitude, the percentage of misclassified galaxies for b < -45 degrees drops continuously to between 15% and 30%, and is clearly dependent on galactic latitude. The classification of galaxies at low galactic latitudes is most strongly affected; in these regions only half of the galaxies are correctly classified. Errors found in this work thus lie by a factor 2-3 above values quoted in the literature.Stars show classification errors of at most 10%, whose level increases towards fainter magnitudes. The classification accuracy is less dependent on galactic latitude than in the case of galaxies. As concerns artifacts, noticeable classification errors occur only for objects brighter than magnitude 15, which is mainly caused by saturation effects of the photographic emulsion. At magnitudes fainter than 15th,the error is below 5%. No dependence from galactic latitude is seen. These investigations show that the automatic classification yields satisfactory results only in certain magnitude intervals, which depend on galactic latitude. The object magnitudes are influenced by the desensitization of the emulsion during exposure and by the vignetting of the telescope. Objects at the plate margins appear systematically too faint. The magnitudes were corrected by means of measured number densities of galaxies and stars, which were determined in 63 fields around the galactic southpole. The difference of the magnitude zero-point between the center and the margin of a plate amounts to approximately 0.05 mag after correction of the margin desensitization. Because of their high central intensity, stars reach the saturation limit of the emulsion already at magnitude 17. Thus bright stars appear systematically too faint. The saturation effect can be corrected by means of a point-spread function, which is calculated from the unsaturated parts of the stellar intensity profiles. The magnitude corrections for the saturation are carried out for each plate separately. In order to establish a unique magnitude zero-point for the 217 single plates, a mutual adjustment of neighbouring fields by means of their overlap regions was done. The procedure was carried out separately for stars and galaxies. In total, 1,005 overlap regions for galaxies and 1,103 regions for stars were available. The zero-point error after adjustment amounts to 0.06 mag for galaxies and 0.07 mag for stars. The external calibration of the photographic rF magnitudes was carried out by means of CCD sequences obtained with three telescopes in Chile and South Africa. In total, photometric V, R data for 1,037 galaxies and 1,058 stars in 92 fields are available. The transformation between photographic and CCD magnitudes requires a relation between F and VR.It was carried out separately for stars and galaxies, because they show different colour transformations. After the transformation of the photographic rF magnitudes to the standard Johnson R system, the errors of the local magnitude zero-point amount to 0.11 mag. for galaxies and 0.15 mag for stars. Because of the large areal extent of the catalogue, the galaxy magnitudes must be corrected for interstellar extinction. Magnitude corrections are based on hydrogen column densitiesof interstellar dust. Extinction corrections amount to up to0.1 mag for 55% of galaxies, and 0.2 mag for another 35%. For the remaining 10%, the corrections are above 0.2 mag.The iteration procedure for the indirect adjustment of single platesmay cause a magnitude gradient in north-south or east-west direction. Investigations of the magnitude differences between photographic and CCD magnitudes versus right ascension and declination show no significant gradients.In order to generate a complete catalogue of galaxies and stars, all double or multiple objects that occur in overlap regions have to be excluded. After the merging of all single plates (including half of the overlap regions), both catalogues contain 5.5 million galaxies and 20.2 million stars. The completeness of the catalogues was investigated from the comparison of counts of stars and galaxies with simulations. The limit of completenessis at magnitude18.3 for galaxies, and at 18.8 for stars. In the case of galaxies, a clear deficit is seen for galaxies down to magitude 16 in comparison with the simulation. Neither by taking into account galaxy evolution, nor by changes in the cosmological parameters, an adjustment of the simulation to the catalogue counts was possible. These results and those of others support the assumption that we are dealing with a real galaxy deficit. The determined slope of 0.66 of the galaxy counts is, within the limits of accuracy, in agreement with the measured values of other authors. No comparable star counts are available. The N-point angular correlation function were determined from various sub-catalogues consisting of 9, 25, 63, 121 and 152 fields as well asfor limiting magnitudes from magnitudes 15.0 to 19.0. The computation of chance distributions was carried out for galaxy counts in cells with side borders from 25''to 28.4''. Averaged correlation functions and their coefficients were determined by means of factorial moments. The delete-d jackknife procedure was applied for the error estimate, with 200 replications per subcatalogue. The 2-point angular correlation function shows a linear trend in logarithmic plots for all sub-catalogues on scales from 0.02 to 2degrees.Within this range, it can be parametrized by a power law omega2 = A theta exp(1-gamma). Depending on sub-catalogue, the gamma values scatter between 1.63 and 1.73. They show a good agreement with the EDSGC, APM and MRSP catalogues. The parametrization of the amplitudeof the 2-point angular correlation versus apparent magnitude yields beta values between 0.267 and 0.322, which are in accordance with beta values from model calculations. The curve form of the 2-point angularcorrelation function shows a significantly flatter decline on scales exceeding 2 degrees which cannot be reproduced by the standard CDM-model. The correlation functions of higher order intersect at a point theta_S, whose position depends on the limiting magnitude. For scales theta theta_S,they decay very quickly. Since the higher orders only occur in galaxyclusters, the intersection can be taken as a measure for the cluster size.Correlations for the third to fifth order that still exist on large scalesindicate that, compared to the size of galaxy clusters, larger structuresexist, the superclusters. The correlation function coefficients with N larger or equal to 3 show characteristicplateaus for all limiting magnitudes, whose length depends on the order considered. Simulations have shown that the plateaus ofsmall scales point towards a strong non-linear cluster formation. The meaning of plateaus on large scales is not yet known. Comparisons with the APM- and MRSP-catalogues show a good agreement, both in the shape as well asin the amplitudes of the coefficients. Only the EDSGC shows significantly higher amplitudes on small scales, which are likely caused by wronglyclassified and/or doubly counted galaxies. This paper is an edited and translated version of a Ph.D. thesis, submitted by R. Ungruhe to Muenster University in 1998. It is released to make the results from this work available to a larger scientific community. Since the data of the galaxy catalogue will also be made available through the NED/IPAC database, its users can make themselves familiar with the methods of analysis and of the construction of the catalogue. Another paper, "Angular and three-dimensional correlation functions determined from the Muenster Red Sky Survey" by P. Boschan (the second referee of the Ph.D. thesis), published in Monthly Notices of the Royal Astronomical Society, Vol. 334, pp. 297-309, is to a very large part based on the contents of the thesis. It should be noted that it was written without the knowledge and without the permission of the author of the Ph.D. thesis.