Sample records for filtering snps imputed

  1. Genotype Imputation of Metabochip SNPs Using a Study-Specific Reference Panel of ~4,000 Haplotypes in African Americans From the Women’s Health Initiative

    PubMed Central

    Liu, Eric Yi; Buyske, Steven; Aragaki, Aaron K.; Peters, Ulrike; Boerwinkle, Eric; Carlson, Chris; Carty, Cara; Crawford, Dana C.; Haessler, Jeff; Hindorff, Lucia A.; Marchand, Loic Le; Manolio, Teri A.; Matise, Tara; Wang, Wei; Kooperberg, Charles; North, Kari E.; Li, Yun

    2012-01-01

    Genetic imputation has become standard practice in modern genetic studies. However, several important issues have not been adequately addressed including the utility of study-specific reference, performance in admixed populations, and quality for less common (minor allele frequency [MAF] 0.005–0.05) and rare (MAF < 0.005) variants. These issues only recently became addressable with genome-wide association studies (GWAS) follow-up studies using dense genotyping or sequencing in large samples of non-European individuals. In this work, we constructed a study-specific reference panel of 3,924 haplotypes using African Americans in the Women’s Health Initiative (WHI) genotyped on both the Metabochip and the Affymetrix 6.0 GWAS platform. We used this reference panel to impute into 6,459 WHI SNP Health Association Resource (SHARe) study subjects with only GWAS genotypes. Our analysis confirmed the imputation quality metric Rsq (estimated r2, specific to each SNP) as an effective post-imputation filter. We recommend different Rsq thresholds for different MAF categories such that the average (across SNPs) Rsq is above the desired dosage r2 (squared Pearson correlation between imputed and experimental genotypes).With a desired dosage r2 of 80%, 99.9% (97.5%, 83.6%, 52.0%, 20.5%) of SNPs with MAF > 0.05 (0.03–0.05, 0.01–0.03, 0.005–0.01, and 0.001–0.005) passed the post-imputation filter. The average dosage r2 for these SNPs is 94.7%, 92.1%, 89.0%, 83.1%, and 79.7%, respectively. These results suggest that for African Americans imputation of Metabochip SNPs from GWAS data, including low frequency SNPs with MAF 0.005–0.05, is feasible and worthwhile for power increase in downstream association analysis provided a sizable reference panel is available. PMID:22851474

  2. Quick, “Imputation-free” meta-analysis with proxy-SNPs

    PubMed Central

    2012-01-01

    Background Meta-analysis (MA) is widely used to pool genome-wide association studies (GWASes) in order to a) increase the power to detect strong or weak genotype effects or b) as a result verification method. As a consequence of differing SNP panels among genotyping chips, imputation is the method of choice within GWAS consortia to avoid losing too many SNPs in a MA. YAMAS (Yet Another Meta Analysis Software), however, enables cross-GWAS conclusions prior to finished and polished imputation runs, which eventually are time-consuming. Results Here we present a fast method to avoid forfeiting SNPs present in only a subset of studies, without relying on imputation. This is accomplished by using reference linkage disequilibrium data from 1,000 Genomes/HapMap projects to find proxy-SNPs together with in-phase alleles for SNPs missing in at least one study. MA is conducted by combining association effect estimates of a SNP and those of its proxy-SNPs. Our algorithm is implemented in the MA software YAMAS. Association results from GWAS analysis applications can be used as input files for MA, tremendously speeding up MA compared to the conventional imputation approach. We show that our proxy algorithm is well-powered and yields valuable ad hoc results, possibly providing an incentive for follow-up studies. We propose our method as a quick screening step prior to imputation-based MA, as well as an additional main approach for studies without available reference data matching the ethnicities of study participants. As a proof of principle, we analyzed six dbGaP Type II Diabetes GWAS and found that the proxy algorithm clearly outperforms naïve MA on the p-value level: for 17 out of 23 we observe an improvement on the p-value level by a factor of more than two, and a maximum improvement by a factor of 2127. Conclusions YAMAS is an efficient and fast meta-analysis program which offers various methods, including conventional MA as well as inserting proxy-SNPs for missing markers to avoid unnecessary power loss. MA with YAMAS can be readily conducted as YAMAS provides a generic parser for heterogeneous tabulated file formats within the GWAS field and avoids cumbersome setups. In this way, it supplements the meta-analysis process. PMID:22971100

  3. Analysis of Case-Control Association Studies: SNPs, Imputation and Haplotypes.

    PubMed

    Chatterjee, Nilanjan; Chen, Yi-Hau; Luo, Sheng; Carroll, Raymond J

    2009-11-01

    Although prospective logistic regression is the standard method of analysis for case-control data, it has been recently noted that in genetic epidemiologic studies one can use the "retrospective" likelihood to gain major power by incorporating various population genetics model assumptions such as Hardy-Weinberg-Equilibrium (HWE), gene-gene and gene-environment independence. In this article, we review these modern methods and contrast them with the more classical approaches through two types of applications (i) association tests for typed and untyped single nucleotide polymorphisms (SNPs) and (ii) estimation of haplotype effects and haplotype-environment interactions in the presence of haplotype-phase ambiguity. We provide novel insights to existing methods by construction of various score-tests and pseudo-likelihoods. In addition, we describe a novel two-stage method for analysis of untyped SNPs that can use any flexible external algorithm for genotype imputation followed by a powerful association test based on the retrospective likelihood. We illustrate applications of the methods using simulated and real data. PMID:20543902

  4. Imputation of class I and II HLA loci using high-density SNPs from ImmunoChip and their associations with Kawasaki disease in family-based study.

    PubMed

    Shrestha, S; Wiener, H W; Aissani, B; Shendre, A; Tang, J; Portman, M A

    2015-06-01

    Kawasaki disease (KD) is the leading cause of acquired heart disease in children in most developed countries including the United States. The etiology of KD is not known; however, epidemiological and immunological data suggest infectious or immune-related factors in the manifestation of the disease. Further, KD has several hereditary features that strongly suggest a genetic component to disease pathogenesis. Human leucocyte antigen (HLA) loci have also been reported to be associated with KD, but results have been inconsistent, in part, because of small study samples and varying linkage disequilibrium (LD) patterns observed across different ethnic groups. To maximize the informativeness of single nucleotide polymorphism (SNP) genotypes in the major histocompatibility (MHC) region, we imputed classical HLA I (A, B, C) and HLA II (DRB1, DQA1, DQB1) alleles using SNP2HLA method from genotypes of 6700 SNPs within the extended MHC region contained in the ImmunoChip among 112 White patients with KD and their biological parents from North America and tested their association with KD susceptibility using the transmission disequilibrium test. Mendelian consistency in the trios suggested high accuracy and reliability of the imputed alleles (class I = 97.5%, class II = 96.6%). While several SNPs in the MHC region were individually associated with KD susceptibility, we report over-transmission of HLA-C*15 (z = +2.19, P = 0.03) and under-transmission of HLA-B*44 (z = -2.49, P = 0.01) alleles from parents to patients with KD. HLA-B*44 has been associated with KD in other smaller studies, and both HLA-C*15 and HLA-B*44 have biological mechanisms that could potentially be involved in KD pathogenesis. Overall, inferring HLA loci within the same ethnic group, using family-based information is a powerful approach. However, studies with larger sample sizes are warranted to evaluate the correlations of the strength and directions between the SNPs in MHC region and the imputed HLA alleles with KD. PMID:25809546

  5. Imputation accuracy is robust to cattle reference genome updates.

    PubMed

    Milanesi, M; Vicario, D; Stella, A; Valentini, A; Ajmone-Marsan, P; Biffani, S; Biscarini, F; Jansen, G; Nicolazzi, E L

    2015-02-01

    Genotype imputation is routinely applied in a large number of cattle breeds. Imputation has become a need due to the large number of SNP arrays with variable density (currently, from 2900 to 777,962 SNPs). Although many authors have studied the effect of different statistical methods on imputation accuracy, the impact of a (likely) change in the reference genome assembly on imputation from lower to higher density has not been determined so far. In this work, 1021 Italian Simmental SNP genotypes were remapped on the three most recent reference genome assemblies. Four imputation methods were used to assess the impact of an update in the reference genome. As expected, the four methods behaved differently, with large differences in terms of accuracy. Updating SNP coordinates on the three tested cattle reference genome assemblies determined only a slight variation on imputation results within method. PMID:25515631

  6. A comprehensive SNP and indel imputability database

    PubMed Central

    Duan, Qing; Liu, Eric Yi; Croteau-Chonka, Damien C.; Mohlke, Karen L.; Li, Yun

    2013-01-01

    Motivation: Genotype imputation has become an indispensible step in genome-wide association studies (GWAS). Imputation accuracy, directly influencing downstream analysis, has shown to be improved using re-sequencing-based reference panels; however, this comes at the cost of high computational burden due to the huge number of potentially imputable markers (tens of millions) discovered through sequencing a large number of individuals. Therefore, there is an increasing need for access to imputation quality information without actually conducting imputation. To facilitate this process, we have established a publicly available SNP and indel imputability database, aiming to provide direct access to imputation accuracy information for markers identified by the 1000 Genomes Project across four major populations and covering multiple GWAS genotyping platforms. Results: SNP and indel imputability information can be retrieved through a user-friendly interface by providing the ID(s) of the desired variant(s) or by specifying the desired genomic region. The query results can be refined by selecting relevant GWAS genotyping platform(s). This is the first database providing variant imputability information specific to each continental group and to each genotyping platform. In Filipino individuals from the Cebu Longitudinal Health and Nutrition Survey, our database can achieve an area under the receiver-operating characteristic curve of 0.97, 0.91, 0.88 and 0.79 for markers with minor allele frequency >5%, 3–5%, 1–3% and 0.5–1%, respectively. Specifically, by filtering out 48.6% of markers (corresponding to a reduction of up to 48.6% in computational costs for actual imputation) based on the imputability information in our database, we can remove 77%, 58%, 51% and 42% of the poorly imputed markers at the cost of only 0.3%, 0.8%, 1.5% and 4.6% of the well-imputed markers with minor allele frequency >5%, 3–5%, 1–3% and 0.5–1%, respectively. Availability: http://www.unc.edu/?yunmli/imputability.html Supplementary information: Supplementary data are available at Bioinformatics online. Contact: yunli@med.unc.edu PMID:23292738

  7. Accuracy of imputation to whole-genome sequence data in Holstein Friesian cattle

    PubMed Central

    2014-01-01

    Background The use of whole-genome sequence data can lead to higher accuracy in genome-wide association studies and genomic predictions. However, to benefit from whole-genome sequence data, a large dataset of sequenced individuals is needed. Imputation from SNP panels, such as the Illumina BovineSNP50 BeadChip and Illumina BovineHD BeadChip, to whole-genome sequence data is an attractive and less expensive approach to obtain whole-genome sequence genotypes for a large number of individuals than sequencing all individuals. Our objective was to investigate accuracy of imputation from lower density SNP panels to whole-genome sequence data in a typical dataset for cattle. Methods Whole-genome sequence data of chromosome 1 (1737 471 SNPs) for 114 Holstein Friesian bulls were used. Beagle software was used for imputation from the BovineSNP50 (3132 SNPs) and BovineHD (40 492 SNPs) beadchips. Accuracy was calculated as the correlation between observed and imputed genotypes and assessed by five-fold cross-validation. Three scenarios S40, S60 and S80 with respectively 40%, 60%, and 80% of the individuals as reference individuals were investigated. Results Mean accuracies of imputation per SNP from the BovineHD panel to sequence data and from the BovineSNP50 panel to sequence data for scenarios S40 and S80 ranged from 0.77 to 0.83 and from 0.37 to 0.46, respectively. Stepwise imputation from the BovineSNP50 to BovineHD panel and then to sequence data for scenario S40 improved accuracy per SNP to 0.65 but it varied considerably between SNPs. Conclusions Accuracy of imputation to whole-genome sequence data was generally high for imputation from the BovineHD beadchip, but was low from the BovineSNP50 beadchip. Stepwise imputation from the BovineSNP50 to the BovineHD beadchip and then to sequence data substantially improved accuracy of imputation. SNPs with a low minor allele frequency were more difficult to impute correctly and the reliability of imputation varied more. Linkage disequilibrium between an imputed SNP and the SNP on the lower density panel, minor allele frequency of the imputed SNP and size of the reference group affected imputation reliability. PMID:25022768

  8. Genotype-Imputation Accuracy across Worldwide Human Populations

    E-print Network

    Rosenberg, Noah

    involves leveraging the information in a reference database of dense genotype data. By modeling accuracy. From a separate survey of additional SNPs typed in the same samples, we evaluated imputation but not for the GWA study (Figure 1D). By modeling the pattern of LD in the reference panel and then applying

  9. Genotype imputation via matrix completion

    PubMed Central

    Chi, Eric C.; Zhou, Hua; Chen, Gary K.; Del Vecchyo, Diego Ortega; Lange, Kenneth

    2013-01-01

    Most current genotype imputation methods are model-based and computationally intensive, taking days to impute one chromosome pair on 1000 people. We describe an efficient genotype imputation method based on matrix completion. Our matrix completion method is implemented in MATLAB and tested on real data from HapMap 3, simulated pedigree data, and simulated low-coverage sequencing data derived from the 1000 Genomes Project. Compared with leading imputation programs, the matrix completion algorithm embodied in our program MENDEL-IMPUTE achieves comparable imputation accuracy while reducing run times significantly. Implementation in a lower-level language such as Fortran or C is apt to further improve computational efficiency. PMID:23233546

  10. Performance of Genotype Imputation for Low Frequency and Rare Variants from the 1000 Genomes

    PubMed Central

    Zheng, Hou-Feng; Rong, Jing-Jing; Liu, Ming; Han, Fang; Zhang, Xing-Wei; Richards, J. Brent; Wang, Li

    2015-01-01

    Genotype imputation is now routinely applied in genome-wide association studies (GWAS) and meta-analyses. However, most of the imputations have been run using HapMap samples as reference, imputation of low frequency and rare variants (minor allele frequency (MAF) < 5%) are not systemically assessed. With the emergence of next-generation sequencing, large reference panels (such as the 1000 Genomes panel) are available to facilitate imputation of these variants. Therefore, in order to estimate the performance of low frequency and rare variants imputation, we imputed 153 individuals, each of whom had 3 different genotype array data including 317k, 610k and 1 million SNPs, to three different reference panels: the 1000 Genomes pilot March 2010 release (1KGpilot), the 1000 Genomes interim August 2010 release (1KGinterim), and the 1000 Genomes phase1 November 2010 and May 2011 release (1KGphase1) by using IMPUTE version 2. The differences between these three releases of the 1000 Genomes data are the sample size, ancestry diversity, number of variants and their frequency spectrum. We found that both reference panel and GWAS chip density affect the imputation of low frequency and rare variants. 1KGphase1 outperformed the other 2 panels, at higher concordance rate, higher proportion of well-imputed variants (info>0.4) and higher mean info score in each MAF bin. Similarly, 1M chip array outperformed 610K and 317K. However for very rare variants (MAF?0.3%), only 0–1% of the variants were well imputed. We conclude that the imputation of low frequency and rare variants improves with larger reference panels and higher density of genome-wide genotyping arrays. Yet, despite a large reference panel size and dense genotyping density, very rare variants remain difficult to impute. PMID:25621886

  11. Accuracy of high-density genotype imputation in Japanese Black cattle.

    PubMed

    Uemoto, Y; Sasaki, S; Sugimoto, Y; Watanabe, T

    2015-08-01

    Genotype imputation facilitates the identification of missing genotypes on a high-density array using low-density arrays and has great potential for reducing genotyping costs for cattle populations. However, the imputation quality varies across breeds, which have different effective population sizes. Therefore, the accuracy of genotype imputation must be evaluated in each breed. The Japanese Black cattle population has a unique genetic background, and this study aimed to investigate different factors affecting imputation quality in this population. A total of 1368 animals were genotyped using the Illumina BovineHD BeadChip, and the accuracy of imputation was evaluated using information from four lower density arrays. The extent of linkage disequilibrium for this population was relatively higher than that in other beef breeds but lower than that in dairy breeds. The accuracy of arrays with more than 20 000 single nucleotide polymorphisms (SNPs) was similar to or higher than that of lower density arrays. In addition, the minor allele frequency of SNPs in the reference population affected the accuracy. The accuracy increased as the size of the reference population increased, up to 400 animals, beyond which there was little increase. A higher genetic relationship between the reference and test populations increased imputation accuracy. These results indicate that high imputation accuracy can be achieved using high-density arrays, having enough reference animals and including relatives in the reference population. PMID:26156250

  12. Recursively Imputed Survival Trees

    PubMed Central

    Zhu, Ruoqing; Kosorok, Michael R.

    2011-01-01

    We propose recursively imputed survival tree (RIST) regression for right-censored data. This new nonparametric regression procedure uses a novel recursive imputation approach combined with extremely randomized trees that allows significantly better use of censored data than previous tree based methods, yielding improved model fit and reduced prediction error. The proposed method can also be viewed as a type of Monte Carlo EM algorithm which generates extra diversity in the tree-based fitting process. Simulation studies and data analyses demonstrate the superior performance of RIST compared to previous methods. PMID:23125470

  13. Assessing Accuracy of Genotype Imputation in American Indians

    PubMed Central

    Malhotra, Alka; Kobes, Sayuko; Bogardus, Clifton; Knowler, William C.; Baier, Leslie J.; Hanson, Robert L.

    2014-01-01

    Background Genotype imputation is commonly used in genetic association studies to test untyped variants using information on linkage disequilibrium (LD) with typed markers. Imputing genotypes requires a suitable reference population in which the LD pattern is known, most often one selected from HapMap. However, some populations, such as American Indians, are not represented in HapMap. In the present study, we assessed accuracy of imputation using HapMap reference populations in a genome-wide association study in Pima Indians. Results Data from six randomly selected chromosomes were used. Genotypes in the study population were masked (either 1% or 20% of SNPs available for a given chromosome). The masked genotypes were then imputed using the software Markov Chain Haplotyping Algorithm. Using four HapMap reference populations, average genotype error rates ranged from 7.86% for Mexican Americans to 22.30% for Yoruba. In contrast, use of the original Pima Indian data as a reference resulted in an average error rate of 1.73%. Conclusions Our results suggest that the use of HapMap reference populations results in substantial inaccuracy in the imputation of genotypes in American Indians. A possible solution would be to densely genotype or sequence a reference American Indian population. PMID:25014012

  14. SNP panels/Imputation

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Participants from thirteen countries discussed services that Interbull can perform or recommendations that Interbull can make to promote harmonization and assist member countries in improving their genomic evaluations in regard to SNP panels and imputation. The panel recommended: A mechanism to shar...

  15. Multiple imputation analysis of casecohort studies

    E-print Network

    Paris-Sud XI, Université de

    Multiple imputation analysis of casecohort studies Helena MARTI Biostatistics, CESP Centre de and multiple imputation 5 4 Validation of the method 8 4.1 Simulations casecohort studies rely on sometimes not fully ecient weighted estimators. Multiple imputation might

  16. Estimating the proportion of variation in susceptibility to schizophrenia captured by common SNPs

    Microsoft Academic Search

    S Hong Lee; Teresa R DeCandia; Stephan Ripke; Jian Yang; Patrick F Sullivan; Michael E Goddard; Matthew C Keller; Peter M Visscher; Naomi R Wray

    2012-01-01

    Schizophrenia is a complex disorder caused by both genetic and environmental factors. Using 9,087 affected individuals, 12,171 controls and 915,354 imputed SNPs from the Schizophrenia Psychiatric Genome-Wide Association Study (GWAS) Consortium (PGC-SCZ), we estimate that 23% (s.e. = 1%) of variation in liability to schizophrenia is captured by SNPs. We show that a substantial proportion of this variation must be

  17. Impact of Genotype Imputation on the Performance of GBLUP and Bayesian Methods for Genomic Prediction

    PubMed Central

    Chen, Liuhong; Li, Changxi; Sargolzaei, Mehdi; Schenkel, Flavio

    2014-01-01

    The aim of this study was to evaluate the impact of genotype imputation on the performance of the GBLUP and Bayesian methods for genomic prediction. A total of 10,309 Holstein bulls were genotyped on the BovineSNP50 BeadChip (50 k). Five low density single nucleotide polymorphism (SNP) panels, containing 6,177, 2,480, 1,536, 768 and 384 SNPs, were simulated from the 50 k panel. A fraction of 0%, 33% and 66% of the animals were randomly selected from the training sets to have low density genotypes which were then imputed into 50 k genotypes. A GBLUP and a Bayesian method were used to predict direct genomic values (DGV) for validation animals using imputed or their actual 50 k genotypes. Traits studied included milk yield, fat percentage, protein percentage and somatic cell score (SCS). Results showed that performance of both GBLUP and Bayesian methods was influenced by imputation errors. For traits affected by a few large QTL, the Bayesian method resulted in greater reductions of accuracy due to imputation errors than GBLUP. Including SNPs with largest effects in the low density panel substantially improved the accuracy of genomic prediction for the Bayesian method. Including genotypes imputed from the 6 k panel achieved almost the same accuracy of genomic prediction as that of using the 50 k panel even when 66% of the training population was genotyped on the 6 k panel. These results justified the application of the 6 k panel for genomic prediction. Imputations from lower density panels were more prone to errors and resulted in lower accuracy of genomic prediction. But for animals that have close relationship to the reference set, genotype imputation may still achieve a relatively high accuracy. PMID:25025158

  18. The utility of low-density genotyping for imputation in the Thoroughbred horse

    PubMed Central

    2014-01-01

    Background Despite the dramatic reduction in the cost of high-density genotyping that has occurred over the last decade, it remains one of the limiting factors for obtaining the large datasets required for genomic studies of disease in the horse. In this study, we investigated the potential for low-density genotyping and subsequent imputation to address this problem. Results Using the haplotype phasing and imputation program, BEAGLE, it is possible to impute genotypes from low- to high-density (50K) in the Thoroughbred horse with reasonable to high accuracy. Analysis of the sources of variation in imputation accuracy revealed dependence both on the minor allele frequency of the single nucleotide polymorphisms (SNPs) being imputed and on the underlying linkage disequilibrium structure. Whereas equidistant spacing of the SNPs on the low-density panel worked well, optimising SNP selection to increase their minor allele frequency was advantageous, even when the panel was subsequently used in a population of different geographical origin. Replacing base pair position with linkage disequilibrium map distance reduced the variation in imputation accuracy across SNPs. Whereas a 1K SNP panel was generally sufficient to ensure that more than 80% of genotypes were correctly imputed, other studies suggest that a 2K to 3K panel is more efficient to minimize the subsequent loss of accuracy in genomic prediction analyses. The relationship between accuracy and genotyping costs for the different low-density panels, suggests that a 2K SNP panel would represent good value for money. Conclusions Low-density genotyping with a 2K SNP panel followed by imputation provides a compromise between cost and accuracy that could promote more widespread genotyping, and hence the use of genomic information in horses. In addition to offering a low cost alternative to high-density genotyping, imputation provides a means to combine datasets from different genotyping platforms, which is becoming necessary since researchers are starting to use the recently developed equine 70K SNP chip. However, more work is needed to evaluate the impact of between-breed differences on imputation accuracy. PMID:24495673

  19. Imputation of TPMT defective alleles for the identification of patients with high-risk phenotypes

    PubMed Central

    Almoguera, Berta; Vazquez, Lyam; Connolly, John J.; Bradfield, Jonathan; Sleiman, Patrick; Keating, Brendan; Hakonarson, Hakon

    2014-01-01

    Background: The activity of thiopurine methyltransferase (TPMT) is subject to genetic variation. Loss-of-function alleles are associated with various degrees of myelosuppression after treatment with thiopurine drugs, thus genotype-based dosing recommendations currently exist. The aim of this study was to evaluate the potential utility of leveraging genomic data from large biorepositories in the identification of individuals with TPMT defective alleles. Material and methods: TPMT variants were imputed using the 1000 Genomes Project reference panel in 87,979 samples from the biobank at The Children's Hospital of Philadelphia. Population ancestry was determined by principal component analysis using HapMap3 samples as reference. Frequencies of the TPMT imputed alleles, genotypes and the associated phenotype were determined across the different populations. A sample of 630 subjects with genotype data from Sanger sequencing (N = 59) and direct genotyping (N = 583) (12 samples overlapping in the two groups) was used to check the concordance between the imputed and observed genotypes, as well as the sensitivity, specificity and positive and negative predictive values of the imputation. Results: Two SNPs (rs1800460 and rs1142345) that represent three TPMT alleles (*3A, *3B, and *3C) were imputed with adequate quality. Frequency for the associated enzyme activity varied across populations and 89.36–94.58% were predicted to have normal TPMT activity, 5.3–10.31% intermediate and 0.12–0.34% poor activities. Overall, 98.88% of individuals (623/630) were correctly imputed into carrying no risk alleles (553/553), heterozygous (45/46) and homozygous (25/31). Sensitivity, specificity and predictive values of imputation were over 90% in all cases except for the sensitivity of imputing homozygous subjects that was 80.64%. Conclusion: Imputation of TPMT alleles from existing genomic data can be used as a first step in the screening of individuals at risk of developing serious adverse events secondary to thiopurine drugs. PMID:24860591

  20. Imputing amino acid polymorphisms in human leukocyte antigens.

    PubMed

    Jia, Xiaoming; Han, Buhm; Onengut-Gumuscu, Suna; Chen, Wei-Min; Concannon, Patrick J; Rich, Stephen S; Raychaudhuri, Soumya; de Bakker, Paul I W

    2013-01-01

    DNA sequence variation within human leukocyte antigen (HLA) genes mediate susceptibility to a wide range of human diseases. The complex genetic structure of the major histocompatibility complex (MHC) makes it difficult, however, to collect genotyping data in large cohorts. Long-range linkage disequilibrium between HLA loci and SNP markers across the major histocompatibility complex (MHC) region offers an alternative approach through imputation to interrogate HLA variation in existing GWAS data sets. Here we describe a computational strategy, SNP2HLA, to impute classical alleles and amino acid polymorphisms at class I (HLA-A, -B, -C) and class II (-DPA1, -DPB1, -DQA1, -DQB1, and -DRB1) loci. To characterize performance of SNP2HLA, we constructed two European ancestry reference panels, one based on data collected in HapMap-CEPH pedigrees (90 individuals) and another based on data collected by the Type 1 Diabetes Genetics Consortium (T1DGC, 5,225 individuals). We imputed HLA alleles in an independent data set from the British 1958 Birth Cohort (N?=?918) with gold standard four-digit HLA types and SNPs genotyped using the Affymetrix GeneChip 500 K and Illumina Immunochip microarrays. We demonstrate that the sample size of the reference panel, rather than SNP density of the genotyping platform, is critical to achieve high imputation accuracy. Using the larger T1DGC reference panel, the average accuracy at four-digit resolution is 94.7% using the low-density Affymetrix GeneChip 500 K, and 96.7% using the high-density Illumina Immunochip. For amino acid polymorphisms within HLA genes, we achieve 98.6% and 99.3% accuracy using the Affymetrix GeneChip 500 K and Illumina Immunochip, respectively. Finally, we demonstrate how imputation and association testing at amino acid resolution can facilitate fine-mapping of primary MHC association signals, giving a specific example from type 1 diabetes. PMID:23762245

  1. Imputation-Based Population Genetics Analysis of Plasmodium falciparum Malaria Parasites

    PubMed Central

    Samad, Hanif; Coll, Francesc; Preston, Mark D.; Ocholla, Harold; Fairhurst, Rick M.; Clark, Taane G.

    2015-01-01

    Whole-genome sequencing technologies are being increasingly applied to Plasmodium falciparum clinical isolates to identify genetic determinants of malaria pathogenesis. However, genome-wide discovery methods, such as haplotype scans for signatures of natural selection, are hindered by missing genotypes in sequence data. Poor correlation between single nucleotide polymorphisms (SNPs) in the P. falciparum genome complicates efforts to apply established missing-genotype imputation methods that leverage off patterns of linkage disequilibrium (LD). The accuracy of state-of-the-art, LD-based imputation methods (IMPUTE, Beagle) was assessed by measuring allelic r2 for 459 P. falciparum samples from malaria patients in 4 countries: Thailand, Cambodia, Gambia, and Malawi. In restricting our analysis to 86k high-quality SNPs across the populations, we found that the complete-case analysis was restricted to 21k SNPs (24.5%), despite no single SNP having more than 10% missing genotypes. The accuracy of Beagle in filling in missing genotypes was consistently high across all populations (allelic r2, 0.87-0.96), but the performance of IMPUTE was mixed (allelic r2, 0.34-0.99) depending on reference haplotypes and population. Positive selection analysis using Beagle-imputed haplotypes identified loci involved in resistance to chloroquine (crt) in Thailand, Cambodia, and Gambia, sulfadoxine-pyrimethamine (dhfr, dhps) in Cambodia, and artemisinin (kelch13) in Cambodia. Tajima’s D-based analysis identified genes under balancing selection that encode well-characterized vaccine candidates: apical merozoite antigen 1 (ama1) and merozoite surface protein 1 (msp1). In contrast, the complete-case analysis failed to identify any well-validated drug resistance or candidate vaccine loci, except kelch13. In a setting of low LD and modest levels of missing genotypes, using Beagle to impute P. falciparum genotypes is a viable strategy for conducting accurate large-scale population genetics and association analyses, and supporting global surveillance for drug resistance markers and candidate vaccine antigens. PMID:25928499

  2. Within- and across-breed imputation of high-density genotypes in dairy and beef cattle from medium- and low-density genotypes.

    PubMed

    Berry, D P; McClure, M C; Mullen, M P

    2014-06-01

    The objective of this study was to evaluate, using three different genotype density panels, the accuracy of imputation from lower- to higher-density genotypes in dairy and beef cattle. High-density genotypes consisting of 777,962 single-nucleotide polymorphisms (SNP) were available on 3122 animals comprised of 269, 196, 710, 234, 719, 730 and 264 Angus, Belgian Blue, Charolais, Hereford, Holstein-Friesian, Limousin and Simmental bulls, respectively. Three different genotype densities were generated: low density (LD; 6501 autosomal SNPs), medium density (50K; 47,770 autosomal SNPs) and high density (HD; 735,151 autosomal SNPs). Imputation from lower- to higher-density genotype platforms was undertaken within and across breeds exploiting population-wide linkage disequilibrium. The mean allele concordance rate per breed from LD to HD when undertaken using a single breed or multiple breed reference population varied from 0.956 to 0.974 and from 0.947 to 0.967, respectively. The mean allele concordance rate per breed from 50K to HD when undertaken using a single breed or multiple breed reference population varied from 0.987 to 0.994 and from 0.987 to 0.993, respectively. The accuracy of imputation was generally greater when the reference population was solely comprised of the breed to be imputed compared to when the reference population comprised of multiple breeds, although the impact was less when imputing from 50K to HD compared to imputing from LD. PMID:24906026

  3. Selecting the Number of Imputed Datasets When Using Multiple Imputation for Missing Data and Disclosure Limitation

    E-print Network

    Reiter, Jerome P.

    Selecting the Number of Imputed Datasets When Using Multiple Imputation for Missing Data and disclosure limitation simultaneously. First, fill in the missing data to generate m completed datasets, then replace confidential values in each completed dataset with r imputations. I investigate how to select m

  4. Combinations of SNPs Related to Signal Transduction in Bipolar Disorder

    PubMed Central

    Koefoed, Pernille; Andreassen, Ole A.; Bennike, Bente; Dam, Henrik; Djurovic, Srdjan; Hansen, Thomas; Jorgensen, Martin Balslev; Kessing, Lars Vedel; Melle, Ingrid; Møller, Gert Lykke; Mors, Ole; Werge, Thomas; Mellerup, Erling

    2011-01-01

    Any given single nucleotide polymorphism (SNP) in a genome may have little or no functional impact. A biologically significant effect may possibly emerge only when a number of key SNP-related genotypes occur together in a single organism. Thus, in analysis of many SNPs in association studies of complex diseases, it may be useful to look at combinations of genotypes. Genes related to signal transmission, e.g., ion channel genes, may be of interest in this respect in the context of bipolar disorder. In the present study, we analysed 803 SNPs in 55 genes related to aspects of signal transmission and calculated all combinations of three genotypes from the 3×803 SNP genotypes for 1355 controls and 607 patients with bipolar disorder. Four clusters of patient-specific combinations were identified. Permutation tests indicated that some of these combinations might be related to bipolar disorder. The WTCCC bipolar dataset were use for replication, 469 of the 803 SNP were present in the WTCCC dataset either directly (n?=?132) or by imputation (n?=?337) covering 51 of our selected genes. We found three clusters of patient-specific 3×SNP combinations in the WTCCC dataset. Different SNPs were involved in the clusters in the two datasets. The present analyses of the combinations of SNP genotypes support a role for both genetic heterogeneity and interactions in the genetic architecture of bipolar disorder. PMID:21897858

  5. Missing data imputation based on compressive sensing for robust speaker identification

    Microsoft Academic Search

    Xianyi Rui

    2010-01-01

    In this paper, the method of missing data imputation based on the emergent field of compressive sensing for the front end of a speaker identification system in noisy conditions is investigated. Firstly, noisy speech signals are transformed into Gammatone spectrum by using cochlear filtering; then, unreliable spectral components are reconstructed given an incomplete set of reliable ones; finally, speaker features

  6. 16 CFR 1115.11 - Imputed knowledge.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ...2010-01-01 2010-01-01 false Imputed knowledge. 1115.11 Section 1115.11 ...Interpretation § 1115.11 Imputed knowledge. (a) In evaluating whether...other representations. This includes the knowledge a firm would have if it conducted a...

  7. What Improves with Increased Missing Data Imputations?

    ERIC Educational Resources Information Center

    Bodner, Todd E.

    2008-01-01

    When using multiple imputation in the analysis of incomplete data, a prominent guideline suggests that more than 10 imputed data values are seldom needed. This article calls into question the optimism of this guideline and illustrates that important quantities (e.g., p values, confidence interval half-widths, and estimated fractions of missing…

  8. 16 CFR 1115.11 - Imputed knowledge.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ...2013-01-01 2013-01-01 false Imputed knowledge. 1115.11 Section 1115.11 ...Interpretation § 1115.11 Imputed knowledge. (a) In evaluating whether...other representations. This includes the knowledge a firm would have if it conducted a...

  9. 16 CFR 1115.11 - Imputed knowledge.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ...2014-01-01 2014-01-01 false Imputed knowledge. 1115.11 Section 1115.11 ...Interpretation § 1115.11 Imputed knowledge. (a) In evaluating whether...other representations. This includes the knowledge a firm would have if it conducted a...

  10. 16 CFR 1115.11 - Imputed knowledge.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ...2012-01-01 2012-01-01 false Imputed knowledge. 1115.11 Section 1115.11 ...Interpretation § 1115.11 Imputed knowledge. (a) In evaluating whether...other representations. This includes the knowledge a firm would have if it conducted a...

  11. 16 CFR 1115.11 - Imputed knowledge.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ...2011-01-01 2011-01-01 false Imputed knowledge. 1115.11 Section 1115.11 ...Interpretation § 1115.11 Imputed knowledge. (a) In evaluating whether...other representations. This includes the knowledge a firm would have if it conducted a...

  12. How imputation errors bias genomic predictions.

    PubMed

    Pimentel, E C G; Edel, C; Emmerling, R; Götz, K-U

    2015-06-01

    The objective of this study was to investigate in detail the biasing effects of imputation errors on genomic predictions. Direct genomic values (DGV) of 3,494 Brown Swiss selection candidates for 37 production and conformation traits were predicted using either their observed 50K genotypes or their 50K genotypes imputed from a mimicked 6K chip. Changes in DGV caused by imputation errors were shown to be systematic. The DGV of top animals were, on average, underestimated and that of bottom animals were, on average, overestimated when imputed genotypes were used instead of observed genotypes. This pattern might be explained by the fact that imputation algorithms will usually suggest the most frequent haplotype from the sample whenever a haplotype cannot be determined unambiguously. That was empirically shown to cause an advantage for the bottom animals and a disadvantage for the top animals. PMID:25841966

  13. Fast imputation using medium- or low-coverage sequence data

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Direct imputation from raw sequence reads can be more accurate than calling genotypes first and then imputing, especially if read depth is low or error rates high, but different imputation strategies are required than those used for data from genotyping chips. A fast algorithm to impute from lower t...

  14. CUTOFF: A spatio-temporal imputation method

    NASA Astrophysics Data System (ADS)

    Feng, Lingbing; Nowak, Gen; O'Neill, T. J.; Welsh, A. H.

    2014-11-01

    Missing values occur frequently in many different statistical applications and need to be dealt with carefully, especially when the data are collected spatio-temporally. We propose a method called CUTOFF imputation that utilizes the spatio-temporal nature of the data to accurately and efficiently impute missing values. The main feature of this method is that the estimate of a missing value is produced by incorporating similar observed temporal information from the value's nearest spatial neighbors. Extensions to this method are also developed to expand the method's ability to accommodate other data generating processes. We develop a cross-validation procedure that optimally chooses parameters for CUTOFF, which can be used by other imputation methods as well. We analyze some rainfall data from 78 gauging stations in the Murray-Darling Basin in Australia using the CUTOFF imputation method and compare its performance to four well-studied competing imputation methods, namely, k-nearest neighbors, singular value decomposition, multiple imputation and random forest. Empirical results show that our method captures the temporal patterns well and is effective at imputing large gaps in the data. Compared to the competing methods, CUTOFF is more accurate and much faster. We analyze further examples to demonstrate CUTOFF's applications to two different data sets and provide extra evidence of its validity and usefulness. We implement a simulation study based on the Murray-Darling Basin data to evaluate the method; the results show that our method performs well in both accuracy and computational efficiency.

  15. PCA-Correlated SNPs for Structure Identification

    E-print Network

    Paschou, Peristera

    - correlated SNPs) to reproduce the structure found by PCA on the complete dataset, without use of ancestry information. Evaluating our method on a previously described dataset (10,805 SNPs, 11 populations), we of individuals. Analyzing a Puerto Rican dataset (192 individuals, 7,257 SNPs), we show that PCA-correlated SNPs

  16. Posterior predictive checking of multiple imputation models.

    PubMed

    Nguyen, Cattram D; Lee, Katherine J; Carlin, John B

    2015-07-01

    Multiple imputation is gaining popularity as a strategy for handling missing data, but there is a scarcity of tools for checking imputation models, a critical step in model fitting. Posterior predictive checking (PPC) has been recommended as an imputation diagnostic. PPC involves simulating "replicated" data from the posterior predictive distribution of the model under scrutiny. Model fit is assessed by examining whether the analysis from the observed data appears typical of results obtained from the replicates produced by the model. A proposed diagnostic measure is the posterior predictive "p-value", an extreme value of which (i.e., a value close to 0 or 1) suggests a misfit between the model and the data. The aim of this study was to evaluate the performance of the posterior predictive p-value as an imputation diagnostic. Using simulation methods, we deliberately misspecified imputation models to determine whether posterior predictive p-values were effective in identifying these problems. When estimating the regression parameter of interest, we found that more extreme p-values were associated with poorer imputation model performance, although the results highlighted that traditional thresholds for classical p-values do not apply in this context. A shortcoming of the PPC method was its reduced ability to detect misspecified models with increasing amounts of missing data. Despite the limitations of posterior predictive p-values, they appear to have a valuable place in the imputer's toolkit. In addition to automated checking using p-values, we recommend imputers perform graphical checks and examine other summaries of the test quantity distribution. PMID:25939490

  17. Multi-population classical HLA type imputation.

    PubMed

    Dilthey, Alexander; Leslie, Stephen; Moutsianas, Loukas; Shen, Judong; Cox, Charles; Nelson, Matthew R; McVean, Gil

    2013-01-01

    Statistical imputation of classical HLA alleles in case-control studies has become established as a valuable tool for identifying and fine-mapping signals of disease association in the MHC. Imputation into diverse populations has, however, remained challenging, mainly because of the additional haplotypic heterogeneity introduced by combining reference panels of different sources. We present an HLA type imputation model, HLA*IMP:02, designed to operate on a multi-population reference panel. HLA*IMP:02 is based on a graphical representation of haplotype structure. We present a probabilistic algorithm to build such models for the HLA region, accommodating genotyping error, haplotypic heterogeneity and the need for maximum accuracy at the HLA loci, generalizing the work of Browning and Browning (2007) and Ron et al. (1998). HLA*IMP:02 achieves an average 4-digit imputation accuracy on diverse European panels of 97% (call rate 97%). On non-European samples, 2-digit performance is over 90% for most loci and ethnicities where data available. HLA*IMP:02 supports imputation of HLA-DPB1 and HLA-DRB3-5, is highly tolerant of missing data in the imputation panel and works on standard genotype data from popular genotyping chips. It is publicly available in source code and as a user-friendly web service framework. PMID:23459081

  18. Analysis of Case-Control Association Studies: SNPs, Imputation and Haplotypes

    Microsoft Academic Search

    Nilanjan Chatterjee; Yi-Hau Chen; Sheng Luo; Raymond J. Carroll

    2010-01-01

    Although prospective logistic regression is the standard method of analysis for case-control data, it has been recently noted that in genetic epidemiologic studies one can use the ``retrospective'' likelihood to gain major power by incorporating various population genetics model assumptions such as Hardy-Weinberg-Equilibrium (HWE), gene-gene and gene-environment independence. In this article we review these modern methods and contrast them with

  19. Imputation of microsatellite alleles from dense SNP genotypes for parentage verification across multiple Bos taurus and Bos indicus breeds

    PubMed Central

    McClure, Matthew C.; Sonstegard, Tad S.; Wiggans, George R.; Van Eenennaam, Alison L.; Weber, Kristina L.; Penedo, Cecilia T.; Berry, Donagh P.; Flynn, John; Garcia, Jose F.; Carmo, Adriana S.; Regitano, Luciana C. A.; Albuquerque, Milla; Silva, Marcos V. G. B.; Machado, Marco A.; Coffey, Mike; Moore, Kirsty; Boscher, Marie-Yvonne; Genestout, Lucie; Mazza, Raffaele; Taylor, Jeremy F.; Schnabel, Robert D.; Simpson, Barry; Marques, Elisa; McEwan, John C.; Cromie, Andrew; Coutinho, Luiz L.; Kuehn, Larry A.; Keele, John W.; Piper, Emily K.; Cook, Jim; Williams, Robert; Van Tassell, Curtis P.

    2013-01-01

    To assist cattle producers transition from microsatellite (MS) to single nucleotide polymorphism (SNP) genotyping for parental verification we previously devised an effective and inexpensive method to impute MS alleles from SNP haplotypes. While the reported method was verified with only a limited data set (N = 479) from Brown Swiss, Guernsey, Holstein, and Jersey cattle, some of the MS-SNP haplotype associations were concordant across these phylogenetically diverse breeds. This implied that some haplotypes predate modern breed formation and remain in strong linkage disequilibrium. To expand the utility of MS allele imputation across breeds, MS and SNP data from more than 8000 animals representing 39 breeds (Bos taurus and B. indicus) were used to predict 9410 SNP haplotypes, incorporating an average of 73 SNPs per haplotype, for which alleles from 12 MS markers could be accurately be imputed. Approximately 25% of the MS-SNP haplotypes were present in multiple breeds (N = 2 to 36 breeds). These shared haplotypes allowed for MS imputation in breeds that were not represented in the reference population with only a small increase in Mendelian inheritance inconsistancies. Our reported reference haplotypes can be used for any cattle breed and the reported methods can be applied to any species to aid the transition from MS to SNP genetic markers. While ~91% of the animals with imputed alleles for 12 MS markers had ?1 Mendelian inheritance conflicts with their parents' reported MS genotypes, this figure was 96% for our reference animals, indicating potential errors in the reported MS genotypes. The workflow we suggest autocorrects for genotyping errors and rare haplotypes, by MS genotyping animals whose imputed MS alleles fail parentage verification, and then incorporating those animals into the reference dataset. PMID:24065982

  20. Multiple Imputation of Multilevel Missing Data-Rigor versus Simplicity

    ERIC Educational Resources Information Center

    Drechsler, Jörg

    2015-01-01

    Multiple imputation is widely accepted as the method of choice to address item-nonresponse in surveys. However, research on imputation strategies for the hierarchical structures that are typically found in the data in educational contexts is still limited. While a multilevel imputation model should be preferred from a theoretical point of view if…

  1. Alternative Multiple Imputation Inference for Mean and Covariance Structure Modeling

    ERIC Educational Resources Information Center

    Lee, Taehun; Cai, Li

    2012-01-01

    Model-based multiple imputation has become an indispensable method in the educational and behavioral sciences. Mean and covariance structure models are often fitted to multiply imputed data sets. However, the presence of multiple random imputations complicates model fit testing, which is an important aspect of mean and covariance structure…

  2. Multiple Imputation Strategies for Multiple Group Structural Equation Models

    Microsoft Academic Search

    Craig K. Enders; Amanda C. Gottschall

    2011-01-01

    Although structural equation modeling software packages use maximum likelihood estimation by default, there are situations where one might prefer to use multiple imputation to handle missing data rather than maximum likelihood estimation (e.g., when incorporating auxiliary variables). The selection of variables is one of the nuances associated with implementing multiple imputation, because the imputer must take special care to preserve

  3. Multiple imputation using chained equations: Issues and guidance for practice.

    PubMed

    White, Ian R; Royston, Patrick; Wood, Angela M

    2011-02-20

    Multiple imputation by chained equations is a flexible and practical approach to handling missing data. We describe the principles of the method and show how to impute categorical and quantitative variables, including skewed variables. We give guidance on how to specify the imputation model and how many imputations are needed. We describe the practical analysis of multiply imputed data, including model building and model checking. We stress the limitations of the method and discuss the possible pitfalls. We illustrate the ideas using a data set in mental health, giving Stata code fragments. PMID:21225900

  4. Marker imputation in barley association studies

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Association mapping requires higher marker density than linkage mapping, potentially leading to more missing marker data and to higher genotyping costs. In human genetics, methods exist to impute missing marker data and whole markers that were typed in a reference panel but not in the experimental d...

  5. Equipercentile Equating via Data-Imputation Techniques.

    ERIC Educational Resources Information Center

    Liou, Michelle; Cheng, Philip E.

    1995-01-01

    Different data imputation techniques that are useful for equipercentile equating are discussed, and empirical data are used to evaluate the accuracy of these techniques as compared with chained equipercentile equating. The kernel estimator, the EM algorithm, the EB model, and the iterative moment estimator are considered. (SLD)

  6. Association Analysis of BMD-associated SNPs with Knee Osteoarthritis†

    PubMed Central

    Yerges-Armstrong, LM; Yau, MS; Liu, Y; Krishnan, S; Renner, JB; Eaton, CB; Kwoh, CK; Nevitt, MC; Duggan, DJ; Mitchell, BD; Jordan, JM; Hochberg, MC; Jackson, RD

    2014-01-01

    Osteoarthritis (OA) risk is widely recognized to be heritable but few loci have been identified. Observational studies have identified higher systemic bone mineral density (BMD) to be associated with an increased risk of radiographic knee osteoarthritis. With this in mind, we sought to evaluate whether well-established genetic loci for variance in BMD are associated with risk for radiographic OA in the Osteoarthritis Initiative (OAI) and the Johnston County Osteoarthritis (JoCo) Project. Cases had at least one knee with definite radiographic OA defined as the presence of definite osteophytes with or without joint space narrowing (KL grade ? 2) and controls were absent for definite radiographic OA in both knees (KL grade ? 1bilaterally). There were 2014 and 658 Caucasian cases, respectively, in the OAI and JoCo Studies, and 953 and 823 controls. Single nucleotide polymorphisms (SNPs) were identified for association analysis from the literature. Genotyping was carried out on the Illumina 2.5M and 1M arrays in GeCKO and JoCo, respectively and imputation was done. Association analyses were carried out separately in each cohort with adjustments for age, BMI, and sex and then parameter estimates were combined across the two cohorts by meta-analysis. We identified 4 SNPs significantly associated with prevalent radiographic knee OA. The strongest signal (p=0.0009, OR=1.22, 95% CI[1.08–1.37]) maps to 12q3 which contains a gene coding for SP7. Additional loci map to 7p14.1 (TXNDC3), 11q13.2 (LRP5) and 11p14.1 (LIN7C). For all four loci the allele associated with higher BMD was associated with higher odds of OA. A BMD risk allele score was not significantly associated with OA risk. This meta-analysis demonstrates that several GWAS-identified BMD SNPs are nominally associated with prevalent radiographic knee OA and further supports the hypothesis that BMD, or its determinants, may be a risk factor contributing to OA development. PMID:24339167

  7. Association analysis of BMD-associated SNPs with knee osteoarthritis.

    PubMed

    Yerges-Armstrong, Laura M; Yau, Michelle S; Liu, Youfang; Krishnan, Subha; Renner, Jordan B; Eaton, Charles B; Kwoh, C Kent; Nevitt, Michael C; Duggan, David J; Mitchell, Braxton D; Jordan, Joanne M; Hochberg, Marc C; Jackson, Rebecca D

    2014-06-01

    Osteoarthritis (OA) risk is widely recognized to be heritable but few loci have been identified. Observational studies have identified higher systemic bone mineral density (BMD) to be associated with an increased risk of radiographic knee osteoarthritis. With this in mind, we sought to evaluate whether well-established genetic loci for variance in BMD are associated with risk for radiographic OA in the Osteoarthritis Initiative (OAI) and the Johnston County Osteoarthritis (JoCo) Project. Cases had at least one knee with definite radiographic OA, defined as the presence of definite osteophytes with or without joint space narrowing (Kellgren-Lawrence [KL] grade ? 2) and controls were absent for definite radiographic OA in both knees (KL grade ? 1 bilaterally). There were 2014 and 658 Caucasian cases, respectively, in the OAI and JoCo Studies, and 953 and 823 controls. Single nucleotide polymorphisms (SNPs) were identified for association analysis from the literature. Genotyping was carried out on Illumina 2.5M and 1M arrays in Genetic Components of Knee OA (GeCKO) and JoCo, respectively and imputation was done. Association analyses were carried out separately in each cohort with adjustments for age, body mass index (BMI), and sex, and then parameter estimates were combined across the two cohorts by meta-analysis. We identified four SNPs significantly associated with prevalent radiographic knee OA. The strongest signal (p?=?0.0009; OR?=?1.22; 95% CI, 1.08-1.37) maps to 12q3, which contains a gene coding for SP7. Additional loci map to 7p14.1 (TXNDC3), 11q13.2 (LRP5), and 11p14.1 (LIN7C). For all four loci the allele associated with higher BMD was associated with higher odds of OA. A BMD risk allele score was not significantly associated with OA risk. This meta-analysis demonstrates that several genomewide association studies (GWAS)-identified BMD SNPs are nominally associated with prevalent radiographic knee OA and further supports the hypothesis that BMD, or its determinants, may be a risk factor contributing to OA development. © 2014 American Society for Bone and Mineral Research. PMID:24339167

  8. A second generation human haplotype map of over 3.1 million SNPs

    PubMed Central

    2009-01-01

    We describe the Phase II HapMap, which characterizes over 3.1 million human single nucleotide polymorphisms (SNPs) genotyped in 270 individuals from four geographically diverse populations and includes 25–35% of common SNP variation in the populations surveyed. The map is estimated to capture untyped common variation with an average maximum r2 of between 0.9 and 0.96 depending on population. We demonstrate that the current generation of commercial genome-wide genotyping products captures common Phase II SNPs with an average maximum r2 of up to 0.8 in African and up to 0.95 in non-African populations, and that potential gains in power in association studies can be obtained through imputation. These data also reveal novel aspects of the structure of linkage disequilibrium. We show that 10–30% of pairs of individuals within a population share at least one region of extended genetic identity arising from recent ancestry and that up to 1% of all common variants are untaggable, primarily because they lie within recombination hotspots. We show that recombination rates vary systematically around genes and between genes of different function. Finally, we demonstrate increased differentiation at non-synonymous, compared to synonymous, SNPs, resulting from systematic differences in the strength or efficacy of natural selection between populations. PMID:17943122

  9. On combining reference data to improve imputation accuracy.

    PubMed

    Chen, Jun; Zhang, Ji-Gang; Li, Jian; Pei, Yu-Fang; Deng, Hong-Wen

    2013-01-01

    Genotype imputation is an important tool in human genetics studies, which uses reference sets with known genotypes and prior knowledge on linkage disequilibrium and recombination rates to infer un-typed alleles for human genetic variations at a low cost. The reference sets used by current imputation approaches are based on HapMap data, and/or based on recently available next-generation sequencing (NGS) data such as data generated by the 1000 Genomes Project. However, with different coverage and call rates for different NGS data sets, how to integrate NGS data sets of different accuracy as well as previously available reference data as references in imputation is not an easy task and has not been systematically investigated. In this study, we performed a comprehensive assessment of three strategies on using NGS data and previously available reference data in genotype imputation for both simulated data and empirical data, in order to obtain guidelines for optimal reference set construction. Briefly, we considered three strategies: strategy 1 uses one NGS data as a reference; strategy 2 imputes samples by using multiple individual data sets of different accuracy as independent references and then combines the imputed samples with samples based on the high accuracy reference selected when overlapping occurs; and strategy 3 combines multiple available data sets as a single reference after imputing each other. We used three software (MACH, IMPUTE2 and BEAGLE) for assessing the performances of these three strategies. Our results show that strategy 2 and strategy 3 have higher imputation accuracy than strategy 1. Particularly, strategy 2 is the best strategy across all the conditions that we have investigated, producing the best accuracy of imputation for rare variant. Our study is helpful in guiding application of imputation methods in next generation association analyses. PMID:23383238

  10. A Comparison of Imputation Methods for Bayesian Factor Analysis Models

    ERIC Educational Resources Information Center

    Merkle, Edgar C.

    2011-01-01

    Imputation methods are popular for the handling of missing data in psychology. The methods generally consist of predicting missing data based on observed data, yielding a complete data set that is amiable to standard statistical analyses. In the context of Bayesian factor analysis, this article compares imputation under an unrestricted…

  11. HICCUP: Hierarchical Clustering Based Value Imputation using Heterogeneous Gene Expression

    E-print Network

    Lee, Dongwon

    samples), HICCUP improves the accuracy of value imputation. Experiments with a real prostate cancer for better value imputation; and (3) by exploiting relationship among the sample space (e.g., cancer vs. non-cancer function and gene network, drug discovery, and patient diagnosis [1], [4], [6]. In real experiments, often

  12. Microdata Imputations and Macrodata Implications: Evidence from the

    E-print Network

    Gerkmann, Ralf

    in the case of business surveys. So, literature leaves a gap on this issue. For this reason, we analyseMicrodata Imputations and Macrodata Implications: Evidence from the Ifo Business Survey Christian and impute the missing observations in the Ifo Business Sur- vey, a large business survey in Germany

  13. How to Improve Postgenomic Knowledge Discovery Using Imputation

    PubMed Central

    2009-01-01

    While microarrays make it feasible to rapidly investigate many complex biological problems, their multistep fabrication has the proclivity for error at every stage. The standard tactic has been to either ignore or regard erroneous gene readings as missing values, though this assumption can exert a major influence upon postgenomic knowledge discovery methods like gene selection and gene regulatory network (GRN) reconstruction. This has been the catalyst for a raft of new flexible imputation algorithms including local least square impute and the recent heuristic collateral missing value imputation, which exploit the biological transactional behaviour of functionally correlated genes to afford accurate missing value estimation. This paper examines the influence of missing value imputation techniques upon postgenomic knowledge inference methods with results for various algorithms consistently corroborating that instead of ignoring missing values, recycling microarray data by flexible and robust imputation can provide substantial performance benefits for subsequent downstream procedures. PMID:19223972

  14. Meta-analysis and Imputation Identifies a 109 kb Risk Haplotype Spanning TNFAIP3 Associated with Lupus Nephritis and Hematologic Manifestations

    PubMed Central

    Bates, Jared S.; Lessard, Christopher J.; Leon, Joanlise M.; Nguyen, Thuan; Battiest, Laura J.; Rodgers, Justin; Kaufman, Kenneth M.; James, Judith A.; Gilkeson, Gary S.; Kelly, Jennifer A.; Humphrey, Mary Beth; Harley, John B.; Gray-McGuire, Courtney; Moser, Kathy L.; Gaffney, Patrick M.

    2009-01-01

    TNFAIP3 encodes the ubiquitin modifying enzyme, A20, a key regulator of inflammatory signaling pathways. We previously reported association between TNFAIP3 variants and systemic lupus erythematosus (SLE). In order to further localize the risk variant(s), we performed a meta-analysis using genetic data available from two Caucasian case/control datasets (1453 total cases, 3381 total controls) and 713 SLE trio families. The best result was found at rs5029939 (P = 1.67 × 10?14, OR = 2.09, 95% CI 1.68–2.60). We then imputed SNPs from the CEU Phase II HapMap using genotypes from 431 SLE cases and 2155 controls. Imputation identified eleven SNPs in addition to three observed SNPs, which together, defined a 109 kb SLE risk segment surrounding TNFAIP3. When evaluating whether the rs5029939 risk allele was associated with SLE clinical manifestations, we observed that heterozygous carriers of the TNFAIP3 risk allele at rs5029939 have a two-fold increased risk of developing renal or hematologic manifestations compared to homozygous non-risk subjects. In summary, our study strengthens the genetic evidence that variants in the region of TNFAIP3 influence risk for SLE, particularly in patients with renal and hematologic manifestations, and narrows the risk effect to a 109 kb DNA segment that spans the TNFAIP3 gene. PMID:19387456

  15. Rare variant genotype imputation with thousands of study-specific whole-genome sequences: implications for cost-effective study designs.

    PubMed

    Pistis, Giorgio; Porcu, Eleonora; Vrieze, Scott I; Sidore, Carlo; Steri, Maristella; Danjou, Fabrice; Busonero, Fabio; Mulas, Antonella; Zoledziewska, Magdalena; Maschio, Andrea; Brennan, Christine; Lai, Sandra; Miller, Michael B; Marcelli, Marco; Urru, Maria Francesca; Pitzalis, Maristella; Lyons, Robert H; Kang, Hyun M; Jones, Chris M; Angius, Andrea; Iacono, William G; Schlessinger, David; McGue, Matt; Cucca, Francesco; Abecasis, Gonçalo R; Sanna, Serena

    2015-07-01

    The utility of genotype imputation in genome-wide association studies is increasing as progressively larger reference panels are improved and expanded through whole-genome sequencing. Developing general guidelines for optimally cost-effective imputation, however, requires evaluation of performance issues that include the relative utility of study-specific compared with general/multipopulation reference panels; genotyping with various array scaffolds; effects of different ethnic backgrounds; and assessment of ranges of allele frequencies. Here we compared the effectiveness of study-specific reference panels to the commonly used 1000 Genomes Project (1000G) reference panels in the isolated Sardinian population and in cohorts of European ancestry including samples from Minnesota (USA). We also examined different combinations of genome-wide and custom arrays for baseline genotypes. In Sardinians, the study-specific reference panel provided better coverage and genotype imputation accuracy than the 1000G panels and other large European panels. In fact, even gene-centered custom arrays (interrogating ~200?000 variants) provided highly informative content across the entire genome. Gain in accuracy was also observed for Minnesotans using the study-specific reference panel, although the increase was smaller than in Sardinians, especially for rare variants. Notably, a combined panel including both study-specific and 1000G reference panels improved imputation accuracy only in the Minnesota sample, and only at rare sites. Finally, we found that when imputation is performed with a study-specific reference panel, cutoffs different from the standard thresholds of MACH-Rsq and IMPUTE-INFO metrics should be used to efficiently filter badly imputed rare variants. This study thus provides general guidelines for researchers planning large-scale genetic studies. PMID:25293720

  16. Multiple Imputation for Missing Values Through Conditional Semiparametric Odds Ratio Models

    PubMed Central

    Chen, Hua Yun; Xie, Hui; Qian, Yi

    2010-01-01

    Summary Multiple imputation is a practically useful approach to handling incompletely observed data in statistical analysis. Parameter estimation and inference based on imputed full data have been made easy by Rubin's rule for result combination. However, creating proper imputation that accommodates flexible models for statistical analysis in practice can be very challenging. We propose an imputation framework that uses conditional semiparametric odds ratio models to impute the missing values. The proposed imputation framework is more flexible and robust than the imputation approach based on the normal model. It is a compatible framework in comparison to the approach based on fully conditionally specified models. The proposed algorithms for multiple imputation through the Monte Carlo Markov Chain sampling approach can be straightforwardly carried out. Simulation studies demonstrate that the proposed approach performs better than existing, commonly used imputation approaches. The proposed approach is applied to imputing missing values in bone fracture data. PMID:21210771

  17. Missing value imputation in longitudinal measures of alcohol consumption

    PubMed Central

    Grittner, Ulrike; Gmel, Gerhard; Ripatti, Samuli; Bloomfield, Kim; Wicki, Matthias

    2011-01-01

    Attrition in longitudinal studies can lead to biased results. The study is motivated by the unexpected observation that alcohol consumption decreased despite of increased availability, which may be due to sample attrition of heavy drinkers. Several imputation methods have been proposed, but rarely compared in longitudinal studies of alcohol consumption. The imputation of consumption level measurements is computationally particularly challenging due to alcohol consumption being a semi-continuous variable (dichotomous drinking status and continuous volume among drinkers), and the non-normality of data in the continuous part. Data come from a longitudinal study in Denmark with four waves (2003–2006) and 1771 individuals at baseline. Five techniques for missing data are compared: Last value carried forward (LVCF) was used as a single, and Hotdeck, Heckman modelling, multivariate imputation by chained equations (MICE), and a Bayesian approach as multiple imputation methods. Predictive mean matching was used to account for non-normality, where instead of imputing regression estimates, “real” observed values from similar cases are imputed. Methods were also compared by means of a simulated dataset. The simulation showed that the Bayesian approach yielded the most unbiased estimates for imputation. The finding of no increase in consumption levels despite a higher availability remained unaltered. PMID:21556290

  18. Novel and efficient tag SNPs selection algorithms.

    PubMed

    Chen, Wen-Pei; Hung, Che-Lun; Tsai, Suh-Jen Jane; Lin, Yaw-Ling

    2014-01-01

    SNPs are the most abundant forms of genetic variations amongst species; the association studies between complex diseases and SNPs or haplotypes have received great attention. However, these studies are restricted by the cost of genotyping all SNPs; thus, it is necessary to find smaller subsets, or tag SNPs, representing the rest of the SNPs. In fact, the existing tag SNP selection algorithms are notoriously time-consuming. An efficient algorithm for tag SNP selection was presented, which was applied to analyze the HapMap YRI data. The experimental results show that the proposed algorithm can achieve better performance than the existing tag SNP selection algorithms; in most cases, this proposed algorithm is at least ten times faster than the existing methods. In many cases, when the redundant ratio of the block is high, the proposed algorithm can even be thousands times faster than the previously known methods. Tools and web services for haplotype block analysis integrated by hadoop MapReduce framework are also developed using the proposed algorithm as computation kernels. PMID:24212035

  19. Reference-free detection of isolated SNPs

    PubMed Central

    Uricaru, Raluca; Rizk, Guillaume; Lacroix, Vincent; Quillery, Elsa; Plantard, Olivier; Chikhi, Rayan; Lemaitre, Claire; Peterlongo, Pierre

    2015-01-01

    Detecting single nucleotide polymorphisms (SNPs) between genomes is becoming a routine task with next-generation sequencing. Generally, SNP detection methods use a reference genome. As non-model organisms are increasingly investigated, the need for reference-free methods has been amplified. Most of the existing reference-free methods have fundamental limitations: they can only call SNPs between exactly two datasets, and/or they require a prohibitive amount of computational resources. The method we propose, discoSnp, detects both heterozygous and homozygous isolated SNPs from any number of read datasets, without a reference genome, and with very low memory and time footprints (billions of reads can be analyzed with a standard desktop computer). To facilitate downstream genotyping analyses, discoSnp ranks predictions and outputs quality and coverage per allele. Compared to finding isolated SNPs using a state-of-the-art assembly and mapping approach, discoSnp requires significantly less computational resources, shows similar precision/recall values, and highly ranked predictions are less likely to be false positives. An experimental validation was conducted on an arthropod species (the tick Ixodes ricinus) on which de novo sequencing was performed. Among the predicted SNPs that were tested, 96% were successfully genotyped and truly exhibited polymorphism. PMID:25404127

  20. Selecting SNPs for pharmacogenomic association study.

    PubMed

    Ahn, Tae Jin; Park, Kyunghee; Son, Dae-Soon; Huh, Nam; Oh, Sohee; Bae, Taejung; Park, Jung-Sun; Lee, Ji-Hyun; Rho, Kyoohyoung; Kim, Sunghoon; Park, Taesung; Lee, Kyusang

    2012-01-01

    SNP genotyping device is an essential tool in the upcoming era ofpersonal genome and personalised medicine. Human genome has more than 10 million SNPs whereas conventional SNP genotyping device can only hold 1 million SNPs. Thus, intelligent SNP contents selection is required to maximise the value of SNP genotyping device. Here, we developed a new selection algorithm and applied this method to design genotyping contents for cancerand pharmacogenomic association study. This approach significantly increased the product value when compared with contents of competitive SNP genotyping product. PMID:23155780

  1. Use of Multiple Imputation Models in Medical Device Trials

    Microsoft Academic Search

    Donald B. Rubin; Samantha R. Cook

    Missing data are a common problem with data sets in most clinical trials, including those dealing with devices. Imputation,\\u000a or filling in the missing values, is an intuitive and flexible way to handle the incomplete data sets that arise because of\\u000a such missing data. Here we present several imputation strategies and their theoretical background, as well as some current\\u000a examples

  2. Diagnosing imputation models by applying target analyses to posterior replicates of completed data‡

    PubMed Central

    He, Yulei; Zaslavsky, Alan M.

    2014-01-01

    Multiple imputation fills in missing data with posterior predictive draws from imputation models. To assess the adequacy of imputation models, we can compare completed data with their replicates simulated under the imputation model. We apply analyses of substantive interest to both datasets and use posterior predictive checks of the differences of these estimates to quantify the evidence of model inadequacy. We can further integrate out the imputed missing data and their replicates over the completed-data analyses to reduce variance in the comparison. In many cases, the checking procedure can be easily implemented using standard imputation software by treating re-imputations under the model as posterior predictive replicates. Thus, it can be applied for non-Bayesian imputation methods. We also sketch several strategies for applying the method in the context of practical imputation analyses. We illustrate the method using two real data applications and study its property using a simulation. PMID:22139814

  3. Simultaneous Analysis of All SNPs in Genome-Wide and Re-Sequencing Association Studies

    PubMed Central

    Hoggart, Clive J.; Whittaker, John C.; De Iorio, Maria; Balding, David J.

    2008-01-01

    Testing one SNP at a time does not fully realise the potential of genome-wide association studies to identify multiple causal variants, which is a plausible scenario for many complex diseases. We show that simultaneous analysis of the entire set of SNPs from a genome-wide study to identify the subset that best predicts disease outcome is now feasible, thanks to developments in stochastic search methods. We used a Bayesian-inspired penalised maximum likelihood approach in which every SNP can be considered for additive, dominant, and recessive contributions to disease risk. Posterior mode estimates were obtained for regression coefficients that were each assigned a prior with a sharp mode at zero. A non-zero coefficient estimate was interpreted as corresponding to a significant SNP. We investigated two prior distributions and show that the normal-exponential-gamma prior leads to improved SNP selection in comparison with single-SNP tests. We also derived an explicit approximation for type-I error that avoids the need to use permutation procedures. As well as genome-wide analyses, our method is well-suited to fine mapping with very dense SNP sets obtained from re-sequencing and/or imputation. It can accommodate quantitative as well as case-control phenotypes, covariate adjustment, and can be extended to search for interactions. Here, we demonstrate the power and empirical type-I error of our approach using simulated case-control data sets of up to 500 K SNPs, a real genome-wide data set of 300 K SNPs, and a sequence-based dataset, each of which can be analysed in a few hours on a desktop workstation. PMID:18654633

  4. Analysis of Human Triallelic SNPs by Next-Generation Sequencing.

    PubMed

    Cao, Min; Shi, Juan; Wang, Jiqiu; Hong, Jie; Cui, Bin; Ning, Guang

    2015-07-01

    Although single-nucleotide polymorphisms (SNPs) have become extremely useful in the study of geneticvariation, triallelic SNPs are still not fully understood. Next-generation sequencing (NGS) is a promising approach to identify triallelic sites in large populations. In this study, we explored exome sequencing data from 221 Chinese individuals, with an average depth of 70-fold. We identified 382,901 SNPs in the study samples, including 2,002 (0.52%) triallelic sites. Among the triallelic SNPs, 17.3% were coding SNPs (cSNPs) and 78.3% were novel. Comparison and analysis revealed that the variant alleles were more likely to result in nonsynonymous variation at triallelic sites. In addition, natural selection seemed to influence triallelic SNPs. However, with the limited sample size assessed, more studies will be required in order to fully characterize the features of triallelic SNPs. PMID:25907303

  5. Parameter estimation in the stochastic Morris-Lecar neuronal model with particle filter methods

    E-print Network

    Samson, Adeline

    with particle filter methods 2 Therefore, there is a growing demand for robust methods to estimate biophysicalParameter estimation in the stochastic Morris-Lecar neuronal model with particle filter methods model. In this paper, we propose a sequential Monte Carlo particle filter algorithm to impute

  6. Identification and analysis of deleterious human SNPs.

    PubMed

    Yue, Peng; Moult, John

    2006-03-10

    We have developed two methods of identifying which non-synonomous single base changes have a deleterious effect on protein function in vivo. One method, described elsewhere, analyzes the effect of the resulting amino acid change on protein stability, utilizing structural information. The other method, introduced here, makes use of the conservation and type of residues observed at a base change position within a protein family. A machine learning technique, the support vector machine, is trained on single amino acid changes that cause monogenic disease, with a control set of amino acid changes fixed between species. Both methods are used to identify deleterious single nucleotide polymorphisms (SNPs) in the human population. After carefully controlling for errors, we find that approximately one quarter of known non-synonymous SNPs are deleterious by these criteria, providing a set of possible contributors to human complex disease traits. PMID:16412461

  7. A Comparison of Item-Level and Scale-Level Multiple Imputation for Questionnaire Batteries

    ERIC Educational Resources Information Center

    Gottschall, Amanda C.; West, Stephen G.; Enders, Craig K.

    2012-01-01

    Behavioral science researchers routinely use scale scores that sum or average a set of questionnaire items to address their substantive questions. A researcher applying multiple imputation to incomplete questionnaire data can either impute the incomplete items prior to computing scale scores or impute the scale scores directly from other scale…

  8. HICCUP: Hierarchical Clustering Based Value Imputation using Heterogeneous Gene Expression Microarray Datasets

    Microsoft Academic Search

    Qiankun Zhao; Prasenjit Mitra; Dongwon Lee; Jaewoo Kang

    2007-01-01

    AbstractóA novel microarray value imputation method, HICCUP,, is presented. HICCUP improves upon existing value imputation methods in the several ways. (1) By judiciously integrating heterogeneous microarray datasets using hierarchical clustering, HICCUP overcomes the limitation of using only single dataset with limited number of samples; (2) Unlike local or global value imputation methods, by mining association rules, HICCUP selects appropriate subsets

  9. Multiple Imputation of Item Scores in Test and Questionnaire Data, and Influence on Psychometric Results

    ERIC Educational Resources Information Center

    van Ginkel, Joost R.; van der Ark, L. Andries; Sijtsma, Klaas

    2007-01-01

    The performance of five simple multiple imputation methods for dealing with missing data were compared. In addition, random imputation and multivariate normal imputation were used as lower and upper benchmark, respectively. Test data were simulated and item scores were deleted such that they were either missing completely at random, missing at…

  10. Handling Missing Values in Longitudinal Panel Data With Multiple Imputation

    PubMed Central

    Young, Rebekah; Johnson, David R.

    2015-01-01

    This article offers an applied review of key issues and methods for the analysis of longitudinal panel data in the presence of missing values. The authors consider the unique challenges associated with attrition (survey dropout), incomplete repeated measures, and unknown observations of time. Using simulated data based on 4 waves of the Marital Instability Over the Life Course Study (n = 2,034), they applied a fixed effect regression model and an event-history analysis with time-varying covariates. They then compared results for analyses with nonimputed missing data and with imputed data both in long and in wide structures. Imputation produced improved estimates in the event-history analysis but only modest improvements in the estimates and standard errors of the fixed effects analysis. Factors responsible for differences in the value of imputation are examined, and recommendations for handling missing values in panel data are presented.

  11. A functional multiple imputation approach to incomplete longitudinal data

    PubMed Central

    He, Yulei; Yucel, Recai; Raghunathan, Trivellore E.

    2013-01-01

    In designed longitudinal studies, information from the same set of subjects are collected repeatedly over time. The longitudinal measurements are often subject to missing data which impose an analytic challenge. We propose a functional multiple imputation approach modeling longitudinal response profiles as smooth curves of time under a functional mixed effects model. We develop a Gibbs sampling algorithm to draw model parameters and imputations for missing values, using a blocking technique for an increased computational efficiency. In an illustrative example, we apply a multiple imputation analysis to data from the Panel Study of Income Dynamics and the Child Development Supplement to investigate the gradient effect of family income on children's health status. Our simulation study demonstrates that this approach performs well under varying modeling assumptions on the time trajectory functions and missingness patterns. PMID:21341300

  12. Meta-analysis and imputation refines the association of 15q25 with smoking quantity

    PubMed Central

    Liu, Jason Z.; Tozzi, Federica; Waterworth, Dawn M.; Pillai, Sreekumar G.; Muglia, Pierandrea; Middleton, Lefkos; Berrettini, Wade; Knouff, Christopher W.; Yuan, Xin; Waeber, Gérard; Vollenweider, Peter; Preisig, Martin; Wareham, Nicholas J; Zhao, Jing Hua; Loos, Ruth J.F.; Barroso, Inês; Khaw, Kay-Tee; Grundy, Scott; Barter, Philip; Mahley, Robert; Kesaniemi, Antero; McPherson, Ruth; Vincent, John B.; Strauss, John; Kennedy, James L.; Farmer, Anne; McGuffin, Peter; Day, Richard; Matthews, Keith; Bakke, Per; Gulsvik, Amund; Lucae, Susanne; Ising, Marcus; Brueckl, Tanja; Horstmann, Sonja; Wichmann, H.-Erich; Rawal, Rajesh; Dahmen, Norbert; Lamina, Claudia; Polasek, Ozren; Zgaga, Lina; Huffman, Jennifer; Campbell, Susan; Kooner, Jaspal; Chambers, John C; Burnett, Mary Susan; Devaney, Joseph M.; Pichard, Augusto D.; Kent, Kenneth M.; Satler, Lowell; Lindsay, Joseph M.; Waksman, Ron; Epstein, Stephen; Wilson, James F.; Wild, Sarah H.; Campbell, Harry; Vitart, Veronique; Reilly, Muredach P.; Li, Mingyao; Qu, Liming; Wilensky, Robert; Matthai, William; Hakonarson, Hakon H.; Rader, Daniel J.; Franke, Andre; Wittig, Michael; Schäfer, Arne; Uda, Manuela; Terracciano, Antonio; Xiao, Xiangjun; Busonero, Fabio; Scheet, Paul; Schlessinger, David; St Clair, David; Rujescu, Dan; Abecasis, Gonçalo R.; Grabe, Hans Jörgen; Teumer, Alexander; Völzke, Henry; Petersmann, Astrid; John, Ulrich; Rudan, Igor; Hayward, Caroline; Wright, Alan F.; Kolcic, Ivana; Wright, Benjamin J; Thompson, John R; Balmforth, Anthony J.; Hall, Alistair S.; Samani, Nilesh J.; Anderson, Carl A.; Ahmad, Tariq; Mathew, Christopher G.; Parkes, Miles; Satsangi, Jack; Caulfield, Mark; Munroe, Patricia B.; Farrall, Martin; Dominiczak, Anna; Worthington, Jane; Thomson, Wendy; Eyre, Steve; Barton, Anne; Mooser, Vincent; Francks, Clyde; Marchini, Jonathan

    2013-01-01

    Smoking is a leading global cause of disease and mortality1. We performed a genomewide meta-analytic association study of smoking-related behavioral traits in a total sample of 41,150 individuals drawn from 20 disease, population, and control cohorts. Our analysis confirmed an effect on smoking quantity (SQ) at a locus on 15q25 (P=9.45e-19) that includes three genes encoding neuronal nicotinic acetylcholine receptor subunits (CHRNA5, CHRNA3, CHRNB4). We used data from the 1000 Genomes project to investigate the region using imputation, which allowed analysis of virtually all common variants in the region and offered a five-fold increase in coverage over the HapMap. This increased the spectrum of potentially causal single nucleotide polymorphisms (SNPs), which included a novel SNP that showed the highest significance, rs55853698, located within the promoter region of CHRNA5. Conditional analysis also identified a secondary locus (rs6495308) in CHRNA3. PMID:20418889

  13. TSPYL5 SNPs: Association with Plasma Estradiol Concentrations and Aromatase Expression

    PubMed Central

    Liu, Mohan; Ingle, James N.; Fridley, Brooke L.; Buzdar, Aman U.; Robson, Mark E.; Kubo, Michiaki; Wang, Liewei; Batzler, Anthony; Jenkins, Gregory D.; Pietrzak, Tracy L.; Carlson, Erin E.; Goetz, Matthew P.; Northfelt, Donald W.; Perez, Edith A.; Williard, Clark V.; Schaid, Daniel J.; Nakamura, Yusuke

    2013-01-01

    We performed a discovery genome-wide association study to identify genetic factors associated with variation in plasma estradiol (E2) concentrations using DNA from 772 postmenopausal women with estrogen receptor (ER)-positive breast cancer prior to the initiation of aromatase inhibitor therapy. Association analyses showed that the single nucleotide polymorphisms (SNP) (rs1864729) with the lowest P value (P = 3.49E-08), mapped to chromosome 8 near TSPYL5. We also identified 17 imputed SNPs in or near TSPYL5 with P values < 5E-08, one of which, rs2583506, created a functional estrogen response element. We then used a panel of lymphoblastoid cell lines (LCLs) stably transfected with ER? with known genome-wide SNP genotypes to demonstrate that TSPYL5 expression increased after E2 exposure of cells heterozygous for variant TSPYL5 SNP genotypes, but not in those homozygous for wild-type alleles. TSPYL5 knockdown decreased, and overexpression increased aromatase (CYP19A1) expression in MCF-7 cells, LCLs, and adipocytes through the skin/adipose (I.4) promoter. Chromatin immunoprecipitation assay showed that TSPYL5 bound to the CYP19A1 I.4 promoter. A putative TSPYL5 binding motif was identified in 43 genes, and TSPYL5 appeared to function as a transcription factor for most of those genes. In summary, genome-wide significant SNPs in TSPYL5 were associated with elevated plasma E2 in postmenopausal breast cancer patients. SNP rs2583506 created a functional estrogen response element, and LCLs with variant SNP genotypes displayed increased E2-dependent TSPYL5 expression. TSPYL5 induced CYP19A1 expression and that of many other genes. These studies have revealed a novel mechanism for regulating aromatase expression and plasma E2 concentrations in postmenopausal women with ER(+) breast cancer. PMID:23518928

  14. Guidebook for Imputation of Missing Data. Technical Report No. 17.

    ERIC Educational Resources Information Center

    Wise, Lauress L.; McLaughlin, Donald H.

    This guidebook is designed for data analysts who are working with computer data files that contain records with incomplete data. It indicates choices the analyst must make and the criteria for making those choices in regard to the following questions: (1) What resources are available for performing the imputation? (2) How big is the data file? (3)…

  15. SKIF: a data imputation framework for concept drifting data streams

    Microsoft Academic Search

    Peng Zhang; Xingquan Zhu; Jianlong Tan; Li Guo

    2010-01-01

    Missing data commonly occurs in many applications. While many data imputation methods exist to handle the missing data problem for large scale databases, when applied to concept drifting data streams, these methods face some common difficulties. First, due to large and continuous data volumes, we are unable to maintain all stream records to form a candidate pool and estimate missing

  16. Imputation of Missing Data Using Machine Learning Techniques

    Microsoft Academic Search

    Kamakshi Lakshminarayan; Steven A. Harp; Robert P. Goldman; Tariq Samad

    1996-01-01

    A serious problem in mining industrial data bases is that they are often incomplete, and a significant amount of data is missing, or erroneously entered. This paper explores the use of machine-learning based alternatives to standard statistical data completion (data imputation) methods, for dealing with miss- ing data. We have approached the data completion problem using two well-known machine learning

  17. Multiple Imputation Strategies for Multiple Group Structural Equation Models

    ERIC Educational Resources Information Center

    Enders, Craig K.; Gottschall, Amanda C.

    2011-01-01

    Although structural equation modeling software packages use maximum likelihood estimation by default, there are situations where one might prefer to use multiple imputation to handle missing data rather than maximum likelihood estimation (e.g., when incorporating auxiliary variables). The selection of variables is one of the nuances associated…

  18. Approximation Algorithms for the Selection of Robust Tag SNPs

    Microsoft Academic Search

    Yao-ting Huang; Kui Zhang; Ting Chen; Kun-mao Chao

    2004-01-01

    \\u000a Recent studies have shown that the chromosomal recombination only takes places at some narrow hotspots. Within the chromosomal\\u000a region between these hotspots (called haplotype block), little or even no recombination occurs, and a small subset of SNPs\\u000a (called tag SNPs) is sufficient to capture the haplotype pattern of the block. In reality, the tag SNPs may be genotyped as\\u000a missing

  19. Localization of Allotetraploid Gossypium SNPs Using Physical Mapping Resources

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Recent efforts in Gossypium SNP development have produced thousands of putative SNPs for G. barbadense, G. mustelinum, and G. tomentosum relative to G. hirsutum. Here we report on current efforts to localize putative SNPs using physical mapping resources. Recent advances in physical mapping resour...

  20. Human non-synonymous SNPs: server and survey.

    PubMed

    Ramensky, Vasily; Bork, Peer; Sunyaev, Shamil

    2002-09-01

    Human single nucleotide polymorphisms (SNPs) represent the most frequent type of human population DNA variation. One of the main goals of SNP research is to understand the genetics of the human phenotype variation and especially the genetic basis of human complex diseases. Non-synonymous coding SNPs (nsSNPs) comprise a group of SNPs that, together with SNPs in regulatory regions, are believed to have the highest impact on phenotype. Here we present a World Wide Web server to predict the effect of an nsSNP on protein structure and function. The prediction method enabled analysis of the publicly available SNP database HGVbase, which gave rise to a dataset of nsSNPs with predicted functionality. The dataset was further used to compare the effect of various structural and functional characteristics of amino acid substitutions responsible for phenotypic display of nsSNPs. We also studied the dependence of selective pressure on the structural and functional properties of proteins. We found that in our dataset the selection pressure against deleterious SNPs depends on the molecular function of the protein, although it is insensitive to several other protein features considered. The strongest selective pressure was detected for proteins involved in transcription regulation. PMID:12202775

  1. Pathway-based identification of SNPs predictive of survival

    PubMed Central

    Pang, Herbert; Hauser, Michael; Minvielle, Stéphane

    2011-01-01

    In recent years, several association analysis methods for case-control studies have been developed. However, as we turn towards the identification of single nucleotide polymorphisms (SNPs) for prognosis, there is a need to develop methods for the identification of SNPs in high dimensional data with survival outcomes. Traditional methods for the identification of SNPs have some drawbacks. First, the majority of the approaches for case-control studies are based on single SNPs. Second, SNPs that are identified without incorporating biological knowledge are more difficult to interpret. Random forests has been found to perform well in gene expression analysis with survival outcomes. In this paper we present the first pathway-based method to correlate SNP with survival outcomes using a machine learning algorithm. We illustrate the application of pathway-based analysis of SNPs predictive of survival with a data set of 192 multiple myeloma patients genotyped for 500?000 SNPs. We also present simulation studies that show that the random forests technique with log-rank score split criterion outperforms several other machine learning algorithms. Thus, pathway-based survival analysis using machine learning tools represents a promising approach for the identification of biologically meaningful SNPs associated with disease. PMID:21368918

  2. Doubly Robust Nonparametric Multiple Imputation for Ignorable Missing Data.

    PubMed

    Long, Qi; Hsu, Chiu-Hsieh; Li, Yisheng

    2012-01-01

    Missing data are common in medical and social science studies and often pose a serious challenge in data analysis. Multiple imputation methods are popular and natural tools for handling missing data, replacing each missing value with a set of plausible values that represent the uncertainty about the underlying values. We consider a case of missing at random (MAR) and investigate the estimation of the marginal mean of an outcome variable in the presence of missing values when a set of fully observed covariates is available. We propose a new nonparametric multiple imputation (MI) approach that uses two working models to achieve dimension reduction and define the imputing sets for the missing observations. Compared with existing nonparametric imputation procedures, our approach can better handle covariates of high dimension, and is doubly robust in the sense that the resulting estimator remains consistent if either of the working models is correctly specified. Compared with existing doubly robust methods, our nonparametric MI approach is more robust to the misspecification of both working models; it also avoids the use of inverse-weighting and hence is less sensitive to missing probabilities that are close to 1. We propose a sensitivity analysis for evaluating the validity of the working models, allowing investigators to choose the optimal weights so that the resulting estimator relies either completely or more heavily on the working model that is likely to be correctly specified and achieves improved efficiency. We investigate the asymptotic properties of the proposed estimator, and perform simulation studies to show that the proposed method compares favorably with some existing methods in finite samples. The proposed method is further illustrated using data from a colorectal adenoma study. PMID:22347786

  3. Missing Data and Multiple Imputation: An Unbiased Approach

    NASA Technical Reports Server (NTRS)

    Foy, M.; VanBaalen, M.; Wear, M.; Mendez, C.; Mason, S.; Meyers, V.; Alexander, D.; Law, J.

    2014-01-01

    The default method of dealing with missing data in statistical analyses is to only use the complete observations (complete case analysis), which can lead to unexpected bias when data do not meet the assumption of missing completely at random (MCAR). For the assumption of MCAR to be met, missingness cannot be related to either the observed or unobserved variables. A less stringent assumption, missing at random (MAR), requires that missingness not be associated with the value of the missing variable itself, but can be associated with the other observed variables. When data are truly MAR as opposed to MCAR, the default complete case analysis method can lead to biased results. There are statistical options available to adjust for data that are MAR, including multiple imputation (MI) which is consistent and efficient at estimating effects. Multiple imputation uses informing variables to determine statistical distributions for each piece of missing data. Then multiple datasets are created by randomly drawing on the distributions for each piece of missing data. Since MI is efficient, only a limited number, usually less than 20, of imputed datasets are required to get stable estimates. Each imputed dataset is analyzed using standard statistical techniques, and then results are combined to get overall estimates of effect. A simulation study will be demonstrated to show the results of using the default complete case analysis, and MI in a linear regression of MCAR and MAR simulated data. Further, MI was successfully applied to the association study of CO2 levels and headaches when initial analysis showed there may be an underlying association between missing CO2 levels and reported headaches. Through MI, we were able to show that there is a strong association between average CO2 levels and the risk of headaches. Each unit increase in CO2 (mmHg) resulted in a doubling in the odds of reported headaches.

  4. Missing Data Imputation in Time Series by Evolutionary Algorithms

    Microsoft Academic Search

    Juan C. Figueroa García; Dusko Kalenatic; Cesar Amilcar Lopez Bello

    2008-01-01

    This paper presents a proposal based in an Evolutionary algorithm for imputing missing observations in Time Series. A genetic\\u000a algorithm based on the minimization of an error function derived from their autocorrelation function, mean and variance, is\\u000a presented.\\u000a \\u000a All methodological aspects of the genetic structure are presented. An extended explanation of the design of the Fitness Function\\u000a is provided. Four

  5. Imputation and quality control steps for combining multiple genome-wide datasets

    PubMed Central

    Verma, Shefali S.; de Andrade, Mariza; Tromp, Gerard; Kuivaniemi, Helena; Pugh, Elizabeth; Namjou-Khales, Bahram; Mukherjee, Shubhabrata; Jarvik, Gail P.; Kottyan, Leah C.; Burt, Amber; Bradford, Yuki; Armstrong, Gretta D.; Derr, Kimberly; Crawford, Dana C.; Haines, Jonathan L.; Li, Rongling; Crosslin, David; Ritchie, Marylyn D.

    2014-01-01

    The electronic MEdical Records and GEnomics (eMERGE) network brings together DNA biobanks linked to electronic health records (EHRs) from multiple institutions. Approximately 51,000 DNA samples from distinct individuals have been genotyped using genome-wide SNP arrays across the nine sites of the network. The eMERGE Coordinating Center and the Genomics Workgroup developed a pipeline to impute and merge genomic data across the different SNP arrays to maximize sample size and power to detect associations with a variety of clinical endpoints. The 1000 Genomes cosmopolitan reference panel was used for imputation. Imputation results were evaluated using the following metrics: accuracy of imputation, allelic R2 (estimated correlation between the imputed and true genotypes), and the relationship between allelic R2 and minor allele frequency. Computation time and memory resources required by two different software packages (BEAGLE and IMPUTE2) were also evaluated. A number of challenges were encountered due to the complexity of using two different imputation software packages, multiple ancestral populations, and many different genotyping platforms. We present lessons learned and describe the pipeline implemented here to impute and merge genomic data sets. The eMERGE imputed dataset will serve as a valuable resource for discovery, leveraging the clinical data that can be mined from the EHR. PMID:25566314

  6. SNPs and Haplotypes in Native American Populations

    PubMed Central

    Kidd, Judith R.; Friedlaender, Françoise; Pakstis, Andrew J.; Furtado, Manohar; Fang, Rixun; Wang, Xudong; Nievergelt, Caroline M.; Kidd, Kenneth K.

    2013-01-01

    Autosomal DNA polymorphisms can provide new information and understanding of both the origins of and relationships among modern Native American populations. At the same time that autosomal markers can be highly informative, they are also susceptible to ascertainment biases in the selection of the markers to use. Identifying markers that can be used for ancestry inference among Native American populations can be considered separate from identifying markers to further the quest for history. In the current study we are using data on nine Native American populations to compare the results based on a large haplotype-based dataset with relatively small independent sets of SNPs. We are interested in what types of limited datasets an individual laboratory might be able to collect are best for addressing two different questions of interest. First, how well can we differentiate the Native American populations and/or infer ancestry by assigning an individual to her population(s) of origin? Second, how well can we infer the historical/evolutionary relationships among Native American populations and their Eurasian origins. We conclude that only a large comprehensive dataset involving multiple autosomal markers on multiple populations will be able to answer both questions; different small sets of markers are able to answer only one or the other of these questions. Using our largest dataset we see a general increasing distance from Old World populations from North to South in the New World except for an unexplained close relationship between our Maya and Quechua samples. PMID:21913176

  7. Genetic Diversity Analysis of Highly Incomplete SNP Genotype Data with Imputations: An Empirical Assessment

    PubMed Central

    Fu, Yong-Bi

    2014-01-01

    Genotyping by sequencing (GBS) recently has emerged as a promising genomic approach for assessing genetic diversity on a genome-wide scale. However, concerns are not lacking about the uniquely large unbalance in GBS genotype data. Although some genotype imputation has been proposed to infer missing observations, little is known about the reliability of a genetic diversity analysis of GBS data, with up to 90% of observations missing. Here we performed an empirical assessment of accuracy in genetic diversity analysis of highly incomplete single nucleotide polymorphism genotypes with imputations. Three large single-nucleotide polymorphism genotype data sets for corn, wheat, and rice were acquired, and missing data with up to 90% of missing observations were randomly generated and then imputed for missing genotypes with three map-independent imputation methods. Estimating heterozygosity and inbreeding coefficient from original, missing, and imputed data revealed variable patterns of bias from assessed levels of missingness and genotype imputation, but the estimation biases were smaller for missing data without genotype imputation. The estimates of genetic differentiation were rather robust up to 90% of missing observations but became substantially biased when missing genotypes were imputed. The estimates of topology accuracy for four representative samples of interested groups generally were reduced with increased levels of missing genotypes. Probabilistic principal component analysis based imputation performed better in terms of topology accuracy than those analyses of missing data without genotype imputation. These findings are not only significant for understanding the reliability of the genetic diversity analysis with respect to large missing data and genotype imputation but also are instructive for performing a proper genetic diversity analysis of highly incomplete GBS or other genotype data. PMID:24626289

  8. Tuning multiple imputation by predictive mean matching and local residual draws

    PubMed Central

    2014-01-01

    Background Multiple imputation is a commonly used method for handling incomplete covariates as it can provide valid inference when data are missing at random. This depends on being able to correctly specify the parametric model used to impute missing values, which may be difficult in many realistic settings. Imputation by predictive mean matching (PMM) borrows an observed value from a donor with a similar predictive mean; imputation by local residual draws (LRD) instead borrows the donor’s residual. Both methods relax some assumptions of parametric imputation, promising greater robustness when the imputation model is misspecified. Methods We review development of PMM and LRD and outline the various forms available, and aim to clarify some choices about how and when they should be used. We compare performance to fully parametric imputation in simulation studies, first when the imputation model is correctly specified and then when it is misspecified. Results In using PMM or LRD we strongly caution against using a single donor, the default value in some implementations, and instead advocate sampling from a pool of around 10 donors. We also clarify which matching metric is best. Among the current MI software there are several poor implementations. Conclusions PMM and LRD may have a role for imputing covariates (i) which are not strongly associated with outcome, and (ii) when the imputation model is thought to be slightly but not grossly misspecified. Researchers should spend efforts on specifying the imputation model correctly, rather than expecting predictive mean matching or local residual draws to do the work. PMID:24903709

  9. Genetic diversity analysis of highly incomplete SNP genotype data with imputations: an empirical assessment.

    PubMed

    Fu, Yong-Bi

    2014-05-01

    Genotyping by sequencing (GBS) recently has emerged as a promising genomic approach for assessing genetic diversity on a genome-wide scale. However, concerns are not lacking about the uniquely large unbalance in GBS genotype data. Although some genotype imputation has been proposed to infer missing observations, little is known about the reliability of a genetic diversity analysis of GBS data, with up to 90% of observations missing. Here we performed an empirical assessment of accuracy in genetic diversity analysis of highly incomplete single nucleotide polymorphism genotypes with imputations. Three large single-nucleotide polymorphism genotype data sets for corn, wheat, and rice were acquired, and missing data with up to 90% of missing observations were randomly generated and then imputed for missing genotypes with three map-independent imputation methods. Estimating heterozygosity and inbreeding coefficient from original, missing, and imputed data revealed variable patterns of bias from assessed levels of missingness and genotype imputation, but the estimation biases were smaller for missing data without genotype imputation. The estimates of genetic differentiation were rather robust up to 90% of missing observations but became substantially biased when missing genotypes were imputed. The estimates of topology accuracy for four representative samples of interested groups generally were reduced with increased levels of missing genotypes. Probabilistic principal component analysis based imputation performed better in terms of topology accuracy than those analyses of missing data without genotype imputation. These findings are not only significant for understanding the reliability of the genetic diversity analysis with respect to large missing data and genotype imputation but also are instructive for performing a proper genetic diversity analysis of highly incomplete GBS or other genotype data. PMID:24626289

  10. 22 CFR 1508.630 - May the African Development Foundation impute conduct of one person to another?

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... true May the African Development Foundation impute conduct of one person to another...Foreign Relations AFRICAN DEVELOPMENT FOUNDATION GOVERNMENTWIDE DEBARMENT AND SUSPENSION...630 May the African Development Foundation impute conduct of one person to...

  11. Introduction to Copulas Parameterization of Copulas Parameter estimation Example: Imputation of Pima diabetes data Discussion Multivariate density estimation via copulas

    E-print Network

    Hoff, Peter

    of Pima diabetes data Discussion Multivariate density estimation via copulas Peter Hoff Statistics estimation Example: Imputation of Pima diabetes data Discussion Outline Introduction to Copulas Parameterization of Copulas Parameter estimation Example: Imputation of Pima diabetes data Discussion #12

  12. 22 CFR 1508.630 - May the African Development Foundation impute conduct of one person to another?

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... true May the African Development Foundation impute conduct of one person to another...Foreign Relations AFRICAN DEVELOPMENT FOUNDATION GOVERNMENTWIDE DEBARMENT AND SUSPENSION...630 May the African Development Foundation impute conduct of one person to...

  13. 22 CFR 1508.630 - May the African Development Foundation impute conduct of one person to another?

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... false May the African Development Foundation impute conduct of one person to another...Foreign Relations AFRICAN DEVELOPMENT FOUNDATION GOVERNMENTWIDE DEBARMENT AND SUSPENSION...630 May the African Development Foundation impute conduct of one person to...

  14. 22 CFR 1508.630 - May the African Development Foundation impute conduct of one person to another?

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... true May the African Development Foundation impute conduct of one person to another...Foreign Relations AFRICAN DEVELOPMENT FOUNDATION GOVERNMENTWIDE DEBARMENT AND SUSPENSION...630 May the African Development Foundation impute conduct of one person to...

  15. 22 CFR 1508.630 - May the African Development Foundation impute conduct of one person to another?

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... true May the African Development Foundation impute conduct of one person to another...Foreign Relations AFRICAN DEVELOPMENT FOUNDATION GOVERNMENTWIDE DEBARMENT AND SUSPENSION...630 May the African Development Foundation impute conduct of one person to...

  16. Identification, validation and high-throughput genotyping of transcribed gene SNPs in cassava.

    PubMed

    Ferguson, Morag E; Hearne, Sarah J; Close, Timothy J; Wanamaker, Steve; Moskal, William A; Town, Christopher D; de Young, Joe; Marri, Pradeep Reddy; Rabbi, Ismail Yusuf; de Villiers, Etienne P

    2012-03-01

    The availability of genomic resources can facilitate progress in plant breeding through the application of advanced molecular technologies for crop improvement. This is particularly important in the case of less researched crops such as cassava, a staple and food security crop for more than 800 million people. Here, expressed sequence tags (ESTs) were generated from five drought stressed and well-watered cassava varieties. Two cDNA libraries were developed: one from root tissue (CASR), the other from leaf, stem and stem meristem tissue (CASL). Sequencing generated 706 contigs and 3,430 singletons. These sequences were combined with those from two other EST sequencing initiatives and filtered based on the sequence quality. Quality sequences were aligned using CAP3 and embedded in a Windows browser called HarvEST:Cassava which is made available. HarvEST:Cassava consists of a Unigene set of 22,903 quality sequences. A total of 2,954 putative SNPs were identified. Of these 1,536 SNPs from 1,170 contigs and 53 cassava genotypes were selected for SNP validation using Illumina's GoldenGate assay. As a result 1,190 SNPs were validated technically and biologically. The location of validated SNPs on scaffolds of the cassava genome sequence (v.4.1) is provided. A diversity assessment of 53 cassava varieties reveals some sub-structure based on the geographical origin, greater diversity in the Americas as opposed to Africa, and similar levels of diversity in West Africa and southern, eastern and central Africa. The resources presented allow for improved genetic dissection of economically important traits and the application of modern genomics-based approaches to cassava breeding and conservation. PMID:22069119

  17. SNPs selection using support vector regression and genetic algorithms in GWAS

    PubMed Central

    2014-01-01

    Introduction This paper proposes a new methodology to simultaneously select the most relevant SNPs markers for the characterization of any measurable phenotype described by a continuous variable using Support Vector Regression with Pearson Universal kernel as fitness function of a binary genetic algorithm. The proposed methodology is multi-attribute towards considering several markers simultaneously to explain the phenotype and is based jointly on statistical tools, machine learning and computational intelligence. Results The suggested method has shown potential in the simulated database 1, with additive effects only, and real database. In this simulated database, with a total of 1,000 markers, and 7 with major effect on the phenotype and the other 993 SNPs representing the noise, the method identified 21 markers. Of this total, 5 are relevant SNPs between the 7 but 16 are false positives. In real database, initially with 50,752 SNPs, we have reduced to 3,073 markers, increasing the accuracy of the model. In the simulated database 2, with additive effects and interactions (epistasis), the proposed method matched to the methodology most commonly used in GWAS. Conclusions The method suggested in this paper demonstrates the effectiveness in explaining the real phenotype (PTA for milk), because with the application of the wrapper based on genetic algorithm and Support Vector Regression with Pearson Universal, many redundant markers were eliminated, increasing the prediction and accuracy of the model on the real database without quality control filters. The PUK demonstrated that it can replicate the performance of linear and RBF kernels. PMID:25573332

  18. Gaussianization-based quasi-imputation and expansion strategies for incomplete correlated binary responses

    Microsoft Academic Search

    Hakan Demirtas; Donald Hedeker

    2007-01-01

    SUMMARY New quasi-imputation and expansion strategies for correlated binary responses are proposed by borrowing ideas from random number generation. The core idea is to convert correlated binary out- comes to multivariate normal outcomes in a sensible way so that re-conversion to the binary scale, after performing multiple imputation, yields the original specied marginal expectations and correla- tions. This conversion process

  19. DETECTING ERRORS AND IMPUTING MISSING DATA FOR SINGLE LOOP SURVEILLANCE SYSTEMS

    E-print Network

    Varaiya, Pravin

    and imputes missing or bad samples to form a complete grid of `clean data', in real time. The diagnostics others produce suspect data all the time. By examining a time series of measurements one can readilyDETECTING ERRORS AND IMPUTING MISSING DATA FOR SINGLE LOOP SURVEILLANCE SYSTEMS Chao Chen

  20. Methods of Imputation used in the USDA National Nutrient Database for Standard Reference

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Objective: To present the predominate methods of imputing used to estimate nutrient values for foods in the USDA National Nutrient Database for Standard Reference (SR20). Materials and Methods: The USDA Nutrient Data Laboratory developed standard methods for imputing nutrient values for foods wh...

  1. A Simplified Framework for Using Multiple Imputation in Social Work Research

    ERIC Educational Resources Information Center

    Rose, Roderick A.; Fraser, Mark W.

    2008-01-01

    Missing data are nearly always a problem in research, and missing values represent a serious threat to the validity of inferences drawn from findings. Increasingly, social science researchers are turning to multiple imputation to handle missing data. Multiple imputation, in which missing values are replaced by values repeatedly drawn from…

  2. Alternative missing data techniques to grade point average: Imputing unavailable grades

    Microsoft Academic Search

    Niels Smits; Gideon J. Mellenbergh

    2002-01-01

    Abstract In this article, Grade Point Average (GPA) is considered a missing data technique for unavailable grades in school grade records. In Study 1, theoretical and empirical dierences between GPA and 7 alterna- tive missing grades techniques were considered. These 7 techniques are subject mean substitution, corrected subject mean, subject cor- relation substitution, regression imputation, EM algorithm imputation and two

  3. A WAVELET-BASED DATA IMPUTATION APPROACH TO SPECTROGRAM RECONSTRUCTION FOR ROBUST SPEECH RECOGNITION

    E-print Network

    Rose, Richard

    Gill University, Canada ABSTRACT Data imputation approaches for robust automatic speech recognition reconstruct known MMSE based approach on the Aurora 2 noisy speech recognition task. Index Terms-- Data Imputation speech to improve automatic speech recognition (ASR) performance. Most existing implementations are model

  4. Imputation Strategies for Missing Data in Environmental Time Series for an Unlucky Situation

    Microsoft Academic Search

    Daria Mendola

    After a detailed review of the main specific solutions for treatment of missing data in environmental time series, this paper deals with the unlucky situation in which, in an hourly series, missing data immediately follow an absolutely anomalous period, for which we do not have any similar period to use for imputation. A tentative multivariate and multiple imputation is put

  5. Detection of sharing by descent, long-range phasing and haplotype imputation

    Microsoft Academic Search

    Gisli Masson; Michael L Frigge; Arnaldur Gylfason; Pasha Zusmanovich; Gudmar Thorleifsson; Pall I Olason; Andres Ingason; Stacy Steinberg; Thorunn Rafnar; Patrick Sulem; Magali Mouy; Frosti Jonsson; Unnur Thorsteinsdottir; Daniel F Gudbjartsson; Hreinn Stefansson; Augustine Kong; Kari Stefansson

    2008-01-01

    Uncertainty about the phase of strings of SNPs creates complications in genetic analysis, although methods have been developed for phasing population-based samples. However, these methods can only phase a small number of SNPs effectively and become unreliable when applied to SNPs spanning many linkage disequilibrium (LD) blocks. Here we show how to phase more than 1,000 SNPs simultaneously for a

  6. Imputation of Unordered Markers and the Impact on Genomic Selection Accuracy

    PubMed Central

    Rutkoski, Jessica E.; Poland, Jesse; Jannink, Jean-Luc; Sorrells, Mark E.

    2013-01-01

    Genomic selection, a breeding method that promises to accelerate rates of genetic gain, requires dense, genome-wide marker data. Genotyping-by-sequencing can generate a large number of de novo markers. However, without a reference genome, these markers are unordered and typically have a large proportion of missing data. Because marker imputation algorithms were developed for species with a reference genome, algorithms suited for unordered markers have not been rigorously evaluated. Using four empirical datasets, we evaluate and characterize four such imputation methods, referred to as k-nearest neighbors, singular value decomposition, random forest regression, and expectation maximization imputation, in terms of their imputation accuracies and the factors affecting accuracy. The effect of imputation method on the genomic selection accuracy is assessed in comparison with mean imputation. The effect of excluding markers with a large proportion of missing data on the genomic selection accuracy is also examined. Our results show that imputation of unordered markers can be accurate, especially when linkage disequilibrium between markers is high and genotyped individuals are related. Of the methods evaluated, random forest regression imputation produced superior accuracy. In comparison with mean imputation, all four imputation methods we evaluated led to greater genomic selection accuracies when the level of missing data was high. Including rather than excluding markers with a large proportion of missing data nearly always led to greater GS accuracies. We conclude that high levels of missing data in dense marker sets is not a major obstacle for genomic selection, even when marker order is not known. PMID:23449944

  7. Imputation for semiparametric transformation models with biased-sampling data.

    PubMed

    Liu, Hao; Qin, Jing; Shen, Yu

    2012-10-01

    Widely recognized in many fields including economics, engineering, epidemiology, health sciences, technology and wildlife management, length-biased sampling generates biased and right-censored data but often provide the best information available for statistical inference. Different from traditional right-censored data, length-biased data have unique aspects resulting from their sampling procedures. We exploit these unique aspects and propose a general imputation-based estimation method for analyzing length-biased data under a class of flexible semiparametric transformation models. We present new computational algorithms that can jointly estimate the regression coefficients and the baseline function semiparametrically. The imputation-based method under the transformation model provides an unbiased estimator regardless whether the censoring is independent or not on the covariates. We establish large-sample properties using the empirical processes method. Simulation studies show that under small to moderate sample sizes, the proposed procedure has smaller mean square errors than two existing estimation procedures. Finally, we demonstrate the estimation procedure by a real data example. PMID:22903245

  8. RECONSTRUCTING DNA COPY NUMBER BY PENALIZED ESTIMATION AND IMPUTATION

    PubMed Central

    Zhang, Zhongyang; Lange, Kenneth; Ophoff, Roel; Sabatti, Chiara

    2011-01-01

    Recent advances in genomics have underscored the surprising ubiquity of DNA copy number variation (CNV). Fortunately, modern genotyping platforms also detect CNVs with fairly high reliability. Hidden Markov models and algorithms have played a dominant role in the interpretation of CNV data. Here we explore CNV reconstruction via estimation with a fused-lasso penalty as suggested by Tibshirani and Wang [Biostatistics 9 (2008) 18–29]. We mount a fresh attack on this difficult optimization problem by the following: (a) changing the penalty terms slightly by substituting a smooth approximation to the absolute value function, (b) designing and implementing a new MM (majorization-minimization) algorithm, and (c) applying a fast version of Newton's method to jointly update all model parameters. Together these changes enable us to minimize the fused-lasso criterion in a highly effective way. We also reframe the reconstruction problem in terms of imputation via discrete optimization. This approach is easier and more accurate than parameter estimation because it relies on the fact that only a handful of possible copy number states exist at each SNP. The dynamic programming framework has the added bonus of exploiting information that the current fused-lasso approach ignores. The accuracy of our imputations is comparable to that of hidden Markov models at a substantially lower computational cost. PMID:21572975

  9. Missing value imputation improves clustering and interpretation of gene expression microarray data

    PubMed Central

    Tuikkala, Johannes; Elo, Laura L; Nevalainen, Olli S; Aittokallio, Tero

    2008-01-01

    Background Missing values frequently pose problems in gene expression microarray experiments as they can hinder downstream analysis of the datasets. While several missing value imputation approaches are available to the microarray users and new ones are constantly being developed, there is no general consensus on how to choose between the different methods since their performance seems to vary drastically depending on the dataset being used. Results We show that this discrepancy can mostly be attributed to the way in which imputation methods have traditionally been developed and evaluated. By comparing a number of advanced imputation methods on recent microarray datasets, we show that even when there are marked differences in the measurement-level imputation accuracies across the datasets, these differences become negligible when the methods are evaluated in terms of how well they can reproduce the original gene clusters or their biological interpretations. Regardless of the evaluation approach, however, imputation always gave better results than ignoring missing data points or replacing them with zeros or average values, emphasizing the continued importance of using more advanced imputation methods. Conclusion The results demonstrate that, while missing values are still severely complicating microarray data analysis, their impact on the discovery of biologically meaningful gene groups can – up to a certain degree – be reduced by using readily available and relatively fast imputation methods, such as the Bayesian Principal Components Algorithm (BPCA). PMID:18423022

  10. SNPeffect: a database mapping molecular phenotypic effects of human non-synonymous coding SNPs

    Microsoft Academic Search

    Joke Reumers; Joost Schymkowitz; Jesper Ferkinghoff-borg; Francois Stricher; Luis Serrano; Frederic Rousseau

    2005-01-01

    Single nucleotide polymorphisms (SNPs) are an increasingly important tool for genetic and biomedi- cal research. However, the accumulated sequence information on allelic variation is not matched by an understanding of the effect of SNPs on the functional attributes or 'molecular phenotype' of a protein. Towards this aim we developed SNPeffect, an online resource of human non-synonymous coding SNPs (nsSNPs) mapping

  11. Association studies with imputed variants using expectation-maximization likelihood-ratio tests.

    PubMed

    Huang, Kuan-Chieh; Sun, Wei; Wu, Ying; Chen, Mengjie; Mohlke, Karen L; Lange, Leslie A; Li, Yun

    2014-01-01

    Genotype imputation has become standard practice in modern genetic studies. As sequencing-based reference panels continue to grow, increasingly more markers are being well or better imputed but at the same time, even more markers with relatively low minor allele frequency are being imputed with low imputation quality. Here, we propose new methods that incorporate imputation uncertainty for downstream association analysis, with improved power and/or computational efficiency. We consider two scenarios: I) when posterior probabilities of all potential genotypes are estimated; and II) when only the one-dimensional summary statistic, imputed dosage, is available. For scenario I, we have developed an expectation-maximization likelihood-ratio test for association based on posterior probabilities. When only imputed dosages are available (scenario II), we first sample the genotype probabilities from its posterior distribution given the dosages, and then apply the EM-LRT on the sampled probabilities. Our simulations show that type I error of the proposed EM-LRT methods under both scenarios are protected. Compared with existing methods, EM-LRT-Prob (for scenario I) offers optimal statistical power across a wide spectrum of MAF and imputation quality. EM-LRT-Dose (for scenario II) achieves a similar level of statistical power as EM-LRT-Prob and, outperforms the standard Dosage method, especially for markers with relatively low MAF or imputation quality. Applications to two real data sets, the Cebu Longitudinal Health and Nutrition Survey study and the Women's Health Initiative Study, provide further support to the validity and efficiency of our proposed methods. PMID:25383782

  12. Association Studies with Imputed Variants Using Expectation-Maximization Likelihood-Ratio Tests

    PubMed Central

    Huang, Kuan-Chieh; Sun, Wei; Wu, Ying; Chen, Mengjie; Mohlke, Karen L.; Lange, Leslie A.; Li, Yun

    2014-01-01

    Genotype imputation has become standard practice in modern genetic studies. As sequencing-based reference panels continue to grow, increasingly more markers are being well or better imputed but at the same time, even more markers with relatively low minor allele frequency are being imputed with low imputation quality. Here, we propose new methods that incorporate imputation uncertainty for downstream association analysis, with improved power and/or computational efficiency. We consider two scenarios: I) when posterior probabilities of all potential genotypes are estimated; and II) when only the one-dimensional summary statistic, imputed dosage, is available. For scenario I, we have developed an expectation-maximization likelihood-ratio test for association based on posterior probabilities. When only imputed dosages are available (scenario II), we first sample the genotype probabilities from its posterior distribution given the dosages, and then apply the EM-LRT on the sampled probabilities. Our simulations show that type I error of the proposed EM-LRT methods under both scenarios are protected. Compared with existing methods, EM-LRT-Prob (for scenario I) offers optimal statistical power across a wide spectrum of MAF and imputation quality. EM-LRT-Dose (for scenario II) achieves a similar level of statistical power as EM-LRT-Prob and, outperforms the standard Dosage method, especially for markers with relatively low MAF or imputation quality. Applications to two real data sets, the Cebu Longitudinal Health and Nutrition Survey study and the Women’s Health Initiative Study, provide further support to the validity and efficiency of our proposed methods. PMID:25383782

  13. Comparison of methods for imputing limited-range variables: a simulation study

    PubMed Central

    2014-01-01

    Background Multiple imputation (MI) was developed as a method to enable valid inferences to be obtained in the presence of missing data rather than to re-create the missing values. Within the applied setting, it remains unclear how important it is that imputed values should be plausible for individual observations. One variable type for which MI may lead to implausible values is a limited-range variable, where imputed values may fall outside the observable range. The aim of this work was to compare methods for imputing limited-range variables, with a focus on those that restrict the range of the imputed values. Methods Using data from a study of adolescent health, we consider three variables based on responses to the General Health Questionnaire (GHQ), a tool for detecting minor psychiatric illness. These variables, based on different scoring methods for the GHQ, resulted in three continuous distributions with mild, moderate and severe positive skewness. In an otherwise complete dataset, we set 33% of the GHQ observations to missing completely at random or missing at random; repeating this process to create 1000 datasets with incomplete data for each scenario. For each dataset, we imputed values on the raw scale and following a zero-skewness log transformation using: univariate regression with no rounding; post-imputation rounding; truncated normal regression; and predictive mean matching. We estimated the marginal mean of the GHQ and the association between the GHQ and a fully observed binary outcome, comparing the results with complete data statistics. Results Imputation with no rounding performed well when applied to data on the raw scale. Post-imputation rounding and imputation using truncated normal regression produced higher marginal means than the complete data estimate when data had a moderate or severe skew, and this was associated with under-coverage of the complete data estimate. Predictive mean matching also produced under-coverage of the complete data estimate. For the estimate of association, all methods produced similar estimates to the complete data. Conclusions For data with a limited range, multiple imputation using techniques that restrict the range of imputed values can result in biased estimates for the marginal mean when data are highly skewed. PMID:24766825

  14. Next generation tools for the annotation of human SNPs

    PubMed Central

    2009-01-01

    Computational biology has the opportunity to play an important role in the identification of functional single nucleotide polymorphisms (SNPs) discovered in large-scale genotyping studies, ultimately yielding new drug targets and biomarkers. The medical genetics and molecular biology communities are increasingly turning to computational biology methods to prioritize interesting SNPs found in linkage and association studies. Many such methods are now available through web interfaces, but the interested user is confronted with an array of predictive results that are often in disagreement with each other. Many tools today produce results that are difficult to understand without bioinformatics expertise, are biased towards non-synonymous SNPs, and do not necessarily reflect up-to-date versions of their source bioinformatics resources, such as public SNP repositories. Here, I assess the utility of the current generation of webservers; and suggest improvements for the next generation of webservers to better deliver value to medical geneticists and molecular biologists. PMID:19181721

  15. Prediction of the deleterious nsSNPs in ABCB transporters.

    PubMed

    Li, Yanhong; Wang, Yonghua; Li, Yan; Yang, Ling

    2006-12-22

    The non-synonymous SNPs (nsSNPs) in coding regions, neutral or deleterious, could lead to the alteration of the function or structure of proteins. We have developed the computational models to analyze the deleterious nsSNPs in the transporters and predict ones in ABCB (ATP-binding cassette B) transporters of interest. The RPLS (ridge partial least square) and LDA (linear discriminant analysis) methods were applied to the problem, by training on a selection of datasets from a specified source, i.e., human transporters. The best combination of datasets and prediction attributes was ascertained. The prediction accuracy of the theoretical RPLS model for the training and testing sets is 84.8% and 80.4%, respectively (LDA: 84.3% and 80.4%), which indicates the models are reasonable and may be helpful for pharmacogenetics studies. PMID:17141228

  16. Analysis of mitochondrial transcription factor A SNPs in alcoholic cirrhosis

    PubMed Central

    TANG, CHUN; LIU, HONGMING; TANG, YONGLIANG; GUO, YONG; LIANG, XIANCHUN; GUO, LIPING; PI, RUXIAN; YANG, JUNTAO

    2014-01-01

    Genetic susceptibility to alcoholic cirrhosis (AC) exists. We previously demonstrated hepatic mitochondrial DNA (mtDNA) damage in patients with AC compared with chronic alcoholics without cirrhosis. Mitochondrial transcription factor A (mtTFA) is central to mtDNA expression regulation and repair; however, it is unclear whether there are specific mtTFA single nucleotide polymorphisms (SNPs) in patients with AC and whether they affect mtDNA repair. In the present study, we screened mtTFA SNPs in patients with AC and analyzed their impact on the copy number of mtDNA in AC. A total of 50 patients with AC, 50 alcoholics without AC and 50 normal subjects were enrolled in the study. SNPs of full-length mtTFA were analyzed using the polymerase chain reaction (PCR) combined with gene sequencing. The hepatic mtTFA mRNA and mtDNA copy numbers were measured using quantitative PCR (qPCR), and mtTFA protein was measured using western blot analysis. A total of 18 mtTFA SNPs specific to patients with AC with frequencies >10% were identified. Two were located in the coding region and 16 were identified in non-coding regions. Conversely, there were five SNPs that were only present in patients with AC and normal subjects and had a frequency >10%. In the AC group, the hepatic mtTFA mRNA and protein levels were significantly lower than those in the other two groups. Moreover, the hepatic mtDNA copy number was significantly lower in the AC group than in the controls and alcoholics without AC. Based on these data, we conclude that AC-specific mtTFA SNPs may be responsible for the observed reductions in mtTFA mRNA, protein levels and mtDNA copy number and they may also increase the susceptibility to AC. PMID:24348767

  17. Traffic Speed Data Imputation Method Based on Tensor Completion

    PubMed Central

    Ran, Bin; Feng, Jianshuai; Liu, Ying; Wang, Wuhong

    2015-01-01

    Traffic speed data plays a key role in Intelligent Transportation Systems (ITS); however, missing traffic data would affect the performance of ITS as well as Advanced Traveler Information Systems (ATIS). In this paper, we handle this issue by a novel tensor-based imputation approach. Specifically, tensor pattern is adopted for modeling traffic speed data and then High accurate Low Rank Tensor Completion (HaLRTC), an efficient tensor completion method, is employed to estimate the missing traffic speed data. This proposed method is able to recover missing entries from given entries, which may be noisy, considering severe fluctuation of traffic speed data compared with traffic volume. The proposed method is evaluated on Performance Measurement System (PeMS) database, and the experimental results show the superiority of the proposed approach over state-of-the-art baseline approaches. PMID:25866501

  18. A multiple imputation strategy for sequential multiple assignment randomized trials.

    PubMed

    Shortreed, Susan M; Laber, Eric; Scott Stroup, T; Pineau, Joelle; Murphy, Susan A

    2014-10-30

    Sequential multiple assignment randomized trials (SMARTs) are increasingly being used to inform clinical and intervention science. In a SMART, each patient is repeatedly randomized over time. Each randomization occurs at a critical decision point in the treatment course. These critical decision points often correspond to milestones in the disease process or other changes in a patient's health status. Thus, the timing and number of randomizations may vary across patients and depend on evolving patient-specific information. This presents unique challenges when analyzing data from a SMART in the presence of missing data. This paper presents the first comprehensive discussion of missing data issues typical of SMART studies: we describe five specific challenges and propose a flexible imputation strategy to facilitate valid statistical estimation and inference using incomplete data from a SMART. To illustrate these contributions, we consider data from the Clinical Antipsychotic Trial of Intervention and Effectiveness, one of the most well-known SMARTs to date. PMID:24919867

  19. Differential network analysis with multiply imputed lipidomic data.

    PubMed

    Kujala, Maiju; Nevalainen, Jaakko; März, Winfried; Laaksonen, Reijo; Datta, Susmita

    2015-01-01

    The importance of lipids for cell function and health has been widely recognized, e.g., a disorder in the lipid composition of cells has been related to atherosclerosis caused cardiovascular disease (CVD). Lipidomics analyses are characterized by large yet not a huge number of mutually correlated variables measured and their associations to outcomes are potentially of a complex nature. Differential network analysis provides a formal statistical method capable of inferential analysis to examine differences in network structures of the lipids under two biological conditions. It also guides us to identify potential relationships requiring further biological investigation. We provide a recipe to conduct permutation test on association scores resulted from partial least square regression with multiple imputed lipidomic data from the LUdwigshafen RIsk and Cardiovascular Health (LURIC) study, particularly paying attention to the left-censored missing values typical for a wide range of data sets in life sciences. Left-censored missing values are low-level concentrations that are known to exist somewhere between zero and a lower limit of quantification. To make full use of the LURIC data with the missing values, we utilize state of the art multiple imputation techniques and propose solutions to the challenges that incomplete data sets bring to differential network analysis. The customized network analysis helps us to understand the complexities of the underlying biological processes by identifying lipids and lipid classes that interact with each other, and by recognizing the most important differentially expressed lipids between two subgroups of coronary artery disease (CAD) patients, the patients that had a fatal CVD event and the ones who remained stable during two year follow-up. PMID:25822937

  20. Sequence Imputation of HPV16 Genomes for Genetic Association Studies

    Microsoft Academic Search

    Benjamin Smith; Zigui Chen; Laura Reimers; Koenraad van Doorslaer; Mark Schiffman; Rob Desalle; Rolando Herrero; Kai Yu; Sholom Wacholder; Tao Wang; Robert D. Burk; Art F. Y. Poon

    2011-01-01

    BackgroundHuman Papillomavirus type 16 (HPV16) causes over half of all cervical cancer and some HPV16 variants are more oncogenic than others. The genetic basis for the extraordinary oncogenic properties of HPV16 compared to other HPVs is unknown. In addition, we neither know which nucleotides vary across and within HPV types and lineages, nor which of the single nucleotide polymorphisms (SNPs)

  1. An analysis of single nucleotide polymorphisms of 125 DNA repair genes in the Texas genome-wide association study of lung cancer with a replication for the XRCC4 SNPs

    PubMed Central

    Yu, Hongping; Zhao, Hui; Wang, Li-E; Han, Younghun; Chen, Wei V.; Amos, Christopher I.; Rafnar, Thorunn; Sulem, Patrick; Stefansson, Kari; Landi, Maria Teresa; Caporaso, Neil; Albanes, Demetrius; Thun, Michael; McKay, James D.; Brennan, Paul; Wang, Yufei; Houlston, Richard S; Spitz, Margaret R.; Wei, Qingyi

    2011-01-01

    DNA repair genes are important for maintaining genomic stability and limiting carcinogenesis. We analyzed all single nucleotide polymorphisms (SNPs) of 125 DNA repair genes covered by the Illumina HumanHap300 (v1.1) BeadChips in a previously conducted genome-wide association study (GWAS) of 1,154 lung cancer cases and 1,137 controls and replicated the top-hits of XRCC4 SNPs in an independent set of 597 cases and 611 controls in Texas populations. We found that six of 20 XRCC4 SNPs were associated with a decreased risk of lung cancer with a P value of 0.01 or lower in the discovery dataset, of which the most significant SNP was rs10040363 (P for allelic test = 4.89 ×10?4). Moreover, the data in this region allowed us to impute a potentially functional SNP rs2075685 (imputed P for allelic test = 1.3 ×10?3). A luciferase reporter assay demonstrated that the rs2075685G>T change in the XRCC4 promoter increased expression of the gene. In the replication study of rs10040363, rs1478486, rs9293329, and rs2075685, however, only rs10040363 achieved a borderline association with a decreased risk of lung cancer in a dominant model (adjusted OR = 0.80, 95% CI = 0.62–1.03, P = 0.079). In the final combined analysis of both the Texas GWAS discovery and replication datasets, the strength of the association was increased for rs10040363 (adjusted OR = 0.77, 95% CI = 0.66–0.89, Pdominant = 5×10?4 and P for trend = 5×10?4) and rs1478486 (adjusted OR = 0.82, 95% CI = 0.71 ?0.94, Pdominant = 6×10?3 and P for trend = 3.5×10?3). Finally, we conducted a meta-analysis of these XRCC4 SNPs with available data from published GWA studies of lung cancer with a total of 12,312 cases and 47,921 controls, in which none of these XRCC4 SNPs was associated with lung cancer risk. It appeared that rs2075685, although associated with increased expression of a reporter gene and lung cancer risk in the Texas populations, did not have an effect on lung cancer risk in other populations. This study underscores the importance of replication using published data in larger populations. PMID:21296624

  2. Tracing Cattle Breeds with Principal Components Analysis Ancestry Informative SNPs

    E-print Network

    Drineas, Petros

    Tracing Cattle Breeds with Principal Components Analysis Ancestry Informative SNPs Jamey Lewis1 that can be used to trace the breed of unknown cattle samples. Taking advantage of the power of Principal the origin of individual cattle. In doing so, we present a thorough examination of population genetic

  3. Hereditary genes and SNPs associated with breast cancer.

    PubMed

    Mahdi, Kooshyar Mohammad; Nassiri, Mohammad Reza; Nasiri, Khadijeh

    2013-01-01

    Breast cancer is the most common cancer among women affecting up to one third of tehm during their lifespans. Increased expression of some genes due to polymorphisms increases the risk of breast cancer incidence. Since mutations that are recognized to increase breast cancer risk within families are quite rare, identification of these SNPs is very important. The most important loci which include mutations are; BRCA1, BRCA2, PTEN, ATM, TP53, CHEK2, PPM1D, CDH1, MLH1, MRE11, MSH2, MSH6, MUTYH, NBN, PMS1, PMS2, BRIP1, RAD50, RAD51C, STK11 and BARD1. Presence of SNPs in these genes increases the risk of breast cancer and associated diagnostic markers are among the most reliable for assessing prognosis of breast cancer. In this article we reviewed the hereditary genes of breast cancer and SNPs associated with increasing the risk of breast cancer that were recently were reported from candidate gene, meta-analysis and GWAS studies. SNPs of genes associated with breast cancer can be used as a potential tool for improving cancer diagnosis and treatment planning. PMID:23886119

  4. Quality assessment parameters for EST-derived SNPs from catfish

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Two factors were found to be most significant for validation of EST-derived SNPs: the contig size and the minor allele sequence frequency. The larger the contigs were, the greater the validation rate although the validation rate was reasonably high when the contig sizes were equal to or larger than...

  5. Addressing Missing Data Mechanism Uncertainty using Multiple-Model Multiple Imputation: Application to a Longitudinal Clinical Trial

    PubMed Central

    Siddique, Juned; Harel, Ofer; Crespi, Catherine M.

    2012-01-01

    We present a framework for generating multiple imputations for continuous data when the missing data mechanism is unknown. Imputations are generated from more than one imputation model in order to incorporate uncertainty regarding the missing data mechanism. Parameter estimates based on the different imputation models are combined using rules for nested multiple imputation. Through the use of simulation, we investigate the impact of missing data mechanism uncertainty on post-imputation inferences and show that incorporating this uncertainty can increase the coverage of parameter estimates. We apply our method to a longitudinal clinical trial of low-income women with depression where nonignorably missing data were a concern. We show that different assumptions regarding the missing data mechanism can have a substantial impact on inferences. Our method provides a simple approach for formalizing subjective notions regarding nonresponse so that they can be easily stated, communicated, and compared. PMID:23503984

  6. 41 CFR 105-68.630 - May the General Services Administration impute conduct of one person to another?

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... false May the General Services Administration impute conduct of one person to another...System (Continued) GENERAL SERVICES ADMINISTRATION Regional Offices-General Services Administration 68-GOVERNMENTWIDE DEBARMENT...

  7. 41 CFR 105-68.630 - May the General Services Administration impute conduct of one person to another?

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... false May the General Services Administration impute conduct of one person to another...System (Continued) GENERAL SERVICES ADMINISTRATION Regional Offices-General Services Administration 68-GOVERNMENTWIDE DEBARMENT...

  8. 41 CFR 105-68.630 - May the General Services Administration impute conduct of one person to another?

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... false May the General Services Administration impute conduct of one person to another...System (Continued) GENERAL SERVICES ADMINISTRATION Regional Offices-General Services Administration 68-GOVERNMENTWIDE DEBARMENT...

  9. Low penetrance breast cancer predisposition SNPs are site specific.

    PubMed

    Mcinerney, Niall; Colleran, Gabrielle; Rowan, Andrew; Walther, Axel; Barclay, Ella; Spain, Sarah; Jones, Angela M; Tuohy, Stephen; Curran, Catherine; Miller, Nicola; Kerin, Michael; Tomlinson, Ian; Sawyer, Elinor

    2009-09-01

    Large scale association studies have identified low penetrance susceptibility alleles that predispose to breast cancer. A locus on chromosome 8q24.21 has been shown to harbour variants that predispose to breast, ovarian, colorectal and prostate cancer. The finding of risk variants clustering at 8q24 suggests that there may be common susceptibility alleles that predispose to more than one epithelial cancer. The aim of this study was firstly to determine whether previously identified breast cancer susceptibility alleles are associated with sporadic breast cancer in the West of Ireland and secondly to ascertain whether there are susceptibility alleles that predispose to all three common epithelial cancers (breast, prostate, colon). We genotyped a panel of 24 SNPs that have recently been shown to predispose to prostate, colorectal or breast cancer in 988 sporadic breast cancer cases and 1,016 controls from the West of Ireland. We then combined our data with publicly available datasets using standard techniques of meta-analysis. The known breast cancer SNPs rs13281615, rs2981582 and rs3803662 were confirmed as associated with breast cancer risk (P (allelic test) = 1.8 x 10(-2), OR = 1.17; P (allelic test) = 2.2 x 10(-3), OR = 1.22; P (allelic test) = 5.1 x 10(-2), OR = 1.15, respectively) in the West of Ireland cohort. For the remaining five breast cancer SNPs that were studied there was no evidence of an association with breast cancer in the West Ireland population (P (allelic test) > 6.5 x 10(-2)). There was also no association between any of the prostate or colorectal susceptibility SNPs, whether at 8q24 or elsewhere, with breast cancer risk. Meta-analysis confirmed that all susceptibility SNPs were site specific, with the exception of rs6983269 which is known to predispose to both colorectal and prostate cancer. This study confirms that susceptibility loci at FGFR2, 8q24 and TNCR9 predispose to sporadic breast cancer in the West of Ireland. It also suggests that low penetrance susceptibility SNPs for breast, prostate and colorectal cancer are distinct. Although 8q24 harbours variants that predispose to all three cancers, the susceptibility loci within the region appear to be specific for the different cancer types with the exception of rs6983269 in colon and prostate cancer. PMID:19005751

  10. Integrative analysis of transcriptomic and proteomic data of Shewanella oneidensis: missing value imputation using temporal datasets

    SciTech Connect

    Torres-García, Wandaliz [Arizona State University; Brown, Steven D [ORNL; Johnson, Roger [Arizona State University; Zhang, Weiwen [Arizona State University; Runger, George [Arizona State University; Meldrum, Deirdre [Arizona State University

    2011-01-01

    Despite significant improvements in recent years, proteomic datasets currently available still suffer large number of missing values. Integrative analyses based upon incomplete proteomic and transcriptomic da-tasets could seriously bias the biological interpretation. In this study, we applied a non-linear data-driven stochastic gradient boosted trees (GBT) model to impute missing proteomic values for proteins experi-mentally undetected, using a temporal transcriptomic and proteomic dataset of Shewanella oneidensis. In this dataset, genes expression was measured after the cells were exposed to 1 mM potassium chromate for 5-, 30-, 60-, and 90-min, while protein abundance was measured only for 45- and 90-min samples. With the goal of elucidating the relationship between temporal gene expression and protein abundance data, and then using it to impute missing proteomic values for samples of 45-min (which does not have cognate transcriptomic data) and 90-min, we initially used nonlinear Smoothing Splines Curve Fitting (SSCF) to identify temporal relationships among transcriptomic data at different time points and then imputed missing gene expression measurements for the sample at 45-min. After the imputation was validated by biological constrains (i.e. operons), we used a data-driven Gradient Boosted Trees (GBT) model to uncover possible non-linear relationships between temporal transcriptomic and proteomic data, and to impute protein abundance for the proteins experimentally undetected in the 45- and 90-min sam-ples, based on relevant predictors such as temporal mRNA gene expression data, cellular roles, molecular weight, sequence length, protein length, guanine-cytosine (GC) content and triple codon counts. The imputed protein values were validated using biological constraints such as operon, regulon and pathway information. Finally, we demonstrated that such missing value imputation improved characterization of the temporal response of S. oneidensis to chromate.

  11. Imputation of missing data using machine learning techniques

    SciTech Connect

    Lakshminarayan, Kamakshi; Harp, S.A.; Goldman, R.; Samad, T. [Honeywell Technology Center, Minneapolis, MN (United States)

    1996-12-31

    A serious problem in mining industrial data bases is that they are often incomplete, and a significant amount of data is missing, or erroneously entered. This paper explores the use of machine-learning based alternatives to standard statistical data completion (data imputation) methods, for dealing with missing data. We have approached the data completion problem using two well-known machine learning techniques. The first is an unsupervised clustering strategy which uses a Bayesian approach to cluster the data into classes. The classes so obtained are then used to predict multiple choices for the attribute of interest. The second technique involves modeling missing variables by supervised induction of a decision tree-based classifier. This predicts the most likely value for the attribute of interest. Empirical tests using extracts from industrial databases maintained by Honeywell customers have been done in order to compare the two techniques. These tests show both approaches are useful and have advantages and disadvantages. We argue that the choice between unsupervised and supervised classification techniques should be influenced by the motivation for solving the missing data problem, and discuss potential applications for the procedures we are developing.

  12. Consortium analysis of 7 candidate SNPs for ovarian cancer.

    PubMed

    Ramus, Susan J; Vierkant, Robert A; Johnatty, Sharon E; Pike, Malcolm C; Van Den Berg, David J; Wu, Anna H; Pearce, Celeste Leigh; Menon, Usha; Gentry-Maharaj, Aleksandra; Gayther, Simon A; Dicioccio, Richard A; McGuire, Valerie; Whittemore, Alice S; Song, Honglin; Easton, Douglas F; Pharoah, Paul D P; Garcia-Closas, Montserrat; Chanock, Stephen; Lissowska, Jolanta; Brinton, Louise; Terry, Kathryn L; Cramer, Daniel W; Tworoger, Shelley S; Hankinson, Susan E; Berchuck, Andrew; Moorman, Patricia G; Schildkraut, Joellen M; Cunningham, Julie M; Liebow, Mark; Kjaer, Susanne Krüger; Hogdall, Estrid; Hogdall, Claus; Blaakaer, Jan; Ness, Roberta B; Moysich, Kirsten B; Edwards, Robert P; Carney, Michael E; Lurie, Galina; Goodman, Marc T; Wang-Gohrke, Shan; Kropp, Silke; Chang-Claude, Jenny; Webb, Penelope M; Chen, Xiaoqing; Beesley, Jonathan; Chenevix-Trench, Georgia; Goode, Ellen L

    2008-07-15

    The Ovarian Cancer Association Consortium selected 7 candidate single nucleotide polymorphisms (SNPs), for which there is evidence from previous studies of an association with variation in ovarian cancer or breast cancer risks. The SNPs selected for analysis were F31I (rs2273535) in AURKA, N372H (rs144848) in BRCA2, rs2854344 in intron 17 of RB1, rs2811712 5' flanking CDKN2A, rs523349 in the 3' UTR of SRD5A2, D302H (rs1045485) in CASP8 and L10P (rs1982073) in TGFB1. Fourteen studies genotyped 4,624 invasive epithelial ovarian cancer cases and 8,113 controls of white non-Hispanic origin. A marginally significant association was found for RB1 when all studies were included [ordinal odds ratio (OR) 0.88 (95% confidence interval (CI) 0.79-1.00) p = 0.041 and dominant OR 0.87 (95% CI 0.76-0.98) p = 0.025]; when the studies that originally suggested an association were excluded, the result was suggestive although no longer statistically significant (ordinal OR 0.92, 95% CI 0.79-1.06). This SNP has also been shown to have an association with decreased risk in breast cancer. There was a suggestion of an association for AURKA, when one study that caused significant study heterogeneity was excluded [ordinal OR 1.10 (95% CI 1.01-1.20) p = 0.027; dominant OR 1.12 (95% CI 1.01-1.24) p = 0.03]. The other 5 SNPs in BRCA2, CDKN2A, SRD5A2, CASP8 and TGFB1 showed no association with ovarian cancer risk; given the large sample size, these results can also be considered to be informative. These null results for SNPs identified from relatively large initial studies shows the importance of replicating associations by a consortium approach. PMID:18431743

  13. Multiple imputation of missing covariates with non-linear effects and interactions: an evaluation of statistical methods

    PubMed Central

    2012-01-01

    Background Multiple imputation is often used for missing data. When a model contains as covariates more than one function of a variable, it is not obvious how best to impute missing values in these covariates. Consider a regression with outcome Y and covariates X and X2. In 'passive imputation' a value X* is imputed for X and then X2 is imputed as (X*)2. A recent proposal is to treat X2 as 'just another variable' (JAV) and impute X and X2 under multivariate normality. Methods We use simulation to investigate the performance of three methods that can easily be implemented in standard software: 1) linear regression of X on Y to impute X then passive imputation of X2; 2) the same regression but with predictive mean matching (PMM); and 3) JAV. We also investigate the performance of analogous methods when the analysis involves an interaction, and study the theoretical properties of JAV. The application of the methods when complete or incomplete confounders are also present is illustrated using data from the EPIC Study. Results JAV gives consistent estimation when the analysis is linear regression with a quadratic or interaction term and X is missing completely at random. When X is missing at random, JAV may be biased, but this bias is generally less than for passive imputation and PMM. Coverage for JAV was usually good when bias was small. However, in some scenarios with a more pronounced quadratic effect, bias was large and coverage poor. When the analysis was logistic regression, JAV's performance was sometimes very poor. PMM generally improved on passive imputation, in terms of bias and coverage, but did not eliminate the bias. Conclusions Given the current state of available software, JAV is the best of a set of imperfect imputation methods for linear regression with a quadratic or interaction effect, but should not be used for logistic regression. PMID:22489953

  14. Computational analyses and prediction of guanylin deleterious SNPs.

    PubMed

    Porto, William F; Franco, Octávio L; Alencar, Sérgio A

    2015-07-01

    Human guanylin, coded by the GUCA2A gene, is a member of a peptide family that activates intestinal membrane guanylate cyclase, regulating electrolyte and water transport in intestinal and renal epithelia. Deregulation of guanylin peptide activity has been associated with colon adenocarcinoma, adenoma and intestinal polyps. Besides, it is known that mutations on guanylin receptors could be involved in meconium ileus. However, there are no previous works regarding the alterations driven by single nucleotide polymorphisms in guanylin peptides. A comprehensive in silico analysis of missense SNPs present in the GUCA2A gene was performed taking into account 16 prediction tools in order to select the deleterious variations for further evaluation by molecular dynamics simulations (50ns). Molecular dynamics data suggest that the three out of five variants (Cys104Arg, Cys112Ser and Cys115Tyr) have undergone structural modifications in terms of flexibility, volume and/or solvation. In addition, two nonsense SNPs were identified, both preventing the formation of disulfide bonds and resulting in the synthesis of truncated proteins. In summary the structural analysis of missense SNPs is important to decrease the number of potential mutations to be in vitro evaluated for associating them with some genetic diseases. In addition, data reported here could lead to a better understanding of structural and functional aspects of guanylin peptides. PMID:25899674

  15. 22 CFR 208.630 - May the U.S. Agency for International Development impute conduct of one person to another?

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ...U.S. Agency for International Development impute...Relations AGENCY FOR INTERNATIONAL DEVELOPMENT GOVERNMENTWIDE...U.S. Agency for International Development impute...with a partnership, joint venture, joint...

  16. TTF-1 and RET promoter SNPs: regulation of RET transcription in Hirschsprung's disease

    Microsoft Academic Search

    Raymond W. Ganster; Vincent C. H. Lui; Thomas Y. Y. Leon; Man-Ting So; Anson M. F. Lau; Ming Fu; Mai-Har Sham; Joanne Knight; Maria Stella Zannini; Pak C. Sham; Paul K. H. Tam

    2005-01-01

    Single nucleotide polymorphisms (SNPs) of the coding regions of receptor tyrosine kinase gene (RET )a re associated with Hirschsprung's disease (HSCR, aganglionic megacolon). These SNPs, individually or com- bined, may act as a low penetrance susceptibility locus and\\/or be in linkage disequilibrium (LD) with another susceptibility locus located in RET regulatory regions. Because two RET promoter SNPs have been found

  17. Imputation method for lifetime exposure assessment in air pollution epidemiologic studies

    PubMed Central

    2013-01-01

    Background Environmental epidemiology, when focused on the life course of exposure to a specific pollutant, requires historical exposure estimates that are difficult to obtain for the full time period due to gaps in the historical record, especially in earlier years. We show that these gaps can be filled by applying multiple imputation methods to a formal risk equation that incorporates lifetime exposure. We also address challenges that arise, including choice of imputation method, potential bias in regression coefficients, and uncertainty in age-at-exposure sensitivities. Methods During time periods when parameters needed in the risk equation are missing for an individual, the parameters are filled by an imputation model using group level information or interpolation. A random component is added to match the variance found in the estimates for study subjects not needing imputation. The process is repeated to obtain multiple data sets, whose regressions against health data can be combined statistically to develop confidence limits using Rubin’s rules to account for the uncertainty introduced by the imputations. To test for possible recall bias between cases and controls, which can occur when historical residence location is obtained by interview, and which can lead to misclassification of imputed exposure by disease status, we introduce an “incompleteness index,” equal to the percentage of dose imputed (PDI) for a subject. “Effective doses” can be computed using different functional dependencies of relative risk on age of exposure, allowing intercomparison of different risk models. To illustrate our approach, we quantify lifetime exposure (dose) from traffic air pollution in an established case–control study on Long Island, New York, where considerable in-migration occurred over a period of many decades. Results The major result is the described approach to imputation. The illustrative example revealed potential recall bias, suggesting that regressions against health data should be done as a function of PDI to check for consistency of results. The 1% of study subjects who lived for long durations near heavily trafficked intersections, had very high cumulative exposures. Thus, imputation methods must be designed to reproduce non-standard distributions. Conclusions Our approach meets a number of methodological challenges to extending historical exposure reconstruction over a lifetime and shows promise for environmental epidemiology. Application to assessment of breast cancer risks will be reported in a subsequent manuscript. PMID:23919666

  18. META-ANALYSIS OF GENOME-WIDE STUDIES IDENTIFIES WNT16 AND ESR1 SNPS ASSOCIATED WITH BONE MINERAL DENSITY IN PREMENOPAUSAL WOMEN

    PubMed Central

    Koller, Daniel L.; Zheng, Hou-Feng; Karasik, David; Yerges-Armstrong, Laura; Liu, Ching-Ti; McGuigan, Fiona; Kemp, John P.; Giroux, Sylvie; Lai, Dongbing; Edenberg, Howard J.; Peacock, Munro; Czerwinski, Stefan A.; Choh, Audrey C.; McMahon, George; St Pourcain, Beate; Timpson, Nicholas J.; Lawlor, Debbie A; Evans, David M; Towne, Bradford; Blangero, John; Carless, Melanie A.; Kammerer, Candace; Goltzman, David; Kovacs, Christopher S.; Prior, Jerilynn C.; Spector, Tim D.; Rousseau, Francois; Tobias, Jon H.; Akesson, Kristina; Econs, Michael J.; Mitchell, Braxton D.; Richards, J. Brent; Kiel, Douglas P.; Foroud, Tatiana

    2013-01-01

    Previous genome-wide association studies (GWAS) have identified common variants in genes associated with variation in bone mineral density (BMD), although most have been carried out in combined samples of older women and men. Meta-analyses of these results have identified numerous SNPs of modest effect at genome-wide significance levels in genes involved in both bone formation and resorption, as well as other pathways. We performed a meta-analysis restricted to premenopausal white women from four cohorts (n= 4,061 women, ages 20 to 45) to identify genes influencing peak bone mass at the lumbar spine and femoral neck. Following imputation, age- and weight-adjusted BMD values were tested for association with each SNP. Association of a SNP in the WNT16 gene (rs3801387; p=1.7 × 10?9) and multiple SNPs in the ESR1/C6orf97 (rs4870044; p=1.3 × 10?8) achieved genome-wide significance levels for lumbar spine BMD. These SNPs, along with others demonstrating suggestive evidence of association, were then tested for association in seven Replication cohorts that included premenopausal women of European, Hispanic-American, and African-American descent (combined n=5,597 for femoral neck; 4,744 for lumbar spine). When the data from the Discovery and Replication cohorts were analyzed jointly, the evidence was more significant (WNT16 joint p=1.3 × 10?11; ESR1/C6orf97 joint p= 1.4 × 10?10). Multiple independent association signals were observed with spine BMD at the ESR1 region after conditioning on the primary signal. Analyses of femoral neck BMD also supported association with SNPs in WNT16 and ESR1/C6orf97 (p< 1 × 10?5). Our results confirm that several of the genes contributing to BMD variation across a broad age range in both sexes have effects of similar magnitude on BMD of the spine in premenopausal women. These data support the hypothesis that variants in these genes of known skeletal function also affect BMD during the premenopausal period. PMID:23074152

  19. IEEE TRANSACTIONS ON SMART GRID, VOL. 4, NO. 4, DECEMBER 2013 2347 Load Curve Data Cleansing and Imputation Via

    E-print Network

    Giannakis, Georgios

    IEEE TRANSACTIONS ON SMART GRID, VOL. 4, NO. 4, DECEMBER 2013 2347 Load Curve Data Cleansing and communication errors. In this context, a novel load cleansing and imputation scheme is developed leveraging (D-) PCP algorithm is developed to carry out the imputation and cleansing tasks using networked

  20. Treatments of Missing Data: A Monte Carlo Comparison of RBHDI, Iterative Stochastic Regression Imputation, and Expectation-Maximization.

    ERIC Educational Resources Information Center

    Gold, Michael Steven; Bentler, Peter M.

    2000-01-01

    Describes a Monte Carlo investigation of four methods for treating incomplete data: (1) resemblance based hot-deck imputation (RBHDI); (2) iterated stochastic regression imputation; (3) structured model expectation maximization; and (4) saturated model expectation maximization. Results favored the expectation maximization methods. (SLD)

  1. 22 CFR 1006.630 - May the Inter-American Foundation impute conduct of one person to another?

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ...2009-04-01 true May the Inter-American Foundation impute conduct of one person to another...Foreign Relations INTER-AMERICAN FOUNDATION GOVERNMENTWIDE DEBARMENT AND SUSPENSION...1006.630 May the Inter-American Foundation impute conduct of one person to...

  2. 22 CFR 1006.630 - May the Inter-American Foundation impute conduct of one person to another?

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ...2014-04-01 false May the Inter-American Foundation impute conduct of one person to another...Foreign Relations INTER-AMERICAN FOUNDATION GOVERNMENTWIDE DEBARMENT AND SUSPENSION...1006.630 May the Inter-American Foundation impute conduct of one person to...

  3. 22 CFR 1006.630 - May the Inter-American Foundation impute conduct of one person to another?

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ...2009-04-01 true May the Inter-American Foundation impute conduct of one person to another...Foreign Relations INTER-AMERICAN FOUNDATION GOVERNMENTWIDE DEBARMENT AND SUSPENSION...1006.630 May the Inter-American Foundation impute conduct of one person to...

  4. Relaxing the independent censoring assumption in the Cox proportional hazards model using multiple imputation

    PubMed Central

    Jackson, Dan; White, Ian R; Seaman, Shaun; Evans, Hannah; Baisley, Kathy; Carpenter, James

    2014-01-01

    The Cox proportional hazards model is frequently used in medical statistics. The standard methods for fitting this model rely on the assumption of independent censoring. Although this is sometimes plausible, we often wish to explore how robust our inferences are as this untestable assumption is relaxed. We describe how this can be carried out in a way that makes the assumptions accessible to all those involved in a research project. Estimation proceeds via multiple imputation, where censored failure times are imputed under user-specified departures from independent censoring. A novel aspect of our method is the use of bootstrapping to generate proper imputations from the Cox model. We illustrate our approach using data from an HIV-prevention trial and discuss how it can be readily adapted and applied in other settings. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:25060703

  5. Multiple imputation methods for longitudinal blood pressure measurements from the Framingham Heart Study.

    PubMed

    Kang, Terri; Kraft, Peter; Gauderman, W James; Thomas, Duncan

    2003-01-01

    Missing data are a great concern in longitudinal studies, because few subjects will have complete data and missingness could be an indicator of an adverse outcome. Analyses that exclude potentially informative observations due to missing data can be inefficient or biased. To assess the extent of these problems in the context of genetic analyses, we compared case-wise deletion to two multiple imputation methods available in the popular SAS package, the propensity score and regression methods. For both the real and simulated data sets, the propensity score and regression methods produced results similar to case-wise deletion. However, for the simulated data, the estimates of heritability for case-wise deletion and the two multiple imputation methods were much lower than for the complete data. This suggests that if missingness patterns are correlated within families, then imputation methods that do not allow this correlation can yield biased results. PMID:14975111

  6. On multivariate imputation and forecasting of decadal wind speed missing data.

    PubMed

    Wesonga, Ronald

    2015-01-01

    This paper demonstrates the application of multiple imputations by chained equations and time series forecasting of wind speed data. The study was motivated by the high prevalence of missing wind speed historic data. Findings based on the fully conditional specification under multiple imputations by chained equations, provided reliable wind speed missing data imputations. Further, the forecasting model shows, the smoothing parameter, alpha (0.014) close to zero, confirming that recent past observations are more suitable for use to forecast wind speeds. The maximum decadal wind speed for Entebbe International Airport was estimated to be 17.6 metres per second at a 0.05 level of significance with a bound on the error of estimation of 10.8 metres per second. The large bound on the error of estimations confirms the dynamic tendencies of wind speed at the airport under study. PMID:25625036

  7. Cd14 SNPs Regulate the Innate Immune Response

    PubMed Central

    Liu, Hong-Hsing; Hu, Yajing; Zheng, Ming; Suhoski, Megan M.; Engleman, Edgar G.; Dill, David; Hudnall, Matt; Wang, Jianmei; Spolski, Rosanne; Leonard, Warren J.; Peltz, Gary

    2012-01-01

    CD14 is a monocytic differentiation antigen that regulates innate immune responses to pathogens. Here, we show that murine Cd14 SNPs regulate the length of Cd14 mRNA and CD14 protein translation efficiency, and consequently the basal level of soluble CD14 (sCD14) and type I IFN production by murine macrophages. This has substantial downstream consequences for the innate immune response; the level of expression of at least 40 IFN-responsive murine genes was altered by this mechanism. We also observed that there was substantial variation in the length of human CD14 mRNAs and in their translation efficiency. sCD14 increased cytokine production by human dendritic cells (DCs), and sCD14-primed DCs augmented human CD4 T cell proliferation. These findings may provide a mechanism for exploring the complex relationship between CD14 SNPs, serum sCD14 levels, and susceptibility to human infectious and allergic diseases. PMID:22445606

  8. Predicting inhaled corticosteroid response in asthma with two associated SNPs.

    PubMed

    McGeachie, M J; Wu, A C; Chang, H-H; Lima, J J; Peters, S P; Tantisira, K G

    2013-08-01

    Inhaled corticosteroids (ICS) are the most commonly used controller medications prescribed for asthma. Two single-nucleotide polymorphisms (SNPs), rs1876828 in corticotrophin releasing hormone receptor 1 and rs37973 in GLCCI1, have previously been associated with corticosteroid efficacy. We studied data from four existing clinical trials of asthmatics, who received ICS and had lung function measured by forced expiratory volume in 1?s (FEV1) before and after the period of such treatment. We combined the two SNPs rs37973 and rs1876828 into a predictive test of FEV1 change using a Bayesian model, which identified patients with good or poor steroid response (highest or lowest quartile, respectively) with predictive performance of 65.7% (P=0.039 vs random) area under the receiver-operator characteristic curve in the training population and 65.9% (P=0.025 vs random) in the test population. These findings show that two genetic variants can be combined into a predictive test that achieves similar accuracy and superior replicability compared with single SNP predictors. PMID:22641026

  9. When Data Goes Missing: Methods for Missing Score Imputation in Biometric Fusion

    E-print Network

    Ross, Arun Abraham

    , score level fusion is commonly used as it offers a good trade-off between fusion complexity and dataWhen Data Goes Missing: Methods for Missing Score Imputation in Biometric Fusion Yaohui Ding, Morgantown, WV, USA ABSTRACT While fusion can be accomplished at multiple levels in a multibiometric system

  10. IMPUTATING MISSING VALUES IN DIARY RECORDS OF SUN-EXPOSURE STUDY

    E-print Network

    Mosegaard, Klaus

    ). In addition, UV radiation were measured at a 10 minute sampling rate. While the ultimate objective is to relate sun- habits, UV dose, and risk of cancer, this work focuses on imputating missing #12. The subjects wore a special designed watch called the \\Sunsaver", which measured UVA and UVB radiation

  11. On the set of imputations induced by the k-additive core

    E-print Network

    Paris-Sud XI, Université de

    Aviation Management Institute of China Beijing Institute of Technology #3,East Road Huajiadi, Chaoyang District, Beijing, China 100102 Email: michel.grabisch@univ-paris1.fr, ttlitong@gmail.com Abstract of payoffs to individuals and possibly to coalitions of size at most k in S. Such general imputations

  12. The Effects of Methods of Imputation for Missing Values on the Validity and Reliability of Scales

    ERIC Educational Resources Information Center

    Cokluk, Omay; Kayri, Murat

    2011-01-01

    The main aim of this study is the comparative examination of the factor structures, corrected item-total correlations, and Cronbach-alpha internal consistency coefficients obtained by different methods used in imputation for missing values in conditions of not having missing values, and having missing values of different rates in terms of testing…

  13. Random-covariances and mixed-effects models for imputing multivariate multilevel continuous data

    PubMed Central

    Yucel, Recai M.

    2012-01-01

    Principled techniques for incomplete-data problems are increasingly part of mainstream statistical practice. Among many proposed techniques so far, inference by multiple imputation (MI) has emerged as one of the most popular. While many strategies leading to inference by MI are available in cross-sectional settings, the same richness does not exist in multilevel applications. The limited methods available for multilevel applications rely on the multivariate adaptations of mixed-effects models. This approach preserves the mean structure across clusters and incorporates distinct variance components into the imputation process. In this paper, I add to these methods by considering a random covariance structure and develop computational algorithms. The attraction of this new imputation modeling strategy is to correctly reflect the mean and variance structure of the joint distribution of the data, and allow the covariances differ across the clusters. Using Markov Chain Monte Carlo techniques, a predictive distribution of missing data given observed data is simulated leading to creation of multiple imputations. To circumvent the large sample size requirement to support independent covariance estimates for the level-1 error term, I consider distributional impositions mimicking random-effects distributions assigned a priori. These techniques are illustrated in an example exploring relationships between victimization and individual and contextual level factors that raise the risk of violent crime. PMID:22271079

  14. Time series outlier detection and imputation Hermine N. Akouemo and Richard J. Povinelli

    E-print Network

    Povinelli, Richard J.

    1 Time series outlier detection and imputation Hermine N. Akouemo and Richard J. Povinelli of outliers in time series data. An autoregressive integrated moving average with exogenous inputs (ARIMAX) model is used to extract the characteristics of the time series and to find the residuals. The outliers

  15. Missing value estimation for DNA microarray gene expression data: local least squares imputation

    Microsoft Academic Search

    Hyunsoo Kim; Gene H. Golub; Haesun Park

    2005-01-01

    Motivation: Gene expression data often contain missing expression values. Effective missing value estimation meth- ods are needed since many algorithms for gene expression data analysis require a complete matrix of gene array values. In this paper, imputation methods based on the least squares formulation are proposed to estimate missing values in the gene expression data, which exploit local similarity structures

  16. Multivariate imputation in cross-sectional analysis of health effects associated with air pollution

    Microsoft Academic Search

    C. Duddek; N. D. Le; J. V. Zidek; R. T. Burnett

    1995-01-01

    We demonstrate a recently developed spatial interpolation methodology in a study of the chronic effects of air pollution on respiratory morbidity. Our study uses data from the Ontario Health Study, a large survey of households in Ontario conducted for the province by Statistics Canada. The interpolation procedure imputes unobserved vectors of air pollution concentrations for individual Public Health Units, from

  17. Reporting the Use of Multiple Imputation for Missing Data in Higher Education Research

    ERIC Educational Resources Information Center

    Manly, Catherine A.; Wells, Ryan S.

    2015-01-01

    Higher education researchers using survey data often face decisions about handling missing data. Multiple imputation (MI) is considered by many statisticians to be the most appropriate technique for addressing missing data in many circumstances. In particular, it has been shown to be preferable to listwise deletion, which has historically been a…

  18. Imputation of missing genotypes from sparse to high density using long-range phasing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Related individuals share potentially long chromosome segments that trace to a common ancestor. A phasing algorithm (ChromoPhase) that utilizes this characteristic of finite populations was developed to phase large sections of a chromosome. In addition to phasing, ChromoPhase imputes missing genotyp...

  19. Multiple Imputation by Ordered Monotone Blocks with Application to the Anthrax Vaccine Research Program

    E-print Network

    West, Mike

    Multiple Imputation by Ordered Monotone Blocks with Application to the Anthrax Vaccine Research with missing values. The CDC Anthrax Vaccine Research Program (AVRP) dataset created new challenges for MI due, the associate editor and two anonymous reviewers for constructive comments. 1 #12;iterating. We apply

  20. The search for stable prognostic models in multiple imputed data sets

    Microsoft Academic Search

    David Vergouw; Martijn W Heymans; George M Peat; Ton Kuijpers; Peter R Croft; Henrica CW de Vet; Henriëtte E van der Horst; Daniëlle AWM van der Windt

    2010-01-01

    BACKGROUND: In prognostic studies model instability and missing data can be troubling factors. Proposed methods for handling these situations are bootstrapping (B) and Multiple imputation (MI). The authors examined the influence of these methods on model composition. METHODS: Models were constructed using a cohort of 587 patients consulting between January 2001 and January 2003 with a shoulder problem in general

  1. Comparative analysis of missing value imputation methods to improve clustering and interpretation of microarray experiments

    Microsoft Academic Search

    Magalie Celton; Alain Malpertuy; Gaëlle Lelandais; Alexandre G. de Brevern

    2010-01-01

    BACKGROUND: Microarray technologies produced large amount of data. In a previous study, we have shown the interest of k-Nearest Neighbour approach for restoring the missing gene expression values, and its positive impact of the gene clustering by hierarchical algorithm. Since, numerous replacement methods have been proposed to impute missing values (MVs) for microarray data. In this study, we have evaluated

  2. Imputation of Missing Links and Attributes in Longitudinal Social Surveys Vladimir Ouzienko and Zoran Obradovic

    E-print Network

    Obradovic, Zoran

    Imputation of Missing Links and Attributes in Longitudinal Social Surveys Vladimir Ouzienko of the links and attributes in longitudinal social surveys which accounts for changing network topology. INTRODUCTION Social network surveys have proven to be invaluable tools for social scientists. In such surveys

  3. iVAR: a program for imputing missing data in multivariate time series using vector autoregressive models.

    PubMed

    Liu, Siwei; Molenaar, Peter C M

    2014-12-01

    This article introduces iVAR, an R program for imputing missing data in multivariate time series on the basis of vector autoregressive (VAR) models. We conducted a simulation study to compare iVAR with three methods for handling missing data: listwise deletion, imputation with sample means and variances, and multiple imputation ignoring time dependency. The results showed that iVAR produces better estimates for the cross-lagged coefficients than do the other three methods. We demonstrate the use of iVAR with an empirical example of time series electrodermal activity data and discuss the advantages and limitations of the program. PMID:24515888

  4. Analysis of accelerated failure time data with dependent censoring using auxiliary variables via nonparametric multiple imputation.

    PubMed

    Hsu, Chiu-Hsieh; Taylor, Jeremy M G; Hu, Chengcheng

    2015-08-30

    We consider the situation of estimating the marginal survival distribution from censored data subject to dependent censoring using auxiliary variables. We had previously developed a nonparametric multiple imputation approach. The method used two working proportional hazards (PH) models, one for the event times and the other for the censoring times, to define a nearest neighbor imputing risk set. This risk set was then used to impute failure times for censored observations. Here, we adapt the method to the situation where the event and censoring times follow accelerated failure time models and propose to use the Buckley-James estimator as the two working models. Besides studying the performances of the proposed method, we also compare the proposed method with two popular methods for handling dependent censoring through the use of auxiliary variables, inverse probability of censoring weighted and parametric multiple imputation methods, to shed light on the use of them. In a simulation study with time-independent auxiliary variables, we show that all approaches can reduce bias due to dependent censoring. The proposed method is robust to misspecification of either one of the two working models and their link function. This indicates that a working proportional hazards model is preferred because it is more cumbersome to fit an accelerated failure time model. In contrast, the inverse probability of censoring weighted method is not robust to misspecification of the link function of the censoring time model. The parametric imputation methods rely on the specification of the event time model. The approaches are applied to a prostate cancer dataset. Copyright © 2015?John Wiley & Sons, Ltd. PMID:25999295

  5. LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources

    Microsoft Academic Search

    Rachel Karchin; Mark Diekhans; Libusha Kelly; Daryl J. Thomas; Ursula Pieper; Narayanan Eswar; David Haussler; Andrej Sali

    2005-01-01

    Motivation: The NCBI dbSNP database lists over 9 million SNPs in the human genome, but currently contains limited annotation information. SNPs that result in amino-acid resi- due changes (nsSNPs) are of critical importance in variation between individuals, including disease and drug sensitivity. Results: We have developed LS-SNP, a genomic-scale software pipeline to annotate nsSNPs. LS-SNP comprehen- sively maps nsSNPs onto

  6. Association scan of 14,500 nonsynonymous SNPs in four diseases identifies autoimmunity variants

    Microsoft Academic Search

    Paul R Burton; David G Clayton; Nick Craddock; Panos Deloukas; Audrey Duncanson; Dominic P Kwiatkowski; Mark I McCarthy; Willem H Ouwehand; Nilesh J Samani; John A Todd; Jeffrey C Barrett; Dan Davison; Peter Donnelly; Doug Easton; Hin-Tak Leung; Jonathan L Marchini; Andrew P Morris; Chris CA Spencer; Martin D Tobin; Antony P Attwood; James P Boorman; Barbara Cant; Ursula Everson; Judith M Hussey; Jennifer D Jolley; Alexandra S Knight; Kerstin Koch; Elizabeth Meech; Sarah Nutland; Christopher V Prowse; Helen E Stevens; Niall C Taylor; Graham R Walters; Neil M Walker; Nicholas A Watkins; Thilo Winzer; Richard W Jones; Wendy L McArdle; Susan M Ring; David P Strachan; Marcus Pembrey; Gerome Breen; David St Clair; Sian Caesar; Katharine Gordon-Smith; Lisa Jones; Christine Fraser; Elaine K Green; Detelina Grozeva; Marian L Hamshere; Peter A Holmans; Ian R Jones; George Kirov; Valentina Moskivina; Ivan Nikolov; Michael C O'Donovan; Michael J Owen; David A Collier; Amanda Elkin; Anne Farmer; Richard Williamson; Peter McGuffin; Allan H Young; I Nicol Ferrier; Stephen G Ball; Anthony J Balmforth; Jennifer H Barrett; Timothy D Bishop; Mark M Iles; Azhar Maqbool; Nadira Yuldasheva; Alistair S Hall; Peter S Braund; Richard J Dixon; Massimo Mangino; Suzanne Stevens; John R Thompson; Francesca Bredin; Mark Tremelling; Miles Parkes; Hazel Drummond; Charles W Lees; Elaine R Nimmo; Jack Satsangi; Sheila A Fisher; Alastair Forbes; Cathryn M Lewis; Clive M Onnie; Natalie J Prescott; Jeremy Sanderson; Christopher G Matthew; Jamie Barbour; M Khalid Mohiuddin; Catherine E Todhunter; John C Mansfield; Tariq Ahmad; Fraser R Cummings; Derek P Jewell; John Webster; Morris J Brown; Mark G Lathrop; John Connell; Anna Dominiczak; Carolina A Braga Marcano; Beverley Burke; Richard Dobson; Johannie Gungadoo; Kate L Lee; Patricia B Munroe; Stephen J Newhouse; Abiodun Onipinla; Chris Wallace; Mingzhan Xue; Mark Caulfield; Martin Farrall; Anne Barton; Ian N Bruce; Hannah Donovan; Steve Eyre; Paul D Gilbert; Samantha L Hilder; Anne M Hinks; Sally L John; Catherine Potter; Alan J Silman; Deborah PM Symmons; Wendy Thomson; Jane Worthington; David B Dunger; Barry Widmer; Timothy M Frayling; Rachel M Freathy; Hana Lango; John R B Perry; Beverley M Shields; Michael N Weedon; Andrew T Hattersley; Graham A Hitman; Mark Walker; Kate S Elliott; Christopher J Groves; Cecilia M Lindgren; Nigel W Rayner; Nicolas J Timpson; Eleftheria Zeggini; Melanie Newport; Giorgio Sirugo; Emily Lyons; Fredrik Vannberg; Adrian V S Hill; Linda A Bradbury; Claire Farrar; Jennifer J Pointon; Paul Wordsworth; Matthew A Brown; Jayne A Franklyn; Joanne M Heward; Matthew J Simmonds; Stephen CL Gough; Sheila Seal; Michael R Stratton; Nazneen Rahman; Maria Ban; An Goris; Stephen J Sawcer; Alastair Compston; David Conway; Muminatou Jallow; Kirk A Rockett; Suzannah J Bumpstead; Amy Chaney; Kate Downes; Mohammed JR Ghori; Rhian Gwilliam; Sarah E Hunt; Michael Inouye; Andrew Keniry; Emma King; Ralph McGinnis; Simon Potter; Rathi Ravindrarajah; Pamela Whittaker; Claire Widden; David Withers; Niall J Cardin; Teresa Ferreira; Joanne Pereira-Gale; Ingeleif B Hallgrimsdóttir; Bryan N Howie; Zhan Su; Yik Ying Teo; Damjan Vukcevic; David Bentley; Sarah L Mitchell; Paul R Newby; Oliver J Brand; Jackie Carr-Smith; Simon H S Pearce; Stephen C L Gough; John D Reveille; Xiaodong Zhou; Anne-Marie Sims; Alison Dowling; Jacqueline Taylor; Tracy Doan; John C Davis; Laurie Savage; Michael M Ward; Thomas L Learch; Michael H Weisman; Lon R Cardon; David M Evans

    2007-01-01

    We have genotyped 14,436 nonsynonymous SNPs (nsSNPs) and 897 major histocompatibility complex (MHC) tag SNPs from 1,000 independent cases of ankylosing spondylitis (AS), autoimmune thyroid disease (AITD), multiple sclerosis (MS) and breast cancer (BC). Comparing these data against a common control dataset derived from 1,500 randomly selected healthy British individuals, we report initial association and independent replication in a North

  7. Discovery of c-SNPs in Anemone coronaria L. and Assessment of Genetic Variation

    Microsoft Academic Search

    A. Shamay; J. Fang; N. Pollak; A. Cohen; N. Yonash; U. Lavi

    2006-01-01

    Single nucleotide polymorphisms (SNPs) were discovered in 44 cDNA clones from leaves by comparison of the commercial cultivar\\u000a ‘Mona Lisa‘ with a wild population of Anemone coronaria L. One hundred and fifty five SNPs were discovered with an average frequency of one SNP per 167 bp. Forty nine percent of\\u000a the SNPs are transitions, 43 are transversions, 26 are heterozygotes and

  8. Computational and structural investigation of deleterious functional SNPs in breast cancer BRCA2 gene.

    PubMed

    Rajasekaran, R; Doss, George Priya; Sudandiradoss, C; Ramanathan, K; Rituraj, Purohit; Sethumadhavan, Rao

    2008-05-01

    In this work, we have analyzed the genetic variation that can alter the expression and the function in BRCA2 gene using computational methods. Out of the total 534 SNPs, 101 were found to be non synonymous (nsSNPs). Among the 7 SNPs in the untranslated region, 3 SNPs were found in 5' and 4 SNPs were found in 3' un-translated regions (UTR). Of the nsSNPs 20.7% were found to be damaging by both SIFT and PolyPhen server among the 101 nsSNPs investigated. UTR resource tool suggested that 2 SNPs in the 5' UTR region and 4 SNPs in the 3' UTR regions might change the protein expression levels. The mutation from asparagine to isoleucine at the position 3124 of the native protein of BRCA2 gene was most deleterious by both SIFT and PolyPhen servers. A structural analysis of this mutated protein and the native protein was made which had an RMSD value of 0.301 nm. Based on this work, we proposed that this most deleterious nsSNP with an SNPid rs28897759 is an important candidate for the cause of breast cancer by BRCA2 gene. PMID:18724707

  9. Accounting for dependence induced by weighted KNN imputation in paired samples, motivated by a colorectal cancer study.

    PubMed

    Suyundikov, Anvar; Stevens, John R; Corcoran, Christopher; Herrick, Jennifer; Wolff, Roger K; Slattery, Martha L

    2015-01-01

    Missing data can arise in bioinformatics applications for a variety of reasons, and imputation methods are frequently applied to such data. We are motivated by a colorectal cancer study where miRNA expression was measured in paired tumor-normal samples of hundreds of patients, but data for many normal samples were missing due to lack of tissue availability. We compare the precision and power performance of several imputation methods, and draw attention to the statistical dependence induced by K-Nearest Neighbors (KNN) imputation. This imputation-induced dependence has not previously been addressed in the literature. We demonstrate how to account for this dependence, and show through simulation how the choice to ignore or account for this dependence affects both power and type I error rate control. PMID:25849489

  10. 7 CFR 3017.630 - May the Department of Agriculture impute conduct of one person to another?

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    7 Agriculture 15 2010-01-01 2010-01-01 false May the Department of Agriculture impute conduct of one person to another? 3017.630 Section 3017.630 Agriculture Regulations of the Department of Agriculture...

  11. 41 CFR 105-68.630 - May the General Services Administration impute conduct of one person to another?

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    41 Public Contracts and Property...May the General Services Administration impute conduct of one person...Section 105-68.630 Public Contracts and Property...Continued) GENERAL SERVICES ADMINISTRATION Regional...

  12. Research note: imputing large group averages for missing data, using rural-urban continuum codes for density driven industry sectors

    Microsoft Academic Search

    Jeremy R. Porter; Ronald E. Cossman; Wesley L. James

    2009-01-01

    Understanding the effects and consequences of missing data imputation is vital to the ability to obtain meaningful and reliable\\u000a statistics and coefficients in the examination of any quantitatively-based phenomena. Over time a series of sophisticated\\u000a methods have been developed to handle the issue of missing data imputation however, these sophisticated methods may not always\\u000a be appropriate or attainable. In these

  13. Double Sampling with Multiple Imputation to Answer Large Sample Meta-Research Questions: Introduction and Illustration by Evaluating Adherence to Two Simple CONSORT Guidelines

    PubMed Central

    Capers, Patrice L.; Brown, Andrew W.; Dawson, John A.; Allison, David B.

    2015-01-01

    Background: Meta-research can involve manual retrieval and evaluation of research, which is resource intensive. Creation of high throughput methods (e.g., search heuristics, crowdsourcing) has improved feasibility of large meta-research questions, but possibly at the cost of accuracy. Objective: To evaluate the use of double sampling combined with multiple imputation (DS?+?MI) to address meta-research questions, using as an example adherence of PubMed entries to two simple consolidated standards of reporting trials guidelines for titles and abstracts. Methods: For the DS large sample, we retrieved all PubMed entries satisfying the filters: RCT, human, abstract available, and English language (n?=?322, 107). For the DS subsample, we randomly sampled 500 entries from the large sample. The large sample was evaluated with a lower rigor, higher throughput (RLOTHI) method using search heuristics, while the subsample was evaluated using a higher rigor, lower throughput (RHITLO) human rating method. Multiple imputation of the missing-completely at-random RHITLO data for the large sample was informed by: RHITLO data from the subsample; RLOTHI data from the large sample; whether a study was an RCT; and country and year of publication. Results: The RHITLO and RLOTHI methods in the subsample largely agreed (phi coefficients: title?=?1.00, abstract?=?0.92). Compliance with abstract and title criteria has increased over time, with non-US countries improving more rapidly. DS?+?MI logistic regression estimates were more precise than subsample estimates (e.g., 95% CI for change in title and abstract compliance by year: subsample RHITLO 1.050–1.174 vs. DS?+?MI 1.082–1.151). As evidence of improved accuracy, DS?+?MI coefficient estimates were closer to RHITLO than the large sample RLOTHI. Conclusion: Our results support our hypothesis that DS?+?MI would result in improved precision and accuracy. This method is flexible and may provide a practical way to examine large corpora of literature. PMID:25988135

  14. Comparison of Results from Different Imputation Techniques for Missing Data from an Anti-Obesity Drug Trial

    PubMed Central

    Jørgensen, Anders W.; Lundstrøm, Lars H.; Wetterslev, Jørn; Astrup, Arne; Gøtzsche, Peter C.

    2014-01-01

    Background In randomised trials of medical interventions, the most reliable analysis follows the intention-to-treat (ITT) principle. However, the ITT analysis requires that missing outcome data have to be imputed. Different imputation techniques may give different results and some may lead to bias. In anti-obesity drug trials, many data are usually missing, and the most used imputation method is last observation carried forward (LOCF). LOCF is generally considered conservative, but there are more reliable methods such as multiple imputation (MI). Objectives To compare four different methods of handling missing data in a 60-week placebo controlled anti-obesity drug trial on topiramate. Methods We compared an analysis of complete cases with datasets where missing body weight measurements had been replaced using three different imputation methods: LOCF, baseline carried forward (BOCF) and MI. Results 561 participants were randomised. Compared to placebo, there was a significantly greater weight loss with topiramate in all analyses: 9.5 kg (SE 1.17) in the complete case analysis (N?=?86), 6.8 kg (SE 0.66) using LOCF (N?=?561), 6.4 kg (SE 0.90) using MI (N?=?561) and 1.5 kg (SE 0.28) using BOCF (N?=?561). Conclusions The different imputation methods gave very different results. Contrary to widely stated claims, LOCF did not produce a conservative (i.e., lower) efficacy estimate compared to MI. Also, LOCF had a lower SE than MI. PMID:25409438

  15. selectSNP – An R package for selecting SNPs optimal for genetic evaluation

    Technology Transfer Automated Retrieval System (TEKTRAN)

    There has been a huge increase in the number of SNPs in the public repositories. This has made it a challenge to design low and medium density SNP panels, which requires careful selection of available SNPs considering many criteria, such as map position, allelic frequency, possible biological functi...

  16. Collaborative development of SNPs for cotton research, introgression, MAS and breeding

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Extensive use of genome-wide analyses requires that molecular markers be highly abundant, informative and, once developed, extremely cost-effective to use, such as single-nucleotide polymorphisms (SNPs). The efforts toward development of cotton SNPs have been few and small-scale. The novel cotton ...

  17. The role of complementary bipartite visual analytical representations in the analysis of SNPs: a case study

    E-print Network

    Bhavnani, Suresh K.

    -nucleotide polymorphisms (SNPs) can help to classify subjects on the basis of their continental origins, with applications. This variation, resulting from millennia of natural selection and random drift, is coded in w20e30 million specific diseases2 and SNPs that are highly associated with continental origins. For example, several

  18. PATHOTYPING OF SALMONELLA ENTERICA BY ANALYSIS OF SNPS IN CYAA AND FLANKING 23S RIBOSOMAL SEQUENCES

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The egg-contaminating phenotype of Salmonella enterica serotype Enteritidis was linked to single-nucleotide polymorphisms (SNPs) occurring in cyaA, which encodes adenylate cyclase that produces cAMP and pyrophosphate from ATP. Ribotyping indicated that SNPs in cyaA were linked to polymorphisms occur...

  19. Lazy collaborative filtering for data sets with missing values.

    PubMed

    Ren, Yongli; Li, Gang; Zhang, Jun; Zhou, Wanlei

    2013-12-01

    As one of the biggest challenges in research on recommender systems, the data sparsity issue is mainly caused by the fact that users tend to rate a small proportion of items from the huge number of available items. This issue becomes even more problematic for the neighborhood-based collaborative filtering (CF) methods, as there are even lower numbers of ratings available in the neighborhood of the query item. In this paper, we aim to address the data sparsity issue in the context of neighborhood-based CF. For a given query (user, item), a set of key ratings is first identified by taking the historical information of both the user and the item into account. Then, an auto-adaptive imputation (AutAI) method is proposed to impute the missing values in the set of key ratings. We present a theoretical analysis to show that the proposed imputation method effectively improves the performance of the conventional neighborhood-based CF methods. The experimental results show that our new method of CF with AutAI outperforms six existing recommendation methods in terms of accuracy. PMID:23757575

  20. Common SNPs explain a large proportion of heritability for human height

    PubMed Central

    Yang, Jian; Benyamin, Beben; McEvoy, Brian P; Gordon, Scott; Henders, Anjali K; Nyholt, Dale R; Madden, Pamela A; Heath, Andrew C; Martin, Nicholas G; Montgomery, Grant W; Goddard, Michael E; Visscher, Peter M

    2011-01-01

    Single nucleotide polymorphisms (SNPs) discovered by genome-wide association studies (GWASs) account for only a small fraction of the genetic variation of complex traits in human populations. Where is the remaining heritability? We estimated the proportion of variance for human height explained by 294,831 SNPs genotyped on 3,925 unrelated individuals using a linear model analysis, and validated the estimation method by simulations based upon the observed genotype data. We show that 45% of variance can be explained by considering all SNPs simultaneously. Thus, most of the heritability is not missing but has not previously been detected because the individual effects are too small to pass stringent significance tests. We provide evidence that the remaining heritability is due to incomplete linkage disequilibrium (LD) between causal variants and genotyped SNPs, exacerbated by causal variants having lower minor allele frequency (MAF) than the SNPs explored to date. PMID:20562875

  1. Multiple Imputation For Combined-Survey Estimation With Incomplete Regressors In One But Not Both Surveys.

    PubMed

    Rendall, Michael S; Ghosh-Dastidar, Bonnie; Weden, Margaret M; Baker, Elizabeth H; Nazarov, Zafar

    2013-11-01

    Within-survey multiple imputation (MI) methods are adapted to pooled-survey regression estimation where one survey has more regressors, but typically fewer observations, than the other. This adaptation is achieved through: (1) larger numbers of imputations to compensate for the higher fraction of missing values; (2) model-fit statistics to check the assumption that the two surveys sample from a common universe; and (3) specificying the analysis model completely from variables present in the survey with the larger set of regressors, thereby excluding variables never jointly observed. In contrast to the typical within-survey MI context, cross-survey missingness is monotonic and easily satisfies the Missing At Random (MAR) assumption needed for unbiased MI. Large efficiency gains and substantial reduction in omitted variable bias are demonstrated in an application to sociodemographic differences in the risk of child obesity estimated from two nationally-representative cohort surveys. PMID:24223447

  2. A Review of Hot Deck Imputation for Survey Non-response

    PubMed Central

    Andridge, Rebecca R.; Little, Roderick J. A.

    2011-01-01

    Summary Hot deck imputation is a method for handling missing data in which each missing value is replaced with an observed response from a “similar” unit. Despite being used extensively in practice, the theory is not as well developed as that of other imputation methods. We have found that no consensus exists as to the best way to apply the hot deck and obtain inferences from the completed data set. Here we review different forms of the hot deck and existing research on its statistical properties. We describe applications of the hot deck currently in use, including the U.S. Census Bureau’s hot deck for the Current Population Survey (CPS). We also provide an extended example of variations of the hot deck applied to the third National Health and Nutrition Examination Survey (NHANES III). Some potential areas for future research are highlighted. PMID:21743766

  3. Evaluation of an imputed pitch velocity model of the auditory tau effect

    Microsoft Academic Search

    Molly J. Henry; J. Devin McAuley; Marta Zaleha

    2009-01-01

    This article extends an imputed pitch velocity model of the auditory kappa effect proposed by Henry and McAuley (2009a) to\\u000a the auditory tau effect. Two experiments were conducted using an AXB design in which listeners judged the relative pitch of\\u000a a middle target tone (X) in ascending and descending three-tone sequences. In Experiment 1, sequences were isochronous, establishing\\u000a constant fast,

  4. Imputing Observed Blood Pressure for Antihypertensive Treatment: Impact on Population and Genetic Analyses

    PubMed Central

    2014-01-01

    BACKGROUND Elevated blood pressure (BP), a heritable risk factor for many age-related disorders, is commonly investigated in population and genetic studies, but antihypertensive use can confound study results. Routine methods to adjust for antihypertensives may not sufficiently account for newer treatment protocols (i.e., combination or multiple drug therapy) found in contemporary cohorts. METHODS We refined an existing method to impute unmedicated BP in individuals on antihypertensives by incorporating new treatment trends. We assessed BP and antihypertensive use in male twins (n = 1,237) from the Vietnam Era Twin Study of Aging: 36% reported antihypertensive use; 52% of those treated were on multiple drugs. RESULTS Estimated heritability was 0.43 (95% confidence interval (CI) = 0.20–0.50) and 0.44 (95% CI = 0.22–0.61) for measured systolic BP (SBP) and diastolic BP (DBP), respectively. We imputed BP for antihypertensives by 3 approaches: (i) addition of a fixed value of 10/5mm Hg to measured SBP/DBP; (ii) incremented addition of mm Hg to BP based on number of medications; and (iii) a refined approach adding mm Hg based on antihypertensive drug class and ethnicity. The imputations did not significantly affect estimated heritability of BP. However, use of our most refined imputation method and other methods resulted in significantly increased phenotypic correlations between BP and body mass index, a trait known to be correlated with BP. CONCLUSIONS This study highlights the potential usefulness of applying a representative adjustment for medication use, such as by considering drug class, ethnicity, and the combination of drugs when assessing the relationship between BP and risk factors. PMID:24532572

  5. Imputing historical statistics, soils information, and other land-use data to crop area

    NASA Technical Reports Server (NTRS)

    Perry, C. R., Jr.; Willis, R. W.; Lautenschlager, L.

    1982-01-01

    In foreign crop condition monitoring, satellite acquired imagery is routinely used. To facilitate interpretation of this imagery, it is advantageous to have estimates of the crop types and their extent for small area units, i.e., grid cells on a map represent, at 60 deg latitude, an area nominally 25 by 25 nautical miles in size. The feasibility of imputing historical crop statistics, soils information, and other ancillary data to crop area for a province in Argentina is studied.

  6. IMPUTATION OF RAMP FLOW DATA USING THE ASYMMETRIC CELL TRANSMISSION TRAFFIC FLOW MODEL

    Microsoft Academic Search

    Ajith Muralidharan; Roberto Horowitz

    The Asymmetric Cell Transmission model can be used to sim- ulate traffic flows in freeway sections. The model is specified by fundamental diagram parameters- determined from mainli ne data, and on-ramp and off-ramp flows. The mainline flow\\/densi ty data are efficiently archived and readily available, but theramp flow data are generally found missing. This paper presents an imputation technique based

  7. Disk filter

    DOEpatents

    Bergman, Werner (Pleasanton, CA)

    1986-01-01

    An electric disk filter provides a high efficiency at high temperature. A hollow outer filter of fibrous stainless steel forms the ground electrode. A refractory filter material is placed between the outer electrode and the inner electrically isolated high voltage electrode. Air flows through the outer filter surfaces through the electrified refractory filter media and between the high voltage electrodes and is removed from a space in the high voltage electrode.

  8. Filter arrangement

    SciTech Connect

    Hancock, T.M.

    1981-06-09

    A filter arrangement is described that includes a venturi element carried within a tubular filter by a flexible attachment ring at one end of the filter removably securing the filter in the aperture of an apertured plate member in a gas filtration system. The outermost diameter of the venturi element is slightly less than the diameter of the aperture to accommodate withdrawal of the venturi element from either side of the plate attendant to installation and removal of the filter arrangement.

  9. SNPs in putative regulatory regions identified by human mouse comparative sequencing and transcription factor binding site data

    SciTech Connect

    Banerjee, Poulabi; Bahlo, Melanie; Schwartz, Jody R.; Loots, Gabriela G.; Houston, Kathryn A.; Dubchak, Inna; Speed, Terence P.; Rubin, Edward M.

    2002-01-01

    Genome wide disease association analysis using SNPs is being explored as a method for dissecting complex genetic traits and a vast number of SNPs have been generated for this purpose. As there are cost and throughput limitations of genotyping large numbers of SNPs and statistical issues regarding the large number of dependent tests on the same data set, to make association analysis practical it has been proposed that SNPs should be prioritized based on likely functional importance. The most easily identifiable functional SNPs are coding SNPs (cSNPs) and accordingly cSNPs have been screened in a number of studies. SNPs in gene regulatory sequences embedded in noncoding DNA are another class of SNPs suggested for prioritization due to their predicted quantitative impact on gene expression. The main challenge in evaluating these SNPs, in contrast to cSNPs is a lack of robust algorithms and databases for recognizing regulatory sequences in noncoding DNA. Approaches that have been previously used to delineate noncoding sequences with gene regulatory activity include cross-species sequence comparisons and the search for sequences recognized by transcription factors. We combined these two methods to sift through mouse human genomic sequences to identify putative gene regulatory elements and subsequently localized SNPs within these sequences in a 1 Megabase (Mb) region of human chromosome 5q31, orthologous to mouse chromosome 11 containing the Interleukin cluster.

  10. Missing data imputation of solar radiation data under different atmospheric conditions.

    PubMed

    Turrado, Concepción Crespo; López, María Del Carmen Meizoso; Lasheras, Fernando Sánchez; Gómez, Benigno Antonio Rodríguez; Rollé, José Luis Calvo; Juez, Francisco Javier de Cos

    2014-01-01

    Global solar broadband irradiance on a planar surface is measured at weather stations by pyranometers. In the case of the present research, solar radiation values from nine meteorological stations of the MeteoGalicia real-time observational network, captured and stored every ten minutes, are considered. In this kind of record, the lack of data and/or the presence of wrong values adversely affects any time series study. Consequently, when this occurs, a data imputation process must be performed in order to replace missing data with estimated values. This paper aims to evaluate the multivariate imputation of ten-minute scale data by means of the chained equations method (MICE). This method allows the network itself to impute the missing or wrong data of a solar radiation sensor, by using either all or just a group of the measurements of the remaining sensors. Very good results have been obtained with the MICE method in comparison with other methods employed in this field such as Inverse Distance Weighting (IDW) and Multiple Linear Regression (MLR). The average RMSE value of the predictions for the MICE algorithm was 13.37% while that for the MLR it was 28.19%, and 31.68% for the IDW. PMID:25356644

  11. Purposeful Variable Selection and Stratification to Impute Missing FAST Data in Trauma Research

    PubMed Central

    Fuchs, Paul A.; del Junco, Deborah J.; Fox, Erin E.; Holcomb, John B.; Rahbar, Mohammad H.; Wade, Charles A.; Alarcon, Louis H.; Brasel, Karen J.; Bulger, Eileen M.; Cohen, Mitchell J.; Myers, John G.; Muskat, Peter; Phelan, Herb A.; Schreiber, Martin A.; Cotton, Bryan A.

    2013-01-01

    Background The Focused Assessment with Sonography for Trauma (FAST) exam is an important variable in many retrospective trauma studies. The purpose of this study was to devise an imputation method to overcome missing data for the FAST exam. Due to variability in patients’ injuries and trauma care, these data are unlikely to be missing completely at random (MCAR), raising concern for validity when analyses exclude patients with missing values. Methods Imputation was conducted under a less restrictive, more plausible missing at random (MAR) assumption. Patients with missing FAST exams had available data on alternate, clinically relevant elements that were strongly associated with FAST results in complete cases, especially when considered jointly. Subjects with missing data (32.7%) were divided into eight mutually exclusive groups based on selected variables that both described the injury and were associated with missing FAST values. Additional variables were selected within each group to classify missing FAST values as positive or negative, and correct FAST exam classification based on these variables was determined for patients with non-missing FAST values. Results Severe head/neck injury (odds ratio, OR=2.04), severe extremity injury (OR=4.03), severe abdominal injury (OR=1.94), no injury (OR=1.94), other abdominal injury (OR=0.47), other head/neck injury (OR=0.57) and other extremity injury (OR=0.45) groups had significant ORs for missing data; the other group odds ratio was not significant (OR=0.84). All 407 missing FAST values were imputed, with 109 classified as positive. Correct classification of non-missing FAST results using the alternate variables was 87.2%. Conclusions Purposeful imputation for missing FAST exams based on interactions among selected variables assessed by simple stratification may be a useful adjunct to sensitivity analysis in the evaluation of imputation strategies under different missing data mechanisms. This approach has the potential for widespread application in clinical and translational research and validation is warranted. Level of Evidence Level II Prognostic or Epidemiological PMID:23778515

  12. Water Filters

    NASA Technical Reports Server (NTRS)

    1993-01-01

    The Aquaspace H2OME Guardian Water Filter, available through Western Water International, Inc., reduces lead in water supplies. The filter is mounted on the faucet and the filter cartridge is placed in the "dead space" between sink and wall. This filter is one of several new filtration devices using the Aquaspace compound filter media, which combines company developed and NASA technology. Aquaspace filters are used in industrial, commercial, residential, and recreational environments as well as by developing nations where water is highly contaminated.

  13. Evaluating GWAS-Identified SNPs for Age at Natural Menopause among Chinese Women

    PubMed Central

    Shen, Chong; Delahanty, Ryan J.; Gao, Yu-Tang; Lu, Wei; Xiang, Yong-Bing; Zheng, Ying; Cai, Qiuyin; Zheng, Wei; Shu, Xiao-Ou; Long, Jirong

    2013-01-01

    Background Age at natural menopause (ANM) is a complex trait with high heritability and is associated with several major hormonal-related diseases. Recently, several genome-wide association studies (GWAS), conducted exclusively among women of European ancestry, have discovered dozens of genetic loci influencing ANM. No study has been conducted to evaluate whether these findings can be generalized to Chinese women. Methodology/Principal Findings We evaluated the index single nucleotide polymorphisms (SNPs) in 19 GWAS-identified genetic susceptibility loci for ANM among 3,533 Chinese women who had natural menopause. We also investigated 3 additional SNPs which were in LD with the index SNP in European-ancestry but not in Asian-ancestry populations. Two genetic risk scores (GRS) were calculated to summarize SNPs across multiple loci one for all SNPs tested (GRSall), and one for SNPs which showed association in our study (GRSsel). All 22 SNPs showed the same association direction as previously reported. Eight SNPs were nominally statistically significant with P?0.05: rs4246511 (RHBDL2), rs12461110 (NLRP11), rs2307449 (POLG), rs12611091 (BRSK1), rs1172822 (BRSK1), rs365132 (UIMC1), rs2720044 (ASH2L), and rs7246479 (TMEM150B). Especially, SNPs rs4246511, rs365132, rs1172822, and rs7246479 remained significant even after Bonferroni correction. Significant associations were observed for GRS. Women in the highest quartile began menopause 0.7 years (P?=?3.24×10?9) and 0.9 years (P?=?4.61×10?11) later than those in the lowest quartile for GRSsel and GRSall, respectively. Conclusions Among the 22 investigated SNPs, eight showed associations with ANM (P<0.05) in our Chinese population. Results from this study extend some recent GWAS findings to the Asian-ancestry population and may guide future efforts to identify genetic determination of menopause. PMID:23536822

  14. High-throughput SNPs for all: genotyping-in-thousands.

    PubMed

    Pavey, Scott A

    2015-07-01

    Understanding the genetic structure of species is essential for conservation. It is only with this information that managers, academics, user groups and land-use planners can understand the spatial scale of migration and local adaptation, source-sink dynamics and effective population size. Such information is essential for a multitude of applications including delineating management units, balancing management priorities, discovering cryptic species and implementing captive breeding programmes. Species can range from locally adapted by hundreds of metres (Pavey et al. ) to complete species panmixia (Côté et al. ). Even more remarkable is that this essential information can be obtained without fully sequenced or annotated genomes, but from mere (putatively) nonfunctional variants. First with allozymes, then microsatellites and now SNPs, this neutral genetic variation carries a wealth of information about migration and drift. For many of us, it may be somewhat difficult to remember our understanding of species conservation before the widespread usage of these useful tools. However most species on earth have yet to give us that 'peek under the curtain'. With the current diversity on earth estimated to be nearly 9 million species (Mora et al. ), we have a long way to go for a comprehensive meta-phylogeographic understanding. A method presented in this issue by Campbell and colleagues (Campbell et al. ) is a tool that will accelerate the pace in this area. Genotyping-in-thousands (GT-seq) leverages recent advancements in sequencing technology to save many hours and dollars over previous methods to generate this important neutral genetic information. PMID:26095005

  15. Identification of Novel Single Nucleotide Polymorphisms (SNPs) in Deer (Odocoileus spp.) Using the BovineSNP50

    E-print Network

    Latch, Emily K.

    ) for identifying polymorphic SNPs in cervids Odocoileus hemionus (mule deer and black-tailed deer) and OIdentification of Novel Single Nucleotide Polymorphisms (SNPs) in Deer (Odocoileus spp.) Using of Novel Single Nucleotide Polymorphisms (SNPs) in Deer (Odocoileus spp.) Using the BovineSNP50 Bead

  16. Biological Filters.

    ERIC Educational Resources Information Center

    Klemetson, S. L.

    1978-01-01

    Presents the 1978 literature review of wastewater treatment. The review is concerned with biological filters, and it covers: (1) trickling filters; (2) rotating biological contractors; and (3) miscellaneous reactors. A list of 14 references is also presented. (HM)

  17. Water Filters

    NASA Technical Reports Server (NTRS)

    1987-01-01

    A compact, lightweight electrolytic water filter generates silver ions in concentrations of 50 to 100 parts per billion in the water flow system. Silver ions serve as effective bactericide/deodorizers. Ray Ward requested and received from NASA a technical information package on the Shuttle filter, and used it as basis for his own initial development, a home use filter.

  18. SNP-Seek database of SNPs derived from 3000 rice genomes.

    PubMed

    Alexandrov, Nickolai; Tai, Shuaishuai; Wang, Wensheng; Mansueto, Locedie; Palis, Kevin; Fuentes, Roven Rommel; Ulat, Victor Jun; Chebotarov, Dmytro; Zhang, Gengyun; Li, Zhikang; Mauleon, Ramil; Hamilton, Ruaraidh Sackville; McNally, Kenneth L

    2015-01-01

    We have identified about 20 million rice SNPs by aligning reads from the 3000 rice genomes project with the Nipponbare genome. The SNPs and allele information are organized into a SNP-Seek system (http://www.oryzasnp.org/iric-portal/), which consists of Oracle database having a total number of rows with SNP genotypes close to 60 billion (20 M SNPs × 3 K rice lines) and web interface for convenient querying. The database allows quick retrieving of SNP alleles for all varieties in a given genome region, finding different alleles from predefined varieties and querying basic passport and morphological phenotypic information about sequenced rice lines. SNPs can be visualized together with the gene structures in JBrowse genome browser. Evolutionary relationships between rice varieties can be explored using phylogenetic trees or multidimensional scaling plots. PMID:25429973

  19. F-SNP: computationally predicted functional SNPs for disease association studies.

    PubMed

    Lee, Phil Hyoun; Shatkay, Hagit

    2008-01-01

    The Functional Single Nucleotide Polymorphism (F-SNP) database integrates information obtained from 16 bioinformatics tools and databases about the functional effects of SNPs. These effects are predicted and indicated at the splicing, transcriptional, translational and post-translational level. As such, the database helps identify and focus on SNPs with potential deleterious effect to human health. In particular, users can retrieve SNPs that disrupt genomic regions known to be functional, including splice sites and transcriptional regulatory regions. Users can also identify non-synonymous SNPs that may have deleterious effects on protein structure or function, interfere with protein translation or impede post-translational modification. A web interface enables easy navigation for obtaining information through multiple starting points and exploration routes (e.g. starting from SNP identifier, genomic region, gene or target disease). The F-SNP database is available at http://compbio.cs.queensu.ca/F-SNP/. PMID:17986460

  20. SNP-Seek database of SNPs derived from 3000 rice genomes

    PubMed Central

    Alexandrov, Nickolai; Tai, Shuaishuai; Wang, Wensheng; Mansueto, Locedie; Palis, Kevin; Fuentes, Roven Rommel; Ulat, Victor Jun; Chebotarov, Dmytro; Zhang, Gengyun; Li, Zhikang; Mauleon, Ramil; Hamilton, Ruaraidh Sackville; McNally, Kenneth L.

    2015-01-01

    We have identified about 20 million rice SNPs by aligning reads from the 3000 rice genomes project with the Nipponbare genome. The SNPs and allele information are organized into a SNP-Seek system (http://www.oryzasnp.org/iric-portal/), which consists of Oracle database having a total number of rows with SNP genotypes close to 60 billion (20 M SNPs × 3 K rice lines) and web interface for convenient querying. The database allows quick retrieving of SNP alleles for all varieties in a given genome region, finding different alleles from predefined varieties and querying basic passport and morphological phenotypic information about sequenced rice lines. SNPs can be visualized together with the gene structures in JBrowse genome browser. Evolutionary relationships between rice varieties can be explored using phylogenetic trees or multidimensional scaling plots. PMID:25429973

  1. A Multiethnic Replication Study of Plasma Lipoprotein Levels-Associated SNPs Identified in Recent GWAS

    PubMed Central

    Bryant, Emily K.; Dressen, Amy S.; Bunker, Clareann H.; Hokanson, John E.; Hamman, Richard F.; Kamboh, M. Ilyas; Demirci, F. Yesim

    2013-01-01

    Genome-wide association studies (GWAS) have identified a number of loci/SNPs associated with plasma total cholesterol (TC), low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and triglyceride (TG) levels. The purpose of this study was to replicate 40 recent GWAS-identified HDL-C-related new loci in 3 epidemiological samples comprising U.S. non-Hispanic Whites (NHWs), U.S. Hispanics, and African Blacks. In each sample, the association analyses were performed with all 4 major lipid traits regardless of previously reported specific associations with selected SNPs. A total of 22 SNPs showed nominally significant association (p<0.05) with at least one lipid trait in at least one ethnic group, although not always with the same lipid traits reported as genome-wide significant in the original GWAS. The total number of significant loci was 10 for TC, 12 for LDL-C, 10 for HDL-C, and 6 for TG levels. Ten SNPs were significantly associated with more than one lipid trait in at least one ethnic group. Six SNPs were significantly associated with at least one lipid trait in more than one ethnic group, although not always with the same trait across various ethnic groups. For 25 SNPs, the associations were replicated with the same genome-wide significant lipid traits in the same direction in at least one ethnic group; at nominal significance for 13 SNPs and with a trend for association for 12 SNPs. However, the associations were not consistently present in all ethnic groups. This observation was consistent with mixed results obtained in other studies that also examined various ethnic groups. PMID:23717430

  2. Genome-wide association analysis of canine atopic dermatitis and identification of disease related SNPs

    Microsoft Academic Search

    Shona Hiedi Wood; Xiayi Ke; Tim Nuttall; Neil McEwan; William E. Ollier; Stuart D. Carter

    2009-01-01

    In humans, genome-wide association studies (GWAS) have been shown to be an effective and thorough approach for identifying\\u000a polymorphisms associated with disease phenotypes. Here, we describe the first study to perform a genome-wide association study\\u000a in canine atopic dermatitis (cAD) using the Illumina Canine SNP20 array, containing 22,362 single-nucleotide polymorphisms\\u000a (SNPs). The aim of the study was to identify SNPs

  3. Comprehensive Exploration of the Effects of miRNA SNPs on Monocyte Gene Expression

    PubMed Central

    Greliche, Nicolas; Zeller, Tanja; Wild, Philipp S.; Rotival, Maxime; Schillert, Arne; Ziegler, Andreas; Deloukas, Panos; Erdmann, Jeanette; Hengstenberg, Christian; Ouwehand, Willem H.; Samani, Nilesh J.; Schunkert, Heribert; Munzel, Thomas; Lackner, Karl J.; Cambien, François; Goodall, Alison H.; Tiret, Laurence; Blankenberg, Stefan; Trégouët, David-Alexandre; Attwood, Tony; Stephanie, Belz; Braund, Peter; Brocheton, Jessy; Cooper, Jason; Crisp-Hihn, Abi; Diemert, Patrick (formerly Linsel-Nitschke); Foad, Nicola; Godefroy, Tiphaine; Gracey, Jay; Gray, Emma; Gwilliams, Rhian; Heimerl, Susanne; Jolley, Jennifer; Krishnan, Unni; Lloyd-Jones, Heather; Liljedahl, Ulrika; Lugauer, Ingrid; Lundmark, Per; Maouche, Seraya; Moore, Jasbir S; Gilles, Montalescot; Muir, David; Murray, Elizabeth; Nelson, Chris P; Neudert, Jessica; Niblett, David; O’Leary, Karen; Pollard, Helen; Proust, Carole; Rankin, Angela; Rendon, Augusto; Rice, Catherine M; Sager, Hendrik; Sambrook, Jennifer; Gerd, Schmitz; Scholz, Michael; Schroeder, Laura; Stephens, Jonathan; Syvannen, Ann-Christine; Tennstedt, Stefanie (formerlyGulde); Wallace, Chris

    2012-01-01

    We aimed to assess whether pri-miRNA SNPs (miSNPs) could influence monocyte gene expression, either through marginal association or by interacting with polymorphisms located in 3'UTR regions (3utrSNPs). We then conducted a genome-wide search for marginal miSNPs effects and pairwise miSNPs × 3utrSNPs interactions in a sample of 1,467 individuals for which genome-wide monocyte expression and genotype data were available. Statistical associations that survived multiple testing correction were tested for replication in an independent sample of 758 individuals with both monocyte gene expression and genotype data. In both studies, the hsa-mir-1279 rs1463335 was found to modulate in cis the expression of LYZ and in trans the expression of CNTN6, CTRC, COPZ2, KRT9, LRRFIP1, NOD1, PCDHA6, ST5 and TRAF3IP2 genes, supporting the role of hsa-mir-1279 as a regulator of several genes in monocytes. In addition, we identified two robust miSNPs × 3utrSNPs interactions, one involving HLA-DPB1 rs1042448 and hsa-mir-219-1 rs107822, the second the H1F0 rs1894644 and hsa-mir-659 rs5750504, modulating the expression of the associated genes. As some of the aforementioned genes have previously been reported to reside at disease-associated loci, our findings provide novel arguments supporting the hypothesis that the genetic variability of miRNAs could also contribute to the susceptibility to human diseases. PMID:23029284

  4. TLR4 single nucleotide polymorphisms (SNPs) associated with Salmonella shedding in pigs.

    PubMed

    Kich, Jalusa Deon; Uthe, Jolita Janutenaite; Benavides, Magda Vieira; Cantão, Maurício Egídio; Zanella, Ricardo; Tuggle, Christopher Keith; Bearson, Shawn Michelle Dunkin

    2014-05-01

    Toll-like receptor 4 (TLR4) is a key factor in the innate immune recognition of lipopolysaccharide (LPS) from Gram-negative bacteria. Previous studies from our group identified differences in the expression profile of TLR4 and genes affected by the TLR4 signaling pathway among pigs that shed varying levels of Salmonella, a Gram-negative bacterium. Therefore, genetic variation in this gene may be involved with the host's immune response to bacterial infections. The current study screened for single nucleotide polymorphisms (SNPs) in the TLR4 gene and tested their association with Salmonella fecal shedding. Pigs (n?=?117) were intranasally challenged at 7 weeks of age with 1?×?10(9) CFU of S. Typhimurium ?4232 and were classified as low or persistent Salmonella shedders based on the levels of Salmonella being excreted in fecal material. Salmonella fecal shedding was determined by quantitative bacteriology on days 2, 7, 14, and 20/21 post exposure, and the cumulative levels of Salmonella were calculated to identify the low (n?=?20) and persistent (n?=?20) Salmonella shedder pigs. From those 40 animals, the TLR4 region was sequenced, and 18 single nucleotide polymorphisms (SNPs) in TLR4 were identified. Twelve SNPs have been previously described and six are novel SNPs of which five are in the 5' untranslated region and one is in intron 2. Single marker association test identified 13 SNPs associated with the qualitative trait of Salmonella fecal shedding, and seven of those SNPs were also associated with a quantitative measurement of fecal shedding (P?SNPs rs80787918 and rs80907449 (P???4.0?×?10(-3)) spanning a region of 4.9 Kb was identified, thereby providing additional information of the influence of those SNPs on Salmonella fecal shedding in pigs. PMID:24566961

  5. FTO gene SNPs associated with extreme obesity in cases, controls and extremely discordant sister pairs

    Microsoft Academic Search

    R Arlen Price; Wei-Dong Li; Hongyu Zhao

    2008-01-01

    BACKGROUND: FTO is a gene located in chromosome region 16q12.2. Recently two studies have found associations of several single nucleotide polymorphisms (SNPs) in FTO with body mass index (BMI) and obesity, particularly rs1421085, rs17817449, and rs9939609. METHODS: We examined these three SNPs in 583 extremely obese women with current BMI greater than 35 kg\\/m2 and lifetime BMI greater than 40

  6. A preliminary study of active compared with passive imputation of missing body mass index values among non-Hispanic white youths1234

    PubMed Central

    Wagstaff, David A; Kranz, Sibylle; Harel, Ofer

    2009-01-01

    Background: Addressing missing data on body weight, height, or both is a challenge many researchers face. In calculating the body mass index (BMI) of study participants, researchers need to impute the missing data. Objective: A multiple imputation through a chained equations approach was used to determine whether one should first impute the missing anthropometric data and then calculate BMI or use an imputation model to obtain BMI. Design: The present study used computer simulation to address the question of how to calculate BMI when there is missing data on weight and height. The simulated data reflected data gathered on non-Hispanic white youths (n = 905) aged 2–18 y, who participated in the 1999–2000 National Health and Nutrition Examination Survey (NHANES). Results: The simulation indicated that it made little difference in the accuracy with which the youths' mean BMIs were estimated when the data were missing completely at random. However, the use of a model to impute BMI was favored slightly when the data were missing at random and the imputation model included the variable used to determine missingness. Conclusion: The present findings extend the use of passive imputation and the use of multiple imputation through a chained equations approach to an area of critical public health importance. PMID:19244364

  7. Pierre Courrieu & Arnaud Rey / Missing Data 1/47 Missing Data Imputation and Corrected Statistics for Large-Scale Behavioral Databases

    E-print Network

    Paris-Sud XI, Université de

    Pierre Courrieu & Arnaud Rey / Missing Data 1/47 Missing Data Imputation and Corrected Statistics / Missing Data 2/47 Missing Data Imputation and Corrected Statistics for Large-Scale Behavioral Databases-scale item performance behavioral databases. Useful statistics corrected for missing data are described

  8. Allelic expression mapping across cellular lineages to establish impact of non-coding SNPs

    PubMed Central

    Adoue, Veronique; Schiavi, Alicia; Light, Nicholas; Almlöf, Jonas Carlsson; Lundmark, Per; Ge, Bing; Kwan, Tony; Caron, Maxime; Rönnblom, Lars; Wang, Chuan; Chen, Shu-Huang; Goodall, Alison H; Cambien, Francois; Deloukas, Panos; Ouwehand, Willem H; Syvänen, Ann-Christine; Pastinen, Tomi

    2014-01-01

    Most complex disease-associated genetic variants are located in non-coding regions and are therefore thought to be regulatory in nature. Association mapping of differential allelic expression (AE) is a powerful method to identify SNPs with direct cis-regulatory impact (cis-rSNPs). We used AE mapping to identify cis-rSNPs regulating gene expression in 55 and 63 HapMap lymphoblastoid cell lines from a Caucasian and an African population, respectively, 70 fibroblast cell lines, and 188 purified monocyte samples and found 40–60% of these cis-rSNPs to be shared across cell types. We uncover a new class of cis-rSNPs, which disrupt footprint-derived de novo motifs that are predominantly bound by repressive factors and are implicated in disease susceptibility through overlaps with GWAS SNPs. Finally, we provide the proof-of-principle for a new approach for genome-wide functional validation of transcription factor–SNP interactions. By perturbing NF?B action in lymphoblasts, we identified 489 cis-regulated transcripts with altered AE after NF?B perturbation. Altogether, we perform a comprehensive analysis of cis-variation in four cell populations and provide new tools for the identification of functional variants associated to complex diseases. PMID:25326100

  9. All SNPs are not created equal: genome-wide association studies reveal a consistent pattern of enrichment among functionally annotated SNPs.

    PubMed

    Schork, Andrew J; Thompson, Wesley K; Pham, Phillip; Torkamani, Ali; Roddey, J Cooper; Sullivan, Patrick F; Kelsoe, John R; O'Donovan, Michael C; Furberg, Helena; Schork, Nicholas J; Andreassen, Ole A; Dale, Anders M

    2013-04-01

    Recent results indicate that genome-wide association studies (GWAS) have the potential to explain much of the heritability of common complex phenotypes, but methods are lacking to reliably identify the remaining associated single nucleotide polymorphisms (SNPs). We applied stratified False Discovery Rate (sFDR) methods to leverage genic enrichment in GWAS summary statistics data to uncover new loci likely to replicate in independent samples. Specifically, we use linkage disequilibrium-weighted annotations for each SNP in combination with nominal p-values to estimate the True Discovery Rate (TDR = 1-FDR) for strata determined by different genic categories. We show a consistent pattern of enrichment of polygenic effects in specific annotation categories across diverse phenotypes, with the greatest enrichment for SNPs tagging regulatory and coding genic elements, little enrichment in introns, and negative enrichment for intergenic SNPs. Stratified enrichment directly leads to increased TDR for a given p-value, mirrored by increased replication rates in independent samples. We show this in independent Crohn's disease GWAS, where we find a hundredfold variation in replication rate across genic categories. Applying a well-established sFDR methodology we demonstrate the utility of stratification for improving power of GWAS in complex phenotypes, with increased rejection rates from 20% in height to 300% in schizophrenia with traditional FDR and sFDR both fixed at 0.05. Our analyses demonstrate an inherent stratification among GWAS SNPs with important conceptual implications that can be leveraged by statistical methods to improve the discovery of loci. PMID:23637621

  10. All SNPs Are Not Created Equal: Genome-Wide Association Studies Reveal a Consistent Pattern of Enrichment among Functionally Annotated SNPs

    PubMed Central

    Schork, Andrew J.; Thompson, Wesley K.; Pham, Phillip; Torkamani, Ali; Roddey, J. Cooper; Sullivan, Patrick F.; Kelsoe, John R.; O'Donovan, Michael C.; Furberg, Helena; Schork, Nicholas J.; Andreassen, Ole A.; Dale, Anders M.

    2013-01-01

    Recent results indicate that genome-wide association studies (GWAS) have the potential to explain much of the heritability of common complex phenotypes, but methods are lacking to reliably identify the remaining associated single nucleotide polymorphisms (SNPs). We applied stratified False Discovery Rate (sFDR) methods to leverage genic enrichment in GWAS summary statistics data to uncover new loci likely to replicate in independent samples. Specifically, we use linkage disequilibrium-weighted annotations for each SNP in combination with nominal p-values to estimate the True Discovery Rate (TDR?=?1?FDR) for strata determined by different genic categories. We show a consistent pattern of enrichment of polygenic effects in specific annotation categories across diverse phenotypes, with the greatest enrichment for SNPs tagging regulatory and coding genic elements, little enrichment in introns, and negative enrichment for intergenic SNPs. Stratified enrichment directly leads to increased TDR for a given p-value, mirrored by increased replication rates in independent samples. We show this in independent Crohn's disease GWAS, where we find a hundredfold variation in replication rate across genic categories. Applying a well-established sFDR methodology we demonstrate the utility of stratification for improving power of GWAS in complex phenotypes, with increased rejection rates from 20% in height to 300% in schizophrenia with traditional FDR and sFDR both fixed at 0.05. Our analyses demonstrate an inherent stratification among GWAS SNPs with important conceptual implications that can be leveraged by statistical methods to improve the discovery of loci. PMID:23637621

  11. Data Imputation in Epistatic MAPs by Network-Guided Matrix Completion.

    PubMed

    Žitnik, Marinka; Zupan, Blaž

    2015-06-01

    Epistatic miniarray profile (E-MAP) is a popular large-scale genetic interaction discovery platform. E-MAPs benefit from quantitative output, which makes it possible to detect subtle interactions with greater precision. However, due to the limits of biotechnology, E-MAP studies fail to measure genetic interactions for up to 40% of gene pairs in an assay. Missing measurements can be recovered by computational techniques for data imputation, in this way completing the interaction profiles and enabling downstream analysis algorithms that could otherwise be sensitive to missing data values. We introduce a new interaction data imputation method called network-guided matrix completion (NG-MC). The core part of NG-MC is low-rank probabilistic matrix completion that incorporates prior knowledge presented as a collection of gene networks. NG-MC assumes that interactions are transitive, such that latent gene interaction profiles inferred by NG-MC depend on the profiles of their direct neighbors in gene networks. As the NG-MC inference algorithm progresses, it propagates latent interaction profiles through each of the networks and updates gene network weights toward improved prediction. In a study with four different E-MAP data assays and considered protein-protein interaction and gene ontology similarity networks, NG-MC significantly surpassed existing alternative techniques. Inclusion of information from gene networks also allowed NG-MC to predict interactions for genes that were not included in original E-MAP assays, a task that could not be considered by current imputation approaches. PMID:25658751

  12. Using latent variable modeling and multiple imputation to calibrate rater bias in diagnosis assessment.

    PubMed

    Siddique, Juned; Crespi, Catherine M; Gibbons, Robert D; Green, Bonnie L

    2011-01-30

    We present an approach that uses latent variable modeling and multiple imputation to correct rater bias when one group of raters tends to be more lenient in assigning a diagnosis than another. Our method assumes that there exists an unobserved moderate category of patient who is assigned a positive diagnosis by one type of rater and a negative diagnosis by the other type. We present a Bayesian random effects censored ordinal probit model that allows us to calibrate the diagnoses across rater types by identifying and multiply imputing 'case' or 'non-case' status for patients in the moderate category. A Markov chain Monte Carlo algorithm is presented to estimate the posterior distribution of the model parameters and generate multiple imputations. Our method enables the calibrated diagnosis variable to be used in subsequent analyses while also preserving uncertainty in true diagnosis. We apply our model to diagnoses of posttraumatic stress disorder (PTSD) from a depression study where nurse practitioners were twice as likely as clinical psychologists to diagnose PTSD despite the fact that participants were randomly assigned to either a nurse or a psychologist. Our model appears to balance PTSD rates across raters, provides a good fit to the data, and preserves between-rater variability. After calibrating the diagnoses of PTSD across rater types, we perform an analysis looking at the effects of comorbid PTSD on changes in depression scores over time. Results are compared with an analysis that uses the original diagnoses and show that calibrating the PTSD diagnoses can yield different inferences. PMID:21204122

  13. Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme

    PubMed Central

    Wang, Xian; Li, Ao; Jiang, Zhaohui; Feng, Huanqing

    2006-01-01

    Background Gene expression profiling has become a useful biological resource in recent years, and it plays an important role in a broad range of areas in biology. The raw gene expression data, usually in the form of large matrix, may contain missing values. The downstream analysis methods that postulate complete matrix input are thus not applicable. Several methods have been developed to solve this problem, such as K nearest neighbor impute method, Bayesian principal components analysis impute method, etc. In this paper, we introduce a novel imputing approach based on the Support Vector Regression (SVR) method. The proposed approach utilizes an orthogonal coding input scheme, which makes use of multi-missing values in one row of a certain gene expression profile and imputes the missing value into a much higher dimensional space, to obtain better performance. Results A comparative study of our method with the previously developed methods has been presented for the estimation of the missing values on six gene expression data sets. Among the three different input-vector coding schemes we tried, the orthogonal input coding scheme obtains the best estimation results with the minimum Normalized Root Mean Squared Error (NRMSE). The results also demonstrate that the SVR method has powerful estimation ability on different kinds of data sets with relatively small NRMSE. Conclusion The SVR impute method shows better performance than, or at least comparable with, the previously developed methods in present research. The outstanding estimation ability of this impute method is partly due to the use of the most missing value information by incorporating orthogonal input coding scheme. In addition, the solid theoretical foundation of SVR method also helps in estimation of performance together with orthogonal input coding scheme. The promising estimation ability demonstrated in the results section suggests that the proposed approach provides a proper solution to the missing value estimation problem. The source code of the SVR method is available from for non-commercial use. PMID:16426462

  14. The estimation and use of predictions for the assessment of model performance using large samples with multiply imputed data.

    PubMed

    Wood, Angela M; Royston, Patrick; White, Ian R

    2015-07-01

    Multiple imputation can be used as a tool in the process of constructing prediction models in medical and epidemiological studies with missing covariate values. Such models can be used to make predictions for model performance assessment, but the task is made more complicated by the multiple imputation structure. We summarize various predictions constructed from covariates, including multiply imputed covariates, and either the set of imputation-specific prediction model coefficients or the pooled prediction model coefficients. We further describe approaches for using the predictions to assess model performance. We distinguish between ideal model performance and pragmatic model performance, where the former refers to the model's performance in an ideal clinical setting where all individuals have fully observed predictors and the latter refers to the model's performance in a real-world clinical setting where some individuals have missing predictors. The approaches are compared through an extensive simulation study based on the UK700 trial. We determine that measures of ideal model performance can be estimated within imputed datasets and subsequently pooled to give an overall measure of model performance. Alternative methods to evaluate pragmatic model performance are required and we propose constructing predictions either from a second set of covariate imputations which make no use of observed outcomes, or from a set of partial prediction models constructed for each potential observed pattern of covariate. Pragmatic model performance is generally lower than ideal model performance. We focus on model performance within the derivation data, but describe how to extend all the methods to a validation dataset. PMID:25630926

  15. Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values.

    PubMed

    García-Laencina, Pedro J; Abreu, Pedro Henriques; Abreu, Miguel Henriques; Afonoso, Noémia

    2015-04-01

    Breast cancer is the most frequently diagnosed cancer in women. Using historical patient information stored in clinical datasets, data mining and machine learning approaches can be applied to predict the survival of breast cancer patients. A common drawback is the absence of information, i.e., missing data, in certain clinical trials. However, most standard prediction methods are not able to handle incomplete samples and, then, missing data imputation is a widely applied approach for solving this inconvenience. Therefore, and taking into account the characteristics of each breast cancer dataset, it is required to perform a detailed analysis to determine the most appropriate imputation and prediction methods in each clinical environment. This research work analyzes a real breast cancer dataset from Institute Portuguese of Oncology of Porto with a high percentage of unknown categorical information (most clinical data of the patients are incomplete), which is a challenge in terms of complexity. Four scenarios are evaluated: (I) 5-year survival prediction without imputation and 5-year survival prediction from cleaned dataset with (II) Mode imputation, (III) Expectation-Maximization imputation and (IV) K-Nearest Neighbors imputation. Prediction models for breast cancer survivability are constructed using four different methods: K-Nearest Neighbors, Classification Trees, Logistic Regression and Support Vector Machines. Experiments are performed in a nested ten-fold cross-validation procedure and, according to the obtained results, the best results are provided by the K-Nearest Neighbors algorithm: more than 81% of accuracy and more than 0.78 of area under the Receiver Operator Characteristic curve, which constitutes very good results in this complex scenario. PMID:25725446

  16. A comparison of selected parametric and imputation methods for estimating snag density and snag quality attributes

    USGS Publications Warehouse

    Eskelson, Bianca N.I.; Hagar, Joan; Temesgen, Hailemariam

    2012-01-01

    Snags (standing dead trees) are an essential structural component of forests. Because wildlife use of snags depends on size and decay stage, snag density estimation without any information about snag quality attributes is of little value for wildlife management decision makers. Little work has been done to develop models that allow multivariate estimation of snag density by snag quality class. Using climate, topography, Landsat TM data, stand age and forest type collected for 2356 forested Forest Inventory and Analysis plots in western Washington and western Oregon, we evaluated two multivariate techniques for their abilities to estimate density of snags by three decay classes. The density of live trees and snags in three decay classes (D1: recently dead, little decay; D2: decay, without top, some branches and bark missing; D3: extensive decay, missing bark and most branches) with diameter at breast height (DBH) ? 12.7 cm was estimated using a nonparametric random forest nearest neighbor imputation technique (RF) and a parametric two-stage model (QPORD), for which the number of trees per hectare was estimated with a Quasipoisson model in the first stage and the probability of belonging to a tree status class (live, D1, D2, D3) was estimated with an ordinal regression model in the second stage. The presence of large snags with DBH ? 50 cm was predicted using a logistic regression and RF imputation. Because of the more homogenous conditions on private forest lands, snag density by decay class was predicted with higher accuracies on private forest lands than on public lands, while presence of large snags was more accurately predicted on public lands, owing to the higher prevalence of large snags on public lands. RF outperformed the QPORD model in terms of percent accurate predictions, while QPORD provided smaller root mean square errors in predicting snag density by decay class. The logistic regression model achieved more accurate presence/absence classification of large snags than the RF imputation approach. Adjusting the decision threshold to account for unequal size for presence and absence classes is more straightforward for the logistic regression than for the RF imputation approach. Overall, model accuracies were poor in this study, which can be attributed to the poor predictive quality of the explanatory variables and the large range of forest types and geographic conditions observed in the data.

  17. Imputation by the mean score should be avoided when validating a Patient Reported Outcomes questionnaire by a Rasch model in presence of informative missing data

    PubMed Central

    2011-01-01

    Background Nowadays, more and more clinical scales consisting in responses given by the patients to some items (Patient Reported Outcomes - PRO), are validated with models based on Item Response Theory, and more specifically, with a Rasch model. In the validation sample, presence of missing data is frequent. The aim of this paper is to compare sixteen methods for handling the missing data (mainly based on simple imputation) in the context of psychometric validation of PRO by a Rasch model. The main indexes used for validation by a Rasch model are compared. Methods A simulation study was performed allowing to consider several cases, notably the possibility for the missing values to be informative or not and the rate of missing data. Results Several imputations methods produce bias on psychometrical indexes (generally, the imputation methods artificially improve the psychometric qualities of the scale). In particular, this is the case with the method based on the Personal Mean Score (PMS) which is the most commonly used imputation method in practice. Conclusions Several imputation methods should be avoided, in particular PMS imputation. From a general point of view, it is important to use an imputation method that considers both the ability of the patient (measured for example by his/her score), and the difficulty of the item (measured for example by its rate of favourable responses). Another recommendation is to always consider the addition of a random process in the imputation method, because such a process allows reducing the bias. Last, the analysis realized without imputation of the missing data (available case analyses) is an interesting alternative to the simple imputation in this context. PMID:21756330

  18. Estimating the effect of multiple imputation on incomplete longitudinal data with application to a randomized clinical study.

    PubMed

    Fong, Daniel Y T; Rai, Shesh N; Lam, Karen S L

    2013-01-01

    For analyzing incomplete longitudinal data, there has been recent interest in comparing estimates with and without the use of multiple imputation along with mixed effects model and generalized estimating equations. Empirically, the additional use of multiple imputation generally led to overestimated variances and may yield more heavily biased estimates than the use of last observation carried forward. Under ignorable or nonignorable missing values, a mixed effects model or generalized estimating equations alone yielded more unbiased estimates. The different methods were also assessed in a randomized controlled clinical trial. PMID:23957512

  19. Filtering apparatus

    DOEpatents

    Haldipur, Gaurang B. (Monroeville, PA); Dilmore, William J. (Murrysville, PA)

    1992-01-01

    A vertical vessel having a lower inlet and an upper outlet enclosure separated by a main horizontal tube sheet. The inlet enclosure receives the flue gas from a boiler of a power system and the outlet enclosure supplies cleaned gas to the turbines. The inlet enclosure contains a plurality of particulate-removing clusters, each having a plurality of filter units. Each filter unit includes a filter clean-gas chamber defined by a plate and a perforated auxiliary tube sheet with filter tubes suspended from each tube sheet and a tube connected to each chamber for passing cleaned gas to the outlet enclosure. The clusters are suspended from the main tube sheet with their filter units extending vertically and the filter tubes passing through the tube sheet and opening in the outlet enclosure. The flue gas is circulated about the outside surfaces of the filter tubes and the particulate is absorbed in the pores of the filter tubes. Pulses to clean the filter tubes are passed through their inner holes through tubes free of bends which are aligned with the tubes that pass the clean gas.

  20. Filtering apparatus

    DOEpatents

    Haldipur, G.B.; Dilmore, W.J.

    1992-09-01

    A vertical vessel is described having a lower inlet and an upper outlet enclosure separated by a main horizontal tube sheet. The inlet enclosure receives the flue gas from a boiler of a power system and the outlet enclosure supplies cleaned gas to the turbines. The inlet enclosure contains a plurality of particulate-removing clusters, each having a plurality of filter units. Each filter unit includes a filter clean-gas chamber defined by a plate and a perforated auxiliary tube sheet with filter tubes suspended from each tube sheet and a tube connected to each chamber for passing cleaned gas to the outlet enclosure. The clusters are suspended from the main tube sheet with their filter units extending vertically and the filter tubes passing through the tube sheet and opening in the outlet enclosure. The flue gas is circulated about the outside surfaces of the filter tubes and the particulate is absorbed in the pores of the filter tubes. Pulses to clean the filter tubes are passed through their inner holes through tubes free of bends which are aligned with the tubes that pass the clean gas. 18 figs.

  1. Imputation method adjusted for covariates for nonrespondents in instruments with applications.

    PubMed

    Li, Juan; Chi, Eric M; Feng, Chunyao; Chow, Shein-Chung

    2011-03-01

    In clinical research, measurement instruments (or questionnaires) consisting of a number of items (questions) are often used to assess treatment effect, e.g., quality-of-life assessment, and clinical disease activity index. In many situations, instead of an individual component, it is of interest to provide an assessment of the treatment effect in some overall measures, e.g., subscale or total score. In practice, these types of data often suffer from incompleteness. A common method is to simply ignore all the item nonrespondents from the analysis. Although this method is statistically valid under the assumption of missing completely at random (MCAR), it suffers from decreasing power/efficiency. In this paper, we propose a regression imputation approach adjusted for covariates with item nonrespondents in the instrument. The proposed method provides consistent estimators, which are asymptotically normal. A bootstrap procedure is also proposed to estimate the asymptotic variance of the derived estimators. A simulation study was conducted to study the finite samples performance of the derived estimators. It is also shown that the estimators based on the imputed data set are more efficient than the estimators based on the completers only. The proposed methodology was illustrated through two applications in observational studies. PMID:21391006

  2. Genome-Wide Association Analysis of Imputed Rare Variants: Application to Seven Common Complex Diseases

    PubMed Central

    Mägi, Reedik; Asimit, Jennifer L; Day-Williams, Aaron G; Zeggini, Eleftheria; Morris, Andrew P

    2012-01-01

    Genome-wide association studies have been successful in identifying loci contributing effects to a range of complex human traits. The majority of reproducible associations within these loci are with common variants, each of modest effect, which together explain only a small proportion of heritability. It has been suggested that much of the unexplained genetic component of complex traits can thus be attributed to rare variation. However, genome-wide association study genotyping chips have been designed primarily to capture common variation, and thus are underpowered to detect the effects of rare variants. Nevertheless, we demonstrate here, by simulation, that imputation from an existing scaffold of genome-wide genotype data up to high-density reference panels has the potential to identify rare variant associations with complex traits, without the need for costly re-sequencing experiments. By application of this approach to genome-wide association studies of seven common complex diseases, imputed up to publicly available reference panels, we identify genome-wide significant evidence of rare variant association in PRDM10 with coronary artery disease and multiple genes in the major histocompatibility complex (MHC) with type 1 diabetes. The results of our analyses highlight that genome-wide association studies have the potential to offer an exciting opportunity for gene discovery through association with rare variants, conceivably leading to substantial advancements in our understanding of the genetic architecture underlying complex human traits. PMID:22951892

  3. Subspace Learning and Imputation for Streaming Big Data Matrices and Tensors

    NASA Astrophysics Data System (ADS)

    Mardani, Morteza; Mateos, Gonzalo; Giannakis, Georgios B.

    2015-05-01

    Extracting latent low-dimensional structure from high-dimensional data is of paramount importance in timely inference tasks encountered with `Big Data' analytics. However, increasingly noisy, heterogeneous, and incomplete datasets as well as the need for {\\em real-time} processing of streaming data pose major challenges to this end. In this context, the present paper permeates benefits from rank minimization to scalable imputation of missing data, via tracking low-dimensional subspaces and unraveling latent (possibly multi-way) structure from \\emph{incomplete streaming} data. For low-rank matrix data, a subspace estimator is proposed based on an exponentially-weighted least-squares criterion regularized with the nuclear norm. After recasting the non-separable nuclear norm into a form amenable to online optimization, real-time algorithms with complementary strengths are developed and their convergence is established under simplifying technical assumptions. In a stationary setting, the asymptotic estimates obtained offer the well-documented performance guarantees of the {\\em batch} nuclear-norm regularized estimator. Under the same unifying framework, a novel online (adaptive) algorithm is developed to obtain multi-way decompositions of \\emph{low-rank tensors} with missing entries, and perform imputation as a byproduct. Simulated tests with both synthetic as well as real Internet and cardiac magnetic resonance imagery (MRI) data confirm the efficacy of the proposed algorithms, and their superior performance relative to state-of-the-art alternatives.

  4. SNPs located at CpG sites modulate genome-epigenome interaction.

    PubMed

    Zhi, Degui; Aslibekyan, Stella; Irvin, Marguerite R; Claas, Steven A; Borecki, Ingrid B; Ordovas, Jose M; Absher, Devin M; Arnett, Donna K

    2013-08-01

    DNA methylation is an important molecular-level phenotype that links genotypes and complex disease traits. Previous studies have found local correlation between genetic variants and DNA methylation levels (cis-meQTLs). However, general mechanisms underlying cis-meQTLs are unclear. We conducted a cis-meQTL analysis of the Genetics of Lipid Lowering Drugs and Diet Network data (n = 593). We found that over 80% of genetic variants at CpG sites (meSNPs) are meQTL loci (P-value<10(-9)), and meSNPs account for over two thirds of the strongest meQTL signals (P-value<10(-200)). Beyond direct effects on the methylation of the meSNP site, the CpG-disrupting allele of meSNPs were associated with lowered methylation of CpG sites located within 45 bp. The effect of meSNPs extends to as far as 10 kb and can contribute to the observed meQTL signals in the surrounding region, likely through correlated methylation patterns and linkage disequilibrium. Therefore, meSNPs are behind a large portion of observed meQTL signals and play a crucial role in the biological process linking genetic variation to epigenetic changes. PMID:23811543

  5. SNPs located at CpG sites modulate genome-epigenome interaction

    PubMed Central

    Zhi, Degui; Aslibekyan, Stella; Irvin, Marguerite R; Claas, Steven A; Borecki, Ingrid B; Ordovas, Jose M; Absher, Devin M; Arnett, Donna K

    2013-01-01

    DNA methylation is an important molecular-level phenotype that links genotypes and complex disease traits. Previous studies have found local correlation between genetic variants and DNA methylation levels (cis-meQTLs). However, general mechanisms underlying cis-meQTLs are unclear. We conducted a cis-meQTL analysis of the Genetics of Lipid Lowering Drugs and Diet Network data (n = 593). We found that over 80% of genetic variants at CpG sites (meSNPs) are meQTL loci (P-value < 10?9), and meSNPs account for over two thirds of the strongest meQTL signals (P-value < 10?200). Beyond direct effects on the methylation of the meSNP site, the CpG-disrupting allele of meSNPs were associated with lowered methylation of CpG sites located within 45 bp. The effect of meSNPs extends to as far as 10 kb and can contribute to the observed meQTL signals in the surrounding region, likely through correlated methylation patterns and linkage disequilibrium. Therefore, meSNPs are behind a large portion of observed meQTL signals and play a crucial role in the biological process linking genetic variation to epigenetic changes. PMID:23811543

  6. Partition dataset according to amino acid type improves the prediction of deleterious non-synonymous SNPs

    SciTech Connect

    Yang, Jing; Li, Yuan-Yuan [School of Biotechnology, East China University of Science and Technology, Shanghai 200237 (China) [School of Biotechnology, East China University of Science and Technology, Shanghai 200237 (China); Shanghai Center for Bioinformation Technology, Shanghai 200235 (China); Li, Yi-Xue, E-mail: yxli@sibs.ac.cn [School of Biotechnology, East China University of Science and Technology, Shanghai 200237 (China) [School of Biotechnology, East China University of Science and Technology, Shanghai 200237 (China); Shanghai Center for Bioinformation Technology, Shanghai 200235 (China); Ye, Zhi-Qiang, E-mail: yezq@pkusz.edu.cn [Laboratory of Chemical Genomics, School of Chemical Biology and Biotechnology, Peking University Shenzhen Graduate School, Shenzhen 518055 (China) [Laboratory of Chemical Genomics, School of Chemical Biology and Biotechnology, Peking University Shenzhen Graduate School, Shenzhen 518055 (China); Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031 (China)

    2012-03-02

    Highlights: Black-Right-Pointing-Pointer Proper dataset partition can improve the prediction of deleterious nsSNPs. Black-Right-Pointing-Pointer Partition according to original residue type at nsSNP is a good criterion. Black-Right-Pointing-Pointer Similar strategy is supposed promising in other machine learning problems. -- Abstract: Many non-synonymous SNPs (nsSNPs) are associated with diseases, and numerous machine learning methods have been applied to train classifiers for sorting disease-associated nsSNPs from neutral ones. The continuously accumulated nsSNP data allows us to further explore better prediction approaches. In this work, we partitioned the training data into 20 subsets according to either original or substituted amino acid type at the nsSNP site. Using support vector machine (SVM), training classification models on each subset resulted in an overall accuracy of 76.3% or 74.9% depending on the two different partition criteria, while training on the whole dataset obtained an accuracy of only 72.6%. Moreover, the dataset was also randomly divided into 20 subsets, but the corresponding accuracy was only 73.2%. Our results demonstrated that partitioning the whole training dataset into subsets properly, i.e., according to the residue type at the nsSNP site, will improve the performance of the trained classifiers significantly, which should be valuable in developing better tools for predicting the disease-association of nsSNPs.

  7. Profiling deleterious non-synonymous SNPs of smoker's gene CYP1A1.

    PubMed

    Ramesh, A Sai; Khan, Imran; Farhan, Md; Thiagarajan, Padma

    2013-01-01

    CYP1A1 gene belongs to the cytochrome P450 family and is known better as smokers' gene due to its hyperactivation as a consequence of long term smoking. The expression of CYP1A1 induces polycyclic aromatic hydrocarbon production in the lungs, which when over expressed, is known to cause smoking related diseases, such as cardiovascular pathologies, cancer, and diabetes. Single nucleotide polymorphisms (SNPs) are the simplest form of genetic variations that occur at a higher frequency, and are denoted as synonymous and non-synonymous SNPs on the basis of their effects on the amino acids. This study adopts a systematic in silico approach to predict the deleterious SNPs that are associated with disease conditions. It is inferred that four SNPs are highly deleterious, among which the SNP with rs17861094 is commonly predicted to be harmful by all tools. Hydrophobic (isoleucine) to hydrophilic (serine) amino acid variation was observed in the candidate gene. Hence, this investigation aims to characterize a candidate gene from 159 SNPs of CYP1A1. PMID:23733671

  8. Association of eight EST-derived SNPs with carcass and meat quality traits in pigs.

    PubMed

    Tong, Xiong; Zhang, Zhe; Jiao, Yiren; Xu, Jian; Dang, Hongquyen; Chen, Ye; Jiang, Zhiguo; Duan, Junli; Zhang, Hao; Li, Jiaqi; Wang, Chong

    2015-02-01

    The identification of genetic markers associated with important economic traits is fundamental to improving the productivity and quality of livestock. In this investigation, we searched for 177 expressed sequence tags (ESTs) putatively involved in meat quality from the available pig EST database, and detected eight single nucleotide polymorphisms (SNPs) in eight ESTs. We investigated the associations of these SNPs with 18 carcass and meat quality traits in a Landrace?×?Lantang F2 resource population (n?=?257). Association analysis revealed that seven SNPs (except E42) were associated with some of the carcass- and meat quality-related traits. Particularly, significant associations of three SNPs (E53, E82, and E36) with backfat thickness traits were observed. Further, the genetic effects of E53 on four live backfat thickness traits were validated in an independent population (n?=?221). More investigations about E53 sequence characteristics were performed, i.e., radiation hybrid (RH) mapping, 3'-RACE, and screening analysis of the positive BAC clones. Our research identified the genetic effects of eight EST-derived SNPs on carcass and meat quality traits, and suggested that E53 may be a useful marker for live backfat thickness traits in pig breeding programs. PMID:25081836

  9. Water Filter

    NSDL National Science Digital Library

    WGBH Boston

    2002-01-01

    In this engineering activity, challenge learners to invent a water filter that cleans dirty water. Learners construct a filter device out of a 2-liter bottle and then experiment with different materials like gravel, sand, and cotton balls to see which is the most effective.
    Safety note: An adult's help is needed for this activity.

  10. Filtering Light

    NSDL National Science Digital Library

    Students learn how CCD cameras use color filters to create astronomical images in this Moveable Museum unit. The four-page PDF guide includes suggested general background readings for educators, activity notes, and step-by-step directions. Students look at black-and-white photos to understand gray scale and construct simple red and green cellophane filters and observe magazine images through them.

  11. Defining, Evaluating, and Removing Bias Induced by Linear Imputation in Longitudinal Clinical Trials with MNAR Missing Data

    PubMed Central

    Helms, Ronald W.; Helms-Reece, Laura; Helms, Russell W.; Helms, Mary W.

    2011-01-01

    Missing not at random (MNAR) post-dropout missing data from a longitudinal clinical trial result in the collection of “biased data”, which leads to biased estimators and tests of corrupted hypotheses. In a full rank linear model analysis the model equation, E[Y] = X?, leads to the definition of the primary parameter ? = (X?X)?1X?E[Y], and the definition of linear secondary parameters of the form ? = L? = L(X?X)?1X?E[Y], including for example, a parameter representing a “treatment effect”. These parameters depend explicitly on E[Y], which raises the questions: what is E[Y] when some elements of the incomplete random vector Y are not observed and MNAR, or when such a Y is “completed” via imputation? We develop a rigorous, readily interpretable definition of E[Y] in this context that leads directly to definitions of ?,Bias(?^)=E[?^]??,Bias(?^)=E[?^ ]?L?, and the extent of hypothesis corruption. These definitions provide a basis for evaluating, comparing, and removing biases induced by various linear imputation methods for MNAR incomplete data from longitudinal clinical trials. Linear imputation methods use earlier data from a subject to impute values for post-dropout missing values and include “Last Observation Carried Forward” (LOCF) and “Baseline Observation Carried Forward” (BOCF), among others. We illustrate the methods of evaluating, comparing, and removing biases and the effects of testing corresponding corrupted hypotheses via a hypothetical, but very realistic longitudinal analgesic clinical trial. PMID:21390998

  12. Imputation of Test Scores in the National Education Longitudinal Study of 1988 (NELS:88). Working Paper Series.

    ERIC Educational Resources Information Center

    Bokossa, Maxime C.; Huang, Gary G.

    This report describes the imputation procedures used to deal with missing data in the National Education Longitudinal Study of 1988 (NELS:88), the only current National Center for Education Statistics (NCES) dataset that contains scores from cognitive tests given the same set of students at multiple time points. As is inevitable, cognitive test…

  13. Statistical Computing Software Reviews Multiple Imputation in Practice: Comparison of Software Packages for Regression Models With Missing Variables

    Microsoft Academic Search

    Nicholas J. HORTON; Stuart R. LIPSITZ

    Missing data frequently complicates data analysis for scientific investigations. The development of statistical methods to ad- dress missing data has been an active area of research in recent decades. Multiple imputation, originally proposed by Rubin in a public use dataset setting, is a general purpose method for an- alyzing datasets with missing data that is broadly applicable to a variety

  14. Estimating the proportion of variation in susceptibility to multiple sclerosis captured by common SNPs

    NASA Astrophysics Data System (ADS)

    Watson, Corey T.; Disanto, Giulio; Breden, Felix; Giovannoni, Gavin; Ramagopalan, Sreeram V.

    2012-10-01

    Multiple sclerosis (MS) is a complex disease with underlying genetic and environmental factors. Although the contribution of alleles within the major histocompatibility complex (MHC) are known to exert strong effects on MS risk, much remains to be learned about the contributions of loci with more modest effects identified by genome-wide association studies (GWASs), as well as loci that remain undiscovered. We use a recently developed method to estimate the proportion of variance in disease liability explained by 475,806 single nucleotide polymorphisms (SNPs) genotyped in 1,854 MS cases and 5,164 controls. We reveal that ~30% of MS genetic liability is explained by SNPs in this dataset, the majority of which is accounted for by common variants. These results suggest that the unaccounted for proportion could be explained by variants that are in imperfect linkage disequilibrium with common GWAS SNPs, highlighting the potential importance of rare variants in the susceptibility to MS.

  15. [Polish population data for 17 Y-STRs and 8 Y-SNPs markers].

    PubMed

    Abreu-G?owacka, Monica; Zaba, Czes?aw; Koralewska-Kordel, Ma?gorzata; Michalak, Eliza; Przybylski, Zygmunt

    2013-01-01

    The aim of our study was to establish the genetic differentiation of the population of the province of Wielkopolska (Greater Poland) for 17 Y-STRs and 8 Y-SNPs and comparison of the Polish population with other selected populations. The investigations included 201 unrelated male inhabitants of the Greater Poland region We found 184 unique haplotypes for 17 Y-STR. The haplotype discrimination capacity was 0.96. The most frequent haplotype Ht-50 was found in 3 samples and 7 haplotypes observed twice. Further, the same samples were analyzed with Y-8 SNPs markers. We obtained 40 haplotypes. The haplotype discrimination capacity was 0.20. The most frequent haplotype was presented in 38 samples. A total of 4 different haplogroups were established. Haplogroup K= 19%, IJ = 7%, R1a1 = 59% and R1b = 15%. The HD value of Y-SNPs/Y-STRs was 0.9883. PMID:24672896

  16. Computational identification and structural analysis of deleterious functional SNPs in MLL gene causing acute leukemia.

    PubMed

    George Priya Doss, C; Rajasekaran, R; Sethumadhavan, Rao

    2010-09-01

    A promising application of the huge amounts of data from the Human Genome Project currently available offers new opportunities for identifying the genetic predisposition and developing a better understanding of complex diseases such as cancers. The main focus of cancer genetics is the study of mutations that are causally implicated in tumorigenesis. The identification of such causal mutations does not only provide insight into cancer biology but also presents anticancer therapeutic targets and diagnostic markers. In this study, we evaluated the Single Nucleotide Polymorphisms (SNPs) that can alter the expression and the function in MLL gene through computational methods. We applied an evolutionary perspective to screen the SNPs using a sequence homologybased SIFT tool, suggested that 10 non-synonymous SNPs (nsSNPs) (50%) were found to be deleterious. Structure based approach PolyPhen server suggested that 5 nsSNPS (25%) may disrupt protein function and structure. PupaSuite tool predicted the phenotypic effect of SNPs on the structure and function of the affected protein. Structure analysis was carried out with the major mutations that occurred in the native protein coded by MLL gene is at amino acid positions Q1198P and K1203Q. The solvent accessibility results showed that 7 residues changed from exposed state in the native type protein to buried state in Q1198P mutant protein and remained unchanged in the case of K1203Q. From the overall results obtained, nsSNP with id (rs1784246) at the amino acid position Q1198P could be considered as deleterious mutation in the acute leukemia caused by MLL gene. PMID:20658337

  17. FitSNPs: highly differentially expressed genes are more likely to have variants associated with disease

    Microsoft Academic Search

    Rong Chen; Alex A Morgan; Joel Dudley; Tarangini Deshpande; Li Li; Keiichi Kodama; Annie P Chiang; Atul J Butte

    2009-01-01

    Background  Candidate single nucleotide polymorphisms (SNPs) from genome-wide association studies (GWASs) were often selected for validation\\u000a based on their functional annotation, which was inadequate and biased. We propose to use the more than 200,000 microarray\\u000a studies in the Gene Expression Omnibus to systematically prioritize candidate SNPs from GWASs.\\u000a \\u000a \\u000a \\u000a \\u000a Results  We analyzed all human microarray studies from the Gene Expression Omnibus, and calculated

  18. Multiple imputation methods for inference on cumulative incidence with missing cause of failure.

    PubMed

    Lee, Minjung; Cronin, Kathleen A; Gail, Mitchell H; Dignam, James J; Feuer, Eric J

    2011-11-01

    Analysis of cumulative incidence (sometimes called absolute risk or crude risk) can be difficult if the cause of failure is missing for some subjects. Assuming missingness is random conditional on the observed data, we develop asymptotic theory for multiple imputation methods to estimate cumulative incidence. Covariates affect cause-specific hazards in our model, and we assume that separate proportional hazards models hold for each cause-specific hazard. Simulation studies show that procedures based on asymptotic theory have near nominal operating characteristics in cohorts of 200 and 400 subjects, both for cumulative incidence and for prediction error. The methods are illustrated with data on survival after breast cancer, obtained from the National Surgical Adjuvant Breast and Bowel Project (NSABP). PMID:22028204

  19. Prioritization of candidate SNPs in colon cancer using bioinformatics tools: an alternative approach for a cancer biologist.

    PubMed

    George Priya Doss, C; Rajasekaran, R; Arjun, P; Sethumadhavan, Rao

    2010-12-01

    The genetics of human phenotype variation and especially, the genetic basis of human complex diseases could be understood by knowing the functions of Single Nucleotide Polymorphisms (SNPs). The main goal of this work is to predict the deleterious non-synonymous SNPs (nsSNPs), so that the number of SNPs screened for association with disease can be reduced to that most likely alters gene function. In this work by using computational tools, we have analyzed the SNPs that can alter the expression and function of cancerous genes involved in colon cancer. To explore possible relationships between genetic mutation and phenotypic variation, different computational algorithm tools like Sorting Intolerant from Tolerant (evolutionary-based approach), Polymorphism Phenotyping (structure-based approach), PupaSuite, UTRScan and FASTSNP were used for prioritization of high-risk SNPs in coding region (exonic nonsynonymous SNPs) and non-coding regions (intronic and exonic 5' and 3'-untranslated region (UTR) SNPs). We developed semi-quantitative relative ranking strategy (non availability of 3D structure) that can be adapted to a priori SNP selection or post hoc evaluation of variants identified in whole genome scans or within haplotype blocks associated with disease. Lastly, we analyzed haplotype tagging SNPs (htSNPs) in the coding and untranslated regions of all the genes by selecting the force tag SNPs selection using iHAP analysis. The computational architecture proposed in this review is based on integrating relevant biomedical information sources to provide a systematic analysis of complex diseases. We have shown a "real world" application of interesting existing bioinformatics tools for SNP analysis in colon cancer. PMID:21153778

  20. Analysis of partially observed clustered data using generalized estimating equations and multiple imputation

    PubMed Central

    Aloisio, Kathryn M.; Swanson, Sonja A.; Micali, Nadia; Field, Alison; Horton, Nicholas J.

    2015-01-01

    Clustered data arise in many settings, particularly within the social and biomedical sciences. As an example, multiple–source reports are commonly collected in child and adolescent psychiatric epidemiologic studies where researchers use various informants (e.g. parent and adolescent) to provide a holistic view of a subject’s symptomatology. Fitzmaurice et al. (1995) have described estimation of multiple source models using a standard generalized estimating equation (GEE) framework. However, these studies often have missing data due to additional stages of consent and assent required. The usual GEE is unbiased when missingness is Missing Completely at Random (MCAR) in the sense of Little and Rubin (2002). This is a strong assumption that may not be tenable. Other options such as weighted generalized estimating equations (WEEs) are computationally challenging when missingness is non–monotone. Multiple imputation is an attractive method to fit incomplete data models while only requiring the less restrictive Missing at Random (MAR) assumption. Previously estimation of partially observed clustered data was computationally challenging however recent developments in Stata have facilitated their use in practice. We demonstrate how to utilize multiple imputation in conjunction with a GEE to investigate the prevalence of disordered eating symptoms in adolescents reported by parents and adolescents as well as factors associated with concordance and prevalence. The methods are motivated by the Avon Longitudinal Study of Parents and their Children (ALSPAC), a cohort study that enrolled more than 14,000 pregnant mothers in 1991–92 and has followed the health and development of their children at regular intervals. While point estimates were fairly similar to the GEE under MCAR, the MAR model had smaller standard errors, while requiring less stringent assumptions regarding missingness. PMID:25642154

  1. Angiogenic, neurotrophic, and inflammatory system SNPs moderate the association between birth weight and ADHD symptom severity.

    PubMed

    Smith, Taylor F; Anastopoulos, Arthur D; Garrett, Melanie E; Arias-Vasquez, Alejandro; Franke, Barbara; Oades, Robert D; Sonuga-Barke, Edmund; Asherson, Philip; Gill, Michael; Buitelaar, Jan K; Sergeant, Joseph A; Kollins, Scott H; Faraone, Stephen V; Ashley-Koch, Allison

    2014-12-01

    Low birth weight is associated with increased risk for Attention-Deficit/Hyperactivity Disorder (ADHD); however, the etiological underpinnings of this relationship remain unclear. This study investigated if genetic variants in angiogenic, dopaminergic, neurotrophic, kynurenine, and cytokine-related biological pathways moderate the relationship between birth weight and ADHD symptom severity. A total of 398 youth from two multi-site, family-based studies of ADHD were included in the analysis. The sample consisted of 360 ADHD probands, 21 affected siblings, and 17 unaffected siblings. A set of 164 SNPs from 31 candidate genes, representing five biological pathways, were included in our analyses. Birth weight and gestational age data were collected from a state birth registry, medical records, and parent report. Generalized Estimating Equations tested for main effects and interactions between individual SNPs and birth weight centile in predicting ADHD symptom severity. SNPs within neurotrophic (NTRK3) and cytokine genes (CNTFR) were associated with ADHD inattentive symptom severity. There was no main effect of birth weight centile on ADHD symptom severity. SNPs within angiogenic (NRP1 & NRP2), neurotrophic (NTRK1 & NTRK3), cytokine (IL16 & S100B), and kynurenine (CCBL1 & CCBL2) genes moderate the association between birth weight centile and ADHD symptom severity. The SNP main effects and SNP × birth weight centile interactions remained significant after adjusting for multiple testing. Genetic variability in angiogenic, neurotrophic, and inflammatory systems may moderate the association between restricted prenatal growth, a proxy for an adverse prenatal environment, and risk to develop ADHD. PMID:25346392

  2. Connecting SNPs in Diabetes: A Spatial Analysis of Meta-GWAS Loci

    PubMed Central

    Schierding, William; O’Sullivan, Justin M.

    2015-01-01

    Meta-analyses of genome-wide association studies (GWAS) have improved our understanding of the genetic foundations of a number of diseases, including diabetes. However, single nucleotide polymorphisms (SNPs) that are identified by GWAS, especially those that fall outside of gene regions, do not always clearly link to the underlying biology. Despite this, these SNPs have often been validated through re-sequencing efforts as not just tag SNPs, but as causative SNPs, and so must play a role in disease development or progression. In this study, we show how the 3D genome (spatial connections) and trans-expression Quantitative Trait Loci connect diabetes loci from different GWAS meta-analyses, informing the backbone of regulatory networks. Our findings include a three-way functional–spatial connection between the TM6SF2, CTRB1–BCAR1, and CELSR2–PSRC1 loci (rs201189528, rs7202844, and rs7202844, respectively) connected through the KCNIP3 and BCAR1/BCAR3 loci, respectively. These spatial hubs serve as an example of how loci in genes with little biological connection to disease come together to contribute to the diabetes phenotype.

  3. Fishing for SNPs: A Targeted Locus Approach for Single Nucleotide Polymorphism Discovery in Rainbow Trout

    E-print Network

    May, Bernie

    Fishing for SNPs: A Targeted Locus Approach for Single Nucleotide Polymorphism Discovery in Rainbow to target variable regions of the rainbow trout Oncorhynchus mykiss genome, 48 of which were designed from. mykiss aguabonita, Little Kern golden trout O. mykiss whitei, coastal rainbow trout O. mykiss irideus

  4. Partition dataset according to amino acid type improves the prediction of deleterious non-synonymous SNPs.

    PubMed

    Yang, Jing; Li, Yuan-Yuan; Li, Yi-Xue; Ye, Zhi-Qiang

    2012-03-01

    Many non-synonymous SNPs (nsSNPs) are associated with diseases, and numerous machine learning methods have been applied to train classifiers for sorting disease-associated nsSNPs from neutral ones. The continuously accumulated nsSNP data allows us to further explore better prediction approaches. In this work, we partitioned the training data into 20 subsets according to either original or substituted amino acid type at the nsSNP site. Using support vector machine (SVM), training classification models on each subset resulted in an overall accuracy of 76.3% or 74.9% depending on the two different partition criteria, while training on the whole dataset obtained an accuracy of only 72.6%. Moreover, the dataset was also randomly divided into 20 subsets, but the corresponding accuracy was only 73.2%. Our results demonstrated that partitioning the whole training dataset into subsets properly, i.e., according to the residue type at the nsSNP site, will improve the performance of the trained classifiers significantly, which should be valuable in developing better tools for predicting the disease-association of nsSNPs. PMID:22326261

  5. BARCSOYSNP23: A SELECTED PANEL OF SNPS FOR SOYBEAN CULTIVAR IDENTIFICATION

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This report describes a set of 23 informative SNPs (BARCSoySNP23) distributed on 19 of the 20 soybean linkage groups that can be used for soybean cultivar identification. Selection of the set was made based upon the linkage map position of each SNP as well as the information provided by each SNP fo...

  6. The effects of single nucleotide polymorphisms (SNPs) of calpastatin (CAST) gene on meat tenderness of yak.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The association of single nucleotide polymorphisms (SNPs) of calpastatin (CAST) gene with shear force of 2.54 cm steaks from M. longissimus dorsi from Gannan yaks (Bos grunniens, n=181) was studied. Yaks were harvested at 2, 3, and 4 yr of age (n=51, 59, and 71, respectively), and samples of each ya...

  7. CLC2 single nucleotide polymorphisms (SNPs) as potential modifiers of cystic fibrosis disease severity

    Microsoft Academic Search

    Carol J Blaisdell; Timothy D Howard; Augustus Stern; Penelope Bamford; Eugene R Bleecker; O Colin Stine

    2004-01-01

    BACKGROUND: Cystic fibrosis (CF) lung disease manifest by impaired chloride secretion leads to eventual respiratory failure. Candidate genes that may modify CF lung disease severity include alternative chloride channels. The objectives of this study are to identify single nucleotide polymorphisms (SNPs) in the airway epithelial chloride channel, CLC-2, and correlate these polymorphisms with CF lung disease. METHODS: The CLC-2 promoter,

  8. 118 SNPs of folate-related genes and risks of spina bifida and conotruncal heart defects

    Microsoft Academic Search

    Gary M Shaw; Wei Lu; Huiping Zhu; Wei Yang; Farren BS Briggs; Suzan L Carmichael; Lisa F Barcellos; Edward J Lammer; Richard H Finnell

    2009-01-01

    BACKGROUND: Folic acid taken in early pregnancy reduces risks for delivering offspring with several congenital anomalies. The mechanism by which folic acid reduces risk is unknown. Investigations into genetic variation that influences transport and metabolism of folate will help fill this data gap. We focused on 118 SNPs involved in folate transport and metabolism. METHODS: Using data from a California

  9. Large-scale enrichment and discovery of gene-associated SNPs

    Technology Transfer Automated Retrieval System (TEKTRAN)

    With the recent advent of massively parallel pyrosequencing by 454 Life Sciences it has become feasible to cost-effectively identify numerous single nucleotide polymorphisms (SNPs) within the recombinogenic regions of the maize (Zea mays L.) genome. We developed a modified version of hypomethylated...

  10. Identification of new SNPs in native South American populations by resequencing the Y chromosome.

    PubMed

    Geppert, M; Ayub, Q; Xue, Y; Santos, S; Ribeiro-dos-Santos, Â; Baeta, M; Núñez, C; Martínez-Jarreta, B; Tyler-Smith, C; Roewer, L

    2015-03-01

    The Y-chromosomal genetic landscape of South America is relatively homogenous. The majority of native Amerindian people are assigned to haplogroup Q and only a small percentage belongs to haplogroup C. With the aim of further differentiating the major Q lineages and thus obtaining new insights into the population history of South America, two individuals, both belonging to the sub-haplogroup Q-M3, were analyzed with next-generation sequencing. Several new candidate SNPs were evaluated and four were confirmed to be new, haplogroup Q-specific, and variable. One of the new SNPs, named MG2, identifies a new sub-haplogroup downstream of Q-M3; the other three (MG11, MG13, MG15) are upstream of Q-M3 but downstream of M242, and describe branches at the same phylogenetic positions as previously known SNPs in the samples tested. These four SNPs were typed in 100 individuals belonging to haplogroup Q. PMID:25303787

  11. An assessment of whether SNPs will replace STRs in national DNA databases Joint considerations of the

    E-print Network

    Working Group on DNA Analysis Methods (SWGDAM) Sir: It is unlikely that SNPs will replace STRs as the preferred method of testing of forensic samples and database samples in the near to medium future throughput, this research is carried out primarily for the pharmaceutical industry for drug discovery

  12. TAXONOMY OF JUNIPERUS COMMUNIS IN NORTH AMERICA: INSIGHT FROM VARIATION IN nrDNA SNPs

    Microsoft Academic Search

    Robert P. Adams

    2008-01-01

    Plants of Juniperus communis L. var. communis, J. c. var. depressa Pursh, J. c. var. jackii Rehdr, J. c. var. saxatilis Pall. were sampled and SNPs from nrDNA were examined. Based on these data and previous data, a new variety of J. communis is recognized: Juniperus communis var. charlottensis R. P. Adams, var. nov. It occurs in muskeg bogs on

  13. Phytologia (April 2010) 92(1)68 DISCOVERY AND SNPS ANALYSES OF POPULATIONS OF

    E-print Network

    Adams, Robert P.

    98368 ABSTRACT Trees from two populations of Juniperus commonly identified as J. scopulorum growing that Juniperus trees identified as J. scopulorum Sarg. have been reported from the dry side (northeastPhytologia (April 2010) 92(1)68 DISCOVERY AND SNPS ANALYSES OF POPULATIONS OF JUNIPERUS MARITIMA

  14. Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level: A Monte Carlo Simulation to Assess the Tenability of the SuperMatrix Approach

    E-print Network

    Lang, Kyle Matthew

    2013-05-31

    Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level: A Monte Carlo Simulation to Assess the Tenability of the SuperMatrix Approach BY Kyle M. Lang Submitted to the graduate program in Psychology and the Graduate... that this is the approved version of the following thesis : Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level: A Monte Carlo Simulation to Assess the Tenability of the SuperMatrix Approach Chairperson Todd D. Little Date approved...

  15. Structural investigation of deleterious non-synonymous SNPs of EGFR gene.

    PubMed

    Raghav, Dhwani; Sharma, Vinay; Agarwal, Subhash Mohan

    2013-03-01

    Epidermal Growth Factor Receptor (EGFR), a member of the receptor tyrosine kinase family has shown to be implicated in the development and progression of various cancers due to mutations in the tyrosine kinase domain (TKD). It is important to understand the functional significance of amino acid variation occurring within TKD due to non-synonymous Single Nucleotide Polymorphism (nsSNPs). Therefore, we have evaluated the influence of nsSNPs on the structure of EGFR-TKD using computational methods. Out of 2,493 SNPs in the EGFR gene, only 41 were found to be non-synonymous. In silico evaluation of these nsSNPs using a sequence based SIFT tool and structure based PolyPhen algorithm revealed that 13 nsSNPs disrupted the conformation of EGFR-TKD. Protein stability analysis using CUPSAT, I-mutant2.0 and iPTree-STAB identified 6 mutants that are less stable than the wild structure. Thereafter, to evaluate the structural impact of 5 mutants (G719A, P733L, V742A, S768I and H773R) the molecular dynamics (MD) simulation for 2 ns was performed. The MD trajectories showed that the native EGFR was stabilized after 0.9 ns while the stability of mutants was achieved after longer simulation. The RMSF profile of P-loop and A-loop shows an increased flexibility for all the mutants. We also observed that the 3 mutants (V742A, P733L and H773R) showed large root mean square deviation (2.075, 2.59 and 2.752 Å respectively) compared to the native EGFR. Further docking studies indicate that gefitinib can be administered for combating cancer occurring due to presence of these mutations. PMID:23605641

  16. Search for and Analysis of Single Nucleotide Polymorphisms (SNPs) in Rice (Oryza sativa, Oryza rufipogon) and Establishment of SNP Markers

    Microsoft Academic Search

    Shinobu Nasu; Junko Suzuki; Rieko Ohta; Kana Hasegawa; Rika Yui; Noriyuki Kitazawa; Lisa Monna; Yuzo Minobe

    2002-01-01

    We searched for SNPs in 417 regions distributed throughout the genome of three Oryza sativa ssp. japonica cultivars, two indica cultivars, and a wild rice (O. rufipogon). We found 2800 SNPs in approxi- mately 250,000 aligned bases for an average of one SNP every 89 bp, or one SNP every 232 bp between two randomly selected strains. Graphic representation of

  17. Use of Imputed Population-based Cancer Registry Data as a Method of Accounting for Missing Information: Application to Estrogen Receptor Status for Breast Cancer

    PubMed Central

    Howlader, Nadia; Noone, Anne-Michelle; Yu, Mandi; Cronin, Kathleen A.

    2012-01-01

    The National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) Program provides a rich source of data stratified according to tumor biomarkers that play an important role in cancer surveillance research. These data are useful for analyzing trends in cancer incidence and survival. These tumor markers, however, are often prone to missing observations. To address the problem of missing data, the authors employed sequential regression multivariate imputation for breast cancer variables, with a particular focus on estrogen receptor status, using data from 13 SEER registries covering the period 1992–2007. In this paper, they present an approach to accounting for missing information through the creation of imputed data sets that can be analyzed using existing software (e.g., SEER*Stat) developed for analyzing cancer registry data. Bias in age-adjusted trends in female breast cancer incidence is shown graphically before and after imputation of estrogen receptor status, stratified by age and race. The imputed data set will be made available in SEER*Stat (http://seer.cancer.gov/analysis/index.html) to facilitate accurate estimation of breast cancer incidence trends. To ensure that the imputed data set is used correctly, the authors provide detailed, step-by-step instructions for conducting analyses. This is the first time that a nationally representative, population-based cancer registry data set has been imputed and made available to researchers for conducting a variety of analyses of breast cancer incidence trends. PMID:22842721

  18. Multiple imputation for estimation of an occurrence rate in cohorts with attrition and discrete follow-up time points: a simulation study

    PubMed Central

    2010-01-01

    Background In longitudinal cohort studies, subjects may be lost to follow-up at any time during the study. This leads to attrition and thus to a risk of inaccurate and biased estimations. The purpose of this paper is to show how multiple imputation can take advantage of all the information collected during follow-up in order to estimate the cumulative probability P(E) of an event E, when the first occurrence of this event is observed at t successive time points of a longitudinal study with attrition. Methods We compared the performance of multiple imputation with that of Kaplan-Meier estimation in several simulated attrition scenarios. Results In missing-completely-at-random scenarios, the multiple imputation and Kaplan-Meier methods performed well in terms of bias (less than 1%) and coverage rate (range = [94.4%; 95.8%]). In missing-at-random scenarios, the Kaplan-Meier method was associated with a bias ranging from -5.1% to 7.0% and with a very poor coverage rate (as low as 0.2%). Multiple imputation performed much better in this situation (bias <2%, coverage rate >83.4%). Conclusions Multiple imputation shows promise for estimation of an occurrence rate in cohorts with attrition. This study is a first step towards defining appropriate use of multiple imputation in longitudinal studies. PMID:20815883

  19. Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics.

    PubMed

    Webb-Robertson, Bobbie-Jo M; Wiberg, Holli K; Matzke, Melissa M; Brown, Joseph N; Wang, Jing; McDermott, Jason E; Smith, Richard D; Rodland, Karin D; Metz, Thomas O; Pounds, Joel G; Waters, Katrina M

    2015-05-01

    In this review, we apply selected imputation strategies to label-free liquid chromatography-mass spectrometry (LC-MS) proteomics datasets to evaluate the accuracy with respect to metrics of variance and classification. We evaluate several commonly used imputation approaches for individual merits and discuss the caveats of each approach with respect to the example LC-MS proteomics data. In general, local similarity-based approaches, such as the regularized expectation maximization and least-squares adaptive algorithms, yield the best overall performances with respect to metrics of accuracy and robustness. However, no single algorithm consistently outperforms the remaining approaches, and in some cases, performing classification without imputation sometimes yielded the most accurate classification. Thus, because of the complex mechanisms of missing data in proteomics, which also vary from peptide to protein, no individual method is a single solution for imputation. On the basis of the observations in this review, the goal for imputation in the field of computational proteomics should be to develop new approaches that work generically for this data type and new strategies to guide users in the selection of the best imputation for their dataset and analysis objectives. PMID:25855118

  20. Water Filters

    NASA Technical Reports Server (NTRS)

    1988-01-01

    Seeking to find a more effective method of filtering potable water that was highly contaminated, Mike Pedersen, founder of Western Water International, learned that NASA had conducted extensive research in methods of purifying water on board manned spacecraft. The key is Aquaspace Compound, a proprietary WWI formula that scientifically blends various types of glandular activated charcoal with other active and inert ingredients. Aquaspace systems remove some substances; chlorine, by atomic adsorption, other types of organic chemicals by mechanical filtration and still others by catalytic reaction. Aquaspace filters are finding wide acceptance in industrial, commercial, residential and recreational applications in the U.S. and abroad.

  1. Filter apparatus

    DOEpatents

    Kuban, Daniel P. (Oak Ridge, TN); Singletary, B. Huston (Oak Ridge, TN); Evans, John H. (Rockwood, TN)

    1984-01-01

    A plurality of holding tubes are respectively mounted in apertures in a partition plate fixed in a housing receiving gas contaminated with particulate material. A filter cartridge is removably held in each holding tube, and the cartridges and holding tubes are arranged so that gas passes through apertures therein and across the partition plate while particulate material is collected in the cartridges. Replacement filter cartridges are respectively held in holding canisters mounted on a support plate which can be secured to the aforesaid housing, and screws mounted on said canisters are arranged to push replacement cartridges into the cartridge holding tubes and thereby eject used cartridges therefrom.

  2. Tailored selection of study individuals to be sequenced in order to improve the accuracy of genotype imputation.

    PubMed

    Peil, Barbara; Kabisch, Maria; Fischer, Christine; Hamann, Ute; Bermejo, Justo Lorenzo

    2015-02-01

    The addition of sequence data from own-study individuals to genotypes from external data repositories, for example, the HapMap, has been shown to improve the accuracy of imputed genotypes. Early approaches for reference panel selection favored individuals who best reflect recombination patterns in the study population. By contrast, a maximization of genetic diversity in the reference panel has been recently proposed. We investigate here a novel strategy to select individuals for sequencing that relies on the characterization of the ancestral kernel of the study population. The simulated study scenarios consisted of several combinations of subpopulations from HapMap. HapMap individuals who did not belong to the study population constituted an external reference panel which was complemented with the sequences of study individuals selected according to different strategies. In addition to a random choice, individuals with the largest statistical depth according to the first genetic principal components were selected. In all simulated scenarios the integration of sequences from own-study individuals increased imputation accuracy. The selection of individuals based on the statistical depth resulted in the highest imputation accuracy for European and Asian study scenarios, whereas random selection performed best for an African-study scenario. Present findings indicate that there is no universal 'best strategy' to select individuals for sequencing. We propose to use the methodology described in the manuscript to assess the advantage of focusing on the ancestral kernel under own study characteristics (study size, genetic diversity, availability and properties of external reference panels, frequency of imputed variants…). PMID:25537753

  3. A Bayesian Multiple Imputation Method for Handling Longitudinal Pesticide Data with Values below the Limit of Detection

    PubMed Central

    Chen, Haiying; Quandt, Sara A.; Grzywacz, Joseph G.; Arcury, Thomas A.

    2013-01-01

    Environmental and biomedical research often produces data below the limit of detection (LOD), or left-censored data. Imputing explicit values for values < LOD in a multivariate setting, such as with longitudinal data, is difficult using a likelihood-based approach. A Bayesian multiple imputation (MI) method is introduced to handle left-censored multivariate data. A Gibbs sampler, which uses an iterative process, is employed to simulate the target multivariate distribution within a Bayesian framework. Following convergence, multiple plausible data sets are generated for analysis by standard statistical methods outside of a Bayesian framework. With explicit imputed values available variables can be analyzed as outcomes or predictors. We illustrate a practical application using longitudinal data from the Community Participatory Approach to Measuring Farmworker Pesticide Exposure (PACE3) study to evaluate the association between urinary acephate concentrations (indicating pesticide exposure) and self-reported potential pesticide poisoning symptoms. Additionally, a simulation study is used to evaluate the sampling property of the estimators for distributional parameters as well as regression coefficients estimated with the generalized estimating equation (GEE) approach. Results demonstrated that the Bayesian MI estimates performed well in most settings, and we recommend the use of this valid and feasible approach to analyze multivariate data with values < LOD. PMID:23504271

  4. Phosphorus Filter

    USGS Multimedia Gallery

    Tom Kehler, fishery biologist at the U.S. Fish and Wildlife Service's Northeast Fishery Center in Lamar, Pennsylvania, checks the flow rate of water leaving a phosphorus filter column. The USGS has pioneered a new use for acid mine drainage residuals that are currently a disposal challenge, usi...

  5. Drug Filtering

    NSDL National Science Digital Library

    Lawrence F. Iles

    2010-01-01

    In this math meets health science activity, learners observe a model of exponential decay, and how kidneys filter blood. Learners will calculate the amount of a drug in the body over a period of time. Then, they will make and analyze the graphical representation of this exponential function. This lesson guide includes questions for learners, assessment options, extensions, and reflection questions.

  6. Strategies for single nucleotide polymorphism (SNP) genotyping to enhance genotype imputation in Gyr (Bos indicus) dairy cattle: Comparison of commercially available SNP chips.

    PubMed

    Boison, S A; Santos, D J A; Utsunomiya, A H T; Carvalheiro, R; Neves, H H R; O'Brien, A M Perez; Garcia, J F; Sölkner, J; da Silva, M V G B

    2015-07-01

    Genotype imputation is widely used as a cost-effective strategy in genomic evaluation of cattle. Key determinants of imputation accuracies, such as linkage disequilibrium patterns, marker densities, and ascertainment bias, differ between Bos indicus and Bos taurus breeds. Consequently, there is a need to investigate effectiveness of genotype imputation in indicine breeds. Thus, the objective of the study was to investigate strategies and factors affecting the accuracy of genotype imputation in Gyr (Bos indicus) dairy cattle. Four imputation scenarios were studied using 471 sires and 1,644 dams genotyped on Illumina BovineHD (HD-777K; San Diego, CA) and BovineSNP50 (50K) chips, respectively. Scenarios were based on which reference high-density single nucleotide polymorphism (SNP) panel (HDP) should be adopted [HD-777K, 50K, and GeneSeek GGP-75Ki (Lincoln, NE)]. Depending on the scenario, validation animals had their genotypes masked for one of the lower-density panels: Illumina (3K, 7K, and 50K) and GeneSeek (SGGP-20Ki and GGP-75Ki). We randomly selected 171 sires as reference and 300 as validation for all the scenarios. Additionally, all sires were used as reference and the 1,644 dams were imputed for validation. Genotypes of 98 individuals with 4 and more offspring were completely masked and imputed. Imputation algorithms FImpute and Beagle v3.3 and v4 were used. Imputation accuracies were measured using the correlation and allelic correct rate. FImpute resulted in highest accuracies, whereas Beagle 3.3 gave the least-accurate imputations. Accuracies evaluated as correlation (allelic correct rate) ranged from 0.910 (0.942) to 0.961 (0.974) using 50K as HDP and with 3K (7K) as low-density panels. With GGP-75Ki as HDP, accuracies were moderate for 3K, 7K, and 50K, but high for SGGP-20Ki. The use of HD-777K as HDP resulted in accuracies of 0.888 (3K), 0.941 (7K), 0.980 (SGGP-20Ki), 0.982 (50K), and 0.993 (GGP-75Ki). Ungenotyped individuals were imputed with an average accuracy of 0.970. The average top 5 kinship coefficients between reference and imputed individuals was a strong predictor of imputation accuracy. FImpute was faster and used less memory than Beagle v4. Beagle v4 outperformed Beagle v3.3 in accuracy and speed of computation. A genotyping strategy that uses the HD-777K SNP chip as a reference panel and SGGP-20Ki as the lower-density SNP panel should be adopted as accuracy was high and similar to that of the 50K. However, the effect of using imputed HD-777K genotypes from the SGGP-20Ki on genomic evaluation is yet to be studied. PMID:25958293

  7. Collective effects of SNPs on transgenerational inheritance in Caenorhabditis elegans and budding yeast.

    PubMed

    Zhu, Zuobin; Man, Xian; Xia, Mengying; Huang, Yimin; Yuan, Dejian; Huang, Shi

    2015-07-01

    We studied the collective effects of single nucleotide polymorphisms (SNPs) on transgenerational inheritance in Caenorhabditis elegans recombinant inbred advanced intercross lines (RIAILs) and yeast segregants. We divided the RIAILs and segregants into two groups of high and low minor allele content (MAC). RIAILs with higher MAC needed less generations of benzaldehyde training to gain a stable olfactory imprint and showed a greater change from normal after benzaldehyde training. Yeast segregants with higher MAC showed a more dramatic shortening of the lag phase length after ethanol exposure. The short lag phase as acquired by ethanol training was more dramatically lost after recovery in ethanol free medium for the high MAC group. We also found a preferential association between MAC and traits linked with higher number of additive QTLs. These results suggest a role for the collective effects of SNPs in transgenerational inheritance, and may help explain human variations in disease susceptibility. PMID:25882787

  8. Coding SNPs as intrinsic markers for sample tracking in large-scale transcriptome studies.

    PubMed

    Xu, Weihong; Gao, Hong; Seok, Junhee; Wilhelmy, Julie; Mindrinos, Michael N; Davis, Ronald W; Xiao, Wenzhong

    2012-06-01

    Large-scale transcriptome profiling in clinical studies often involves assaying multiple samples of a patient to monitor disease progression, treatment effect, and host response in multiple tissues. Such profiling is prone to human error, which often results in mislabeled samples. Here, we present a method to detect mislabeled sample outliers using coding single nucleotide polymorphisms (cSNPs) specifically designed on the microarray and demonstrate that the mislabeled samples can be efficiently identified by either simple clustering of allele-specific expression scores or Mahalanobis distance-based outlier detection method. Based on our results, we recommend the incorporation of cSNPs into future transcriptome array designs as intrinsic markers for sample tracking. PMID:22668418

  9. WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation

    PubMed Central

    2013-01-01

    Background SNPs&GO is a method for the prediction of deleterious Single Amino acid Polymorphisms (SAPs) using protein functional annotation. In this work, we present the web server implementation of SNPs&GO (WS-SNPs&GO). The server is based on Support Vector Machines (SVM) and for a given protein, its input comprises: the sequence and/or its three-dimensional structure (when available), a set of target variations and its functional Gene Ontology (GO) terms. The output of the server provides, for each protein variation, the probabilities to be associated to human diseases. Results The server consists of two main components, including updated versions of the sequence-based SNPs&GO (recently scored as one of the best algorithms for predicting deleterious SAPs) and of the structure-based SNPs&GO3d programs. Sequence and structure based algorithms are extensively tested on a large set of annotated variations extracted from the SwissVar database. Selecting a balanced dataset with more than 38,000 SAPs, the sequence-based approach achieves 81% overall accuracy, 0.61 correlation coefficient and an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve of 0.88. For the subset of ~6,600 variations mapped on protein structures available at the Protein Data Bank (PDB), the structure-based method scores with 84% overall accuracy, 0.68 correlation coefficient, and 0.91 AUC. When tested on a new blind set of variations, the results of the server are 79% and 83% overall accuracy for the sequence-based and structure-based inputs, respectively. Conclusions WS-SNPs&GO is a valuable tool that includes in a unique framework information derived from protein sequence, structure, evolutionary profile, and protein function. WS-SNPs&GO is freely available at http://snps.biofold.org/snps-and-go. PMID:23819482

  10. Genetic Basis of Common Human Disease: Insight into the Role of Missense SNPs from Genome-Wide Association Studies.

    PubMed

    Pal, Lipika R; Moult, John

    2015-07-01

    Recent genome-wide association studies (GWAS) have led to the reliable identification of single nucleotide polymorphisms (SNPs) at a number of loci associated with increased risk of specific common human diseases. Each such locus implicates multiple possible candidate SNPs for involvement in disease mechanism. A variety of mechanisms may link the presence of an SNP to altered in vivo gene product function and hence contribute to disease risk. Here, we report an analysis of the role of one of these mechanisms, missense SNPs (msSNPs) in proteins in seven complex trait diseases. Linkage disequilibrium information was used to identify possible candidate msSNPs associated with increased disease risk at each of 356 loci for the seven diseases. Two computational methods were used to estimate which of these SNPs has a significant impact on in vivo protein function. 69% of the loci have at least one candidate msSNP and 33% have at least one predicted high-impact msSNP. In some cases, these SNPs are in well-established disease-related proteins, such as MST1 (macrophage stimulating 1) for Crohn's disease. In others, they are in proteins identified by GWAS as likely candidates for disease relevance, but previously without known mechanism, such as ADAMTS13 (ADAM metallopeptidase with thrombospondin type 1 motif, 13) for coronary artery disease. In still other cases, the missense SNPs are in proteins not previously suggested as disease candidates, such as TUBB1 (tubulin, beta 1, class VI) for hypertension. Together, these data support a substantial role for this class of SNPs in susceptibility to common human disease. PMID:25937569

  11. Improved Resolution Haplogroup G Phylogeny in the Y Chromosome, Revealed by a Set of Newly Characterized SNPs

    PubMed Central

    Sims, Lynn M.; Garvey, Dennis; Ballantyne, Jack

    2009-01-01

    Background Y-SNP haplogroup G (hgG), defined by Y-SNP marker M201, is relatively uncommon in the United States general population, with only 8 additional sub-markers characterized. Many of the previously described eight sub-markers are either very rare (2–4%) or do not distinguish between major populations within this hg. In fact, prior to the current study, only 2% of our reference Caucasian population belonged to hgG and all of these individuals were in sub-haplogroup G2a, defined by P15. Additional Y-SNPs are needed in order to differentiate between individuals within this haplogroup. Principal Findings In this work we have investigated whether we could differentiate between a population of 63 hgG individuals using previously uncharacterized Y-SNPs. We have designed assays to test these individuals using all known hgG SNPs (n?=?9) and an additional 16 unreported/undefined Y-SNPS. Using a combination of DNA sequence and genetic genealogy databases, we have uncovered a total of 15 new hgG SNPs that had been previously reported but not phylogenetically characterized. Ten of the new Y-SNPs are phylogenetically equivalent to M201, one is equivalent to P15 and, interestingly, four create new, separate haplogroups. Three of the latter are more common than many of the previously defined Y-SNPs. Y-STR data from these individuals show that DYS385*12 is present in (70%) of G2a3b1-U13 individuals while only 4% of non-G2a3b1-U13 individuals posses the DYS385*12 allele. Conclusions This study uncovered several previously undefined Y-SNPs by using data from several database sources. The new Y-SNPs revealed in this paper will be of importance to those with research interests in population biology and human evolution. PMID:19495413

  12. Evaluation of human leukocyte N-formylpeptide receptor (FPR1) SNPs in aggressive periodontitis patients

    Microsoft Academic Search

    Y Zhang; R Syed; C Uygar; D Pallos; M C Gorry; E Firatli; J R Cortelli; T E VanDyke; P S Hart; E Feingold; T C Hart

    2003-01-01

    Polymorphonuclear neutrophils (PMNs) are attracted to sites of infection by N-formylpeptide (fMLP) chemoattractants. The high-affinity fMLP receptor (FPR1) of phagocytic cells interacts with bacterial fMLP and mediates chemotaxis, degranulation, and superoxide production. These cellular functions are disrupted in PMN from aggressive periodontitis (AP) patients. Two FPR1 gene single nucleotide polymorphisms (SNPs), c.329T>C and c.378C>G, have been associated with a localized

  13. Investigation on the role of nsSNPs in HNPCC genes – a bioinformatics approach

    PubMed Central

    Doss, C George Priya; Sethumadhavan, Rao

    2009-01-01

    Background A central focus of cancer genetics is the study of mutations that are causally implicated in tumorigenesis. The identification of such causal mutations not only provides insight into cancer biology but also presents anticancer therapeutic targets and diagnostic markers. Missense mutations are nucleotide substitutions that change an amino acid in a protein, the deleterious effects of these mutations are commonly attributed to their impact on primary amino acid sequence and protein structure. Methods The method to identify functional SNPs from a pool, containing both functional and neutral SNPs is challenging by experimental protocols. To explore possible relationships between genetic mutation and phenotypic variation, we employed different bioinformatics algorithms like Sorting Intolerant from Tolerant (SIFT), Polymorphism Phenotyping (PolyPhen), and PupaSuite to predict the impact of these amino acid substitutions on protein activity of mismatch repair (MMR) genes causing hereditary nonpolyposis colorectal cancer (HNPCC). Results SIFT classified 22 of 125 variants (18%) as 'Intolerant." PolyPhen classified 40 of 125 amino acid substitutions (32%) as "Probably or possibly damaging". The PupaSuite predicted the phenotypic effect of SNPs on the structure and function of the affected protein. Based on the PolyPhen scores and availability of three-dimensional structures, structure analysis was carried out with the major mutations that occurred in the native protein coded by MSH2 and MSH6 genes. The amino acid residues in the native and mutant model protein were further analyzed for solvent accessibility and secondary structure to check the stability of the proteins. Conclusion Based on this approach, we have shown that four nsSNPs, which were predicted to have functional consequences (MSH2-Y43C, MSH6-Y538S, MSH6-S580L, and MSH6-K854M), were already found to be associated with cancer risk. Our study demonstrates the presence of other deleterious mutations and also endorses with in vivo experimental studies. PMID:19389263

  14. The SNPs in the ACACA gene are effective on fatty acid composition in Holstein milk.

    PubMed

    Matsumoto, Hirokazu; Sasaki, Kenta; Bessho, Takuya; Kobayashi, Eiji; Abe, Tsuyoshi; Sasazaki, Shinji; Oyama, Kenji; Mannen, Hideyuki

    2012-09-01

    Fatty acid composition is an important economic trait for both dairy and beef cattle and controlled by genetic factors. Candidate genes controlling fatty acid composition may be found in fat synthesis and metabolism pathways. Acetyl-CoA carboxylase is the flux-determining enzyme in the regulation of fatty acid synthesis in animal tissues. One of two isozymes of this enzyme, acetyl-CoA carboxylase-? (ACACA), catalyses the first committed step of fatty acid synthesis in mammalian cytosol, leading to the biosynthesis of long-chain fatty acids. In the current study, the sequence comparison of the coding sequence (CDS) and two promoter regions (PIA and PIII) in bovine ACACA gene was performed between Japanese Black and Holstein cattle to detect nucleotide polymorphisms influencing fatty acid composition in milk and beef. Five single nucleotide polymorphisms (SNPs) were identified in the CDS region, 28 SNPs in the PIA region and three SNPs in the PIII region. Association study revealed that CCT/CCT type of PIII_#1, #2/PIA_#26 indicated a higher percentage of C14:0 in the milk of the Holstein cattle than CCT/GTC type (p = 0.050) and that a difference of the percentage of C16:0 was observed between CCT/CCT and GTC/GTC type (p = 0.023). CDS_#2 T/T type indicated a higher percentage of C18:0 than T/C type (p = 0.008). In addition, the Japanese Black cattle with CC/GT type of PIII_#1, #2 showed a higher percentage of C18:2 in the meat than those with GT/GT type (p = 0.025). Since PIII is the promoter specific to mammary gland during lactation, the altered expression of the ACACA gene owing to the SNPs in the PIII region may influence the fatty acid composition in the milk. PMID:22718502

  15. A Reduced Number of mtSNPs Saturates Mitochondrial DNA Haplotype Diversity of Worldwide Population Groups

    Microsoft Academic Search

    Antonio Salas; Jorge Amigo; Vincent Macaulay

    2010-01-01

    BackgroundThe high levels of variation characterising the mitochondrial DNA (mtDNA) molecule are due ultimately to its high average mutation rate; moreover, mtDNA variation is deeply structured in different populations and ethnic groups. There is growing interest in selecting a reduced number of mtDNA single nucleotide polymorphisms (mtSNPs) that account for the maximum level of discrimination power in a given population.

  16. A new MALDI-TOF based mini-sequencing assay for genotyping of SNPS

    Microsoft Academic Search

    Xiyuan Sun; H. Ding; K. Hung; Baochuan Guo

    2000-01-01

    A new MALDI-TOF based mini-sequencing assay termed VSET was developed for genotyping of SNPs. In this assay, specific fragments of genomic DNA containing the SNP site(s) are first amplified, followed by mini-sequencing in the presence of three ddNTPs and the fourth nucleotide in the deoxy form. In this way, the primer is extended by only one base from one allele,

  17. Genome-Wide Association Study Based on Multiple Imputation with Low-Depth Sequencing Data: Application to Biofuel Traits in Reed Canarygrass

    PubMed Central

    Ramstein, Guillaume P.; Lipka, Alexander E.; Lu, Fei; Costich, Denise E.; Cherney, Jerome H.; Buckler, Edward S.; Casler, Michael D.

    2015-01-01

    Genotyping by sequencing allows for large-scale genetic analyses in plant species with no reference genome, but sets the challenge of sound inference in presence of uncertain genotypes. We report an imputation-based genome-wide association study (GWAS) in reed canarygrass (Phalaris arundinacea L., Phalaris caesia Nees), a cool-season grass species with potential as a biofuel crop. Our study involved two linkage populations and an association panel of 590 reed canarygrass genotypes. Plants were assayed for up to 5228 single nucleotide polymorphism markers and 35 traits. The genotypic markers were derived from low-depth sequencing with 78% missing data on average. To soundly infer marker-trait associations, multiple imputation (MI) was used: several imputes of the marker data were generated to reflect imputation uncertainty and association tests were performed on marker effects across imputes. A total of nine significant markers were identified, three of which showed significant homology with the Brachypodium dystachion genome. Because no physical map of the reed canarygrass genome was available, imputation was conducted using classification trees. In general, MI showed good consistency with the complete-case analysis and adequate control over imputation uncertainty. A gain in significance of marker effects was achieved through MI, but only for rare cases when missing data were <45%. In addition to providing insight into the genetic basis of important traits in reed canarygrass, this study presents one of the first applications of MI to genome-wide analyses and provides useful guidelines for conducting GWAS based on genotyping-by-sequencing data. PMID:25770100

  18. Functional classification of 15 million SNPs detected from diverse chicken populations

    PubMed Central

    Gheyas, Almas A.; Boschiero, Clarissa; Eory, Lel; Ralph, Hannah; Kuo, Richard; Woolliams, John A.; Burt, David W.

    2015-01-01

    Next-generation sequencing has prompted a surge of discovery of millions of genetic variants from vertebrate genomes. Besides applications in genetic association and linkage studies, a fraction of these variants will have functional consequences. This study describes detection and characterization of 15 million SNPs from chicken genome with the goal to predict variants with potential functional implications (pfVars) from both coding and non-coding regions. The study reports: 183K amino acid-altering SNPs of which 48% predicted as evolutionary intolerant, 13K splicing variants, 51K likely to alter RNA secondary structures, 500K within most conserved elements and 3K from non-coding RNAs. Regions of local fixation within commercial broiler and layer lines were investigated as potential selective sweeps using genome-wide SNP data. Relationships with phenotypes, if any, of the pfVars were explored by overlaying the sweep regions with known QTLs. Based on this, the candidate genes and/or causal mutations for a number of important traits are discussed. Although the fixed variants within sweep regions were enriched with non-coding SNPs, some non-synonymous-intolerant mutations reached fixation, suggesting their possible adaptive advantage. The results presented in this study are expected to have important implications for future genomic research to identify candidate causal mutations and in poultry breeding. PMID:25926514

  19. Identification of Sex-Linked SNPs and Sex-Determining Regions in the Yellowtail Genome.

    PubMed

    Koyama, Takashi; Ozaki, Akiyuki; Yoshida, Kazunori; Suzuki, Junpei; Fuji, Kanako; Aoki, Jun-Ya; Kai, Wataru; Kawabata, Yumi; Tsuzaki, Tatsuo; Araki, Kazuo; Sakamoto, Takashi

    2015-08-01

    Unlike the conservation of sex-determining (SD) modes seen in most mammals and birds, teleost fishes exhibit a wide variety of SD systems and genes. Hence, the study of SD genes and sex chromosome turnover in fish is one of the most interesting topics in evolutionary biology. To increase resolution of the SD gene evolutionary trajectory in fish, identification of the SD gene in more fish species is necessary. In this study, we focused on the yellowtail, a species widely cultivated in Japan. It is a member of family Carangidae in which no heteromorphic sex chromosome has been observed, and no SD gene has been identified to date. By performing linkage analysis and BAC walking, we identified a genomic region and SNPs with complete linkage to yellowtail sex. Comparative genome analysis revealed the yellowtail SD region ancestral chromosome structure as medaka-fugu. Two inversions occurred in the yellowtail linage after it diverged from the yellowtail-medaka ancestor. An association study using wild yellowtails and the SNPs developed from BAC ends identified two SNPs that can reasonably distinguish the sexes. Therefore, these will be useful genetic markers for yellowtail breeding. Based on a comparative study, it was suggested that a PDZ domain containing the GIPC protein might be involved in yellowtail sex determination. The homomorphic sex chromosomes widely observed in the Carangidae suggest that this family could be a suitable marine fish model to investigate the early stages of sex chromosome evolution, for which our results provide a good starting point. PMID:25975833

  20. Functional classification of 15 million SNPs detected from diverse chicken populations.

    PubMed

    Gheyas, Almas A; Boschiero, Clarissa; Eory, Lel; Ralph, Hannah; Kuo, Richard; Woolliams, John A; Burt, David W

    2015-06-01

    Next-generation sequencing has prompted a surge of discovery of millions of genetic variants from vertebrate genomes. Besides applications in genetic association and linkage studies, a fraction of these variants will have functional consequences. This study describes detection and characterization of 15 million SNPs from chicken genome with the goal to predict variants with potential functional implications (pfVars) from both coding and non-coding regions. The study reports: 183K amino acid-altering SNPs of which 48% predicted as evolutionary intolerant, 13K splicing variants, 51K likely to alter RNA secondary structures, 500K within most conserved elements and 3K from non-coding RNAs. Regions of local fixation within commercial broiler and layer lines were investigated as potential selective sweeps using genome-wide SNP data. Relationships with phenotypes, if any, of the pfVars were explored by overlaying the sweep regions with known QTLs. Based on this, the candidate genes and/or causal mutations for a number of important traits are discussed. Although the fixed variants within sweep regions were enriched with non-coding SNPs, some non-synonymous-intolerant mutations reached fixation, suggesting their possible adaptive advantage. The results presented in this study are expected to have important implications for future genomic research to identify candidate causal mutations and in poultry breeding. PMID:25926514

  1. F-108 polymer and capillary electrophoresis easily resolves complex environmental DNA mixtures and SNPs.

    PubMed

    Damaso, Natalie; Martin, Lauren; Kushwaha, Priyanka; Mills, DeEtta

    2014-11-01

    Ecological studies of microbial communities often use profiling methods but the true community diversity can be underestimated in methods that separate amplicons based on sequence length using performance optimized polymer 4. Taxonomically, unrelated organisms can produce the same length amplicon even though the amplicons have different sequences. F-108 polymer has previously been shown to resolve same length amplicons by sequence polymorphisms. In this study, we showed F-108 polymer, using the ABI Prism 310 Genetic Analyzer and CE, resolved four bacteria that produced the same length amplicon for the 16S rRNA domain V3 but have variable nucleotide content. Second, a microbial mat community profile was resolved and supported by NextGen sequencing where the number of peaks in the F-108 profile was in concordance with the confirmed species numbers in the mat. Third, equine DNA was analyzed for SNPs. The F-108 polymer was able to distinguish heterozygous and homozygous individuals for the melanocortin 1 receptor coat color gene. The method proved to be rapid, inexpensive, reproducible, and uses common CE instruments. The potential for F-108 to resolve DNA mixtures or SNPs can be applied to various sample types-from SNPs to forensic mixtures to ecological communities. PMID:25168595

  2. Identification of Deleterious SNPs and Their Effects on Structural Level in CHRNA3 Gene.

    PubMed

    Chandramohan, Vivek; Nagaraju, Navya; Rathod, Shrikant; Kaphle, Anubhav; Muddapur, Uday

    2015-08-01

    The aim of our study is to identify probable deleterious genetic variations that can alter the expression and the function of the CHRNA3 gene using in silico methods. Of the 2305 SNPs identified in the CHRNA3 gene, 115 were found to be non-synonymous and 12 and 15 nsSNPs were found to be in the 5' and 3' UTRs, respectively. Further, out of the 115 nsSNPs investigated, eight were predicted to be deleterious by both SIFT and PredictSNP servers. The major mutations predicted to affect the structure of the protein are phenylalanine to valine (Y43V) and lysine to asparagine (K216N) as shown by the trajectory run in molecular dynamics studies. The random transition of the protein structures over the simulation period caused by these mutations hints at how the native state is distorted which could lead to the loss of structural stability and functionality of the nicotinic acetylcholine receptors subunit ?-3 protein. Based on this work, we propose that the nsSNP with SNP id of rs75495285 and rs76821682 will have comparatively more deleterious effects than the other predicted mutations in destabilizing the protein structure. PMID:26002565

  3. Design of a High Density SNP Genotyping Assay in the Pig Using SNPs Identified and Characterized by Next Generation Sequencing Technology

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The dissection of complex traits of economic importance for the pig industry requires the availability of a significant number of genetic markers, such as SNPs. This study was conducted in order to discover thousands of porcine SNPs using next generation sequencing technologies and use those SNPs, a...

  4. Drug Filtering

    NSDL National Science Digital Library

    This lesson from Illuminations looks at exponential decay. The example of how kidneys filter blood is used. The material asks students to determine the amount of a drug that remains in the body over a period of time. Students will predict behavior by an exponential decay model and graph an exponential set of data. The lesson is appropriate for grades 9-12 and should require 1 class period to complete.

  5. A real-time PCR genotyping assay to detect FAD2A SNPs in peanuts (Arachis hypogaea L.)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The high oleic (C18:1) phenotype in peanuts has been previously demonstrated to result from a homozygous recessive genotype (ol1ol1ol2ol2) in two homeologous fatty acid desaturase genes (FAD2A and FAD2B) with two key SNPs. These mutant SNPs, specifically G448A in FAD2A and 442insA in FAD2B, signifi...

  6. SNP Mining in Crassostrea gigas EST Data: Transferability to Four Other Crassostrea Species, Phylogenetic Inferences and Outlier SNPs under Selection

    PubMed Central

    Zhong, Xiaoxiao; Li, Qi; Yu, Hong; Kong, Lingfeng

    2014-01-01

    Oysters, with high levels of phenotypic plasticity and wide geographic distribution, are a challenging group for taxonomists and phylogenetics. Our study is intended to generate new EST-SNP markers and to evaluate their potential for cross-species utilization in phylogenetic study of the genus Crassostrea. In the study, 57 novel SNPs were developed from an EST database of C. gigas by the HRM (high-resolution melting) method. Transferability of 377 SNPs developed for C. gigas was examined on four other Crassostrea species: C. sikamea, C. angulata, C. hongkongensis and C. ariakensis. Among the 377 primer pairs tested, 311 (82.5%) primers showed amplification in C. sikamea, 353 (93.6%) in C. angulata, 254 (67.4%) in C. hongkongensis and 253 (67.1%) in C. ariakensis. A total of 214 SNPs were found to be transferable to all four species. Phylogenetic analyses showed that C. hongkongensis was a sister species of C. ariakensis and that this clade was sister to the clade containing C. sikamea, C. angulata and C. gigas. Within this clade, C. gigas and C. angulata had the closest relationship, with C. sikamea being the sister group. In addition, we detected eight SNPs as potentially being under selection by two outlier tests (fdist and hierarchical methods). The SNPs studied here should be useful for genetic diversity, comparative mapping and phylogenetic studies across species in Crassostrea and the candidate outlier SNPs are worth exploring in more detail regarding association genetics and functional studies. PMID:25238392

  7. Computational identification of pathogenic associated nsSNPs and its structural impact in UROD gene: a molecular dynamics approach.

    PubMed

    Doss, C George Priya; Magesh, R

    2014-11-01

    Uroporphyrinogen decarboxylase is a cytosolic enzyme involved in the biosynthetic pathway of heme production. Decreased activity of this enzyme results in porphyria cutanea tarda and hepato erythropoietic porphyria. Nonsynonymous single nucleotide polymorphisms (nsSNPs) alter protein sequence and can cause disease. Identifying the deleterious nsSNPs that contribute to disease is an important task. We used five different in silico tools namely SIFT, PANTHER, PolyPhen2, SNPs&GO, and I-mutant3 to identify deleterious nsSNPs in UROD gene. Further, we used molecular dynamic (MD) approach to evaluate the impact of deleterious mutations on UROD protein structure. By comparing the results of all the five prediction results, we screened 35 (51.47 %) nsSNPs as highly deleterious. MD analysis results show that all the three L161Q, L282R, and I334T deleterious variants were affecting the UROD protein structural stability and flexibility. Our findings provide strong evidence on the effect of deleterious nsSNPs in UROD gene. A detailed MD study provides a new insight in the conformational changes occurred in the mutant structures of UROD protein. PMID:24777812

  8. Ceramic filters

    SciTech Connect

    Holmes, B.L.; Janney, M.A.

    1995-12-31

    Filters were formed from ceramic fibers, organic fibers, and a ceramic bond phase using a papermaking technique. The distribution of particulate ceramic bond phase was determined using a model silicon carbide system. As the ceramic fiber increased in length and diameter the distance between particles decreased. The calculated number of particles per area showed good agreement with the observed value. After firing, the papers were characterized using a biaxial load test. The strength of papers was proportional to the amount of bond phase included in the paper. All samples exhibited strain-tolerant behavior.

  9. Genetic susceptibility to chronic otitis media with effusion: candidate gene SNPs

    PubMed Central

    MacArthur, Carol J.; Wilmot, Beth; Wang, Linda; Schuller, Michael; Lighthall, Jessyka; Trune, Dennis

    2014-01-01

    Objective The genetic factors leading to a predisposition to otitis media are not well understood. The objective of the current study was to develop a tag-single nucleotide polymorphism (SNP) panel to determine if there is an association between candidate gene polymorphisms and the development of chronic otitis media with effusion. Study Design A 1:1 case/control design of 100 cases and 100 controls was used. The study was limited to the chronic otitis media with effusion phenotype to increase the population homogeneity. Methods A panel of 192 tag-SNPs was selected. Saliva for DNA extraction was collected from 100 chronic otitis media with effusion cases and 100 controls. After quality control, 100 case and 79 control samples were available for hybridization. Genomic DNA from each subject was hybridized to the single nucleotide polymorphism probes, and genotypes were generated. Quality control across all samples and SNPs reduced the final SNPs used for analysis to 170. Each single nucleotide polymorphism was then analyzed for statistical association with chronic otitis media with effusion. Results Eight single nucleotide polymorphisms from 4 genes had an unadjusted p-value of <0.05 for association with the chronic otitis media with effusion phenotype (TLR4, MUC5B, SMAD2, SMAD4); five of these polymorphisms were in the TLR4 gene. Conclusion While these results need to be replicated in a novel population, the presence of 5 single nucleotide polymorphisms in the TLR4 gene having association with chronic otitis media with effusion in our study population lends evidence for the possible role of this gene in the susceptibility to otitis media. PMID:23929584

  10. In silico analysis of Single Nucleotide Polymorphisms (SNPs) in human BRAF gene.

    PubMed

    Hussain, Muhammad Ramzan Manwar; Shaik, Noor Ahmad; Al-Aama, Jumana Yousuf; Asfour, Hani Z; Khan, Fatima Subhani; Masoodi, Tariq Ahmad; Khan, Muhammad Akhtar; Shaik, Nazia Sultana

    2012-10-25

    BRAF gene mutations are frequently seen in both inherited and somatic diseases. However, the harmful mutations for BRAF gene have not been predicted in silico. Owing to the importance of BRAF gene in cell division, differentiation and secretion processes, the functional analysis was carried out to explore the possible association between genetic mutations and phenotypic variations. Genomic analysis of BRAF was initiated with SIFT followed by PolyPhen and SNPs&GO servers to retrieve the 85 deleterious non-synonymous SNPs (nsSNPs) from dbSNP. A total of 5 mutations i.e. c.406T>G (S136A), c.1446G>T (R462I), c.1556 A>G (K499E), c.1860 T>A (V600E) and c.2352 C>T (P764L) that are found to exert benign effects on the BRAF protein structure and function were chosen for further analysis. Protein structural analysis with these amino acid variants was performed by using I-Mutant, FOLD-X, HOPE, NetSurfP, Swiss PDB viewer, Chimera and NOMAD-Ref servers to check their solvent accessibility, molecular dynamics and energy minimization calculations. Our in silico analysis suggested that S136A and P764L variants of BRAF could directly or indirectly destabilize the amino acid interactions and hydrogen bond networks thus explain the functional deviations of protein to some extent. Screening for BRAF, S136A and P764Lvariants may be useful for disease molecular diagnosis and also to design the molecular inhibitors of BRAF pathways. PMID:22824468

  11. SNPs detected in the yak MC4R gene and their association with growth traits.

    PubMed

    Cai, X; Mipam, T D; Zhao, F F; Sun, L

    2015-07-01

    MC4R (melanocortin 4 receptor) is expressed in the appetite-regulating areas of the brain and takes part in leptin signaling pathways. Sequencing of the coding region of the MC4R gene for 354 yaks identified the following five single nucleotide polymorphisms (SNPs): SNP1 (273C>T), SNP2 (321 G>T), SNP3 (864 C>A), SNP4 (1069G>C) and SNP5 (1206 G>C). SNP1, SNP2 and SNP3 were synonymous mutations, whereas SNP4 and SNP5 were missense mutations resulting in amino acid substitutions (V286L and R331S). Pairwise linkage disequilibrium (LD) analysis indicated that two pairs of SNPs, SNP2 and SNP5 (r 2=0.81027) and SNP4 and SNP5 (r 2=0.53816), exhibited higher degrees of LD. CC genotype of SNP4, CGACG and CTCCC haplotypes for all SNPs were associated with increased BW of animals that were 18 months old and with the average daily gain. The secondary structure and transmembrane region prediction of the yak MC4R protein suggested that SNP4 was correlated with influential changes in the seventh transmembrane domain of the MC4R protein and with the functional deterioration or even incapacitation of MC4R, which may contribute to the increased feed intake, BW and average daily gain of the yaks with CC genotypes. The data from this study suggested that 1069G>C SNP of the MC4R gene could be used in marker-assisted selection of growth traits in the Maiwa yak breed. PMID:25757688

  12. Residential proximity to electromagnetic field sources and birth weight: Minimizing residual confounding using multiple imputation and propensity score matching.

    PubMed

    de Vocht, Frank; Lee, Brian

    2014-08-01

    Studies have suggested that residential exposure to extremely low frequency (50 Hz) electromagnetic fields (ELF-EMF) from high voltage cables, overhead power lines, electricity substations or towers are associated with reduced birth weight and may be associated with adverse birth outcomes or even miscarriages. We previously conducted a study of 140,356 singleton live births between 2004 and 2008 in Northwest England, which suggested that close residential proximity (? 50 m) to ELF-EMF sources was associated with reduced average birth weight of 212 g (95%CI: -395 to -29 g) but not with statistically significant increased risks for other adverse perinatal outcomes. However, the cohort was limited by missing data for most potentially confounding variables including maternal smoking during pregnancy, which was only available for a small subgroup, while also residual confounding could not be excluded. This study, using the same cohort, was conducted to minimize the effects of these problems using multiple imputation to address missing data and propensity score matching to minimize residual confounding. Missing data were imputed using multiple imputation using chained equations to generate five datasets. For each dataset 115 exposed women (residing ? 50 m from a residential ELF-EMF source) were propensity score matched to 1150 unexposed women. After doubly robust confounder adjustment, close proximity to a residential ELF-EMF source remained associated with a reduction in birth weight of -116 g (95% confidence interval: -224:-7 g). No effect was found for proximity ? 100 m compared to women living further away. These results indicate that although the effect size was about half of the effect previously reported, close maternal residential proximity to sources of ELF-EMF remained associated with suboptimal fetal growth. PMID:24815339

  13. Improved Ancestry Estimation for both Genotyping and Sequencing Data using Projection Procrustes Analysis and Genotype Imputation.

    PubMed

    Wang, Chaolong; Zhan, Xiaowei; Liang, Liming; Abecasis, Gonçalo R; Lin, Xihong

    2015-06-01

    Accurate estimation of individual ancestry is important in genetic association studies, especially when a large number of samples are collected from multiple sources. However, existing approaches developed for genome-wide SNP data do not work well with modest amounts of genetic data, such as in targeted sequencing or exome chip genotyping experiments. We propose a statistical framework to estimate individual ancestry in a principal component ancestry map generated by a reference set of individuals. This framework extends and improves upon our previous method for estimating ancestry using low-coverage sequence reads (LASER 1.0) to analyze either genotyping or sequencing data. In particular, we introduce a projection Procrustes analysis approach that uses high-dimensional principal components to estimate ancestry in a low-dimensional reference space. Using extensive simulations and empirical data examples, we show that our new method (LASER 2.0), combined with genotype imputation on the reference individuals, can substantially outperform LASER 1.0 in estimating fine-scale genetic ancestry. Specifically, LASER 2.0 can accurately estimate fine-scale ancestry within Europe using either exome chip genotypes or targeted sequencing data with off-target coverage as low as 0.05×. Under the framework of LASER 2.0, we can estimate individual ancestry in a shared reference space for samples assayed at different loci or by different techniques. Therefore, our ancestry estimation method will accelerate discovery in disease association studies not only by helping model ancestry within individual studies but also by facilitating combined analysis of genetic data from multiple sources. PMID:26027497

  14. Filtering Water

    NSDL National Science Digital Library

    Brieske, Joel A.

    2003-01-01

    The first site related to water filtration is from the US Environmental Agency entitled EPA Environmental Education: Water Filtration (1 ). The two-page document explains the need for water filtration and the steps water treatment plants take to purify water. To further understand the process, a demonstration project is provided that illustrates these purification steps, which include coagulation, sedimentation, filtration, and disinfection. The second site is an interesting Flash animation called Filtration: How Does it Work (2 ) provided by Canada's Prairie Farm Rehabilitation Administration. Visitors will learn various types of filtration procedures and systems and the materials that are used such as carbon and sand. Next, from the National Science Foundation is a learning activity called Get Out the Gunk (3 ). Using just a few simple items from around the house, kids will be able to answer questions like "Does a filter work better with a lot of water rushing through, or a small trickle?" and "Does it make the water cleaner if you pour it through a filter twice?" The fourth Web site, Rapid Sand Filtration (4 ), is provided by Dottie Schmitt and Christie Shinault of Virginia Tech. The authors describe the process, which involves the flow of water through a bed of granular media, normally following settling basins in conventional water treatment trains to remove any particulate matter left over after flocculation and settling. Along with its thorough description, readers can view illustrations and photographs that further explain the process. The Vegetative Buffer Strips for Improved Surface Water Quality (5) Web site is provided by the Iowa State University Extension office. The document explains what vegetative buffer strips are, how they filter contaminants and sediment from surface water, how effective they are, and more. The sixth offering is a file called Infiltration Basins and Trenches (6) that is offered by the University of Wisconsin Extension. These structures are intended to collect water, have it infiltrate into the ground, and have it purified along the way. This document explains how effective they are at removing pollutants, how to install them, design guidelines, maintenance, and more. Next, from a site called Wilderness Survial.net is the Water Filtration Devices (7) page. Visitors read how to make a filtering system out of cloth, sand, crushed rock, charcoal, or a hollow log, although as is stated, the water still has to be purified. The last site, from the US Geological Survey, is called A Visit to a Wastewater-Treatment Plant: Primary Treatment of Wastewater (8). Although geared towards children, the site does a good job of explaining what happens at each stage of the treatment process and how pollutants are removed to help keep water clean. Everything from screening, pumping, aerating, sludge and scum removal, killing bacteria, and what is done with wastewater residuals is covered.

  15. Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs.

    PubMed

    Lee, S Hong; Ripke, Stephan; Neale, Benjamin M; Faraone, Stephen V; Purcell, Shaun M; Perlis, Roy H; Mowry, Bryan J; Thapar, Anita; Goddard, Michael E; Witte, John S; Absher, Devin; Agartz, Ingrid; Akil, Huda; Amin, Farooq; Andreassen, Ole A; Anjorin, Adebayo; Anney, Richard; Anttila, Verneri; Arking, Dan E; Asherson, Philip; Azevedo, Maria H; Backlund, Lena; Badner, Judith A; Bailey, Anthony J; Banaschewski, Tobias; Barchas, Jack D; Barnes, Michael R; Barrett, Thomas B; Bass, Nicholas; Battaglia, Agatino; Bauer, Michael; Bayés, Mònica; Bellivier, Frank; Bergen, Sarah E; Berrettini, Wade; Betancur, Catalina; Bettecken, Thomas; Biederman, Joseph; Binder, Elisabeth B; Black, Donald W; Blackwood, Douglas H R; Bloss, Cinnamon S; Boehnke, Michael; Boomsma, Dorret I; Breen, Gerome; Breuer, René; Bruggeman, Richard; Cormican, Paul; Buccola, Nancy G; Buitelaar, Jan K; Bunney, William E; Buxbaum, Joseph D; Byerley, William F; Byrne, Enda M; Caesar, Sian; Cahn, Wiepke; Cantor, Rita M; Casas, Miguel; Chakravarti, Aravinda; Chambert, Kimberly; Choudhury, Khalid; Cichon, Sven; Cloninger, C Robert; Collier, David A; Cook, Edwin H; Coon, Hilary; Cormand, Bru; Corvin, Aiden; Coryell, William H; Craig, David W; Craig, Ian W; Crosbie, Jennifer; Cuccaro, Michael L; Curtis, David; Czamara, Darina; Datta, Susmita; Dawson, Geraldine; Day, Richard; De Geus, Eco J; Degenhardt, Franziska; Djurovic, Srdjan; Donohoe, Gary J; Doyle, Alysa E; Duan, Jubao; Dudbridge, Frank; Duketis, Eftichia; Ebstein, Richard P; Edenberg, Howard J; Elia, Josephine; Ennis, Sean; Etain, Bruno; Fanous, Ayman; Farmer, Anne E; Ferrier, I Nicol; Flickinger, Matthew; Fombonne, Eric; Foroud, Tatiana; Frank, Josef; Franke, Barbara; Fraser, Christine; Freedman, Robert; Freimer, Nelson B; Freitag, Christine M; Friedl, Marion; Frisén, Louise; Gallagher, Louise; Gejman, Pablo V; Georgieva, Lyudmila; Gershon, Elliot S; Geschwind, Daniel H; Giegling, Ina; Gill, Michael; Gordon, Scott D; Gordon-Smith, Katherine; Green, Elaine K; Greenwood, Tiffany A; Grice, Dorothy E; Gross, Magdalena; Grozeva, Detelina; Guan, Weihua; Gurling, Hugh; De Haan, Lieuwe; Haines, Jonathan L; Hakonarson, Hakon; Hallmayer, Joachim; Hamilton, Steven P; Hamshere, Marian L; Hansen, Thomas F; Hartmann, Annette M; Hautzinger, Martin; Heath, Andrew C; Henders, Anjali K; Herms, Stefan; Hickie, Ian B; Hipolito, Maria; Hoefels, Susanne; Holmans, Peter A; Holsboer, Florian; Hoogendijk, Witte J; Hottenga, Jouke-Jan; Hultman, Christina M; Hus, Vanessa; Ingason, Andrés; Ising, Marcus; Jamain, Stéphane; Jones, Edward G; Jones, Ian; Jones, Lisa; Tzeng, Jung-Ying; Kähler, Anna K; Kahn, René S; Kandaswamy, Radhika; Keller, Matthew C; Kennedy, James L; Kenny, Elaine; Kent, Lindsey; Kim, Yunjung; Kirov, George K; Klauck, Sabine M; Klei, Lambertus; Knowles, James A; Kohli, Martin A; Koller, Daniel L; Konte, Bettina; Korszun, Ania; Krabbendam, Lydia; Krasucki, Robert; Kuntsi, Jonna; Kwan, Phoenix; Landén, Mikael; Långström, Niklas; Lathrop, Mark; Lawrence, Jacob; Lawson, William B; Leboyer, Marion; Ledbetter, David H; Lee, Phil H; Lencz, Todd; Lesch, Klaus-Peter; Levinson, Douglas F; Lewis, Cathryn M; Li, Jun; Lichtenstein, Paul; Lieberman, Jeffrey A; Lin, Dan-Yu; Linszen, Don H; Liu, Chunyu; Lohoff, Falk W; Loo, Sandra K; Lord, Catherine; Lowe, Jennifer K; Lucae, Susanne; MacIntyre, Donald J; Madden, Pamela A F; Maestrini, Elena; Magnusson, Patrik K E; Mahon, Pamela B; Maier, Wolfgang; Malhotra, Anil K; Mane, Shrikant M; Martin, Christa L; Martin, Nicholas G; Mattheisen, Manuel; Matthews, Keith; Mattingsdal, Morten; McCarroll, Steven A; McGhee, Kevin A; McGough, James J; McGrath, Patrick J; McGuffin, Peter; McInnis, Melvin G; McIntosh, Andrew; McKinney, Rebecca; McLean, Alan W; McMahon, Francis J; McMahon, William M; McQuillin, Andrew; Medeiros, Helena; Medland, Sarah E; Meier, Sandra; Melle, Ingrid; Meng, Fan; Meyer, Jobst; Middeldorp, Christel M; Middleton, Lefkos; Milanova, Vihra; Miranda, Ana; Monaco, Anthony P; Montgomery, Grant W; Moran, Jennifer L; Moreno-De-Luca, Daniel; Morken, Gunnar; Morris, Derek W; Morrow, Eric M; Moskvina, Valentina; Muglia, Pierandrea; Mühleisen, Thomas W; Muir, Walter J; Müller-Myhsok, Bertram; Murtha, Michael; Myers, Richard M; Myin-Germeys, Inez; Neale, Michael C; Nelson, Stan F; Nievergelt, Caroline M; Nikolov, Ivan; Nimgaonkar, Vishwajit; Nolen, Willem A; Nöthen, Markus M; Nurnberger, John I; Nwulia, Evaristus A; Nyholt, Dale R; O'Dushlaine, Colm; Oades, Robert D; Olincy, Ann; Oliveira, Guiomar; Olsen, Line; Ophoff, Roel A; Osby, Urban; Owen, Michael J; Palotie, Aarno; Parr, Jeremy R

    2013-09-01

    Most psychiatric disorders are moderately to highly heritable. The degree to which genetic variation is unique to individual disorders or shared across disorders is unclear. To examine shared genetic etiology, we use genome-wide genotype data from the Psychiatric Genomics Consortium (PGC) for cases and controls in schizophrenia, bipolar disorder, major depressive disorder, autism spectrum disorders (ASD) and attention-deficit/hyperactivity disorder (ADHD). We apply univariate and bivariate methods for the estimation of genetic variation within and covariation between disorders. SNPs explained 17-29% of the variance in liability. The genetic correlation calculated using common SNPs was high between schizophrenia and bipolar disorder (0.68 ± 0.04 s.e.), moderate between schizophrenia and major depressive disorder (0.43 ± 0.06 s.e.), bipolar disorder and major depressive disorder (0.47 ± 0.06 s.e.), and ADHD and major depressive disorder (0.32 ± 0.07 s.e.), low between schizophrenia and ASD (0.16 ± 0.06 s.e.) and non-significant for other pairs of disorders as well as between psychiatric disorders and the negative control of Crohn's disease. This empirical evidence of shared genetic etiology for psychiatric disorders can inform nosology and encourages the investigation of common pathophysiologies for related disorders. PMID:23933821

  16. SNPs in human miRNA genes affect biogenesis and function

    PubMed Central

    Sun, Guihua; Yan, Jin; Noltner, Katie; Feng, Jinong; Li, Haitang; Sarkis, Daniel A.; Sommer, Steve S.; Rossi, John J.

    2009-01-01

    MicroRNAs (miRNAs) are 21–25-nucleotide-long, noncoding RNAs that are involved in translational regulation. Most miRNAs derive from a two-step sequential processing: the generation of pre-miRNA from pri-miRNA by the Drosha/DGCR8 complex in the nucleus, and the generation of mature miRNAs from pre-miRNAs by the Dicer/TRBP complex in the cytoplasm. Sequence variation around the processing sites, and sequence variations in the mature miRNA, especially the seed sequence, may have profound affects on miRNA biogenesis and function. In the context of analyzing the roles of miRNAs in Schizophrenia and Autism, we defined at least 24 human X-linked miRNA variants. Functional assays were developed and performed on these variants. In this study we investigate the affects of single nucleotide polymorphisms (SNPs) on the generation of mature miRNAs and their function, and report that naturally occurring SNPs can impair or enhance miRNA processing as well as alter the sites of processing. Since miRNAs are small functional units, single base changes in both the precursor elements as well as the mature miRNA sequence may drive the evolution of new microRNAs by altering their biological function. Finally, the miRNAs examined in this study are X-linked, suggesting that the mutant alleles could be determinants in the etiology of diseases. PMID:19617315

  17. A North American Yersinia pestis Draft Genome Sequence: SNPs and Phylogenetic Analysis

    PubMed Central

    Hao, Jicheng; Mastrian, Stephen D.; Shah, Maulik K.; Vogler, Amy J.; Allender, Christopher J.; Clark, Erin A.; Benitez, Debbie S.; Youngkin, David J.; Girard, Jessica M.; Auerbach, Raymond K.; Beckstrom-Sternberg, Stephen M.; Keim, Paul

    2007-01-01

    Background Yersinia pestis, the causative agent of plague, is responsible for some of the greatest epidemic scourges of mankind. It is widespread in the western United States, although it has only been present there for just over 100 years. As a result, there has been very little time for diversity to accumulate in this region. Much of the diversity that has been detected among North American isolates is at loci that mutate too quickly to accurately reconstruct large-scale phylogenetic patterns. Slowly-evolving but stable markers such as SNPs could be useful for this purpose, but are difficult to identify due to the monomorphic nature of North American isolates. Methodology/Principal Findings To identify SNPs that are polymorphic among North American populations of Y. pestis, a gapped genome sequence of Y. pestis strain FV-1 was generated. Sequence comparison of FV-1 with another North American strain, CO92, identified 19 new SNP loci that differ among North American isolates. Conclusions/Significance The 19 SNP loci identified in this study should facilitate additional studies of the genetic population structure of Y. pestis across North America. PMID:17311096

  18. Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs

    PubMed Central

    2013-01-01

    Most psychiatric disorders are moderately to highly heritable. The degree to which genetic variation is unique to individual disorders or shared across disorders is unclear. To examine shared genetic etiology, we use genome-wide genotype data from the Psychiatric Genomics Consortium (PGC) for cases and controls in schizophrenia, bipolar disorder, major depressive disorder, autism spectrum disorders (ASD) and attention-deficit/hyperactivity disorder (ADHD). We apply univariate and bivariate methods for the estimation of genetic variation within and covariation between disorders. SNPs explained 17–29% of the variance in liability. The genetic correlation calculated using common SNPs was high between schizophrenia and bipolar disorder (0.68 ± 0.04 s.e.), moderate between schizophrenia and major depressive disorder (0.43 ± 0.06 s.e.), bipolar disorder and major depressive disorder (0.47 ± 0.06 s.e.), and ADHD and major depressive disorder (0.32 ± 0.07 s.e.), low between schizophrenia and ASD (0.16 ± 0.06 s.e.) and non-significant for other pairs of disorders as well as between psychiatric disorders and the negative control of Crohn’s disease. This empirical evidence of shared genetic etiology for psychiatric disorders can inform nosology and encourages the investigation of common pathophysiologies for related disorders. PMID:23933821

  19. SNPs Previously Associated with Dupuytren’s Disease Replicated in a North American Cohort

    PubMed Central

    Anderson, Eric R.; Ye, Zhan; Caldwell, Michael D.; Burmester, James K.

    2014-01-01

    Objective Dupuytren’s disease is a progressive fibrosis of the hand that often results in debilitating flexion contractures. Its etiology is not completely understood but likely involves both genetic and environmental factors. A recent study performed in Europe identified DNA variants that associate with Dupuytren’s disease. Given the likelihood for genetic variation among populations, we planned to validate the genetic variants identified by this study in a North American population. Methods In the Marshfield Clinic’s Personalized Medicine Research Project, 296 cases with Dupuytren’s disease were identified and matched 3-to-1 to controls without Dupuytren’s disease. Clinical data were abstracted from the electronic medical record. The top 12 single nucleotide polymorphisms (SNPs) from the European study were selected and tested in a multiplex assay using the MassArray Analyzer 4 (Sequenom, Inc., San Diego, CA). Differences in allele frequency were determined, and variants with a P value of <0.004 were considered significant. Results We replicated 5 of the 12 SNPs previously reported to be associated with Dupuytren’s disease. Conclusion Our findings support a role for the Wnt signaling pathway in the development of Dupuytren’s disease, and suggest that further study of this pathway may result in early diagnosis and non-surgical treatments for Dupuytren’s disease. PMID:24573701

  20. Genomics and introgression: discovery and mapping ofthousands of species-diagnostic SNPs using RAD sequencing

    USGS Publications Warehouse

    Hand, Brian K; Hether, Tyler D; Kovach, Ryan P.; Muhlfeld, Clint C.; Amish, Stephen J.; Boyer, Matthew C.; O’Rourke, Sean M.; Miller, Michael R.; Lowe, Winsor H.; Hohenlohe, Paul A.; Luikart, Gordon

    2015-01-01

    Invasive hybridization and introgression pose a serious threat to the persistence of many native species. Understanding the effects of hybridization on native populations (e.g., fitness consequences) requires numerous species-diagnostic loci distributed genome-wide. Here we used RAD sequencing to discover thousands of single-nucleotide polymorphisms (SNPs) that are diagnostic between rainbow trout (RBT, Oncorhynchus mykiss), the world’s most widely introduced fish, and native westslope cutthroat trout (WCT, O. clarkii lewisi) in the northern Rocky Mountains, USA. We advanced previous work that identified 4,914 species-diagnostic loci by using longer sequence reads (100 bp vs. 60 bp) and a larger set of individuals (n = 84). We sequenced RAD libraries for individuals from diverse sampling sources, including native populations of WCT and hatchery broodstocks of WCT and RBT. We also took advantage of a newly released reference genome assembly for RBT to align our RAD loci. In total, we discovered 16,788 putatively diagnostic SNPs, 10,267 of which we mapped to anchored chromosome locations on the RBT genome. A small portion of previously discovered putative diagnostic loci (325 of 4,914) were no longer diagnostic (i.e., fixed between species) based on our wider survey of non-hybridized RBT and WCT individuals. Our study suggests that RAD loci mapped to a draft genome assembly could provide the marker density required to identify genes and chromosomal regions influencing selection in admixed populations of conservation concern and evolutionary interest.

  1. A new GWAS and meta-analysis with 1000Genomes imputation identifies novel risk variants for colorectal cancer

    PubMed Central

    Al-Tassan, Nada A.; Whiffin, Nicola; Hosking, Fay J.; Palles, Claire; Farrington, Susan M.; Dobbins, Sara E.; Harris, Rebecca; Gorman, Maggie; Tenesa, Albert; Meyer, Brian F.; Wakil, Salma M.; Kinnersley, Ben; Campbell, Harry; Martin, Lynn; Smith, Christopher G.; Idziaszczyk, Shelley; Barclay, Ella; Maughan, Timothy S.; Kaplan, Richard; Kerr, Rachel; Kerr, David; Buchannan, Daniel D.; Ko Win, Aung; Hopper, John; Jenkins, Mark; Lindor, Noralane M.; Newcomb, Polly A.; Gallinger, Steve; Conti, David; Schumacher, Fred; Casey, Graham; Dunlop, Malcolm G.; Tomlinson, Ian P.; Cheadle, Jeremy P.; Houlston, Richard S.

    2015-01-01

    Genome-wide association studies (GWAS) of colorectal cancer (CRC) have identified 23 susceptibility loci thus far. Analyses of previously conducted GWAS indicate additional risk loci are yet to be discovered. To identify novel CRC susceptibility loci, we conducted a new GWAS and performed a meta-analysis with five published GWAS (totalling 7,577 cases and 9,979 controls of European ancestry), imputing genotypes utilising the 1000 Genomes Project. The combined analysis identified new, significant associations with CRC at 1p36.2 marked by rs72647484 (minor allele frequency [MAF]?=?0.09) near CDC42 and WNT4 (P?=?1.21?×?10?8, odds ratio [OR]?=?1.21 ) and at 16q24.1 marked by rs16941835 (MAF?=?0.21, P?=?5.06?×?10?8; OR?=?1.15) within the long non-coding RNA (lncRNA) RP11-58A18.1 and ~500?kb from the nearest coding gene FOXL1. Additionally we identified a promising association at 10p13 with rs10904849 intronic to CUBN (MAF?=?0.32, P?=?7.01?×?10-8; OR?=?1.14). These findings provide further insights into the genetic and biological basis of inherited genetic susceptibility to CRC. Additionally, our analysis further demonstrates that imputation can be used to exploit GWAS data to identify novel disease-causing variants. PMID:25990418

  2. Regression imputation with ground air temperature for the satellite-based lake and reservoir temperature database in Japan

    NASA Astrophysics Data System (ADS)

    Tonooka, Hideyuki

    2012-10-01

    Water temperature monitoring for inland water bodies like lakes and reservoirs is important in the aspects of biodiversity conservation, and global warming monitoring. However, most of inland water bodies except for a few large water bodies have not fully or never been monitored on water temperature, partly because in-situ temperature measurements are not easy for small water bodies which are widely scattered and variously managed by individuals, companies, governments etc. Thus, the satellite-based lake and reservoir temperature database in Japan (SatLARTD-J) has been developed since 2009. At present, the database contains surface temperature data for 934 water bodies which were retrieved from thermal infrared (TIR) images of the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) instrument onboard NASA's Terra satellite, but its temporal resolution is only four times per year in average. In order to improve this, the author demonstrates regression imputation for SatLARTD-J using ground air temperature data provided from the Automated Meteorological Data Acquisition System (AMeDAS) operated by Japan Meteorological Agency. The validation study using in-situ data from two Japanese lakes indicates that an expected imputation error will be about 2 K.

  3. A new GWAS and meta-analysis with 1000Genomes imputation identifies novel risk variants for colorectal cancer.

    PubMed

    Al-Tassan, Nada A; Whiffin, Nicola; Hosking, Fay J; Palles, Claire; Farrington, Susan M; Dobbins, Sara E; Harris, Rebecca; Gorman, Maggie; Tenesa, Albert; Meyer, Brian F; Wakil, Salma M; Kinnersley, Ben; Campbell, Harry; Martin, Lynn; Smith, Christopher G; Idziaszczyk, Shelley; Barclay, Ella; Maughan, Timothy S; Kaplan, Richard; Kerr, Rachel; Kerr, David; Buchannan, Daniel D; Ko Win, Aung; Hopper, John; Jenkins, Mark; Lindor, Noralane M; Newcomb, Polly A; Gallinger, Steve; Conti, David; Schumacher, Fred; Casey, Graham; Dunlop, Malcolm G; Tomlinson, Ian P; Cheadle, Jeremy P; Houlston, Richard S

    2015-01-01

    Genome-wide association studies (GWAS) of colorectal cancer (CRC) have identified 23 susceptibility loci thus far. Analyses of previously conducted GWAS indicate additional risk loci are yet to be discovered. To identify novel CRC susceptibility loci, we conducted a new GWAS and performed a meta-analysis with five published GWAS (totalling 7,577 cases and 9,979 controls of European ancestry), imputing genotypes utilising the 1000 Genomes Project. The combined analysis identified new, significant associations with CRC at 1p36.2 marked by rs72647484 (minor allele frequency [MAF]?=?0.09) near CDC42 and WNT4 (P?=?1.21?×?10(-8), odds ratio [OR]?=?1.21 ) and at 16q24.1 marked by rs16941835 (MAF?=?0.21, P?=?5.06?×?10(-8); OR?=?1.15) within the long non-coding RNA (lncRNA) RP11-58A18.1 and ~500?kb from the nearest coding gene FOXL1. Additionally we identified a promising association at 10p13 with rs10904849 intronic to CUBN (MAF?=?0.32, P?=?7.01?×?10(-8); OR?=?1.14). These findings provide further insights into the genetic and biological basis of inherited genetic susceptibility to CRC. Additionally, our analysis further demonstrates that imputation can be used to exploit GWAS data to identify novel disease-causing variants. PMID:25990418

  4. Insights into Diversity and Imputed Metabolic Potential of Bacterial Communities in the Continental Shelf of Agatti Island

    PubMed Central

    Dhar, Sunil Kumar; Jani, Kunal; Apte, Deepak A.; Shouche, Yogesh S.; Sharma, Avinash

    2015-01-01

    Marine microbes play a key role and contribute largely to the global biogeochemical cycles. This study aims to explore microbial diversity from one such ecological hotspot, the continental shelf of Agatti Island. Sediment samples from various depths of the continental shelf were analyzed for bacterial diversity using deep sequencing technology along with the culturable approach. Additionally, imputed metagenomic approach was carried out to understand the functional aspects of microbial community especially for microbial genes important in nutrient uptake, survival and biogeochemical cycling in the marine environment. Using culturable approach, 28 bacterial strains representing 9 genera were isolated from various depths of continental shelf. The microbial community structure throughout the samples was dominated by phylum Proteobacteria and harbored various bacterioplanktons as well. Significant differences were observed in bacterial diversity within a short region of the continental shelf (1–40 meters) i.e. between upper continental shelf samples (UCS) with lesser depths (i.e. 1–20 meters) and lower continental shelf samples (LCS) with greater depths (i.e. 25–40 meters). By using imputed metagenomic approach, this study also discusses several adaptive mechanisms which enable microbes to survive in nutritionally deprived conditions, and also help to understand the influence of nutrition availability on bacterial diversity. PMID:26066038

  5. Insights into Diversity and Imputed Metabolic Potential of Bacterial Communities in the Continental Shelf of Agatti Island.

    PubMed

    Kumbhare, Shreyas V; Dhotre, Dhiraj P; Dhar, Sunil Kumar; Jani, Kunal; Apte, Deepak A; Shouche, Yogesh S; Sharma, Avinash

    2015-01-01

    Marine microbes play a key role and contribute largely to the global biogeochemical cycles. This study aims to explore microbial diversity from one such ecological hotspot, the continental shelf of Agatti Island. Sediment samples from various depths of the continental shelf were analyzed for bacterial diversity using deep sequencing technology along with the culturable approach. Additionally, imputed metagenomic approach was carried out to understand the functional aspects of microbial community especially for microbial genes important in nutrient uptake, survival and biogeochemical cycling in the marine environment. Using culturable approach, 28 bacterial strains representing 9 genera were isolated from various depths of continental shelf. The microbial community structure throughout the samples was dominated by phylum Proteobacteria and harbored various bacterioplanktons as well. Significant differences were observed in bacterial diversity within a short region of the continental shelf (1-40 meters) i.e. between upper continental shelf samples (UCS) with lesser depths (i.e. 1-20 meters) and lower continental shelf samples (LCS) with greater depths (i.e. 25-40 meters). By using imputed metagenomic approach, this study also discusses several adaptive mechanisms which enable microbes to survive in nutritionally deprived conditions, and also help to understand the influence of nutrition availability on bacterial diversity. PMID:26066038

  6. Identification of Pyrus Single Nucleotide Polymorphisms (SNPs) and Evaluation for Genetic Mapping in European Pear and Interspecific Pyrus Hybrids

    PubMed Central

    Troggio, Michela; Malnoy, Mickael; Velasco, Riccardo; Fontana, Paolo; Won, KyungHo; Durel, Charles-Eric; Perchepied, Laure; Schaffer, Robert; Wiedow, Claudia; Bus, Vincent; Brewer, Lester; Gardiner, Susan E.; Crowhurst, Ross N.; Chagné, David

    2013-01-01

    We have used new generation sequencing (NGS) technologies to identify single nucleotide polymorphism (SNP) markers from three European pear (Pyrus communis L.) cultivars and subsequently developed a subset of 1096 pear SNPs into high throughput markers by combining them with the set of 7692 apple SNPs on the IRSC apple Infinium® II 8K array. We then evaluated this apple and pear Infinium® II 9K SNP array for large-scale genotyping in pear across several species, using both pear and apple SNPs. The segregating populations employed for array validation included a segregating population of European pear (‘Old Home’בLouise Bon Jersey’) and four interspecific breeding families derived from Asian (P. pyrifolia Nakai and P. bretschneideri Rehd.) and European pear pedigrees. In total, we mapped 857 polymorphic pear markers to construct the first SNP-based genetic maps for pear, comprising 78% of the total pear SNPs included in the array. In addition, 1031 SNP markers derived from apple (13% of the total apple SNPs included in the array) were polymorphic and were mapped in one or more of the pear populations. These results are the first to demonstrate SNP transferability across the genera Malus and Pyrus. Our construction of high density SNP-based and gene-based genetic maps in pear represents an important step towards the identification of chromosomal regions associated with a range of horticultural characters, such as pest and disease resistance, orchard yield and fruit quality. PMID:24155917

  7. Identification of novel single nucleotide polymorphisms (SNPs) in deer (Odocoileus spp.) using the BovineSNP50 BeadChip.

    PubMed

    Haynes, Gwilym D; Latch, Emily K

    2012-01-01

    Single nucleotide polymorphisms (SNPs) are growing in popularity as a genetic marker for investigating evolutionary processes. A panel of SNPs is often developed by comparing large quantities of DNA sequence data across multiple individuals to identify polymorphic sites. For non-model species, this is particularly difficult, as performing the necessary large-scale genomic sequencing often exceeds the resources available for the project. In this study, we trial the Bovine SNP50 BeadChip developed in cattle (Bos taurus) for identifying polymorphic SNPs in cervids Odocoileus hemionus (mule deer and black-tailed deer) and O. virginianus (white-tailed deer) in the Pacific Northwest. We found that 38.7% of loci could be genotyped, of which 5% (n = 1068) were polymorphic. Of these 1068 polymorphic SNPs, a mixture of putatively neutral loci (n = 878) and loci under selection (n = 190) were identified with the F(ST)-outlier method. A range of population genetic analyses were implemented using these SNPs and a panel of 10 microsatellite loci. The three types of deer could readily be distinguished with both the SNP and microsatellite datasets. This study demonstrates that commercially developed SNP chips are a viable means of SNP discovery for non-model organisms, even when used between very distantly related species (the Bovidae and Cervidae families diverged some 25.1-30.1 million years before present). PMID:22590559

  8. Identification of Novel Single Nucleotide Polymorphisms (SNPs) in Deer (Odocoileus spp.) Using the BovineSNP50 BeadChip

    PubMed Central

    Haynes, Gwilym D.; Latch, Emily K.

    2012-01-01

    Single nucleotide polymorphisms (SNPs) are growing in popularity as a genetic marker for investigating evolutionary processes. A panel of SNPs is often developed by comparing large quantities of DNA sequence data across multiple individuals to identify polymorphic sites. For non-model species, this is particularly difficult, as performing the necessary large-scale genomic sequencing often exceeds the resources available for the project. In this study, we trial the Bovine SNP50 BeadChip developed in cattle (Bos taurus) for identifying polymorphic SNPs in cervids Odocoileus hemionus (mule deer and black-tailed deer) and O. virginianus (white-tailed deer) in the Pacific Northwest. We found that 38.7% of loci could be genotyped, of which 5% (n?=?1068) were polymorphic. Of these 1068 polymorphic SNPs, a mixture of putatively neutral loci (n?=?878) and loci under selection (n?=?190) were identified with the FST-outlier method. A range of population genetic analyses were implemented using these SNPs and a panel of 10 microsatellite loci. The three types of deer could readily be distinguished with both the SNP and microsatellite datasets. This study demonstrates that commercially developed SNP chips are a viable means of SNP discovery for non-model organisms, even when used between very distantly related species (the Bovidae and Cervidae families diverged some 25.1?30.1 million years before present). PMID:22590559

  9. Analysis of copy loss and gain variations in Holstein cattle autosomes using BeadChip SNPs

    PubMed Central

    2010-01-01

    Background Copy number variation (CNV) has been recently identified in human and other mammalian genomes, and there is a growing awareness of CNV's potential as a major source for heritable variation in complex traits. Genomic selection is a newly developed tool based on the estimation of breeding values for quantitative traits through the use of genome-wide genotyping of SNPs. Over 30,000 Holstein bulls have been genotyped with the Illumina BovineSNP50 BeadChip, which includes 54,001 SNPs (~SNP/50,000 bp), some of which fall within CNV regions. Results We used the BeadChip data obtained for 912 Israeli bulls to investigate the effects of CNV on SNP calls. For each of the SNPs, we estimated the frequencies of occurrence of loss of heterozygosity (LOH) and of gain, based either on deviation from the expected Hardy-Weinberg equilibrium (HWE) or on signal intensity (SI) using the PennCNV "detect" option. Correlations between LOH/CNV frequencies predicted by the two methods were low (up to r = 0.08). Nevertheless, 418 locations displayed significantly high frequencies by both methods. Efficiency of designating large genomic clusters of olfactory receptors as CNVs was 29%. Frequency values for copy loss were distinguishable in non-autosomal regions, indicating misplacement of a region in the current BTA7 map. Analysis of BTA18 placed major quantitative trait loci affecting net merit in the US Holstein population in regions rich in segmental duplications and CNVs. Enrichment of transporters in CNV loci suggested their potential effect on milk-production traits. Conclusions Expansion of HWE and PennCNV analyses allowed estimating LOH/CNV frequencies, and combining the two methods yielded more sensitive detection of inherited CNVs and better estimation of their possible effects on cattle genetics. Although this approach was more effective than methodologies previously applied in cattle, it has severe limitations. Thus the number of CNVs reported here for the Holstein breed may represent as little as one-tenth of inherited common structural variation. PMID:21114805

  10. Using multiple imputation and propensity scores to test the effect of car seats and seat belt usage on injury severity from trauma registry data

    Microsoft Academic Search

    John R. Hayes; Jonathan I. Groner

    2008-01-01

    BackgroundMissing data and the retrospective, nonrandomized nature of trauma registries can decrease the quality of registry-based research. Therefore, we used multiple imputation and propensity scores to test the effect of car seats and seat belt usage on injury severity in children involved in motor vehicle crashes.

  11. Type I Error Rates from Mixed Effects Model Repeated Measures versus Fixed Effects Anova with Missing Values Imputed via Last Observation Carried Forward

    Microsoft Academic Search

    Craig H. Mallinckrodt; W. Scott Clark; Stacy R. David

    2001-01-01

    Treatment effects are often evaluated by comparing change over time in outcome measures. However, valid analyses of longitudinal data can be problematic when subjects discontinue (dropout) prior to completing the trial. This study compared the Type I error rates from a likelihood-based repeated measures analysis (MMRM) to a fixed-effects analysis of variance where missing values were imputed using the last

  12. Genetic identities and local inbreeding in pure diploid clones with homoplasic markers: SNPs may be misleading.

    PubMed

    De Meeûs, Thierry

    2015-07-01

    Expected values for observed heterozygosity, genetic diversity, and inbreeding of individuals relative to inbreeding of the population (FIS) are derived in the case of one locus displaying homoplasy with K possible allelic states (KAM model) in a clonal diploid population. Heterozygosity (HO) and genetic diversity (HS) are substantially affected by homoplasy as long as the number of alleles K?10, while FIS remains weakly affected in any case. Simulations suggest that in big populations, or in case of maximum homoplasy (K=2), expected values can appear far from the observed ones because equilibrium takes too many generations to be reached at homoplasic markers in clonally propagating populations. This raises some concern on the use of SNPs, at least in clonal populations. PMID:25960105

  13. MICA SNPs and the NKG2D system in virus-induced HCC.

    PubMed

    Goto, Kaku; Kato, Naoya

    2015-03-01

    Hepatocellular carcinoma (HCC) is one of the most frequent causes of cancer-related death globally. Above well-known risk factors for HCC development ranging from various toxins to diseases such as diabetes mellitus, chronic infection with hepatitis B virus and hepatitis C virus (HCV) poses the most serious threat, constituting the cause in more than 80 % of cases. In addition to the viral genes intensively investigated, the pathophysiological importance of host genetic factors has also been greatly and increasingly appreciated. Genome-wide association studies (GWAS) comprehensively search the host genome at the single-nucleotide level, and have successfully identified the genomic region associated with a whole variety of diseases. With respect to HCC, there have been reports from several groups on single nucleotide polymorphisms (SNPs) associated with hepatocarcinogenesis, among which was our GWAS discovering MHC class I polypeptide-related sequence A (MICA) as a susceptibility gene for HCV-induced HCC. MICA is a natural killer (NK) group 2D (NKG2D) ligand, whose interaction with NKG2D triggers NK cell-mediated cytotoxicity toward the target cells, and is a key molecule in tumor immune surveillance as its expression is induced on stressed cells such as transformed tumor cells for the detection by NK cells. In this review, the latest understanding of the MICA-NKG2D system in viral HCC, particularly focused on its antitumor properties and the involvement of MICA SNPs, is summarized, followed by a discussion of targets for state-of-the-art cancer immunotherapy with personalized medicine in view. PMID:25270965

  14. Common SNPs explain some of the variation in the personality dimensions of neuroticism and extraversion

    PubMed Central

    Vinkhuyzen, A A E; Pedersen, N L; Yang, J; Lee, S H; Magnusson, P K E; Iacono, W G; McGue, M; Madden, P A F; Heath, A C; Luciano, M; Payton, A; Horan, M; Ollier, W; Pendleton, N; Deary, I J; Montgomery, G W; Martin, N G; Visscher, P M; Wray, N R

    2012-01-01

    The personality traits of neuroticism and extraversion are predictive of a number of social and behavioural outcomes and psychiatric disorders. Twin and family studies have reported moderate heritability estimates for both traits. Few associations have been reported between genetic variants and neuroticism/extraversion, but hardly any have been replicated. Moreover, the ones that have been replicated explain only a small proportion of the heritability (SNPs as 0.06 (s.e.=0.03) for neuroticism and 0.12 (s.e.=0.03) for extraversion. In an additional series of analyses in a family-based sample, we show that while for both traits ?45% of the phenotypic variance can be explained by pedigree data (that is, expected genetic similarity) one third of this can be explained by SNP data (that is, realized genetic similarity). A part of the so-called ‘missing heritability' has now been accounted for, but some of the reported heritability is still unexplained. Possible explanations for the remaining missing heritability are that: (i) rare variants that are not captured by common SNPs on current genotype platforms make a major contribution; and/ or (ii) the estimates of narrow sense heritability from twin and family studies are biased upwards, for example, by not properly accounting for nonadditive genetic factors and/or (common) environmental factors. PMID:22832902

  15. A genome-wide study of common SNPs and CNVs in cognitive performance in the CANTAB

    PubMed Central

    Need, Anna C.; Attix, Deborah K.; McEvoy, Jill M.; Cirulli, Elizabeth T.; Linney, Kristen L.; Hunt, Priscilla; Ge, Dongliang; Heinzen, Erin L.; Maia, Jessica M.; Shianna, Kevin V.; Weale, Michael E.; Cherkas, Lynn F.; Clement, Gail; Spector, Tim D.; Gibson, Greg; Goldstein, David B.

    2009-01-01

    Psychiatric disorders such as schizophrenia are commonly accompanied by cognitive impairments that are treatment resistant and crucial to functional outcome. There has been great interest in studying cognitive measures as endophenotypes for psychiatric disorders, with the hope that their genetic basis will be clearer. To investigate this, we performed a genome-wide association study involving 11 cognitive phenotypes from the Cambridge Neuropsychological Test Automated Battery. We showed these measures to be heritable by comparing the correlation in 100 monozygotic and 100 dizygotic twin pairs. The full battery was tested in ?750 subjects, and for spatial and verbal recognition memory, we investigated a further 500 individuals to search for smaller genetic effects. We were unable to find any genome-wide significant associations with either SNPs or common copy number variants. Nor could we formally replicate any polymorphism that has been previously associated with cognition, although we found a weak signal of lower than expected P-values for variants in a set of 10 candidate genes. We additionally investigated SNPs in genomic loci that have been shown to harbor rare variants that associate with neuropsychiatric disorders, to see if they showed any suggestion of association when considered as a separate set. Only NRXN1 showed evidence of significant association with cognition. These results suggest that common genetic variation does not strongly influence cognition in healthy subjects and that cognitive measures do not represent a more tractable genetic trait than clinical endpoints such as schizophrenia. We discuss a possible role for rare variation in cognitive genomics. PMID:19734545

  16. The evolutionary history of Afrocanarian blue tits inferred from genomewide SNPs.

    PubMed

    Gohli, Jostein; Leder, Erica H; Garcia-Del-Rey, Eduardo; Johannessen, Lars Erik; Johnsen, Arild; Laskemoen, Terje; Popp, Magnus; Lifjeld, Jan T

    2015-01-01

    A common challenge in phylogenetic reconstruction is to find enough suitable genomic markers to reliably trace splitting events with short internodes. Here, we present phylogenetic analyses based on genomewide single-nucleotide polymorphisms (SNPs) of an enigmatic avian radiation, the subspecies complex of Afrocanarian blue tits (Cyanistes teneriffae). The two sister species, the Eurasian blue tit (Cyanistes caeruleus) and the azure tit (Cyanistes cyanus), constituted the out-group. We generated a large data set of SNPs for analysis of population structure and phylogeny. We also adapted our protocol to utilize degraded DNA from old museum skins from Libya. We found strong population structuring that largely confirmed subspecies monophyly and constructed a coalescent-based phylogeny with full support at all major nodes. The results are consistent with a recent hypothesis that La Palma and Libya are relic populations of an ancient Afrocanarian blue tit, although a small data set for Libya could not resolve its position relative to La Palma. The birds on the eastern islands of Fuerteventura and Lanzarote are similar to those in Morocco. Together they constitute the sister group to the clade containing the other Canary Islands (except La Palma), in which El Hierro is sister to the three central islands. Hence, extant Canary Islands populations seem to originate from multiple independent colonization events. We also found population divergences in a key reproductive trait, viz. sperm length, which may constitute reproductive barriers between certain populations. We recommend a taxonomic revision of this polytypic species, where several subspecies should qualify for species rank. PMID:25407440

  17. Prediction of deleterious non-synonymous SNPs based on protein interaction network and hybrid properties.

    PubMed

    Huang, Tao; Wang, Ping; Ye, Zhi-Qiang; Xu, Heng; He, Zhisong; Feng, Kai-Yan; Hu, Lele; Cui, Weiren; Wang, Kai; Dong, Xiao; Xie, Lu; Kong, Xiangyin; Cai, Yu-Dong; Li, Yixue

    2010-01-01

    Non-synonymous SNPs (nsSNPs), also known as Single Amino acid Polymorphisms (SAPs) account for the majority of human inherited diseases. It is important to distinguish the deleterious SAPs from neutral ones. Most traditional computational methods to classify SAPs are based on sequential or structural features. However, these features cannot fully explain the association between a SAP and the observed pathophysiological phenotype. We believe the better rationale for deleterious SAP prediction should be: If a SAP lies in the protein with important functions and it can change the protein sequence and structure severely, it is more likely related to disease. So we established a method to predict deleterious SAPs based on both protein interaction network and traditional hybrid properties. Each SAP is represented by 472 features that include sequential features, structural features and network features. Maximum Relevance Minimum Redundancy (mRMR) method and Incremental Feature Selection (IFS) were applied to obtain the optimal feature set and the prediction model was Nearest Neighbor Algorithm (NNA). In jackknife cross-validation, 83.27% of SAPs were correctly predicted when the optimized 263 features were used. The optimized predictor with 263 features was also tested in an independent dataset and the accuracy was still 80.00%. In contrast, SIFT, a widely used predictor of deleterious SAPs based on sequential features, has a prediction accuracy of 71.05% on the same dataset. In our study, network features were found to be most important for accurate prediction and can significantly improve the prediction performance. Our results suggest that the protein interaction context could provide important clues to help better illustrate SAP's functional association. This research will facilitate the post genome-wide association studies. PMID:20689580

  18. Disrupted-in-Schizophrenia-1 SNPs and Susceptibility to Schizophrenia: Evidence from Malaysia

    PubMed Central

    Kartini, Abdullah; Norsidah, Kuzaifah; Ramli, Musa; Tariq, Abdul Razak; Wan Rohani, Wan Taib

    2015-01-01

    Objective Even though the role of the DICS1 gene as a risk factor for schizophrenia is still unclear, there is substantial evidence from functional and cell biology studies that supports the connection of the gene with schizophrenia. The studies associating the DISC1 gene with schizophrenia in Asian populations are limited to East-Asian populations. Our study examined several DISC1 markers of schizophrenia that were identified in the Caucasian and East-Asian populations in Malaysia and assessed the role of rs2509382, which is located at 11q14.3, the mutual translocation region of the famous DISC1 translocation [t (1; 11) (p42.1; q14.3)]. Methods We genotyped eleven single-neucleotide polymorphism (SNPs) within or related to DISC1 (rs821597, rs821616, rs4658971, rs1538979, rs843979, rs2812385, rs1407599, rs4658890, and rs2509382) using the PCR-RFLP methods. Results In all, there were 575 participants (225 schizophrenic patients and 350 healthy controls) of either Malay or Chinese ethnicity. The case-control analyses found two SNPs that were associated with schizophrenia [rs4658971 (p=0.030; OR=1.43 (1.35-1.99) and rs1538979-(p=0.036; OR=1.35 (1.02-1.80)] and rs2509382-susceptibility among the males schizophrenics [p=0.0082; OR=2.16 (1.22-3.81)]. This is similar to the meta-analysis findings for the Caucasian populations. Conclusion The study supports the notion that the DISC1 gene is a marker of schizophrenia susceptibility and that rs2509382 in the mutual DISC1 translocation region is a susceptibility marker for schizophrenia among males in Malaysia. However, the finding of the study is limited due to possible genetic stratification and the small sample size. PMID:25670952

  19. HEPA filter dissolution process

    DOEpatents

    Brewer, K.N.; Murphy, J.A.

    1994-02-22

    A process is described for dissolution of spent high efficiency particulate air (HEPA) filters and then combining the complexed filter solution with other radioactive wastes prior to calcining the mixed and blended waste feed. The process is an alternate to a prior method of acid leaching the spent filters which is an inefficient method of treating spent HEPA filters for disposal. 4 figures.

  20. HEPA filter dissolution process

    SciTech Connect

    Brewer, K.N.; Murphy, J.A.

    1992-12-31

    This invention is comprised of a process for dissolution of spent high efficiency particulate air (HEPA) filters and then combining the complexed filter solution with other radioactive wastes prior to calcining the mixed and blended waste feed. The process is an alternate to a prior method of acid leaching the spent filters which is an inefficient method of treating spent HEPA filters for disposal.

  1. ELECTRET AIR FILTERS

    Microsoft Academic Search

    Rashmi Thakur; Dipayan Das; Apurba Das

    2012-01-01

    This review summarizes the research progress made so far on electret air filters used for separation of airborne particles from complex air stream. A set of different categories of these filters are delineated and the methods of manufacturing of these filters are described. The principles and mechanisms of filtration and modeling of pressure drop by these filters are analyzed. The

  2. Recirculating electric air filter

    DOEpatents

    Bergman, Werner (Pleasanton, CA)

    1986-01-01

    An electric air filter cartridge has a cylindrical inner high voltage eleode, a layer of filter material, and an outer ground electrode formed of a plurality of segments moveably connected together. The outer electrode can be easily opened to remove or insert filter material. Air flows through the two electrodes and the filter material and is exhausted from the center of the inner electrode.

  3. (Mobile K'' Filter program)

    SciTech Connect

    Not Available

    1991-05-17

    This report documents progress through May 16, 1990 in the marketing of the Mobile K' filter. This air filter traps fine particulates. A total number of 167 of the filter units have been sold. An effort to increase sales by lowering the cost of the units by delivering the filters unassembled is under way. (GHH)

  4. Hepa filter dissolution process

    DOEpatents

    Brewer, Ken N. (Arco, ID); Murphy, James A. (Idaho Falls, ID)

    1994-01-01

    A process for dissolution of spent high efficiency particulate air (HEPA) filters and then combining the complexed filter solution with other radioactive wastes prior to calcining the mixed and blended waste feed. The process is an alternate to a prior method of acid leaching the spent filters which is an inefficient method of treating spent HEPA filters for disposal.

  5. Collaborative Filtering Recommender Systems

    Microsoft Academic Search

    J. Ben Schafer; Dan Frankowski; Jonathan L. Herlocker; Shilad Sen

    2007-01-01

    One of the potent personalization technologies powering the adap- tive web is collaborative filtering. Collaborative filtering (CF) is the process of filtering or evaluating items through the opinions of other people. CF technol- ogy brings together the opinions of large interconnected communities on the web, supporting filtering of substantial quantities of data. In this chapter we in- troduce the core

  6. Properties of multilayer filters

    NASA Technical Reports Server (NTRS)

    Baumeister, P. W.

    1973-01-01

    New methods were investigated of using optical interference coatings to produce bandpass filters for the spectral region 110 nm to 200 nm. The types of filter are: triple cavity metal dielectric filters; all dielectric reflection filters; and all dielectric Fabry Perot type filters. The latter two types use thorium fluoride and either cryolite films or magnesium fluoride films in the stacks. The optical properties of the thorium fluoride were also measured.

  7. Canonical Single Nucleotide Polymorphisms (SNPs) for High-Resolution Subtyping of Shiga-Toxin Producing Escherichia coli (STEC) O157:H7

    PubMed Central

    Griffing, Sean M.; MacCannell, Duncan R.; Schmidtke, Amber J.; Freeman, Molly M.; Hyytiä-Trees, Eija; Gerner-Smidt, Peter; Ribot, Efrain M.; Bono, James L.

    2015-01-01

    The objective of this study was to develop a canonical, parsimoniously-informative SNP panel for subtyping Shiga-toxin producing Escherichia coli (STEC) O157:H7 that would be consistent with epidemiological, PFGE, and MLVA clustering of human specimens. Our group had previously identified 906 putative discriminatory SNPs, which were pared down to 391 SNPs based on their prevalence in a test set. The 391 SNPs were screened using a high-throughput form of TaqMan PCR against a set of clinical isolates that represent the most diverse collection of O157:H7 isolates from outbreaks and sporadic cases examined to date. Another 30 SNPs identified by others were also screened using the same method. Two additional targets were tested using standard TaqMan PCR endpoint analysis. These 423 SNPs were reduced to a 32 SNP panel with the almost the same discriminatory value. While the panel partitioned our diverse set of isolates in a manner that was consistent with epidemiological data and PFGE and MLVA phylogenies, it resulted in fewer subtypes than either existing method and insufficient epidemiological resolution in 10 of 47 clusters. Therefore, another round of SNP discovery was undertaken using comparative genomic resequencing of pooled DNA from the 10 clusters with insufficient resolution. This process identified 4,040 potential SNPs and suggested one of the ten clusters was incorrectly grouped. After its removal, there were 2,878 SNPs, of which only 63 were previously identified and 438 occurred across multiple clusters. Among highly clonal bacteria like STEC O157:H7, linkage disequilibrium greatly limits the number of parsimoniously informative SNPs. Therefore, it is perhaps unsurprising that our panel accounted for the potential discriminatory value of numerous other SNPs reported in the literature. We concluded published O157:H7 SNPs are insufficient for effective epidemiological subtyping. However, the 438 multi-cluster SNPs we identified may provide the additional information required. PMID:26132731

  8. A Genomewide Comparison of Population Structure at STRPs and Nearby SNPs Bret A. Payseur and Peicheng Jing

    E-print Network

    Payseur, Bret

    population struc- ture. Humans provide a useful case study because world- wide patterns of population and the increasing pop- ularity of genomewide association studies have stimulated the measurement of populationA Genomewide Comparison of Population Structure at STRPs and Nearby SNPs in Humans Bret A. Payseur

  9. Analysis of artificially degraded DNA using STRs and SNPs—results of a collaborative European (EDNAP) exercise

    Microsoft Academic Search

    L. A. Dixon; A. E. Dobbins; H. K. Pulker; J. M. Butler; P. M. Vallone; M. D. Coble; W. Parson; B. Berger; P. Grubwieser; H. S. Mogensen; N. Morling; K. Nielsen; J. J. Sanchez; E. Petkovski; A. Carracedo; P. Sanchez-Diz; E. Ramos-Luis; M. Bri?n; J. A. Irwin; R. S. Just; O. Loreille; T. J. Parsons; D. Syndercombe-Court; H. Schmitter; B. Stradmann-Bellinghausen; K. Bender; P. Gill

    2006-01-01

    Recently, there has been much debate about what kinds of genetic markers should be implemented as new core loci that constitute national DNA databases. The choices lie between conventional STRs, ranging in size from 100 to 450bp; mini-STRs, with amplicon sizes less than 200bp; and single nucleotide polymorphisms (SNPs). There is general agreement by the European DNA Profiling Group (EDNAP)

  10. IL-18R1 and IL-18RAP SNPs may associate with Bronchopulmonary Dysplasia in African American infants

    PubMed Central

    Floros, Joanna; Londono, Douglas; Gordon, Derek; Silveyra, Patricia; Diangelo, Susan L; Viscardi, Rose M; Worthen, George S; Shenberger, Jeffrey; Wang, Guirong; Lin, Zhenwu; Thomas, Neal J

    2013-01-01

    The genetic contribution to the development of bronchopulmonary dysplasia (BPD) in prematurely born infants is substantial, but information related to the specific genes involved is lacking. We conducted a case-control single nucleotide polymorphism (SNP) association study of candidate genes (n=601) or 6,324 SNPs in 1,091 prematurely born infants with gestational age <35 weeks, with or without neonatal lung disease including BPD. BPD was defined as need for oxygen at 28 days. Genotype analysis revealed, after multiple comparisons correction, two significant SNPs, rs3771150 (IL-18RAP) and rs3771171 (IL-18R1), in African Americans (AA) with BPD (vs. AA without BPD; q<0.05). No associations with Caucasian (CA) BPD, AA or CA RDS, or prematurity in either AA or CA, were identified with these SNPs. Respective frequencies were 0.098 and 0.093 without BPD and 0.38 for each SNP in infants with BPD. In the replication set (82 cases; 102 controls), the p-values were 0.012 for rs3771150 and 0.07 for rs3771171. Combining p-values using Fisher's method, overall p-values were 8.31E-07 for rs3771150, and 6.33E-06 for rs3771171. We conclude, IL-18RAP and IL-18R1 SNPs identify AA infants at risk for BPD. These genes may contribute to AA BPD pathogenesis via inflammatory-mediated processes and require further study. PMID:22289858

  11. Japan PGx Data Science Consortium Database: SNPs and HLA genotype data from 2994 Japanese healthy individuals for pharmacogenomics studies.

    PubMed

    Kamitsuji, Shigeo; Matsuda, Takashi; Nishimura, Koichi; Endo, Seiko; Wada, Chisa; Watanabe, Kenji; Hasegawa, Koichi; Hishigaki, Haretsugu; Masuda, Masatoshi; Kuwahara, Yusuke; Tsuritani, Katsuki; Sugiura, Kenkichi; Kubota, Tomoko; Miyoshi, Shinji; Okada, Kinya; Nakazono, Kazuyuki; Sugaya, Yuki; Yang, Woosung; Sawamoto, Taiji; Uchida, Wataru; Shinagawa, Akira; Fujiwara, Tsutomu; Yamada, Hisaharu; Suematsu, Koji; Tsutsui, Naohisa; Kamatani, Naoyuki; Liou, Shyh-Yuh

    2015-06-01

    Japan Pharmacogenomics Data Science Consortium (JPDSC) has assembled a database for conducting pharmacogenomics (PGx) studies in Japanese subjects. The database contains the genotypes of 2.5 million single-nucleotide polymorphisms (SNPs) and 5 human leukocyte antigen loci from 2994 Japanese healthy volunteers, as well as 121 kinds of clinical information, including self-reports, physiological data, hematological data and biochemical data. In this article, the reliability of our data was evaluated by principal component analysis (PCA) and association analysis for hematological and biochemical traits by using genome-wide SNP data. PCA of the SNPs showed that all the samples were collected from the Japanese population and that the samples were separated into two major clusters by birthplace, Okinawa and other than Okinawa, as had been previously reported. Among 87 SNPs that have been reported to be associated with 18 hematological and biochemical traits in genome-wide association studies (GWAS), the associations of 56 SNPs were replicated using our data base. Statistical power simulations showed that the sample size of the JPDSC control database is large enough to detect genetic markers having a relatively strong association even when the case sample size is small. The JPDSC database will be useful as control data for conducting PGx studies to explore genetic markers to improve the safety and efficacy of drugs either during clinical development or in post-marketing. PMID:25855068

  12. Role of DISC1 interacting proteins in schizophrenia risk from genome-wide analysis of missense SNPs.

    PubMed

    Costas, Javier; Suárez-Rama, Jose Javier; Carrera, Noa; Paz, Eduardo; Páramo, Mario; Agra, Santiago; Brenlla, Julio; Ramos-Ríos, Ramón; Arrojo, Manuel

    2013-11-01

    A balanced translocation affecting DISC1 cosegregates with several psychiatric disorders, including schizophrenia, in a Scottish family. DISC1 is a hub protein of a network of protein-protein interactions involved in multiple developmental pathways within the brain. Gene set-based analysis has been proposed as an alternative to individual analysis of single nucleotide polymorphisms (SNPs) to get information from genome-wide association studies. In this work, we tested for an overrepresentation of the DISC1 interacting proteins within the top results of our ranked list of genes based on our previous genome-wide association study of missense SNPs in schizophrenia. Our data set consisted of 5100 common missense SNPs genotyped in 476 schizophrenic patients and 447 control subjects from Galicia, NW Spain. We used a modification of the Gene Set Enrichment Analysis adapted for SNPs, as implemented in the GenGen software. The analysis detected an overrepresentation of the DISC1 interacting proteins (permuted P-value=0.0158), indicative of the role of this gene set in schizophrenia risk. We identified seven leading-edge genes, MACF1, UTRN, DST, DISC1, KIF3A, SYNE1, and AKAP9, responsible for the overrepresentation. These genes are involved in neuronal cytoskeleton organization and intracellular transport through the microtubule cytoskeleton, suggesting that these processes may be impaired in schizophrenia. PMID:23909765

  13. Detection of microRNA SNPs with ultrahigh specificity by using reduced graphene oxide-assisted rolling circle amplification.

    PubMed

    Zhu, Xiaoli; Shen, Yalan; Cao, Jiepei; Yin, Li; Ban, Fangfang; Shu, Yongqian; Li, Genxi

    2015-06-01

    Here we report a reduced graphene oxide-assisted rolling circle amplification for the detection of miRNA SNPs. The difference of the signal of a miRNA SNP reaches 100 fold, a value over 10 times larger than some current methodologies, which allows the discrimination of a SNP even with the naked eye. PMID:26000768

  14. In silico screening, genotyping, molecular dynamics simulation and activity studies of SNPs in pyruvate kinase M2.

    PubMed

    Kalaiarasan, Ponnusamy; Kumar, Bhupender; Chopra, Rupali; Gupta, Vibhor; Subbarao, Naidu; Bamezai, Rameshwar N K

    2015-01-01

    Role of, 29-non-synonymous, 15-intronic, 3-close to UTR, single nucleotide polymorphisms (SNPs) and 2 mutations of Human Pyruvate Kinase (PK) M2 were investigated by in-silico and in-vitro functional studies. Prediction of deleterious substitutions based on sequence homology and structure based servers, SIFT, PANTHER, SNPs&GO, PhD-SNP, SNAP and PolyPhen, depicted that 19% emerged common between all the mentioned programs. SNPeffect and HOPE showed three substitutions (C31F, Q310P and S437Y) in-silico as deleterious and functionally important. In-vitro activity assays showed C31F and S437Y variants of PKM2 with reduced activity, while Q310P variant was catalytically inactive. The allosteric activation due to binding of fructose 1-6 bisphosphate (FBP) was compromised in case of S437Y nsSNP variant protein. This was corroborated through molecular dynamics (MD) simulation study, which was also carried out in other two variant proteins. The 5 intronic SNPs of PKM2, associated with sporadic breast cancer in a case-control study, when subjected to different computational analyses, indicated that 3 SNPs (rs2856929, rs8192381 and rs8192431) could generate an alternative transcript by influencing splicing factor binding to PKM2. We propose that these, potentially functional and important variations, both within exons and introns, could have a bearing on cancer metabolism, since PKM2 has been implicated in cancer in the recent past. PMID:25768091

  15. Association of SNPs in GHSR rs292216 and rs509035 on dietary intake in Indonesian obese female adolescents

    PubMed Central

    Luglio, Harry Freitag; Inggriyani, Cut Gina; Huriyati, Emy; Julia, Madarina; Susilowati, Rina

    2014-01-01

    Background: Obesity has been linked to high dietary intake and low physical activity. Studies showed that those factors were not only regulated by environment but also by genetic. However, the relationship is less been understood in obese children and adolescents. Objective: The objective of this study was to examine the role of SNPs in GHSR rs292216 and rs509035 on dietary intake in obese female adolescents. Methods: This is an observational study with cross sectional design. Respondents were obese female adolescents enrolled from obesity screening done in six junior high schools in Yogyakarta. Dietary intake was measured using 6 days 24 hours inconsecutive dietary recall. Genotyping of 2 SNPs from GHSR was done using FRLP-PCR. Results: There were 78 obese female adolescents joined this study. We found that no significant association between SNPs GHSR and dietary intake (p < 0.05). In addition, a SNP-SNP interaction analysis shown there is no difference between combination of GHSR rs292216 and rs509035 on dietary intake (p < 0.05). Conclusion: We concluded that SNPs on GHSR rs292216 and rs509035 were not related to dietary intake in Indonesian obese female adolescents. Further study is necessary to investigate the effect of those genes on dietary intake in the broader population. PMID:25755847

  16. Discovery and Fine-Mapping of Glycaemic and Obesity-Related Trait Loci Using High-Density Imputation

    PubMed Central

    van de Bunt, Martijn; Surakka, Ida; Sarin, Antti-Pekka; Mahajan, Anubha; Marullo, Letizia; Thorleifsson, Gudmar; H?gg, Sara; Hottenga, Jouke-Jan; Ladenvall, Claes; Ried, Janina S.; Winkler, Thomas W.; Willems, Sara M.; Pervjakova, Natalia; Esko, Tõnu; Beekman, Marian; Nelson, Christopher P.; Willenborg, Christina; Ferreira, Teresa; Fernandez, Juan; Gaulton, Kyle J.; Steinthorsdottir, Valgerdur; Hamsten, Anders; Magnusson, Patrik K. E.; Willemsen, Gonneke; Milaneschi, Yuri; Robertson, Neil R.; Groves, Christopher J.; Bennett, Amanda J.; Lehtim?ki, Terho; Viikari, Jorma S.; Rung, Johan; Lyssenko, Valeriya; Perola, Markus; Heid, Iris M.; Herder, Christian; Grallert, Harald; Müller-Nurasyid, Martina; Roden, Michael; Hypponen, Elina; Isaacs, Aaron; van Leeuwen, Elisabeth M.; Karssen, Lennart C.; Mihailov, Evelin; Houwing-Duistermaat, Jeanine J.; de Craen, Anton J. M.; Deelen, Joris; Havulinna, Aki S.; Blades, Matthew; Hengstenberg, Christian; Erdmann, Jeanette; Schunkert, Heribert; Kaprio, Jaakko; Tobin, Martin D.; Samani, Nilesh J.; Lind, Lars; Salomaa, Veikko; Lindgren, Cecilia M.; Slagboom, P. Eline; Metspalu, Andres; van Duijn, Cornelia M.; Eriksson, Johan G.; Peters, Annette; Gieger, Christian; Jula, Antti; Groop, Leif; Raitakari, Olli T.; Power, Chris; Penninx, Brenda W. J. H.; de Geus, Eco; Smit, Johannes H.; Boomsma, Dorret I.; Pedersen, Nancy L.; Ingelsson, Erik; Thorsteinsdottir, Unnur; Stefansson, Kari; Ripatti, Samuli; Prokopenko, Inga; McCarthy, Mark I.; Morris, Andrew P.

    2015-01-01

    Reference panels from the 1000 Genomes (1000G) Project Consortium provide near complete coverage of common and low-frequency genetic variation with minor allele frequency ?0.5% across European ancestry populations. Within the European Network for Genetic and Genomic Epidemiology (ENGAGE) Consortium, we have undertaken the first large-scale meta-analysis of genome-wide association studies (GWAS), supplemented by 1000G imputation, for four quantitative glycaemic and obesity-related traits, in up to 87,048 individuals of European ancestry. We identified two loci for body mass index (BMI) at genome-wide significance, and two for fasting glucose (FG), none of which has been previously reported in larger meta-analysis efforts to combine GWAS of European ancestry. Through conditional analysis, we also detected multiple distinct signals of association mapping to established loci for waist-hip ratio adjusted for BMI (RSPO3) and FG (GCK and G6PC2). The index variant for one association signal at the G6PC2 locus is a low-frequency coding allele, H177Y, which has recently been demonstrated to have a functional role in glucose regulation. Fine-mapping analyses revealed that the non-coding variants most likely to drive association signals at established and novel loci were enriched for overlap with enhancer elements, which for FG mapped to promoter and transcription factor binding sites in pancreatic islets, in particular. Our study demonstrates that 1000G imputation and genetic fine-mapping of common and low-frequency variant association signals at GWAS loci, integrated with genomic annotation in relevant tissues, can provide insight into the functional and regulatory mechanisms through which their effects on glycaemic and obesity-related traits are mediated. PMID:26132169

  17. Discovery and Fine-Mapping of Glycaemic and Obesity-Related Trait Loci Using High-Density Imputation.

    PubMed

    Horikoshi, Momoko; M?gi, Reedik; van de Bunt, Martijn; Surakka, Ida; Sarin, Antti-Pekka; Mahajan, Anubha; Marullo, Letizia; Thorleifsson, Gudmar; H?gg, Sara; Hottenga, Jouke-Jan; Ladenvall, Claes; Ried, Janina S; Winkler, Thomas W; Willems, Sara M; Pervjakova, Natalia; Esko, Tõnu; Beekman, Marian; Nelson, Christopher P; Willenborg, Christina; Wiltshire, Steven; Ferreira, Teresa; Fernandez, Juan; Gaulton, Kyle J; Steinthorsdottir, Valgerdur; Hamsten, Anders; Magnusson, Patrik K E; Willemsen, Gonneke; Milaneschi, Yuri; Robertson, Neil R; Groves, Christopher J; Bennett, Amanda J; Lehtim?ki, Terho; Viikari, Jorma S; Rung, Johan; Lyssenko, Valeriya; Perola, Markus; Heid, Iris M; Herder, Christian; Grallert, Harald; Müller-Nurasyid, Martina; Roden, Michael; Hypponen, Elina; Isaacs, Aaron; van Leeuwen, Elisabeth M; Karssen, Lennart C; Mihailov, Evelin; Houwing-Duistermaat, Jeanine J; de Craen, Anton J M; Deelen, Joris; Havulinna, Aki S; Blades, Matthew; Hengstenberg, Christian; Erdmann, Jeanette; Schunkert, Heribert; Kaprio, Jaakko; Tobin, Martin D; Samani, Nilesh J; Lind, Lars; Salomaa, Veikko; Lindgren, Cecilia M; Slagboom, P Eline; Metspalu, Andres; van Duijn, Cornelia M; Eriksson, Johan G; Peters, Annette; Gieger, Christian; Jula, Antti; Groop, Leif; Raitakari, Olli T; Power, Chris; Penninx, Brenda W J H; de Geus, Eco; Smit, Johannes H; Boomsma, Dorret I; Pedersen, Nancy L; Ingelsson, Erik; Thorsteinsdottir, Unnur; Stefansson, Kari; Ripatti, Samuli; Prokopenko, Inga; McCarthy, Mark I; Morris, Andrew P

    2015-07-01

    Reference panels from the 1000 Genomes (1000G) Project Consortium provide near complete coverage of common and low-frequency genetic variation with minor allele frequency ?0.5% across European ancestry populations. Within the European Network for Genetic and Genomic Epidemiology (ENGAGE) Consortium, we have undertaken the first large-scale meta-analysis of genome-wide association studies (GWAS), supplemented by 1000G imputation, for four quantitative glycaemic and obesity-related traits, in up to 87,048 individuals of European ancestry. We identified two loci for body mass index (BMI) at genome-wide significance, and two for fasting glucose (FG), none of which has been previously reported in larger meta-analysis efforts to combine GWAS of European ancestry. Through conditional analysis, we also detected multiple distinct signals of association mapping to established loci for waist-hip ratio adjusted for BMI (RSPO3) and FG (GCK and G6PC2). The index variant for one association signal at the G6PC2 locus is a low-frequency coding allele, H177Y, which has recently been demonstrated to have a functional role in glucose regulation. Fine-mapping analyses revealed that the non-coding variants most likely to drive association signals at established and novel loci were enriched for overlap with enhancer elements, which for FG mapped to promoter and transcription factor binding sites in pancreatic islets, in particular. Our study demonstrates that 1000G imputation and genetic fine-mapping of common and low-frequency variant association signals at GWAS loci, integrated with genomic annotation in relevant tissues, can provide insight into the functional and regulatory mechanisms through which their effects on glycaemic and obesity-related traits are mediated. PMID:26132169

  18. Role of plasma matrix-metalloproteases (MMPs) and their polymorphisms (SNPs) in sepsis development and outcome in ICU patients

    PubMed Central

    Martin, Guadalupe; Asensi, Víctor; Montes, A. Hugo; Collazos, Julio; Alvarez, Victoria; Carton, José A.; Taboada, Francisco; Valle-Garay, Eulalia

    2014-01-01

    Matrix-metalloproteases (MMPs) and their tissue-inhibitors (TIMPs), modulated by different single nucleotide polymorphisms (SNPs), are critical in sepsis development. Ninety ICU severely septic and 91 ICU uninfected patients were prospectively studied. MMP-1 (?1607 1G/2G), MMP-3 (?1612 5A/6A), MMP-8 (?799 C/T), MMP-9 (?1562 C/T), and MMP-13 (?77A/G) SNPs were genotyped. Plasma MMPs (-1, -2, -3, -8, -9, -10, -13) and TIMPs (-1,-2,-4) were measured. AA homozygotes and A allele carriers of MMP-13 (?77 A/G) and 1G2G carriers of the MMP-1 (?1607 1G/2G) SNPs frequencies were different between septic and uninfected patients (p < 0.05), as well as plasma MMP-3, -8, -9 -10 and TIMP-2 levels (p < 0.04). No differences in MMPs levels among MMP-13 or MMP-1 SNPs genotypes carriers were observed. The area under the ROC curve for MMP-8 in the diagnosis of sepsis was 0.87 (95% CI 0.82–0.92), and that of CRP was 0.98 (0.94–0.998), whereas the area of MMP-9 in the detection of non-septic state was 0.73 (0.65–0.80), p < 0.0001 for all curves. Sepsis associated with increased MMP-8 and decreased MMP-9 levels in multivariate analysis (p < 0.0002). We report for the first time an association between MMP-13 and MMP-1 SNPs and sepsis. An independent association of MMP-8 and MMP-9 levels with sepsis was also observed. PMID:24833564

  19. SCREENING LOW FREQUENCY SNPS FROM GENOME WIDE ASSOCIATION STUDY REVEALS A NEW RISK ALLELE FOR PROGRESSION TO AIDS

    PubMed Central

    Le Clerc, Sigrid; Coulonges, Cédric; Delaneau, Olivier; Van Manen, Danielle; Herbeck, Joshua T.; Limou, Sophie; An, Ping; Martinson, Jeremy J.; Spadoni, Jean-Louis; Therwath, Amu; Veldink, Jan H.; van den Berg, Leonard H.; Taing, Lieng; Labib, Taoufik; Mellak, Safa; Montes, Matthieu; Delfraissy, Jean-François; Schächter, François; Winkler, Cheryl; Froguel, Philippe; Mullins, James I.; Schuitemaker, Hanneke; Zagury, Jean-François

    2011-01-01

    Background Seven genome-wide association studies (GWAS) have been published in AIDS and only associations in the HLA region on chromosome 6 and CXCR6 have passed genome-wide significance. Methods We reanalyzed the data from three previously published GWAS, targeting specifically low frequency SNPs (minor allele frequency (MAF)<5%). Two groups composed of 365 slow progressors (SP) and 147 rapid progressors (RP) from Europe and the US were compared with a control group of 1394 seronegative individuals using Eigenstrat corrections. Results Of the 8584 SNPs with MAF<5% in cases and controls (Bonferroni threshold=5.8×10?6), four SNPs showed statistical evidence of association with the SP phenotype. The best result was for HCP5 rs2395029 (p=8.54×10?15, OR=3.41) in the HLA locus, in partial linkage disequilibrium with two additional chromosome 6 associations in C6orf48 (p=3.03×10?10, OR=2.9) and NOTCH4 (9.08×10?07, OR=2.32). The fourth association corresponded to rs2072255 located in RICH2 (p=3.30×10?06, OR=0.43) in chromosome 17. Using HCP5 rs2395029 as a covariate, the C6orf48 and NOTCH4 signals disappeared, but the RICH2 signal still remained significant. Conclusion Besides the already known chromosome 6 associations, the analysis of low frequency SNPs brought up a new association in the RICH2 gene. Interestingly, RICH2 interacts with BST-2 known to be a major restriction factor for HIV-1 infection. Our study has thus identified a new candidate gene for AIDS molecular etiology and confirms the interest of singling out low frequency SNPs in order to exploit GWAS data. PMID:21107268

  20. Rigid porous filter

    DOEpatents

    Chiang, Ta-Kuan (Morgantown, WV); Straub, Douglas L. (Morgantown, WV); Dennis, Richard A. (Morgantown, WV)

    2000-01-01

    The present invention involves a porous rigid filter including a plurality of concentric filtration elements having internal flow passages and forming external flow passages there between. The present invention also involves a pressure vessel containing the filter for the removal of particulates from high pressure particulate containing gases, and further involves a method for using the filter to remove such particulates. The present filter has the advantage of requiring fewer filter elements due to the high surface area-to-volume ratio provided by the filter, requires a reduced pressure vessel size, and exhibits enhanced mechanical design properties, improved cleaning properties, configuration options, modularity and ease of fabrication.

  1. CDH13 promoter SNPs with pleiotropic effect on cardiometabolic parameters represent methylation QTLs.

    PubMed

    Putku, Margus; Kals, Mart; Inno, Rain; Kasela, Silva; Org, Elin; Kožich, Viktor; Milani, Lili; Laan, Maris

    2015-03-01

    CDH13 encodes T-cadherin, a receptor for high molecular weight (HMW) adiponectin and low-density lipoprotein, promoting proliferation and migration of endothelial cells. Genome-wide association studies have mapped multiple variants in CDH13 associated with cardiometabolic traits (CMT) with variable effects across studies. We hypothesized that this heterogeneity might reflect interplay with DNA methylation within the region. Resequencing and EpiTYPER™ assay were applied for the HYPertension in ESTonia/Coronary Artery Disease in Czech (HYPEST/CADCZ; n = 358) samples to identify CDH13 promoter SNPs acting as methylation Quantitative Trait Loci (meQTLs) and to investigate their associations with CMT. In silico data were extracted from genome-wide DNA methylation and genotype datasets of the population-based sample Estonian Genome Center of the University of Tartu (EGCUT; n = 165). HYPEST-CADCZ meta-analysis identified a rare variant rs113460564 as highly significant meQTL for a 134-bp distant CpG site (P = 5.90 × 10(-6); ? = 3.19%). Four common SNPs (rs12443878, rs12444338, rs62040565, rs8060301) exhibited effect on methylation level of up to 3 neighboring CpG sites in both datasets. The strongest association was detected in EGCUT between rs8060301 and cg09415485 (false discovery rate corrected P value = 1.89 × 10(-30)). Simultaneously, rs8060301 showed association with diastolic blood pressure, serum high-density lipoprotein and HMW adiponectin (P < 0.005). Novel strong associations were identified between rare CDH13 promoter meQTLs (minor allele frequency <5%) and HMW adiponectin: rs2239857 (P = 5.50 × 10(-5), ? = -1,841.9 ng/mL) and rs77068073 (P = 2.67 × 10(-4), ? = -2,484.4 ng/mL). Our study shows conclusively that CDH13 promoter harbors meQTLs associated with CMTs. It paves the way to deeper understanding of the interplay between DNA variation and methylation in susceptibility to common diseases. PMID:25543204

  2. Transcriptome characterization and high throughput SSRs and SNPs discovery in Cucurbita pepo (Cucurbitaceae)

    PubMed Central

    2011-01-01

    Background Cucurbita pepo belongs to the Cucurbitaceae family. The "Zucchini" types rank among the highest-valued vegetables worldwide, and other C. pepo and related Cucurbita spp., are food staples and rich sources of fat and vitamins. A broad range of genomic tools are today available for other cucurbits that have become models for the study of different metabolic processes. However, these tools are still lacking in the Cucurbita genus, thus limiting gene discovery and the process of breeding. Results We report the generation of a total of 512,751 C. pepo EST sequences, using 454 GS FLX Titanium technology. ESTs were obtained from normalized cDNA libraries (root, leaves, and flower tissue) prepared using two varieties with contrasting phenotypes for plant, flowering and fruit traits, representing the two C. pepo subspecies: subsp. pepo cv. Zucchini and subsp. ovifera cv Scallop. De novo assembling was performed to generate a collection of 49,610 Cucurbita unigenes (average length of 626 bp) that represent the first transcriptome of the species. Over 60% of the unigenes were functionally annotated and assigned to one or more Gene Ontology terms. The distributions of Cucurbita unigenes followed similar tendencies than that reported for Arabidopsis or melon, suggesting that the dataset may represent the whole Cucurbita transcriptome. About 34% unigenes were detected to have known orthologs of Arabidopsis or melon, including genes potentially involved in disease resistance, flowering and fruit quality. Furthermore, a set of 1,882 unigenes with SSR motifs and 9,043 high confidence SNPs between Zucchini and Scallop were identified, of which 3,538 SNPs met criteria for use with high throughput genotyping platforms, and 144 could be detected as CAPS. A set of markers were validated, being 80% of them polymorphic in a set of variable C. pepo and C. moschata accessions. Conclusion We present the first broad survey of gene sequences and allelic variation in C. pepo, where limited prior genomic information existed. The transcriptome provides an invaluable new tool for biological research. The developed molecular markers are the basis for future genetic linkage and quantitative trait loci analysis, and will be essential to speed up the process of breeding new and better adapted squash varieties. PMID:21310031

  3. Cordierite silicon nitride filters

    SciTech Connect

    Sawyer, J.; Buchan, B. (Acurex Environmental Corp., Mountain View, CA (United States)); Duiven, R.; Berger, M. (Aerotherm Corp., Mountain View, CA (United States)); Cleveland, J.; Ferri, J. (GTE Products Corp., Towanda, PA (United States))

    1992-02-01

    The objective of this project was to develop a silicon nitride based crossflow filter. This report summarizes the findings and results of the project. The project was phased with Phase I consisting of filter material development and crossflow filter design. Phase II involved filter manufacturing, filter testing under simulated conditions and reporting the results. In Phase I, Cordierite Silicon Nitride (CSN) was developed and tested for permeability and strength. Target values for each of these parameters were established early in the program. The values were met by the material development effort in Phase I. The crossflow filter design effort proceeded by developing a macroscopic design based on required surface area and estimated stresses. Then the thermal and pressure stresses were estimated using finite element analysis. In Phase II of this program, the filter manufacturing technique was developed, and the manufactured filters were tested. The technique developed involved press-bonding extruded tiles to form a filter, producing a monolithic filter after sintering. Filters manufactured using this technique were tested at Acurex and at the Westinghouse Science and Technology Center. The filters did not delaminate during testing and operated and high collection efficiency and good cleanability. Further development in areas of sintering and filter design is recommended.

  4. Filter type gas sampler with filter consolidation

    DOEpatents

    Miley, Harry S. (219 Rockwood Dr., Richland, WA 99352); Thompson, Robert C. (5313 Phoebe La., West Richland, WA 99352); Hubbard, Charles W. (1900 Stevens, Apt. 526, Richland, WA 99352); Perkins, Richard W. (1413 Sunset, Richland, WA 99352)

    1997-01-01

    Disclosed is an apparatus for automatically consolidating a filter or, more specifically, an apparatus for drawing a volume of gas through a plurality of sections of a filter, whereafter the sections are subsequently combined for the purpose of simultaneously interrogating the sections to detect the presence of a contaminant.

  5. Associations Between Incident Ischemic Stroke Events and Stroke and Cardiovascular Disease-Related GWAS SNPs in the Population Architecture Using Genomics and Epidemiology (PAGE) Study

    PubMed Central

    Carty, Cara L.; B?žková, Petra; Fornage, Myriam; Franceschini, Nora; Cole, Shelley; Heiss, Gerardo; Hindorff, Lucia A.; Howard, Barbara V.; Mann, Sue; Martin, Lisa W.; Zhang, Ying; Matise, Tara C.; Prentice, Ross; Reiner, Alexander P.; Kooperberg, Charles

    2012-01-01

    Background Genome-wide association studies (GWAS) have identified loci associated with ischemic stroke (IS) and cardiovascular disease (CVD) in European-descent individuals, but their replication in different populations has been largely unexplored. Methods and Results Nine single-nucleotide polymorphisms (SNPs) selected from GWAS and meta-analyses of stroke and 86 SNPs previously associated with myocardial infarction and CVD risk factors including blood lipids (HDL, LDL, triglycerides), type 2 diabetes and body mass index were investigated for associations with incident IS in European Americans (EA) N=26,276; African Americans (AA) N=8970; and American Indians (AI) N= 3570 from the Population Architecture using Genomics and Epidemiology Study. Ancestry-specific fixed effects meta-analysis with inverse variance weighting was used to combine study-specific log hazard ratios from Cox proportional hazards models. Two of 9 stroke SNPs (rs783396 and rs1804689) were associated with increased IS hazard in AA; none were significant in this large EA cohort. Of 73 CVD risk factor SNPs tested in EA, two (HDL and triglycerides SNPs) were associated with IS. In AA, SNPs associated with LDL, HDL and BMI were significantly associated with IS (3 of 86 SNPs tested). Out of 58 SNPs tested in AI, one LDL SNP was significantly associated with IS. Conclusions Our analyses showing lack of replication in spite of reasonable power for many stroke SNPs and differing results by ancestry highlight the need to follow-up on GWAS findings and conduct genetic association studies in diverse populations. We found modest IS associations with BMI and lipids SNPs, though these findings require confirmation. PMID:22403240

  6. Identification and structural comparison of deleterious mutations in nsSNPs of ABL1 gene in chronic myeloid leukemia: a bio-informatics study.

    PubMed

    George Priya Doss, C; Sudandiradoss, C; Rajasekaran, R; Purohit, Rituraj; Ramanathan, K; Sethumadhavan, Rao

    2008-08-01

    Single nucleotide polymorphism (SNP) serve as frequent genetic markers along the chromosome. They can, however, have important consequences for individual susceptibility to disease and reactions to medical treatment. Also, genetics of the human phenotype variation could be understood by knowing the functions of these SNPs. Currently, a vast literature exists reporting possible associations between SNPs and diseases. It is still a major challenge to identify the functional SNPs in a disease related gene. In this work, we have analyzed the genetic variation that can alter the expression and the function in chronic myeloid leukemia (CML) by ABL1 gene through computational methods. Out of the total 827 SNPs, 18 were found to be non-synonymous (nsSNPs). Among the 30 SNPs in the untranslated region, 3 SNPs were found in 5' and 27 SNPs were found in 3' untranslated regions (UTR). It was found that 16.7% nsSNPs were found to be damaging by both SIFT and PolyPhen server. UTR resource tool suggested that 6 out of 27 SNPs in the 3' UTR region were functionally significant. The two major mutations that occurred in the native protein (1OPL) coded by ABL1 gene were at positions 159 (L-->P) and 178 (G-->S). Val (6), Ala (7) and Trp (344) were found to be stabilizing residues in the native protein (1OPL) coded by ABL1 gene. Even though all the three residues were found in the mutant protein 178 (G-->S), only two of them Val (6) and Ala (7) were acting as stabilizing residue in another mutant 159 (L-->P). We propose from the overall results obtained in this work that, both the mutations 159 (L-->P) and 178 (G-->S) should be considered important in the chronic myeloid leukemia caused by ABL1 gene. Our results on this computational study will find good application with the cancer biologist working on experimental protocols. PMID:18243808

  7. Earth Water Filter

    NSDL National Science Digital Library

    2005-12-17

    In this video segment adapted from ZOOM, cast members try to make the most effective water filter. They experiment with filtering dirty, salty water through different combinations of sand, gravel, and a cotton bandana.

  8. A new ALF from Litopenaeus vannamei and its SNPs related to WSSV resistance

    NASA Astrophysics Data System (ADS)

    Liu, Jingwen; Yu, Yang; Li, Fuhua; Zhang, Xiaojun; Xiang, Jianhai

    2014-11-01

    Anti-lipopolysaccharide factors (ALFs) are basic components of the crustacean immune system that defend against a range of pathogens. The cDNA sequence of a new ALF, designated nLvALF2, with an open reading frame encoding 132 amino acids was cloned. Its deduced amino acid sequence contained the conserved functional domain of ALFs, the LPS binding domain (LBD). Its genomic sequence consisted of three exons and four introns. nLvALF2 was mainly expressed in the Oka organ and gills of shrimps. The transcriptional level of nLvALF2 increased significantly after white spot syndrome virus (WSSV) infection, suggesting its important roles in protecting shrimps from WSSV. Single nucleotide polymorphisms (SNPs) were found in the genomic sequence of nLvALF2, of which 38 were analyzed for associations with the susceptibility/resistance of shrimps to WSSV. The loci g.2422 A>G, g.2466 T>C, and g.2529 G>A were significantly associated with the resistance to WSSV ( P<0.05). These SNP loci could be developed as markers for selection of WSSV-resistant varieties of Litopenaeus vannamei.

  9. Sparse representation based biomarker selection for schizophrenia with integrated analysis of fMRI and SNPs.

    PubMed

    Cao, Hongbao; Duan, Junbo; Lin, Dongdong; Shugart, Yin Yao; Calhoun, Vince; Wang, Yu-Ping

    2014-11-15

    Integrative analysis of multiple data types can take advantage of their complementary information and therefore may provide higher power to identify potential biomarkers that would be missed using individual data analysis. Due to different natures of diverse data modality, data integration is challenging. Here we address the data integration problem by developing a generalized sparse model (GSM) using weighting factors to integrate multi-modality data for biomarker selection. As an example, we applied the GSM model to a joint analysis of two types of schizophrenia data sets: 759,075 SNPs and 153,594 functional magnetic resonance imaging (fMRI) voxels in 208 subjects (92 cases/116 controls). To solve this small-sample-large-variable problem, we developed a novel sparse representation based variable selection (SRVS) algorithm, with the primary aim to identify biomarkers associated with schizophrenia. To validate the effectiveness of the selected variables, we performed multivariate classification followed by a ten-fold cross validation. We compared our proposed SRVS algorithm with an earlier sparse model based variable selection algorithm for integrated analysis. In addition, we compared with the traditional statistics method for uni-variant data analysis (Chi-squared test for SNP data and ANOVA for fMRI data). Results showed that our proposed SRVS method can identify novel biomarkers that show stronger capability in distinguishing schizophrenia patients from healthy controls. Moreover, better classification ratios were achieved using biomarkers from both types of data, suggesting the importance of integrative analysis. PMID:24530838

  10. Assessment of genetic diversity in lentils (Lens culinaris Medik.) based on SNPs.

    PubMed

    Basheer-Salimia, R; Camilli, B; Scacchi, S; Noli, E; Awad, M

    2015-01-01

    This study is the first attempt to establish an SNP database for the purpose of estimating the genetic diversity and relatedness of Palestinian lentil genotypes. A total of 14 lentil accessions (11 local, two supplied by ICARDA, and one introduced from Italy) were investigated. By sequencing two genes, lectin and lipid transfer protein 5 (LTP5), four SNPs were detected (three in the first and one in latter gene) with average frequencies of one SNP every 228 and 578 bp, respectively. In addition, in LTP5 two single-nucleotide indels were observed in the non-coding part of the gene. Four haplotypes were identified in the lectin gene, three in LTP5. One lectin haplotype coincided with that present in GenBank belonging to two cultivated varieties, two were rather similar to this, whereas the last one turned out closer to the sequence of one wild lentil accession, indicating the existence of diversity in the Palestinian germplasm. These results, enhancing the available knowledge of lentil genetic resources in Palestine, may contribute to their conservation and utilization in breeding projects. PMID:26125786

  11. Associate PCR-RFLP assay design with SNPs based on genetic algorithm in appropriate parameters estimation.

    PubMed

    Chuang, Li-Yeh; Cheng, Yu-Huei; Yang, Cheng-Huei; Yang, Cheng-Hong

    2013-06-01

    Polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) is a commonly used laboratory technique and useful in small-scale basic research studies of complex genetic diseases that are associated with single nucleotide polymorphisms (SNPs). Before PCR-RFLP assay for SNP genotyping can be performed, a feasible primer pair observes numerous constraints and an available restriction enzyme for discriminating a target SNP, are required. The computation of feasible PCR-RFLP primers and find available restriction enzymes simultaneously aim at a target SNP is a challenging problem. Here, we propose an available method which combines the updated core of SNP-RFLPing with a genetic algorithm to reliably mine available restriction enzymes and search for feasible PCR-RFLP primers. We have in silico simulated the method in the SLC6A4 gene under different parameter settings and provided an appropriate parameter setting. The wet laboratory validation showed that it indeed usable in providing the available restriction enzymes and designing feasible primers that fit the common primer constraints. We have provided an easy and kindly interface to assist the researchers designing their PCR-RFLP assay for SNP genotyping. The program is implemented in JAVA and is freely available at http://bio.kuas.edu.tw/ganpd/. PMID:23722280

  12. A Novel SNPs Detection Method Based on Gold Magnetic Nanoparticles Array and Single Base Extension

    PubMed Central

    Li, Song; Liu, Hongna; Jia, Yingying; Deng, Yan; Zhang, Liming; Lu, Zhuoxuan; He, Nongyue

    2012-01-01

    To fulfill the increasing need for large-scale genetic research, a high-throughput and automated SNPs genotyping method based on gold magnetic nanoparticles (GMNPs) array and dual-color single base extension has been designed. After amplification of DNA templates, biotinylated extension primers were captured by streptavidin coated gold magnetic nanoparticle (SA-GMNPs). Next a solid-phase, dual-color single base extension (SBE) reaction with the specific biotinylated primer was performed directly on the surface of the GMNPs. Finally, a “bead array” was fabricated by spotting GMNPs with fluorophore on a clean glass slide, and the genotype of each sample was discriminated by scanning the “bead array”. MTHFR gene C677T polymorphism of 320 individual samples were interrogated using this method, the signal/noise ratio for homozygous samples were over 12.33, while the signal/noise ratio for heterozygous samples was near 1. Compared with other dual-color hybridization based genotyping methods, the method described here gives a higher signal/noise ratio and SNP loci can be identified with a high level of confidence. This assay has the advantage of eliminating the need for background subtraction and direct analysis of the fluorescence values of the GMNPs to determine their genotypes without the necessary procedures for purification and complex reduction of PCR products. The application of this strategy to large-scale SNP studies simplifies the process, and reduces the labor required to produce highly sensitive results while improving the potential for automation. PMID:23139724

  13. Scale invariant correlations between genes and SNPs on Human chromosome 1 reveal potential evolutionary mechanisms.

    PubMed

    Kendal, Wayne S

    2007-03-21

    The local density of gene structures and single nucleotide polymorphisms (SNPs) along human chromosomes appears inhomogeneous. In chromosome 1, the density patterns from both these elements are shown here to exhibit similar scale invariant clustering, as well as long-ranged and scale invariant auto- and cross-correlations. The local densities of these elements sites can be accurately represented by the scale invariant exponential dispersion models, a group of stochastic models that act as limiting distributions for a wide range of generalized linear models. The scale invariant Poisson-gamma (PG) distribution is the most applicable of these models, since it describes the above findings and it lends itself to a stochastic mechanism for the accumulation of segmental chromosomal changes. This PG model describes the summation of neutral chromosomal mutations, deletions, rearrangements and recombinations, within chromosomal segments that are distinguished by their evolutionary genealogies. Scale invariance is a necessary property if such a description is to remain valid at different measurement scales. The observed density patterns, and proposed model, presumably represent the convergent summation of multiple stochastic processes within the evolutionary history of the chromosome. PMID:17137602

  14. Association of ESR1 gene tagging SNPs with breast cancer risk.

    PubMed

    Dunning, Alison M; Healey, Catherine S; Baynes, Caroline; Maia, Ana-Teresa; Scollen, Serena; Vega, Ana; Rodríguez, Raquel; Barbosa-Morais, Nuno L; Ponder, Bruce A J; Low, Yen-Ling; Bingham, Sheila; Haiman, Christopher A; Le Marchand, Loic; Broeks, Annegien; Schmidt, Marjanka K; Hopper, John; Southey, Melissa; Beckmann, Matthias W; Fasching, Peter A; Peto, Julian; Johnson, Nichola; Bojesen, Stig E; Nordestgaard, Børge; Milne, Roger L; Benitez, Javier; Hamann, Ute; Ko, Yon; Schmutzler, Rita K; Burwinkel, Barbara; Schürmann, Peter; Dörk, Thilo; Heikkinen, Tuomas; Nevanlinna, Heli; Lindblom, Annika; Margolin, Sara; Mannermaa, Arto; Kosma, Veli-Matti; Chen, Xiaoqing; Spurdle, Amanda; Change-Claude, Jenny; Flesch-Janys, Dieter; Couch, Fergus J; Olson, Janet E; Severi, Gianluca; Baglietto, Laura; Børresen-Dale, Anne-Lise; Kristensen, Vessela; Hunter, David J; Hankinson, Susan E; Devilee, Peter; Vreeswijk, Maaike; Lissowska, Jolanta; Brinton, Louise; Liu, Jianjun; Hall, Per; Kang, Daehee; Yoo, Keun-Young; Shen, Chen-Yang; Yu, Jyh-Cherng; Anton-Culver, Hoda; Ziogoas, Argyrios; Sigurdson, Alice; Struewing, Jeff; Easton, Douglas F; Garcia-Closas, Montserrat; Humphreys, Manjeet K; Morrison, Jonathan; Pharoah, Paul D P; Pooley, Karen A; Chenevix-Trench, Georgia

    2009-03-15

    We have conducted a three-stage, comprehensive single nucleotide polymorphism (SNP)-tagging association study of ESR1 gene variants (SNPs) in more than 55,000 breast cancer cases and controls from studies within the Breast Cancer Association Consortium (BCAC). No large risks or highly significant associations were revealed. SNP rs3020314, tagging a region of ESR1 intron 4, is associated with an increase in breast cancer susceptibility with a dominant mode of action in European populations. Carriers of the c-allele have an odds ratio (OR) of 1.05 [95% Confidence Intervals (CI) 1.02-1.09] relative to t-allele homozygotes, P = 0.004. There is significant heterogeneity between studies, P = 0.002. The increased risk appears largely confined to oestrogen receptor-positive tumour risk. The region tagged by SNP rs3020314 contains sequence that is more highly conserved across mammalian species than the rest of intron 4, and it may subtly alter the ratio of two mRNA splice forms. PMID:19126777

  15. Filter service system

    DOEpatents

    Sellers, Cheryl L. (Peoria, IL); Nordyke, Daniel S. (Arlington Heights, IL); Crandell, Richard A. (Morton, IL); Tomlins, Gregory (Peoria, IL); Fei, Dong (Peoria, IL); Panov, Alexander (Dunlap, IL); Lane, William H. (Chillicothe, IL); Habeger, Craig F. (Chillicothe, IL)

    2008-12-09

    According to an exemplary embodiment of the present disclosure, a system for removing matter from a filtering device includes a gas pressurization assembly. An element of the assembly is removably attachable to a first orifice of the filtering device. The system also includes a vacuum source fluidly connected to a second orifice of the filtering device.

  16. Nonlinear Attitude Filtering Methods

    NASA Technical Reports Server (NTRS)

    Markley, F. Landis; Crassidis, John L.; Cheng, Yang

    2005-01-01

    This paper provides a survey of modern nonlinear filtering methods for attitude estimation. Early applications relied mostly on the extended Kalman filter for attitude estimation. Since these applications, several new approaches have been developed that have proven to be superior to the extended Kalman filter. Several of these approaches maintain the basic structure of the extended Kalman filter, but employ various modifications in order to provide better convergence or improve other performance characteristics. Examples of such approaches include: filter QUEST, extended QUEST, the super-iterated extended Kalman filter, the interlaced extended Kalman filter, and the second-order Kalman filter. Filters that propagate and update a discrete set of sigma points rather than using linearized equations for the mean and covariance are also reviewed. A two-step approach is discussed with a first-step state that linearizes the measurement model and an iterative second step to recover the desired attitude states. These approaches are all based on the Gaussian assumption that the probability density function is adequately specified by its mean and covariance. Other approaches that do not require this assumption are reviewed, including particle filters and a Bayesian filter based on a non-Gaussian, finite-parameter probability density function on SO(3). Finally, the predictive filter, nonlinear observers and adaptive approaches are shown. The strengths and weaknesses of the various approaches are discussed.

  17. The Ribosome Filter Redux

    PubMed Central

    Mauro, Vincent P.; Edelman, Gerald M.

    2010-01-01

    The ribosome filter hypothesis postulates that ribosomes are not simply translation machines but also function as regulatory elements that differentially affect or filter the translation of particular mRNAs. On the basis of new information, we take the opportunity here to review the ribosome filter hypothesis, suggest specific mechanisms of action, and discuss recent examples from the literature that support it. PMID:17890902

  18. HEPA filter encapsulation

    DOEpatents

    Gates-Anderson, Dianne D. (Union City, CA); Kidd, Scott D. (Brentwood, CA); Bowers, John S. (Manteca, CA); Attebery, Ronald W. (San Lorenzo, CA)

    2003-01-01

    A low viscosity resin is delivered into a spent HEPA filter or other waste. The resin is introduced into the filter or other waste using a vacuum to assist in the mass transfer of the resin through the filter media or other waste.

  19. Auditory filter bank inversion

    Microsoft Academic Search

    L. Lin; W. H. Holines; Eliathamby Ambikairajah

    2001-01-01

    Models of auditory filtering using the Gammatone filter bank are useful tools in speech processing. A perceptually accurate auditory inversion model has applications in speech and audio coding. This paper proposes a new auditory filter bank inversion method using a least squares optimization technique. The proposed method is computationally efficient and its low delay makes it suitable for frame-by-frame processing.

  20. Replication and Predictive Value of SNPs Associated with Melanoma and Pigmentation Traits in a Southern European Case-Control Study

    PubMed Central

    Stefanaki, Irene; Gogas, Helen; Kypreou, Katerina P.; Chatzinasiou, Foteini; Nikolaou, Vasiliki; Plaka, Michaela; Kalfa, Iro; Antoniou, Christina; Ioannidis, John P. A.; Evangelou, Evangelos; Stratigos, Alexander J.

    2013-01-01

    Background Genetic association studies have revealed numerous polymorphisms conferring susceptibility to melanoma. We aimed to replicate previously discovered melanoma-associated single-nucleotide polymorphisms (SNPs) in a Greek case-control population, and examine their predictive value. Methods Based on a field synopsis of genetic variants of melanoma (MelGene), we genotyped 284 patients and 284 controls at 34 melanoma-associated SNPs of which 19 derived from GWAS. We tested each one of the 33 SNPs passing quality control for association with melanoma both with and without accounting for the presence of well-established phenotypic risk factors. We compared the risk allele frequencies between the Greek population and the HapMap CEU sample. Finally, we evaluated the predictive ability of the replicated SNPs. Results Risk allele frequencies were significantly lower compared to the HapMap CEU for eight SNPs (rs16891982 – SLC45A2, rs12203592 – IRF4, rs258322 – CDK10, rs1805007 – MC1R, rs1805008 - MC1R, rs910873 - PIGU, rs17305573- PIGU, and rs1885120 - MTAP) and higher for one SNP (rs6001027 – PLA2G6) indicating a different profile of genetic susceptibility in the studied population. Previously identified effect estimates modestly correlated with those found in our population (r?=?0.72, P<0.0001). The strongest associations were observed for rs401681-T in CLPTM1L (odds ratio [OR] 1.60, 95% CI 1.22–2.10; P?=?0.001), rs16891982-C in SCL45A2 (OR 0.51, 95% CI 0.34–0.76; P?=?0.001), and rs1805007-T in MC1R (OR 4.38, 95% CI 2.03–9.43; P?=?2×10?5). Nominally statistically significant associations were seen also for another 5 variants (rs258322-T in CDK10, rs1805005-T in MC1R, rs1885120-C in MYH7B, rs2218220-T in MTAP and rs4911442-G in the ASIP region). The addition of all SNPs with nominal significance to a clinical non-genetic model did not substantially improve melanoma risk prediction (AUC for clinical model 83.3% versus 83.9%, p?=?0.66). Conclusion Overall, our study has validated genetic variants that are likely to contribute to melanoma susceptibility in the Greek population. PMID:23393597

  1. Race and ethnicity data quality and imputation using U.S. Census data in an integrated health system: the Kaiser Permanente Southern California experience.

    PubMed

    Derose, Stephen F; Contreras, Richard; Coleman, Karen J; Koebnick, Corinna; Jacobsen, Steven J

    2013-06-01

    Research on racial and ethnic disparities using health system databases can shed light on the usual health care and outcomes of large numbers of individuals so that health inequities can be better understood and addressed. Such research often suffers from limitations in race/ethnicity data quality. We examined the quality of race/ethnicity data in a large, diverse, integrated health system that repeatedly collects these data on utilization of services. We tested the accuracy of Bayesian Improved Surname Geocoding for imputation of race/ethnicity data. Administrative race/ethnicity data were accurate as judged by comparison with self-report in adults. The Bayesian Improved Surname Geocoding method produced imputation results far better than chance assignment for the four most common race/ethnicity groups in the health system: Whites, Hispanics, Blacks, and Asians. These results support renewed efforts to conduct studies of racial and ethnic disparities in large health systems. PMID:23169896

  2. The Advantage of Imputation of Missing Income Data to Evaluate the Association Between Income and Self-Reported Health Status (SRH) in a Mexican American Cohort Study

    Microsoft Academic Search

    Anthony B. Ryder; Anna V. Wilkinson; Michelle K. McHugh; Katherine Saunders; Sumesh Kachroo; Anthony D’Amelio; Melissa Bondy; Carol J. Etzel

    Missing data often occur in cross-sectional surveys and longitudinal and experimental studies. The purpose of this study was\\u000a to compare the prediction of self-rated health (SRH), a robust predictor of morbidity and mortality among diverse populations,\\u000a before and after imputation of the missing variable “yearly household income.” We reviewed data from 4,162 participants of\\u000a Mexican origin recruited from July 1,

  3. The impact of using different imputation methods for missing quality of life scores on the estimation of the cost-effectiveness of lung-volume-reduction surgery

    Microsoft Academic Search

    David K. Blough; Scott Ramsey; Sean D. Sullivan; Roger Yusen

    2009-01-01

    A post hoc analysis of data from a prospective cost-effectiveness analysis (CEA) conducted alongside a randomized controlled trial (National Emphysema Treatment Trial - NETT) was used to assess the impact of using different imputation methods for missing quality of life data on the estimation of the incremental cost-effectiveness ratio (ICER). The NETT compared lung-volume-reduction surgery plus medical therapy with medical

  4. Ten recently identified associations between nsSNPs and colorectal cancer could not be replicated in German families.

    PubMed

    Frank, Bernd; Burwinkel, Barbara; Bermejo, Justo Lorenzo; Försti, Asta; Hemminki, Kari; Houlston, Richard; Mangold, Elisabeth; Rahner, Nils; Friedl, Waltraut; Friedrichs, Nicolaus; Buettner, Reinhard; Engel, Christoph; Loeffler, Markus; Holinski-Feder, Elke; Morak, Monika; Keller, Gisela; Schackert, Hans K; Krüger, Stefan; Goecke, Timm; Moeslein, Gabriela; Kloor, Matthias; Gebert, Johannes; Kunstmann, Erdmute; Schulmann, Karsten; Rüschoff, Josef; Propping, Peter

    2008-11-18

    Ten non-synonymous single nucleotide polymorphisms (nsSNPs), which were recently associated with colorectal cancer risk in a comprehensive, array based study (AKAP9 M463I, DKK3 G335R, AMPD1 Q12X, LIPC L356F, PSMB9 V32I, THBS1 N700S, CA6 S90G, ASCC3 C1995S, DHX36 S416C and CPA4 G303C) were re-evaluated in the present study based on 626 German familial non-HNPCC colorectal cancer patients and 736 healthy controls. No associations of any of the 10 nsSNPs with colorectal cancer could be replicated. The combined analyses indicated that further research based on additional independent samples is required. PMID:18619730

  5. Scanning the genome for gene SNPs related to climate adaptation and estimating selection at the molecular level in boreal black spruce.

    PubMed

    Prunier, Julien; Laroche, Jérôme; Beaulieu, Jean; Bousquet, Jean

    2011-04-01

    Outlier detection methods were used to scan the genome of the boreal conifer black spruce (Picea mariana [Mill.] B.S.P.) for gene single-nucleotide polymorphisms (SNPs) potentially involved in adaptations to temperature and precipitation variations. The scan involved 583 SNPs from 313 genes potentially playing adaptive roles. Differentiation estimates among population groups defined following variation in temperature and precipitation were moderately high for adaptive quantitative characters such as the timing of budset or tree height (Q(ST) = 0.189-0.314). Average differentiation estimates for gene SNPs were null, with F(ST) values of 0.005 and 0.006, respectively, among temperature and precipitation population groups. Using two detection approaches, a total of 26 SNPs from 25 genes distributed among 11 of the 12 linkage groups of black spruce were detected as outliers with F(ST) as high as 0.078. Nearly half of the outlier SNPs were located in exons and half of those were nonsynonymous. The functional annotations of genes carrying outlier SNPs and regression analyses between the frequencies of these SNPs and climatic variables supported their implication in adaptive processes. Several genes carrying outlier SNPs belonged to gene families previously found to harbour outlier SNPs in a reproductively isolated but largely sympatric congeneric species, suggesting differential subfunctionalization of gene duplicates. Selection coefficient estimates (S) were moderate but well above the magnitude of drift (>1/N(e)), indicating that the signature of natural selection could be detected at the nucleotide level despite the recent establishment of these populations during the Holocene. PMID:21375634

  6. Common non-synonymous SNPs associated with breast cancer susceptibility: ?ndings from the Breast Cancer Association Consortium

    E-print Network

    Milne, Roger L.; Burwinkel, Barbara; Michailidou, Kyriaki; Arias-Perez, Jose-Ignacio; Zamora, M. Pilar; Menéndez-Rodríguez, Primitiva; Hardisson, David; Mendiola, Marta; González-Neira, Anna; Pita, Guillermo; Alonso, M. Rosario; Dennis, Joe; Wang, Qin; Bolla, Manjeet K.; Swerdlow, Anthony; Ashworth, Alan; Orr, Nick; Schoemaker, Minouk; Ko, Yon-Dschun; Brauch, Hiltrud; Hamann, Ute; The GENICA Network; Andrulis, Irene L.; Knight, Julia A.; Glendon, Gord; Tchatchou, Sandrine; kConFab Investigators; Australian Ovarian Cancer Study Group; Matsuo, Keitaro; Ito, Hidemi; Iwata, Hiroji; Tajima, Kazuo; Li, Jingmei; Brand, Judith S.; Brenner, Hermann; Dieffenbach, Aida Karina; Arndt, Volker; Stegmaier, Christa; Lambrechts, Diether; Peuteman, Gilian; Christiaens, Marie-Rose; Smeets, Ann; Jakubowska, Anna; Lubinski, Jan; Jaworska-Bieniek, Katarzyna; Durda, Katazyna; Hartman, Mikael; Hui, Miao; Lim, Wei Yen; Chan, Ching Wan; Marme, Federick; Yang, Rongxi; Bugert, Peter; Lindblom, Annika; Margolin, Sara; García-Closas, Montserrat; Chanock, Stephen J.; Lissowska, Jolanta; Figueroa, Jonine D.; Bojesen, Stig E.; Nordestgaard, Børge G.; Flyger, Henrik; Hooning, Maartje J.; Kriege, Mieke; van den Ouweland, Ans M.W.; Koppert, Linetta B.; Fletcher, Olivia; Johnson, Nichola; dos-Santos-Silva, Isabel; Peto, Julian; Zheng, Wei; Deming-Halverson, Sandra; Shrubsole, Martha J.; Long, Jirong; Chang-Claude, Jenny; Rudolph, Anja; Seibold, Petra; Flesch-Janys, Dieter; Winqvist, Robert; Pylkäs, Katri; Jukkola-Vuorinen, Arja; Grip, Mervi; Cox, Angela; Cross, Simon S.; Reed, Malcolm W.R.; Schmidt, Marjanka K.; Broeks, Annegien; Cornelissen, Sten; Braaf, Linde; Kang, Daehee; Choi, Ji-Yeob; Park, Sue K.; Noh, Dong-Young; Simard, Jacques; Dumont, Martine; Goldberg, Mark S.; Labrèche, France; Fasching, Peter A.; Hein, Alexander; Ekici, Arif B.; Beckmann, Matthias W.; Radice, Paolo; Peterlongo, Paolo; Azzollini, Jacopo; Barile, Monica; Sawyer, Elinor; Tomlinson, Ian; Kerin, Michael; Miller, Nicola; Hopper, John L.; Schmidt, Daniel F.; Makalic, Enes; Southey, Melissa C.; Teo, Soo Hwang; Yip, Cheng Har; Sivanandan, Kavitta; Tay, Wan-Ting; Shen, Chen-Yang; Hsiung, Chia-Ni; Yu, Jyh-Cherng; Hou, Ming-Feng; Guénel, Pascal; Truong, Therese; Sanchez, Marie; Mulot, Claire; Blot, William; Cai, Qiuyin; Nevanlinna, Heli; Muranen, Taru A.; Aittomäki, Kristiina; Blomqvist, Carl; Wu, Anna H.; Tseng, Chiu-Chen; Van Den Berg, David; Stram, Daniel O.; Bogdanova, Natalia; Dörk, Thilo; Muir, Kenneth; Lophatananon, Artitaya; Stewart-Brown, Sarah; Siriwanarangsan, Pornthep; Mannermaa, Arto; Kataja, Vesa; Kosma, Veli-Matti; Hartikainen, Jaana M.; Shu, Xiao-Ou; Lu, Wei; Yu-Tang, Gao; Zhang, Ben; Couch, Fergus J.; Toland, Amanda E.; TNBCC; Yannoukakos, Drakoulis; Sangrajrang, Suleeporn; McKay, James; Wang, Xianshu; Olson, Janet E.; Vachon, Celine; Purrington, Kristen; Severi, Gianluca; Baglietto, Laura; Haiman, Christopher A.; Henderson, Brian E.; Schumacher, Fredrick; Marchand, Loic Le; Devilee, Peter; Tollenaar, Robert A.E.M.; Seynaeve, Caroline; Czene, Kamila; Eriksson, Mikael; Humphreys, Keith; Darabi, Hatef; Ahmed, Shahana; Shah, Mitul; Pharoah, Paul D.P.; Hall, Per; Giles, Graham G.; Benítez, Javier; Dunning, Alison M.; Chenevix-Trench, Georgia; Easton, Douglas F.

    2014-07-04

    unconditional logistic regression. Strong evidence of association was observed for three nsSNPs: ATXN7-K264R at 3p21 [rs1053338, per allele OR 5 1.07, 95% con?dence interval (CI) 5 1.04–1.10, P 5 2.9 3 1026], AKAP9-M463I at 7q21 (rs6964587, OR 5 1.05, 95% CI 5 1...

  7. High Allelic Burden of Four Obesity SNPs Is Associated With Poorer Weight Loss Outcomes Following Gastric Bypass Surgery

    Microsoft Academic Search

    Christopher D. Still; G. Craig Wood; Xin Chu; Robert Erdman; Christina H. Manney; Peter N. Benotti; Anthony T. Petrick; William E. Strodel; Uyenlinh L. Mirshahi; Tooraj Mirshahi; David J. Carey; Glenn S. Gerhard

    2011-01-01

    Genome-wide association and linkage studies have identified multiple susceptibility loci for obesity. We hypothesized that such loci may affect weight loss outcomes following dietary or surgical weight loss interventions. A total of 1,001 white individuals with extreme obesity (BMI >35 kg\\/m2) who underwent a preoperative diet\\/behavioral weight loss intervention and Roux-en-Y gastric bypass surgery were genotyped for single-nucleotide polymorphisms (SNPs)

  8. FNDC5 (irisin) gene and exceptional longevity: a functional replication study with rs16835198 and rs726344 SNPs.

    PubMed

    Sanchis-Gomar, Fabian; Garatachea, Nuria; He, Zi-hong; Pareja-Galeano, Helios; Fuku, Noriyuki; Tian, Ye; Arai, Yasumichi; Abe, Yukiko; Murakami, Haruka; Miyachi, Motohiko; Yvert, Thomas; Santiago, Catalina; Venturini, Letizia; Fiuza-Luces, Carmen; Santos-Lozano, Alejandro; Rodríguez-Romo, Gabriel; Ricevuti, Giovanni; Hirose, Nobuyoshi; Emanuele, Enzo; Lucia, Alejandro

    2014-01-01

    Irisin might play an important role in reducing the risk of obesity, insulin resistance, or several related diseases, and high irisin levels may contribute to successful aging. Thus, the irisin precursor (FNDC5) gene is a candidate to influence exceptional longevity (EL), i.e., being a centenarian. It has been recently shown that two single-nucleotide polymorphisms (SNPs) in the FNDC5 gene, rs16835198 and rs726344, are associated with in vivo insulin sensitivity in adults. We determined luciferase gene reporter activity in the two above-mentioned SNPs and studied genotype distributions among centenarians (n?=?175, 144 women) and healthy controls (n?=?347, 142 women) from Spain. We also studied an Italian [79 healthy centenarians (40 women) and 316 healthy controls (156 women)] and a Japanese cohort [742 centenarians (623 women) and 499 healthy controls (356 women)]. The rs726344 SNP had functional significance, as shown by differences in luciferase activity between the constructs of this SNP (all P???0.05), with the variant A-allele having higher luciferase activity compared with the G-allele (P?=?0.04). For the rs16835198 SNP, the variant T-allele tended to show higher luciferase activity compared with the G-allele (P?=?0.07). However, we found no differences between genotype/allele frequencies of the two SNPs in centenarians versus controls in any cohort, and no significant association (using logistic regression adjusted by sex) between the two SNPs and EL. Further research is needed with different cohorts as well as with additional variants in the FNDC5 gene or in other genes involved in irisin signaling. PMID:25427998

  9. Identification of 187 single nucleotide polymorphisms (SNPs) among 41 candidate genes for ischemic heart disease in the Japanese population

    Microsoft Academic Search

    Yozo Ohnishi; Toshihiro Tanaka; Ryo Yamada; Koji Suematsu; Maiko Minami; Kenshi Fujii; Noritake Hoki; Kazuhisa Kodama; Seiki Nagata; Tohru Hayashi; Naokazu Kinoshita; Hiroshi Sato; Hideyuki Sato; Tsunehiko Kuzuya; Hiroshi Takeda; Masatsugu Hori; Yusuke Nakamura

    2000-01-01

    To investigate whether common variants in the human genetic background are associated with pathogenesis of ischemic heart diseases, we systematically surveyed 41 possible candidate genes for single-nucleotide polymorphisms (SNPs) by directly sequencing 96 independent alleles at each locus, derived from 48 unrelated Japanese patients with myocardial infarction, including 25.8 kb 5' flanking regions, 56.8 kb exonic and 35.4 kb intronic

  10. Whole-Genome Resequencing Analysis of Hanwoo and Yanbian Cattle to Identify Genome-Wide SNPs and Signatures of Selection

    PubMed Central

    Choi, Jung-Woo; Choi, Bong-Hwan; Lee, Seung-Hwan; Lee, Seung-Soo; Kim, Hyeong-Cheol; Yu, Dayeong; Chung, Won-Hyong; Lee, Kyung-Tai; Chai, Han-Ha; Cho, Yong-Min; Lim, Dajeong

    2015-01-01

    Over the last 30 years, Hanwoo has been selectively bred to improve economically important traits. Hanwoo is currently the representative Korean native beef cattle breed, and it is believed that it shared an ancestor with a Chinese breed, Yanbian cattle, until the last century. However, these two breeds have experienced different selection pressures during recent decades. Here, we whole-genome sequenced 10 animals each of Hanwoo and Yanbian cattle (20 total) using the Illumina HiSeq 2000 sequencer. A total of approximately 3.12 and 3.07 billion sequence reads were mapped to the bovine reference sequence assembly (UMD 3.1) at an average of approximately 10.71- and 10.53-fold coverage for Hanwoo and Yanbian cattle, respectively. A total of 17,936,399 single nucleotide polymorphisms (SNPs) were yielded, of which 22.3% were found to be novel. By annotating the SNPs, we further retrieved numerous nonsynonymous SNPs that may be associated with traits of interest in cattle. Furthermore, we performed whole-genome screening to detect signatures of selection throughout the genome. We located several promising selective sweeps that are potentially responsible for economically important traits in cattle; the PPP1R12A gene is an example of a gene that potentially affects intramuscular fat content. These discoveries provide valuable genomic information regarding potential genomic markers that could predict traits of interest for breeding programs of these cattle breeds. PMID:26018558

  11. Next-generation RAD sequencing identifies thousands of SNPs for assessing hybridization between rainbow and westslope cutthroat trout.

    PubMed

    Hohenlohe, Paul A; Amish, Stephen J; Catchen, Julian M; Allendorf, Fred W; Luikart, Gordon

    2011-03-01

    The increased numbers of genetic markers produced by genomic techniques have the potential to both identify hybrid individuals and localize chromosomal regions responding to selection and contributing to introgression. We used restriction-site-associated DNA sequencing to identify a dense set of candidate SNP loci with fixed allelic differences between introduced rainbow trout (Oncorhynchus mykiss) and native westslope cutthroat trout (Oncorhynchus clarkii lewisi). We distinguished candidate SNPs from homeologs (paralogs resulting from whole-genome duplication) by detecting excessively high observed heterozygosity and deviations from Hardy-Weinberg proportions. We identified 2923 candidate species-specific SNPs from a single Illumina sequencing lane containing 24 barcode-labelled individuals. Published sequence data and ongoing genome sequencing of rainbow trout will allow physical mapping of SNP loci for genome-wide scans and will also provide flanking sequence for design of qPCR-based TaqMan(®) assays for high-throughput, low-cost hybrid identification using a subset of 50-100 loci. This study demonstrates that it is now feasible to identify thousands of informative SNPs in nonmodel species quickly and at reasonable cost, even if no prior genomic information is available. PMID:21429168

  12. dbSMR: a novel resource of genome-wide SNPs affecting microRNA mediated regulation

    PubMed Central

    Hariharan, Manoj; Scaria, Vinod; Brahmachari, Samir K

    2009-01-01

    Background MicroRNAs (miRNAs) regulate several biological processes through post-transcriptional gene silencing. The efficiency of binding of miRNAs to target transcripts depends on the sequence as well as intramolecular structure of the transcript. Single Nucleotide Polymorphisms (SNPs) can contribute to alterations in the structure of regions flanking them, thereby influencing the accessibility for miRNA binding. Description The entire human genome was analyzed for SNPs in and around predicted miRNA target sites. Polymorphisms within 200 nucleotides that could alter the intramolecular structure at the target site, thereby altering regulation were annotated. Collated information was ported in a MySQL database with a user-friendly interface accessible through the URL: . Conclusion The database has a user-friendly interface where the information can be queried using either the gene name, microRNA name, polymorphism ID or transcript ID. Combination queries using 'AND' or 'OR' is also possible along with specifying the degree of change of intramolecular bonding with and without the polymorphism. Such a resource would enable researchers address questions like the role of regulatory SNPs in the 3' UTRs and population specific regulatory modulations in the context of microRNA targets. PMID:19371411

  13. Identification of single nucleotide polymorphisms (SNPs) in the 16S rRNA gene of foodborne Bacillus spp.

    PubMed

    Fernández-No, I C; Böhme, K; Caamaño-Antelo, S; Barros-Velázquez, J; Calo-Mata, P

    2015-04-01

    The main goal of this work was the identification of single nucleotide polymorphisms (SNPs) in the 16S rRNA gene of foodborne Bacillus spp. that may be useful for typing purposes. These species include, among others, Bacillus cereus, an important pathogenic species involved in food poisoning, and Bacillus licheniformis, Bacillus subtilis and Bacillus pumilus, which are causative agents of food spoilage described as responsible for foodborne disease outbreaks. With this purpose in mind, 52 Bacillus strains isolated from culture collections and fresh and processed food were considered. SNP type "Y" at sites 212 and 476 appeared in the majority of B. licheniformis studied strains. SNP type "R" at site 278 was detected in many strains of the B. subtilis/Bacillus amyloliquefaciens group, while polymorphism "Y" at site 173 was characteristic of the majority of strains of B. cereus/Bacillus thuringiensis group. The analysis of SNPs provided more intra-specific information than phylogenetic analysis in the cases of B. cereus and B. subtilis. Moreover, this study describes novel SNPs that should be considered when designing 16S rRNA-based primers and probes for multiplex-PCR, Real-Time PCR and microarray systems for foodborne Bacillus spp. PMID:25475292

  14. Investigating the function of three non-synonymous SNPs in EGFR gene: structural modelling and association with breast cancer.

    PubMed

    Choura, Mouna; Frikha, Fakher; Kharrat, Najla; Aifa, Sami; Rebaï, Ahmed

    2010-01-01

    Non-synonymous single nucleotide polymorphisms (nsSNPs) represent common genomic variations that alter protein sequence and function. Some nsSNPs affecting conserved amino acids have been reported to be associated with cancer susceptibility. Interestingly, Epidermal Growth Factor Receptor (EGFR) is commonly overexpressed and mutated in many cancers. In this study, we investigated the structural effect of three deleterious nsSNPs: rs17337451 (R962G), rs1140476 (R977C) and rs17290699 (H988P) within EGFR using computational tools. The modelled mutant dimers showed less stability than wild type EGFR dimer. Furthermore, we showed the important role of R962 and H988 residues in the EGFR dimer formation. We also report preliminary experimental data for SNP R977C suggesting that the variant C977 might confer greater risk for breast cancer. These results contribute to an improved understanding of the EGFR dimer stability and provide new elements for understanding the relationship between EGFR and cancer. PMID:20049516

  15. Whole-Genome Resequencing Analysis of Hanwoo and Yanbian Cattle to Identify Genome-Wide SNPs and Signatures of Selection.

    PubMed

    Choi, Jung-Woo; Choi, Bong-Hwan; Lee, Seung-Hwan; Lee, Seung-Soo; Kim, Hyeong-Cheol; Yu, Dayeong; Chung, Won-Hyong; Lee, Kyung-Tai; Chai, Han-Ha; Cho, Yong-Min; Lim, Dajeong

    2015-05-31

    Over the last 30 years, Hanwoo has been selectively bred to improve economically important traits. Hanwoo is currently the representative Korean native beef cattle breed, and it is believed that it shared an ancestor with a Chinese breed, Yanbian cattle, until the last century. However, these two breeds have experienced different selection pressures during recent decades. Here, we whole-genome sequenced 10 animals each of Hanwoo and Yanbian cattle (20 total) using the Illumina HiSeq 2000 sequencer. A total of approximately 3.12 and 3.07 billion sequence reads were mapped to the bovine reference sequence assembly (UMD 3.1) at an average of approximately 10.71- and 10.53-fold coverage for Hanwoo and Yanbian cattle, respectively. A total of 17,936,399 single nucleotide polymorphisms (SNPs) were yielded, of which 22.3% were found to be novel. By annotating the SNPs, we further retrieved numerous nonsynonymous SNPs that may be associated with traits of interest in cattle. Furthermore, we performed whole-genome screening to detect signatures of selection throughout the genome. We located several promising selective sweeps that are potentially responsible for economically important traits in cattle; the PPP1R12A gene is an example of a gene that potentially affects intramuscular fat content. These discoveries provide valuable genomic information regarding potential genomic markers that could predict traits of interest for breeding programs of these cattle breeds. PMID:26018558

  16. Generic Kalman Filter Software

    NASA Technical Reports Server (NTRS)

    Lisano, Michael E., II; Crues, Edwin Z.

    2005-01-01

    The Generic Kalman Filter (GKF) software provides a standard basis for the development of application-specific Kalman-filter programs. Historically, Kalman filters have been implemented by customized programs that must be written, coded, and debugged anew for each unique application, then tested and tuned with simulated or actual measurement data. Total development times for typical Kalman-filter application programs have ranged from months to weeks. The GKF software can simplify the development process and reduce the development time by eliminating the need to re-create the fundamental implementation of the Kalman filter for each new application. The GKF software is written in the ANSI C programming language. It contains a generic Kalman-filter-development directory that, in turn, contains a code for a generic Kalman filter function; more specifically, it contains a generically designed and generically coded implementation of linear, linearized, and extended Kalman filtering algorithms, including algorithms for state- and covariance-update and -propagation functions. The mathematical theory that underlies the algorithms is well known and has been reported extensively in the open technical literature. Also contained in the directory are a header file that defines generic Kalman-filter data structures and prototype functions and template versions of application-specific subfunction and calling navigation/estimation routine code and headers. Once the user has provided a calling routine and the required application-specific subfunctions, the application-specific Kalman-filter software can be compiled and executed immediately. During execution, the generic Kalman-filter function is called from a higher-level navigation or estimation routine that preprocesses measurement data and post-processes output data. The generic Kalman-filter function uses the aforementioned data structures and five implementation- specific subfunctions, which have been developed by the user on the basis of the aforementioned templates. The GKF software can be used to develop many different types of unfactorized Kalman filters. A developer can choose to implement either a linearized or an extended Kalman filter algorithm, without having to modify the GKF software. Control dynamics can be taken into account or neglected in the filter-dynamics model. Filter programs developed by use of the GKF software can be made to propagate equations of motion for linear or nonlinear dynamical systems that are deterministic or stochastic. In addition, filter programs can be made to operate in user-selectable "covariance analysis" and "propagation-only" modes that are useful in design and development stages.

  17. Using imputation and mixture model approaches to integrate multi-state capture-recapture models with assignment information.

    PubMed

    Wen, Zhi; Pollock, Kenneth H; Nichols, James D; Waser, Peter M; Cao, Weihua

    2014-06-01

    In this article, we first extend the superpopulation capture-recapture model to multiple states (locations or populations) for two age groups., Wen et al., (2011; 2013) developed a new approach combining capture-recapture data with population assignment information to estimate the relative contributions of in situ births and immigrants to the growth of a single study population. Here, we first generalize Wen et al., (2011; 2013) approach to a system composed of multiple study populations (multi-state) with two age groups, where an imputation approach is employed to account for the uncertainty inherent in the population assignment information. Then we develop a different, individual-level mixture model approach to integrate the individual-level population assignment information with the capture-recapture data. Our simulation and real data analyses show that the fusion of population assignment information with capture-recapture data allows us to estimate the origination-specific recruitment of new animals to the system and the dispersal process between populations within the system. Compared to a standard capture-recapture model, our new models improve the estimation of demographic parameters, including survival probability, origination-specific entry probability, and especially the probability of movement between populations, yielding higher accuracy and precision. PMID:24571715

  18. Analysis of longitudinal trials with protocol deviation: a framework for relevant, accessible assumptions, and inference via multiple imputation.

    PubMed

    Carpenter, James R; Roger, James H; Kenward, Michael G

    2013-01-01

    Protocol deviations, for example, due to early withdrawal and noncompliance, are unavoidable in clinical trials. Such deviations often result in missing data. Additional assumptions are then needed for the analysis, and these cannot be definitively verified from the data at hand. Thus, as recognized by recent regulatory guidelines and reports, clarity about these assumptions and their implications is vital for both the primary analysis and framing relevant sensitivity analysis. This article focuses on clinical trials with longitudinal quantitative outcome data. For the target population, we define two estimands, the de jure estimand, "does the treatment work under the best case scenario," and the de facto estimand, "what would be the effect seen in practice." We then carefully define the concept of a deviation from the protocol relevant to the estimand, or for short a deviation. Each patient's postrandomization data can then be divided into predeviation data and postdeviation data. We set out an accessible framework for contextually appropriate assumptions relevant to de facto and de jure estimands, that is, assumptions about the joint distribution of pre- and postdeviation data relevant to the clinical question at hand. We then show how, under these assumptions, multiple imputation provides a practical approach to estimation and inference. We illustrate with data from a longitudinal clinical trial in patients with chronic asthma. PMID:24138436

  19. Filtering separators having filter cleaning apparatus

    SciTech Connect

    Margraf, A.

    1984-08-28

    This invention relates to filtering separators of the kind having a housing which is subdivided by a partition, provided with parallel rows of holes or slots, into a dust-laden gas space for receiving filter elements positioned in parallel rows and being impinged upon by dust-laden gas from the outside towards the inside, and a clean gas space. In addition, the housing is provided with a chamber for cleansing the filter element surfaces of a row by counterflow action while covering at the same time the partition holes or slots leading to the adjacent rows of filter elements. The chamber is arranged for the supply of compressed air to at least one injector arranged to feed compressed air and secondary air to the row of filter elements to be cleansed. The chamber is also reciprocatingly displaceable along the partition in periodic and intermittent manner. According to the invention, a surface of the chamber facing towards the partition covers at least two of the rows of holes or slots of the partition, and the chamber is closed upon itself with respect to the clean gas space, and is connected to a compressed air reservoir via a distributor pipe and a control valve. At least one of the rows of holes or slots of the partition and the respective row of filter elements in flow communication therewith are in flow communication with the discharge side of at least one injector acted upon with compressed air. At least one other row of the rows of holes or slots of the partition and the respective row of filter elements is in flow communication with the suction side of the injector.

  20. Contactor/filter improvements

    DOEpatents

    Stelman, D.

    1988-06-30

    A contactor/filter arrangement for removing particulate contaminants from a gaseous stream is described. The filter includes a housing having a substantially vertically oriented granular material retention member with upstream and downstream faces, a substantially vertically oriented microporous gas filter element, wherein the retention member and the filter element are spaced apart to provide a zone for the passage of granular material therethrough. A gaseous stream containing particulate contaminants passes through the gas inlet means as well as through the upstream face of the granular material retention member, passing through the retention member, the body of granular material, the microporous gas filter element, exiting out of the gas outlet means. A cover screen isolates the filter element from contact with the moving granular bed. In one embodiment, the granular material is comprised of porous alumina impregnated with CuO, with the cover screen cleaned by the action of the moving granular material as well as by backflow pressure pulses. 6 figs.

  1. Waveguide interleaving filters

    NASA Astrophysics Data System (ADS)

    Chiba, Takafumi; Arai, Hideaki; Nounen, Hideki; Ohira, Katsumi

    2003-08-01

    Optical Interleaving filters are attractive components in WDM systems especially narrower than 100GHz channel spacing system. Several types of interleaving filters have been proposed and realized. An interleaving filter is required to have the Box-like characteristics such as the periodic response, flat-passband, low insertion loss and low crosstalk. In addition, low chromatic dispersion (CD) is indispensable for DWDM systems. We focus on planar lightwave circuit interleaving filters. In this paper, we present interleaving filters with a tandem configuration of Fourier transform-based MZIs. The circuit using PLC technique is fabricated with high index contrast waveguides of ?1.5%. We also have demonstrated monolithically integrated 1×4 (50-200GHz) interleaving filters and arrayed waveguide gratings(AWGs) with 200GHz-channel spacing suitable for interleaving WDM system.

  2. Optically tunable optical filter

    NASA Astrophysics Data System (ADS)

    James, Robert T. B.; Wah, Christopher; Iizuka, Keigo; Shimotahira, Hiroshi

    1995-12-01

    We experimentally demonstrate an optically tunable optical filter that uses photorefractive barium titanate. With our filter we implement a spectrum analyzer at 632.8 nm with a resolution of 1.2 nm. We simulate a wavelength-division multiplexing system by separating two semiconductor laser diodes, at 1560 nm and 1578 nm, with the same filter. The filter has a bandwidth of 6.9 nm. We also use the same filter to take 2.5-nm-wide slices out of a 20-nm-wide superluminescent diode centered at 840 nm. As a result, we experimentally demonstrate a phenomenal tuning range from 632.8 to 1578 nm with a single filtering device.

  3. Optimal separable correlation filters

    NASA Astrophysics Data System (ADS)

    McFadden, Frank E.

    2002-07-01

    Separable filters, because they are specified separately in each dimension, require less memory space and present opportunities for faster computation. Mahalanobis and Kumar1 presented a method for deriving separable correlation filters, but the filters were required to satisfy a restrictive assumption, and were thus not fully optimized. In this work, we present a general procedure for deriving separable versions of any correlation filter, using singular value decomposition (SVD), and prove that this is optimal for separable filters based on the Maximum Average Correlation Height (MACH) criterion. Further, we show that additional separable components may be used to improve the performance of the filter, with only a linear increase in computational and memory space requirements. MSTAR data is used to demonstrate the effects on sharpness of correlation peaks and locational precision, as the number of separable components is varied.

  4. Uneven-order decentered Shapiro filters for boundary filtering

    NASA Astrophysics Data System (ADS)

    Falissard, F.

    2015-07-01

    This paper addresses the use of Shapiro filters for boundary filtering. A new class of uneven-order decentered Shapiro filters is proposed and compared to classical Shapiro filters and even-order decentered Shapiro filters. The theoretical analysis shows that the proposed boundary filters are more accurate than the centered Shapiro filters and more robust than the even-order decentered boundary filters usable at the same distance to the boundary. The benefit of the new boundary filters is assessed for computations using the compressible Euler equations.

  5. Hybrid Filter Membrane

    NASA Technical Reports Server (NTRS)

    Laicer, Castro; Rasimick, Brian; Green, Zachary

    2012-01-01

    Cabin environmental control is an important issue for a successful Moon mission. Due to the unique environment of the Moon, lunar dust control is one of the main problems that significantly diminishes the air quality inside spacecraft cabins. Therefore, this innovation was motivated by NASA s need to minimize the negative health impact that air-suspended lunar dust particles have on astronauts in spacecraft cabins. It is based on fabrication of a hybrid filter comprising nanofiber nonwoven layers coated on porous polymer membranes with uniform cylindrical pores. This design results in a high-efficiency gas particulate filter with low pressure drop and the ability to be easily regenerated to restore filtration performance. A hybrid filter was developed consisting of a porous membrane with uniform, micron-sized, cylindrical pore channels coated with a thin nanofiber layer. Compared to conventional filter media such as a high-efficiency particulate air (HEPA) filter, this filter is designed to provide high particle efficiency, low pressure drop, and the ability to be regenerated. These membranes have well-defined micron-sized pores and can be used independently as air filters with discreet particle size cut-off, or coated with nanofiber layers for filtration of ultrafine nanoscale particles. The filter consists of a thin design intended to facilitate filter regeneration by localized air pulsing. The two main features of this invention are the concept of combining a micro-engineered straight-pore membrane with nanofibers. The micro-engineered straight pore membrane can be prepared with extremely high precision. Because the resulting membrane pores are straight and not tortuous like those found in conventional filters, the pressure drop across the filter is significantly reduced. The nanofiber layer is applied as a very thin coating to enhance filtration efficiency for fine nanoscale particles. Additionally, the thin nanofiber coating is designed to promote capture of dust particles on the filter surface and to facilitate dust removal with pulse or back airflow.

  6. Practical alarm filtering

    SciTech Connect

    Bray, M.; Corsberg, D. (Idaho National Engineering Lab., Idaho Falls, ID (United States))

    1994-02-01

    An expert system-based alarm filtering method is described which prioritizes and reduces the number of alarms facing an operator. This patented alarm filtering methodology was originally developed and implemented in a pressurized water reactor, and subsequently in a chemical processing facility. Both applications were in LISP and both were successful. In the chemical processing facility, for instance, alarm filtering reduced the quantity of alarm messages by 90%. 6 figs.

  7. Filter holder and gasket assembly for candle or tube filters

    DOEpatents

    Lippert, Thomas Edwin (Murrysville, PA); Alvin, Mary Anne (Pittsburgh, PA); Bruck, Gerald Joseph (Murrysville, PA); Smeltzer, Eugene E. (Export, PA)

    1999-03-02

    A filter holder and gasket assembly for holding a candle filter element within a hot gas cleanup system pressure vessel. The filter holder and gasket assembly includes a filter housing, an annular spacer ring securely attached within the filter housing, a gasket sock, a top gasket, a middle gasket and a cast nut.

  8. Filter holder and gasket assembly for candle or tube filters

    DOEpatents

    Lippert, T.E.; Alvin, M.A.; Bruck, G.J.; Smeltzer, E.E.

    1999-03-02

    A filter holder and gasket assembly are disclosed for holding a candle filter element within a hot gas cleanup system pressure vessel. The filter holder and gasket assembly includes a filter housing, an annular spacer ring securely attached within the filter housing, a gasket sock, a top gasket, a middle gasket and a cast nut. 9 figs.

  9. Transversal filters with continuously tapped delay lines (CTT filters)

    Microsoft Academic Search

    E. R. Hafner; P. E. Leuthold

    1969-01-01

    The properties of common transversal filters are well known. Transversal filters with continuously tapped delay lines have the advantage that they do not show a periodic continuation of the frequency characteristic. Therefore it is not necessary to use an additional filter for suppressing higher frequency components. After an introduction to CTT filter theory the realization of transversal filters with continuous

  10. Neural filtering of colored noise based on Kalman filter structure

    Microsoft Academic Search

    Shen-Shu Xiong; Zhao-Ying Zhou

    2003-01-01

    In this paper, adaptive filtering approaches of colored noise based on the Kalman filter structure using neural networks are proposed, which need not extend the dimensions of the filter. The colored measurement noise is first modeled from a Gaussian white noise through a shaping filter. The Kalman filtering model of colored noise is then built by adopting an equivalent observation

  11. Nanofiber Filters Eliminate Contaminants

    NASA Technical Reports Server (NTRS)

    2009-01-01

    With support from Phase I and II SBIR funding from Johnson Space Center, Argonide Corporation of Sanford, Florida tested and developed its proprietary nanofiber water filter media. Capable of removing more than 99.99 percent of dangerous particles like bacteria, viruses, and parasites, the media was incorporated into the company's commercial NanoCeram water filter, an inductee into the Space Foundation's Space Technology Hall of Fame. In addition to its drinking water filters, Argonide now produces large-scale nanofiber filters used as part of the reverse osmosis process for industrial water purification.

  12. Linear phase compressive filter

    DOEpatents

    McEwan, T.E.

    1995-06-06

    A phase linear filter for soliton suppression is in the form of a laddered series of stages of non-commensurate low pass filters with each low pass filter having a series coupled inductance (L) and a reverse biased, voltage dependent varactor diode, to ground which acts as a variable capacitance (C). L and C values are set to levels which correspond to a linear or conventional phase linear filter. Inductance is mapped directly from that of an equivalent nonlinear transmission line and capacitance is mapped from the linear case using a large signal equivalent of a nonlinear transmission line. 2 figs.

  13. Association Analysis of IL10, TNF-?, and IL23R-IL12RB2 SNPs with Behçet’s Disease Risk in Western Algeria

    PubMed Central

    Khaib Dit Naib, Ouahiba; Aribi, Mourad; Idder, Aicha; Chiali, Amel; Sairi, Hakim; Touitou, Isabelle; Lefranc, Gérard; Barat-Houari, Mouna

    2013-01-01

    Objective: We have conducted the first study of the association of interleukin (IL)-10, tumor necrosis factor alpha (TNF-?), and IL23R-IL12RB2 region single nucleotide polymorphisms (SNPs) with Behçet’s disease (BD) in Western Algeria. Methods: A total of 51 BD patients and 96 unrelated controls from West region of Algeria were genotyped by direct sequencing for 11 SNPs including 2 SNPs from the IL10 promoter [c.-819T?>?C (rs1800871), c.-592A?>?C (rs1800872)], 6 SNPs from the TNF-? promoter [c.-1211T?>?C (rs1799964), c.-1043C?>?A (rs1800630), c.-1037C?>?T (rs1799724), c.-556G?>?A (rs1800750), c.-488G?>?A (rs1800629), and c.-418G?>?A (rs361525)], and 3 SNPs from the IL23R-IL12RB2 region [g.67747415A?>?C (rs12119179), g.67740092G?>?A (rs11209032), and g.67760140T?>?C (rs924080)]. Results: The minor alleles c.-819T and c.-592A were significantly associated with BD [odds ratio (OR)?=?2.18; 95% confidence interval (CI) 1.28–3.73, p?=?0.003]; whereas, there was weaker association between TNF-? promoter SNPs or IL23R-IL12RB2 region and disease risk. Conclusion: Unlike the TNF-? and the IL23R-IL12RB2 region SNPs, the two IL10 SNPs were strongly associated with BD. The -819T, and -592A alleles and the -819TT, -819CT, and -592AA and -592CA genotypes seem to be highly involved in the risk of developing of BD in the population of Western Algeria. PMID:24151497

  14. SNPs in MicroRNA Binding Sites in 3?-UTRs of RAAS Genes Influence Arterial Blood Pressure and Risk of Myocardial Infarction

    Microsoft Academic Search

    A. Yael Nossent; Jakob L. Hansen; Carine Doggen; Paul H. A. Quax; Soren P. Sheikh; Frits R. Rosendaal; Søren P Sheikh

    2011-01-01

    BackgroundWe hypothesized that single nucleotide polymorphisms (SNPs) located in microRNA (miR) binding sites in genes of the renin angiotensin aldosterone system (RAAS) can influence blood pressure and risk of myocardial infarction.MethodsUsing online databases dbSNP and TargetScan, we identified 10 SNPs in potential miR binding sites in eight RAAS-related genes, common in Caucasians. We genotyped a large case-control study on myocardial

  15. Screening and structural evaluation of deleterious Non-Synonymous SNPs of ePHA2 gene involved in susceptibility to cataract formation.

    PubMed

    Masoodi, Tariq Ahmad; Shammari, Sulaiman A Al; Al-Muammar, May N; Almubrad, Turki M; Alhamdan, Adel A

    2012-01-01

    Age-related cataract is clinically and genetically heterogeneous disorder affecting the ocular lens, and the leading cause of vision loss and blindness worldwide. Here we screened nonsynonymous single nucleotide polymorphisms (nsSNPs) of a novel gene, EPHA2 responsible for age related cataracts. The SNPs were retrieved from dbSNP. Using I-Mutant, protein stability change was calculated. The potentially functional nsSNPs and their effect on protein was predicted by PolyPhen and SIFT respectively. FASTSNP was used for functional analysis and estimation of risk score. The functional impact on the EPHA2 protein was evaluated by using SWISSPDB viewer and NOMAD-Ref server. Our analysis revealed 16 SNPs as nonsynonymous out of which 6 nsSNPs, namely rs11543934, rs2291806, rs1058371, rs1058370, rs79100278 and rs113882203 were found to be least stable by I-Mutant 2.0 with DDG value of > -1.0. nsSNPs, namely rs35903225, rs2291806, rs1058372, rs1058370, rs79100278 and rs113882203 showed a highly deleterious tolerance index score of 0.00 by SIFT server. Four nsSNPs namely rs11543934, rs2291806, rs1058370 and rs113882203 were found to be probably damaging with PSIC score of ? 2. 0 by Polyp hen server. Three nsSNPs namely, rs11543934, rs2291806 and rs1058370 were found to be highly polymorphic with a risk score of 3-4 with a possible effect of Non-conservative change and splicing regulation by FASTSNP. The total energy and RMSD value was higher for the mutant-type structure compared to the native type structure. We concluded that the nsSNP namely rs2291806 as the potential functional polymorphic that is likely to have functional impact on the EPHA2 gene. PMID:22829731

  16. Transcriptome-Wide Single Nucleotide Polymorphisms (SNPs) for Abalone (Haliotis midae): Validation and Application Using GoldenGate Medium-Throughput Genotyping Assays

    PubMed Central

    Bester-Van Der Merwe, Aletta; Blaauw, Sonja; Plessis, Jana Du; Roodt-Wilding, Rouvay

    2013-01-01

    Haliotis midae is one of the most valuable commercial abalone species in the world, but is highly vulnerable, due to exploitation, habitat destruction and predation. In order to preserve wild and cultured stocks, genetic management and improvement of the species has become crucial. Fundamental to this is the availability and employment of molecular markers, such as microsatellites and single nucleotide (SNPs). Transcriptome sequences generated through sequencing-by-synthesis technology were utilized for the in vitro and in silico identification of 505 putative SNPs from a total of 316 selected contigs. A subset of 234 SNPs were further validated and characterized in wild and cultured abalone using two Illumina GoldenGate genotyping assays. Combined with VeraCode technology, this genotyping platform yielded a 65%–69% conversion rate (percentage polymorphic markers) with a global genotyping success rate of 76%–85% and provided a viable means for validating SNP markers in a non-model species. The utility of 31 of the validated SNPs in population structure analysis was confirmed, while a large number of SNPs (174) were shown to be informative and are, thus, good candidates for linkage map construction. The non-synonymous SNPs (50) located in coding regions of genes that showed similarities with known proteins will also be useful for genetic applications, such as the marker-assisted selection of genes of relevance to abalone aquaculture. PMID:24065109

  17. Top associated SNPs in prostate cancer are significantly enriched in cis-expression quantitative trait loci and at transcription factor binding sites

    PubMed Central

    Shen, Bairong; Zhao, Zhongming

    2014-01-01

    While genome-wide association studies (GWAS) have revealed thousands of disease risk single nucleotide polymorphisms (SNPs), their functions remain largely unknown. Recent studies have suggested the regulatory roles of GWAS risk variants in several common diseases; however, the complex regulatory structure in prostate cancer is unclear. We investigated the potential regulatory roles of risk variants in two prostate cancer GWAS datasets by their interactions with expression quantitative trait loci (eQTL) and/or transcription factor binding sites (TFBSs) in three populations. Our results indicated that the moderately associated GWAS SNPs were significantly enriched with cis-eQTLs and TFBSs in Caucasians (CEU), but not in African Americans (AA) or Japanese (JPT); this was also observed in an independent pan-cancer related SNPs from the GWAS Catalog. We found that the eQTL enrichment in the CEU population was tissue-specific to eQTLs from CEU lymphoblastoid cell lines. Importantly, we pinpointed two SNPs, rs2861405 and rs4766642, by overlapping results from cis-eQTL and TFBS as applied to the CEU data. These results suggested that prostate cancer associated SNPs and pan-cancer associated SNPs are likely to play regulatory roles in CEU. However, the negative enrichment results in AA or JPT and the potential mechanisms remain to be elucidated in additional samples. PMID:25026280

  18. Optimal Filters on the Sphere

    Microsoft Academic Search

    Jason D. Mcewen; Michael P. Hobson; Anthony N. Lasenby

    2008-01-01

    We derive optimal filters on the sphere in the context of detecting compact objects embedded in a stochastic background process. The matched filter and the scale adaptiv e filter are derived on the sphere in the most general setting, allowing for directional template profiles and filters. The p erfor- mance and relative merits of the two optimal filters are discu

  19. Ll-filters-a new class of order statistic filters

    Microsoft Academic Search

    FRANCESCO PALMIERI; CHARLES G. BONCELET

    1989-01-01

    The Ll-filters are introduced to generate the order statistic filters (L-filters) and the nonrecursive linear, or finite-duration impulse-response (FIR), filters. Such estimators are particularly effective filtering signals that do not necessarily follow Gaussian distributions. They can be designed to restore one-dimensional or multidimensional signals corrupted by noise of impulsive type. Such filters are appealing since they are suitable for being

  20. Tunable LYOT-filters

    NASA Astrophysics Data System (ADS)

    Donkor, Eric; Kumavor, Patrick D.; Villa, Carlos

    2009-05-01

    An all fiber tunable Lyot filter is experimentally demonstrated. The filter can be configured for narrow band or wide band tuning. It is implemented as a Sagnac loop configuration comprised of polarization controllers, birefringent fibers, and a polarizer. Results are present for band stop tuning and narrow band tuning for the wavelengths ranging between 1500nm to 1600nm.

  1. Intelligent medical information filtering

    Microsoft Academic Search

    Yuri Quintana

    1998-01-01

    This paper describes an intelligent information filtering system to assist users to be notified of updates to new and relevant medical information. Among the major problems users face is the large volume of medical information that is generated each day, and the need to filter and retrieve relevant information. The Internet has dramatically increased the amount of electronically accessible medical

  2. Waveguide filters for satellites

    Microsoft Academic Search

    Vicente E. Boria; Benito Gimeno

    2007-01-01

    This article describes the historical evolution, and its theoretical and technological features of waveguide filters. All-metal waveguide filters have been widely used in satellite payloads since the advent of the first space communication systems four decades ago. In this period, such technology has been continuously evolving due to the increasing requirements demanded to these components. Special attention has also been

  3. An optimal auditory filter

    Microsoft Academic Search

    IRINO Toshio; Morinosato Wakamiya

    1995-01-01

    The optimality of the peripheral auditory filter is investigated using operator methods applied to a scale representation. A `gammachirp' function, which consists of a frequency modulated carrier and an envelope of a gamma distribution function, is found to be the optimal auditory filter in terms of minimal uncertainty if the time-scale representation is calculated in the auditory system. The gammatone

  4. Durability of ceramic filters

    SciTech Connect

    Alvin, M.A.; Tressler, R.E.; Lippert, T.E.; Diaz, E.S.; Smeltzer, E.E.

    1994-10-01

    The objectives of this program are to identify the potential long-term thermal/chemical effects that advanced coal-based power generating systems have on the stability of porous ceramic filter materials, as well as to assess the influence of these effects on filter operating performance and life.

  5. [Polymorphism of SNPs in the lipoprotein lipase (LPL) in Siniperca chuatsi and their association with feed habit domestication].

    PubMed

    Yang, Yu-Hui; Liang, Xu-Fang; Fang, Rong; Peng, Min-Yan; Huang, Zhi-Dong

    2011-09-01

    Sinperca chuatsi usually refuses eating dead bait or man-made feed due to its specific feeding habit. It was showed that some S. chuatsi could be induced to take lifeless bait with domestication gradually after long-term of cultivation. High cost, serious pollution, severe diseases, and other problems of S chuatsi cultivation can be solved effectively by selecting those liable to domestication via molecular marker and treating them with artificial feed on a large scale. Lipoprotein lipase, one of the key enzymes in lipoprotein metabolic processes, provides energy by catalyzing the oxidation of triglycerides lie in the core of chylomicron and very low density lipoprotein (VLDL) into glycerin and fatty acid, and saves energy. In order to search the distribution of the alleles and genotypes of lipoprotein lipase gene (lipoprotein LPL) gene in two feed habit domestication phenotype populations, SNP genetic polymorphisms in 6 and 7 introns and 6, 7, and 8 exons of LPL in S. chuatsi were analyzed by PCR product sequencing method. Three SNPs A25T, G26T, and C29G were detected, two of which were non-synonymous mutations. It was indicated by Chi-square test that the analysis between domesticated and undomesticated populations show no significant difference (P>0.05) between the three SNPs and feed habit. However, diplotype 2 in the two population showed significant difference (P<0.05) by assembling different genotypes of the three SNPs into five diplotypes. Polymorphic analysis of partial fragment of LPL genome in S. chuatsi was accomplished successfully. Therefore, LPL gene can be considered as a candidate gene and genetic marker for feed habit domestication in S. chuatsi, which lays the foundation for the work of molecular marker and selecting breed, which has extensive application prospect. PMID:21951801

  6. Comparison of two methods for estimating absolute risk of prostate cancer based on SNPs and family history

    PubMed Central

    Hsu, Fang-Chi; Sun, Jielin; Zhu, Yi; Kim, Seong-Tae; Jin, Tao; Zhang, Zheng; Wiklund, Fredrik; Kader, A. Karim; Zheng, S. Lilly; Isaacs, William; Grönberg, Henrik; Xu, Jianfeng

    2010-01-01

    Disease risk-associated single nucleotide polymorphisms (SNPs) identified from genome-wide association studies have the potential to be used for disease risk prediction. An important feature of these risk-associated SNPs is their weak individual effect but stronger cumulative effect on disease risk. Several approaches are commonly used to model the combined effect in risk prediction but their performance is unclear. We compared two methods to model the combined effect of 14 prostate cancer (PCa) risk-associated SNPs and family history for the estimation of absolute risk for PCa in a population-based case-control study in Sweden (2,899 cases and 1,722 controls). Method 1 weighs each risk allele equally using a simple method of counting the number of risk alleles while Method 2 weighs each risk SNP differently based on their respective Odds Ratios. We found considerable differences between the two methods. Absolute risk estimates from Method 1 were generally higher than that of Method 2, especially among men at higher risk. The difference in the overall discriminative performance, measured by area under the curve (AUC) of the receiver operating characteristic was small between Method 1 (0.614) and Method 2 (0.618), P = 0.20. However, the performance of these two methods in identifying high-risk individuals (two-fold or three-fold higher than average risk), measured by positive predictive values (PPV), was higher for Method 2 than Method 1. In conclusion, these results suggest that Method 2 is superior to Method 1 in estimating absolute risk if the purpose of risk prediction is to identify high-risk individuals. PMID:20332264

  7. SSRI response in depression may be influenced by SNPs in HTR1B and HTR1A

    PubMed Central

    Villafuerte, Sandra M.; Vallabhaneni, Kamala; ?liwerska, El?bieta; McMahon, Francis J.; Young, Elizabeth A.; Burmeister, Margit

    2009-01-01

    Desensitization of serotonin 1A (HTR1A) and 1B (HTR1B) autoreceptors has been proposed to be involved in the delayed onset of response to SSRIs. Variations in gene expression in these genes may thus affect SSRI response. Here we test this hypothesis in two samples from the Sequenced Treatment Alternatives to Relieve Depression (STAR*D), and show evidence for involvement of several genetic variants alone and in interaction. Initially, three functional SNPs in the HTR1B gene and in the HTR1A gene were analyzed in 153 depressed patients treated with citalopram. QIDS-C scores were evaluated over time with respect to genetic variation. Subjects homozygous for the - 1019 G allele (rs6295) in HTR1A showed higher baseline QIDS scores (p = 0.033), and by 12 weeks had a significantly lower response rate (p = 0.005). HTR1B haplotypes were estimated according to previously reported in-vitro expression levels. Individuals who were homozygous for the high-expression haplotype showed significantly slower response to citalopram (p = 0.034). We then analyzed more SNPs in the extended overall STAR*D sample. Although we could not directly test the same functional SNPs, we found that homozygotes for the G allele at rs1364043 in HTR1A (p = 0.045) and the C allele of rs6298 in HTR1B showed better response to citalopram over time (p = 0.022). Test for interaction between rs6298 in HTR1B and rs1364043 in HTR1A was significant (overall p = 0.032) Our data suggest that an enhanced capacity of HTR1B or HTR1A transcriptional activity may impair desensitization of the autoreceptors during SSRI treatment. PMID:19829169

  8. Sub-micron filter

    DOEpatents

    Tepper, Frederick (Sanford, FL); Kaledin, Leonid (Port Orange, FL)

    2009-10-13

    Aluminum hydroxide fibers approximately 2 nanometers in diameter and with surface areas ranging from 200 to 650 m.sup.2/g have been found to be highly electropositive. When dispersed in water they are able to attach to and retain electronegative particles. When combined into a composite filter with other fibers or particles they can filter bacteria and nano size particulates such as viruses and colloidal particles at high flux through the filter. Such filters can be used for purification and sterilization of water, biological, medical and pharmaceutical fluids, and as a collector/concentrator for detection and assay of microbes and viruses. The alumina fibers are also capable of filtering sub-micron inorganic and metallic particles to produce ultra pure water. The fibers are suitable as a substrate for growth of cells. Macromolecules such as proteins may be separated from each other based on their electronegative charges.

  9. Common non-synonymous SNPs associated with breast cancer susceptibility: findings from the Breast Cancer Association Consortium.

    PubMed

    Milne, Roger L; Burwinkel, Barbara; Michailidou, Kyriaki; Arias-Perez, Jose-Ignacio; Zamora, M Pilar; Menéndez-Rodríguez, Primitiva; Hardisson, David; Mendiola, Marta; González-Neira, Anna; Pita, Guillermo; Alonso, M Rosario; Dennis, Joe; Wang, Qin; Bolla, Manjeet K; Swerdlow, Anthony; Ashworth, Alan; Orr, Nick; Schoemaker, Minouk; Ko, Yon-Dschun; Brauch, Hiltrud; Hamann, Ute; Andrulis, Irene L; Knight, Julia A; Glendon, Gord; Tchatchou, Sandrine; Matsuo, Keitaro; Ito, Hidemi; Iwata, Hiroji; Tajima, Kazuo; Li, Jingmei; Brand, Judith S; Brenner, Hermann; Dieffenbach, Aida Karina; Arndt, Volker; Stegmaier, Christa; Lambrechts, Diether; Peuteman, Gilian; Christiaens, Marie-Rose; Smeets, Ann; Jakubowska, Anna; Lubinski, Jan; Jaworska-Bieniek, Katarzyna; Durda, Katazyna; Hartman, Mikael; Hui, Miao; Yen Lim, Wei; Wan Chan, Ching; Marme, Federick; Yang, Rongxi; Bugert, Peter; Lindblom, Annika; Margolin, Sara; García-Closas, Montserrat; Chanock, Stephen J; Lissowska, Jolanta; Figueroa, Jonine D; Bojesen, Stig E; Nordestgaard, Børge G; Flyger, Henrik; Hooning, Maartje J; Kriege, Mieke; van den Ouweland, Ans M W; Koppert, Linetta B; Fletcher, Olivia; Johnson, Nichola; dos-Santos-Silva, Isabel; Peto, Julian; Zheng, Wei; Deming-Halverson, Sandra; Shrubsole, Martha J; Long, Jirong; Chang-Claude, Jenny; Rudolph, Anja; Seibold, Petra; Flesch-Janys, Dieter; Winqvist, Robert; Pylkäs, Katri; Jukkola-Vuorinen, Arja; Grip, Mervi; Cox, Angela; Cross, Simon S; Reed, Malcolm W R; Schmidt, Marjanka K; Broeks, Annegien; Cornelissen, Sten; Braaf, Linde; Kang, Daehee; Choi, Ji-Yeob; Park, Sue K; Noh, Dong-Young; Simard, Jacques; Dumont, Martine; Goldberg, Mark S; Labrèche, France; Fasching, Peter A; Hein, Alexander; Ekici, Arif B; Beckmann, Matthias W; Radice, Paolo; Peterlongo, Paolo; Azzollini, Jacopo; Barile, Monica; Sawyer, Elinor; Tomlinson, Ian; Kerin, Michael; Miller, Nicola; Hopper, John L; Schmidt, Daniel F; Makalic, Enes; Southey, Melissa C; Hwang Teo, Soo; Har Yip, Cheng; Sivanandan, Kavitta; Tay, Wan-Ting; Shen, Chen-Yang; Hsiung, Chia-Ni; Yu, Jyh-Cherng; Hou, Ming-Feng; Guénel, Pascal; Truong, Therese; Sanchez, Marie; Mulot, Claire; Blot, William; Cai, Qiuyin; Nevanlinna, Heli; Muranen, Taru A; Aittomäki, Kristiina; Blomqvist, Carl; Wu, Anna H; Tseng, Chiu-Chen; Van Den Berg, David; Stram, Daniel O; Bogdanova, Natalia; Dörk, Thilo; Muir, Kenneth; Lophatananon, Artitaya; Stewart-Brown, Sarah; Siriwanarangsan, Pornthep; Mannermaa, Arto; Kataja, Vesa; Kosma, Veli-Matti; Hartikainen, Jaana M; Shu, Xiao-Ou; Lu, Wei; Gao, Yu-Tang; Zhang, Ben; Couch, Fergus J; Toland, Amanda E; Yannoukakos, Drakoulis; Sangrajrang, Suleeporn; McKay, James; Wang, Xianshu; Olson, Janet E; Vachon, Celine; Purrington, Kristen; Severi, Gianluca; Baglietto, Laura; Haiman, Christopher A; Henderson, Brian E; Schumacher, Fredrick; Le Marchand, Loic; Devilee, Peter; Tollenaar, Robert A E M; Seynaeve, Caroline; Czene, Kamila; Eriksson, Mikael; Humphreys, Keith; Darabi, Hatef; Ahmed, Shahana; Shah, Mitul; Pharoah, Paul D P; Hall, Per; Giles, Graham G; Benítez, Javier; Dunning, Alison M; Chenevix-Trench, Georgia; Easton, Douglas F

    2014-11-15

    Candidate variant association studies have been largely unsuccessful in identifying common breast cancer susceptibility variants, although most studies have been underpowered to detect associations of a realistic magnitude. We assessed 41 common non-synonymous single-nucleotide polymorphisms (nsSNPs) for which evidence of association with breast cancer risk had been previously reported. Case-control data were combined from 38 studies of white European women (46 450 cases and 42 600 controls) and analyzed using unconditional logistic regression. Strong evidence of association was observed for three nsSNPs: ATXN7-K264R at 3p21 [rs1053338, per allele OR = 1.07, 95% confidence interval (CI) = 1.04-1.10, P = 2.9 × 10(-6)], AKAP9-M463I at 7q21 (rs6964587, OR = 1.05, 95% CI = 1.03-1.07, P = 1.7 × 10(-6)) and NEK10-L513S at 3p24 (rs10510592, OR = 1.10, 95% CI = 1.07-1.12, P = 5.1 × 10(-17)). The first two associations reached genome-wide statistical significance in a combined analysis of available data, including independent data from nine genome-wide association studies (GWASs): for ATXN7-K264R, OR = 1.07 (95% CI = 1.05-1.10, P = 1.0 × 10(-8)); for AKAP9-M463I, OR = 1.05 (95% CI = 1.04-1.07, P = 2.0 × 10(-10)). Further analysis of other common variants in these two regions suggested that intronic SNPs nearby are more strongly associated with disease risk. We have thus identified a novel susceptibility locus at 3p21, and confirmed previous suggestive evidence that rs6964587 at 7q21 is associated with risk. The third locus, rs10510592, is located in an established breast cancer susceptibility region; the association was substantially attenuated after adjustment for the known GWAS hit. Thus, each of the associated nsSNPs is likely to be a marker for another, non-coding, variant causally related to breast cancer risk. Further fine-mapping and functional studies are required to identify the underlying risk-modifying variants and the genes through which they act. PMID:24943594

  10. New Insights into the Lake Chad Basin Population Structure Revealed by High-Throughput Genotyping of Mitochondrial DNA Coding SNPs

    PubMed Central

    ?erný, Viktor; Carracedo, Ángel

    2011-01-01

    Background Located in the Sudan belt, the Chad Basin forms a remarkable ecosystem, where several unique agricultural and pastoral techniques have been developed. Both from an archaeological and a genetic point of view, this region has been interpreted to be the center of a bidirectional corridor connecting West and East Africa, as well as a meeting point for populations coming from North Africa through the Saharan desert. Methodology/Principal Findings Samples from twelve ethnic groups from the Chad Basin (n?=?542) have been high-throughput genotyped for 230 coding region mitochondrial DNA (mtDNA) Single Nucleotide Polymorphisms (mtSNPs) using Matrix-Assisted Laser Desorption/Ionization Time-Of-Flight (MALDI-TOF) mass spectrometry. This set of mtSNPs allowed for much better phylogenetic resolution than previous studies of this geographic region, enabling new insights into its population history. Notable haplogroup (hg) heterogeneity has been observed in the Chad Basin mirroring the different demographic histories of these ethnic groups. As estimated using a Bayesian framework, nomadic populations showed negative growth which was not always correlated to their estimated effective population sizes. Nomads also showed lower diversity values than sedentary groups. Conclusions/Significance Compared to sedentary population, nomads showed signals of stronger genetic drift occurring in their ancestral populations. These populations, however, retained more haplotype diversity in their hypervariable segments I (HVS-I), but not their mtSNPs, suggesting a more ancestral ethnogenesis. Whereas the nomadic population showed a higher Mediterranean influence signaled mainly by sub-lineages of M1, R0, U6, and U5, the other populations showed a more consistent sub-Saharan pattern. Although lifestyle may have an influence on diversity patterns and hg composition, analysis of molecular variance has not identified these differences. The present study indicates that analysis of mtSNPs at high resolution could be a fast and extensive approach for screening variation in population studies where labor-intensive techniques such as entire genome sequencing remain unfeasible. PMID:21533064

  11. Imputation of exome sequence variants into population- based samples and blood-cell-trait-associated loci in African Americans: NHLBI GO Exome Sequencing Project.

    PubMed

    Auer, Paul L; Johnsen, Jill M; Johnson, Andrew D; Logsdon, Benjamin A; Lange, Leslie A; Nalls, Michael A; Zhang, Guosheng; Franceschini, Nora; Fox, Keolu; Lange, Ethan M; Rich, Stephen S; O'Donnell, Christopher J; Jackson, Rebecca D; Wallace, Robert B; Chen, Zhao; Graubert, Timothy A; Wilson, James G; Tang, Hua; Lettre, Guillaume; Reiner, Alex P; Ganesh, Santhi K; Li, Yun

    2012-11-01

    Researchers have successfully applied exome sequencing to discover causal variants in selected individuals with familial, highly penetrant disorders. We demonstrate the utility of exome sequencing followed by imputation for discovering low-frequency variants associated with complex quantitative traits. We performed exome sequencing in a reference panel of 761 African Americans and then imputed newly discovered variants into a larger sample of more than 13,000 African Americans for association testing with the blood cell traits hemoglobin, hematocrit, white blood count, and platelet count. First, we illustrate the feasibility of our approach by demonstrating genome-wide-significant associations for variants that are not covered by conventional genotyping arrays; for example, one such association is that between higher platelet count and an MPL c.117G>T (p.Lys39Asn) variant encoding a p.Lys39Asn amino acid substitution of the thrombopoietin receptor gene (p = 1.5 × 10(-11)). Second, we identified an association between missense variants of LCT and higher white blood count (p = 4 × 10(-13)). Third, we identified low-frequency coding variants that might account for allelic heterogeneity at several known blood cell-associated loci: MPL c.754T>C (p.Tyr252His) was associated with higher platelet count; CD36 c.975T>G (p.Tyr325(?)) was associated with lower platelet count; and several missense variants at the ?-globin gene locus were associated with lower hemoglobin. By identifying low-frequency missense variants associated with blood cell traits not previously reported by genome-wide association studies, we establish that exome sequencing followed by imputation is a powerful approach to dissecting complex, genetically heterogeneous traits in large population-based studies. PMID:23103231

  12. Combining information from two data sources with misreporting and incompleteness to assess hospice-use among cancer patients: a multiple imputation approach.

    PubMed

    He, Yulei; Landrum, Mary Beth; Zaslavsky, Alan M

    2014-09-20

    Combining information from multiple data sources can enhance estimates of health-related measures by using one source to supply information that is lacking in another, assuming the former has accurate and complete data. However, there is little research conducted on combining methods when each source might be imperfect, for example, subject to measurement errors and/or missing data. In a multisite study of hospice-use by late-stage cancer patients, this variable was available from patients' abstracted medical records, which may be considerably underreported because of incomplete acquisition of these records. Therefore, data for Medicare-eligible patients were supplemented with their Medicare claims that contained information on hospice-use, which may also be subject to underreporting yet to a lesser degree. In addition, both sources suffered from missing data because of unit nonresponse from medical record abstraction and sample undercoverage for Medicare claims. We treat the true hospice-use status from these patients as a latent variable and propose to multiply impute it using information from both data sources, borrowing the strength from each. We characterize the complete-data model as a product of an 'outcome' model for the probability of hospice-use and a 'reporting' model for the probability of underreporting from both sources, adjusting for other covariates. Assuming the reports of hospice-use from both sources are missing at random and the underreporting are conditionally independent, we develop a Bayesian multiple imputation algorithm and conduct multiple imputation analyses of patient hospice-use in demographic and clinical subgroups. The proposed approach yields more sensible results than alternative methods in our example. Our model is also related to dual system estimation in population censuses and dual exposure assessment in epidemiology. PMID:24804628

  13. Coparenting conflict, nonacceptance, and depression among divorced adults: results from a 12-year follow-up study of child custody mediation using multiple imputation.

    PubMed

    Sbarra, David A; Emery, Robert E

    2005-01-01

    Using statistically imputed data to increase available power, this article reevaluated the long-term effects of divorce mediation on adults' psychological adjustment and investigated the relations among coparenting custody conflict, nonacceptance of marital termination, and depression at 2 occasions over a decade apart following marital dissolution. Group comparisons revealed that fathers and parents who mediated their custody disputes reported significantly more nonacceptance at the 12-year follow-up assessment. Significant interactions were observed by gender in regression models predicting nonacceptance at the follow-up; mothers' nonacceptance was positively associated with concurrent depression, whereas fathers' nonacceptance was positively associated with early nonacceptance and negatively associated with concurrent conflict. PMID:15709851

  14. Association of prostate cancer risk with SNPs in regions containing androgen receptor binding sites captured by ChIP-on-chip analyses

    PubMed Central

    Lu, Yizhen; Sun, Jielin; Kader, Andrew K.; Kim, Seong-Tae; Kim, Jin-Woo; Liu, Wennuan; Sun, Jishan; Lu, Daru; Feng, Junjie; Zhu, Yi; Jin, Tao; Zhang, Zheng; Dimitrov, Latchezar; Lowey, James; Campbell, Kevin; Suh, Edward; Duggan, David; Carpten, John; Trent, Jeffrey M.; Gronberg, Henrik; Zheng, Siqun L.; Isaacs, William B.; Xu, Jianfeng

    2012-01-01

    Background Genome-wide association studies (GWAS) have identified approximately three dozen single nucleotide polymorphisms (SNPs) consistently associated with prostate cancer (PCa) risk. Despite the reproducibility of these associations, the molecular mechanism for most of these SNPs has not been well elaborated as most lie within non-coding regions of the genome. Androgens play a key role in prostate carcinogenesis. Recently, using ChIP-on-chip technology, 22,447 androgen receptor (AR) binding sites have been mapped throughout the genome, greatly expanding the genomic regions potentially involved in androgen-mediated activity. Methodology/Principal findings To test the hypothesis that sequence variants in AR binding sites are associated with PCa risk, we performed a systematic evaluation among two existing PCa GWAS cohorts; the Johns Hopkins Hospital and the Cancer Genetic Markers of Susceptibility (CGEMS) study population. We demonstrate that regions containing AR binding sites are significantly enriched for PCa risk-associated SNPs, i.e. more than expected by chance alone. In addition, compared with the entire genome, these newly observed risk-associated SNPs in these regions are significantly more likely to overlap with established PCa risk-associated SNPs from previous GWAS. These results are consistent with our previous finding from a bioinformatics analysis that one-third of the 33 known PCa risk-associated SNPs discovered by GWAS are located in regions of the genome containing AR binding sites. Conclusions/Significance The results to date provide novel statistical evidence suggesting an androgen-mediated mechanism by which some PCa associated SNPs act to influence PCa risk. However, these results are hypothesis generating and ultimately warrant testing through in-depth molecular analyses. PMID:21671247

  15. MiRNA-Related SNPs and Risk of Esophageal Adenocarcinoma and Barrett’s Esophagus: Post Genome-Wide Association Analysis in the BEACON Consortium

    PubMed Central

    Buas, Matthew F.; Onstad, Lynn; Levine, David M.; Risch, Harvey A.; Chow, Wong-Ho; Liu, Geoffrey; Fitzgerald, Rebecca C.; Bernstein, Leslie; Ye, Weimin; Bird, Nigel C.; Romero, Yvonne; Casson, Alan G.; Corley, Douglas A.; Shaheen, Nicholas J.; Wu, Anna H.; Gammon, Marilie D.; Reid, Brian J.; Hardie, Laura J.; Peters, Ulrike; Whiteman, David C.; Vaughan, Thomas L.

    2015-01-01

    Incidence of esophageal adenocarcinoma (EA) has increased substantially in recent decades. Multiple risk factors have been identified for EA and its precursor, Barrett’s esophagus (BE), such as reflux, European ancestry, male sex, obesity, and tobacco smoking, and several germline genetic variants were recently associated with disease risk. Using data from the Barrett’s and Esophageal Adenocarcinoma Consortium (BEACON) genome-wide association study (GWAS) of 2,515 EA cases, 3,295 BE cases, and 3,207 controls, we examined single nucleotide polymorphisms (SNPs) that potentially affect the biogenesis or biological activity of microRNAs (miRNAs), small non-coding RNAs implicated in post-transcriptional gene regulation, and deregulated in many cancers, including EA. Polymorphisms in three classes of genes were examined for association with risk of EA or BE: miRNA biogenesis genes (157 SNPs, 21 genes); miRNA gene loci (234 SNPs, 210 genes); and miRNA-targeted mRNAs (177 SNPs, 158 genes). Nominal associations (P<0.05) of 29 SNPs with EA risk, and 25 SNPs with BE risk, were observed. None remained significant after correction for multiple comparisons (FDR q>0.50), and we did not find evidence for interactions between variants analyzed and two risk factors for EA/BE (smoking and obesity). This analysis provides the most extensive assessment to date of miRNA-related SNPs in relation to risk of EA and BE. While common genetic variants within components of the miRNA biogenesis core pathway appear unlikely to modulate susceptibility to EA or BE, further studies may be warranted to examine potential associations between unassessed variants in miRNA genes and targets with disease risk. PMID:26039359

  16. Construction of High Density Sweet Cherry (Prunus avium L.) Linkage Maps Using Microsatellite Markers and SNPs Detected by Genotyping-by-Sequencing (GBS)

    PubMed Central

    Guajardo, Verónica; Solís, Simón; Sagredo, Boris; Gainza, Felipe; Muñoz, Carlos; Gasic, Ksenija; Hinrichsen, Patricio

    2015-01-01

    Linkage maps are valuable tools in genetic and genomic studies. For sweet cherry, linkage maps have been constructed using mainly microsatellite markers (SSRs) and, recently, using single nucleotide polymorphism markers (SNPs) from a cherry 6K SNP array. Genotyping-by-sequencing (GBS), a new methodology based on high-throughput sequencing, holds great promise for identification of high number of SNPs and construction of high density linkage maps. In this study, GBS was used to identify SNPs from an intra-specific sweet cherry cross. A total of 8,476 high quality SNPs were selected for mapping. The physical position for each SNP was determined using the peach genome, Peach v1.0, as reference, and a homogeneous distribution of markers along the eight peach scaffolds was obtained. On average, 65.6% of the SNPs were present in genic regions and 49.8% were located in exonic regions. In addition to the SNPs, a group of SSRs was also used for construction of linkage maps. Parental and consensus high density maps were constructed by genotyping 166 siblings from a ‘Rainier’ x ‘Rivedel’ (Ra x Ri) cross. Using Ra x Ri population, 462, 489 and 985 markers were mapped into eight linkage groups in ‘Rainier’, ‘Rivedel’ and the Ra x Ri map, respectively, with 80% of mapped SNPs located in genic regions. Obtained maps spanned 549.5, 582.6 and 731.3 cM for ‘Rainier’, ‘Rivedel’ and consensus maps, respectively, with an average distance of 1.2 cM between adjacent markers for both ‘Rainier’ and ‘Rivedel’ maps and of 0.7 cM for Ra x Ri map. High synteny and co-linearity was observed between obtained maps and with Peach v1.0. These new high density linkage maps provide valuable information on the sweet cherry genome, and serve as the basis for identification of QTLs and genes relevant for the breeding of the species. PMID:26011256

  17. Ceramic fiber reinforced filter

    DOEpatents

    Stinton, David P. (Knoxville, TN); McLaughlin, Jerry C. (Oak Ridge, TN); Lowden, Richard A. (Powell, TN)

    1991-01-01

    A filter for removing particulate matter from high temperature flowing fluids, and in particular gases, that is reinforced with ceramic fibers. The filter has a ceramic base fiber material in the form of a fabric, felt, paper of the like, with the refractory fibers thereof coated with a thin layer of a protective and bonding refractory applied by chemical vapor deposition techniques. This coating causes each fiber to be physically joined to adjoining fibers so as to prevent movement of the fibers during use and to increase the strength and toughness of the composite filter. Further, the coating can be selected to minimize any reactions between the constituents of the fluids and the fibers. A description is given of the formation of a composite filter using a felt preform of commercial silicon carbide fibers together with the coating of these fibers with pure silicon carbide. Filter efficiency approaching 100% has been demonstrated with these filters. The fiber base material is alternately made from aluminosilicate fibers, zirconia fibers and alumina fibers. Coating with Al.sub.2 O.sub.3 is also described. Advanced configurations for the composite filter are suggested.

  18. Fourier plane filters

    NASA Technical Reports Server (NTRS)

    Oliver, D. S.; Aldrich, R. E.; Krol, F. T.

    1972-01-01

    An electrically addressed liquid crystal Fourier plane filter capable of real time optical image processing is described. The filter consists of two parts: a wedge filter having forty 9 deg segments and a ring filter having twenty concentric rings in a one inch diameter active area. Transmission of the filter in the off (transparent) state exceeds fifty percent. By using polarizing optics, contrast as high as 10,000:1 can be achieved at voltages compatible with FET switching technology. A phenomenological model for the dynamic scattering is presented for this special case. The filter is designed to be operated from a computer and is addressed by a seven bit binary word which includes an on or off command and selects any one of the twenty rings or twenty wedge pairs. The overall system uses addressable latches so that once an element is in a specified state, it will remain there until a change of state command is received. The drive for the liquid crystal filter is ? 30 V peak at 30 Hz to 70 Hz. These parameters give a rise time for the scattering of 20 msec and a decay time of 80 to 100 msec.

  19. Usefulness of SNPs as Supplementary Markers in a Paternity Case with 3 Genetic Incompatibilities at Autosomal and Y Chromosomal Loci

    PubMed Central

    Lindner, Iris; von Wurmb-Schwark, Nicole; Meier, Patrick; Fimmers, Rolf; Büttner, Andreas

    2014-01-01

    Summary Background In kinship testing, investigation of 15 short tandem repeats (STRs) usually provides decisive genetic information for resolving relationship cases. However, in complex deficiency cases, in cases with more than 2 mutations at different STR loci or when close (untested) relatives of the alleged father are suggested to be the biological father of the child, STR typing alone may not be sufficient. In these cases, the application of supplementary markers such as single nucleotide polymorphisms (SNPs) is recommended. Methods We describe a paternity case with 3 genetic incompatibilities (Penta D, VWA, and DYS385) between the alleged father and the child after analyzing 23 autosomal and 16 Y chromosomal STR loci. The question arose as to whether the alleged father could be excluded and a related person could be the biological father of the child, or whether the observed genetic incompatibilities were mutations. Interestingly, the 2 excluded full brothers of the alleged father possessed identical genetic incompatibilities at locus VWA and DYS385 as the alleged father. Results and Conclusions Additional performance of a 50-plex SNP assay demonstrated that the observed mismatches were indeed mutations and the alleged father was the biological father of the child. The results show the usefulness of SNPs as supplementary markers in relationship testing when STR analyses show ambiguous results. PMID:24847187

  20. DNA pooling analysis of 21 norepinephrine transporter gene SNPs with attention deficit hyperactivity disorder: no evidence for association.

    PubMed

    Xu, Xiaohui; Knight, Jo; Brookes, Keeley; Mill, Jonathan; Sham, Pak; Craig, Ian; Taylor, Eric; Asherson, Philip

    2005-04-01

    The norepinephrine system is known to play a role in attentional and cognitive-energetic mechanisms and is thought to be important in attention deficit hyperactivity disorder (ADHD). Stimulant medications are known to alter the activity of norepinephrine as well as dopamine in the synapse and the highly selective norepinephrine reuptake inhibitor, atomoxetine, is an effective treatment for ADHD symptoms. This study set out to investigate whether common polymorphisms within the norepinephrine transporter gene (NET1) are associated with DSM-IV ADHD combined subtype, using a sample that has previously shown association with genes that affect the synaptic release and uptake of neurotransmitters; DAT1 and SNAP-25. We identified 21 single nucleotide polymorphisms (SNPs) from publicly available databases that had minor allele frequencies > or =5% and span the NET1 genomic region, including those analyzed in previous studies of ADHD. DNA pooling was used to screen for associations using two case pools (n = 180 cases) and four control pools (n = 334 controls). We identified three SNPs that showed suggestive evidence for association using either case-control or within family tests of association, however, none of these were significant after adjustment for the number of markers analyzed. We conclude that none of the markers show significant evidence of association with ADHD although we cannot rule out small genetic effects. PMID:15719398

  1. Structure-function studies on non-synonymous SNPs of chemokine receptor gene implicated in cardiovascular disease: a computational approach.

    PubMed

    Sai Ramesh, A; Sethumadhavan, Rao; Thiagarajan, Padma

    2013-12-01

    Among non-communicable diseases, cardiovascular disease (CVD) is claimed to be the leading cause of death worldwide. The chemokine (C-C Motif) receptor 5 (CCR5) gene has a strong association with the development of CVD and may culminate in myocardial infarction. In this study, its potential variations have been determined using molecular dynamics approach. Single nucleotide polymorphisms (SNPs) are the predominant mutations and their deleterious effects were initially screened using prediction tools. Further, for the 75 % of deleterious non-synonymous SNPs predicted in common by the above tools, root mean square deviation (RMSD) and stability residues were determined using SWISS-PDB viewer and SRide server respectively. Accordingly, four point mutations L55Q, V131F, R223W, and G301R which had RMSD ?2.0 Å were selected and trajectory analyses were performed. In common, all trajectory analyses reported no similarities between native and mutants. Combined mutational analysis comparing all the mutants together with the native also showed significant and similar changes. Thus we conclude that the above four mutations are the potential targets of CCR5 and may lead to CVD. PMID:24293156

  2. Association of BID SNPs (rs8190315 and rs2072392) and clinical features of benign prostate hyperplasia in Korean population

    PubMed Central

    Seok, Hosik; Kim, Su Kang; Yoo, Koo Han; Lee, Byung-Cheol; Kim, Young Ock; Chung, Joo-Ho

    2014-01-01

    Exercise has beneficial effect on cancer apoptosis and benign prostatic hyperplasia (BPH). The BH3 interacting domain death agonist (BID) gene expression is associated with apoptosis or cell proliferation. In this study, we investigated the association between BID single nucleotide polymorphisms (SNPs) and the development, prostate volume, and international prostate symptom score (IPSS) of BPH. In 222 BPH males and 214 controls, two SNPs in BID [rs8190315 (Ser56Gly), and rs2072392 (Asp106Asp)] were genotyped and analyzed using multiple logistic regression models. In the result, the genotype and allele frequencies of rs8190315 and rs2072392 were not associated with BPH development or IPSS, however, the allele frequencies [odd ratio (OR)= 1.90, 95% confidence interval (CI)= 1.07–3.41, P= 0.03] and genotype frequencies (in dominant model, OR= 1.94, 95% CI= 1.01–3.74, P= 0.42) of rs8190315, and the genotype frequencies of rs2072392 (in dominant model, OR= 1.94, 95% CI= 1.01–3.74, P= 0.42) were associated with increased prostate volume. We propose that rs8190315 and rs2072392 of BID may contribute to the disease severity of BPH. PMID:25610824

  3. Two Novel SNPs in ATXN3 3’ UTR May Decrease Age at Onset of SCA3/MJD in Chinese Patients

    PubMed Central

    Long, Zhe; Chen, Zhao; Wang, Chunrong; Huang, Fengzhen; Peng, Huirong; Hou, Xuan; Ding, Dongxue; Ye, Wei; Wang, Junling; Pan, Qian; Li, Jiada; Xia, Kun; Tang, Beisha; Ashizawa, Tetsuo; Jiang, Hong

    2015-01-01

    Spinocerebellar ataxia type 3 (SCA3), or Machado—Joseph disease (MJD), is an autosomal dominantly-inherited disease that produces progressive problems with movement. It is caused by the expansion of an area of CAG repeats in a coding region of ATXN3. The number of repeats is inversely associated with age at disease onset (AO) and is significantly associated with disease severity; however, the degree of CAG expansion only explains 50 to 70% of variance in AO. We tested two SNPs, rs709930 and rs910369, in the 3’ UTR of ATXN3 gene for association with SCA3/MJD risk and with SCA3/MJD AO in an independent cohort of 170 patients with SCA3/MJD and 200 healthy controls from mainland China. rs709930 genotype frequencies were statistically significantly different between patients and controls (p = 0.001, ? = 0.05). SCA3/MJD patients carrying the rs709930 A allele and rs910369 T allele experienced an earlier onset, with a decrease in AO of approximately 2 to 4 years. The two novel SNPs found in this study might be genetic modifiers for AO in SCA3/MJD. PMID:25689313

  4. Multilevel filtering elliptic preconditioners

    NASA Technical Reports Server (NTRS)

    Kuo, C. C. Jay; Chan, Tony F.; Tong, Charles

    1989-01-01

    A class of preconditioners is presented for elliptic problems built on ideas borrowed from the digital filtering theory and implemented on a multilevel grid structure. They are designed to be both rapidly convergent and highly parallelizable. The digital filtering viewpoint allows the use of filter design techniques for constructing elliptic preconditioners and also provides an alternative framework for understanding several other recently proposed multilevel preconditioners. Numerical results are presented to assess the convergence behavior of the new methods and to compare them with other preconditioners of multilevel type, including the usual multigrid method as preconditioner, the hierarchical basis method and a recent method proposed by Bramble-Pasciak-Xu.

  5. Remotely serviced filter and housing

    DOEpatents

    Ross, Maurice J. (Pocatello, ID); Zaladonis, Larry A. (Idaho Falls, ID)

    1988-09-27

    A filter system for a hot cell comprises a housing adapted for input of air or other gas to be filtered, flow of the air through a filter element, and exit of filtered air. The housing is tapered at the top to make it easy to insert a filter cartridge using an overhead crane. The filter cartridge holds the filter element while the air or other gas is passed through the filter element. Captive bolts in trunnion nuts are readily operated by electromechanical manipulators operating power wrenches to secure and release the filter cartridge. The filter cartridge is adapted to make it easy to change a filter element by using a master-slave manipulator at a shielded window station.

  6. 11.10 Filter Banks What Are Filter Banks?

    E-print Network

    Fowler, Mark

    1/7 11.10 Filter Banks #12;2/7 What Are Filter Banks? Often need to slice up a "wideband" signal into various "subbands" Figure from Porat's Book #12;3/7 Filter Banks Application: Cell Phone Basestation FDMA Converter & ADC Filter Bank Demod Demod Demod Antenna ... ...User 1 User M ff2f1 1 GHz ... ... User 1

  7. Concurrent filtering and smoothing

    E-print Network

    Kaess, Michael

    This paper presents a novel algorithm for integrating real-time filtering of navigation data with full map/trajectory smoothing. Unlike conventional mapping strategies, the result of loop closures within the smoother serve ...

  8. Parallel Subconvolution Filtering Architectures

    NASA Technical Reports Server (NTRS)

    Gray, Andrew A.

    2003-01-01

    These architectures are based on methods of vector processing and the discrete-Fourier-transform/inverse-discrete- Fourier-transform (DFT-IDFT) overlap-and-save method, combined with time-block separation of digital filters into frequency-domain subfilters implemented by use of sub-convolutions. The parallel-processing method implemented in these architectures enables the use of relatively small DFT-IDFT pairs, while filter tap lengths are theoretically unlimited. The size of a DFT-IDFT pair is determined by the desired reduction in processing rate, rather than on the order of the filter that one seeks to implement. The emphasis in this report is on those aspects of the underlying theory and design rules that promote computational efficiency, parallel processing at reduced data rates, and simplification of the designs of very-large-scale integrated (VLSI) circuits needed to implement high-order filters and correlators.

  9. Erythropoietin-Stimulating Agents and Survival in End-Stage Renal Disease: Comparison of Payment Policy Analysis, Instrumental Variables, and Multiple Imputation of Potential Outcomes

    PubMed Central

    Dore, David D.; Swaminathan, Shailender; Gutman, Roee; Trivedi, Amal N.; Mor, Vincent

    2013-01-01

    Objective To compare the assumptions and estimands across three approaches to estimating the effect of erythropoietin-stimulating agents (ESAs) on mortality. Study Design and Setting Using data from the Renal Management Information System, we conducted two analyses utilizing a change to bundled payment that we hypothesized mimicked random assignment to ESA (pre-post, difference-in-difference, and instrumental variable analyses). A third analysis was based on multiply imputing potential outcomes using propensity scores. Results There were 311,087 recipients of ESAs and 13,095 non-recipients. In the pre-post comparison, we identified no clear relationship between bundled payment (measured by calendar time) and the incidence of death within six months (risk difference -1.5%; 95% CI - 7.0% to 4.0%). In the instrumental variable analysis, the risk of mortality was similar among ESA recipients (risk difference -0.9%; 95% CI -2.1 to 0.3). In the multiple imputation analysis, we observed a 4.2% (95% CI 3.4% to 4.9%) absolute reduction in mortality risk with use of ESAs, but closer to the null for patients with baseline hematocrit >36%. Conclusion Methods emanating from different disciplines often rely on different assumptions, but can be informative about a similar causal contrast. The implications of these distinct approaches are discussed. PMID:23849152

  10. Nonlinear Attitude Filtering Methods

    NASA Technical Reports Server (NTRS)

    Markley, F. Landis; Crassidis, John L.; Cheng, Yang

    2005-01-01

    The extended Kalman filter (EKF) is the workhorse of real-time spacecraft attitude estimation. Since the group SO3 of rotation matrices has dimension three, most attitude determination EKFs use lower- dimensional attitude parameterizations than the nine-parameter attitude matrix itself. The fact that all three- parameter representations of SO3 are singular or discontinuous for certain attitudes has led to extended discussions of constraints and attitude representations in EKFs. The most successful EKF uses a nonsingular parameterization for the global attitude, which necessarily has more than three parameters, while employing a three-component representation for the attitude errors. This filter has become known as the Multiplicative Extended Kalman Filter. These issues are now well understood, however, and the EKF has performed admirably in the vast majority of attitude determination applications. Nevertheless, poor performance or even divergence arising from the linearization implicit in the EKF has led to the development of nonlinear filters, most recently sigma point or unscented filters and particle filters.

  11. Application of microbead biological filters

    Microsoft Academic Search

    Michael B. Timmons; John L. Holder; James M. Ebeling

    2006-01-01

    The application of floating microbead filters to aquaculture is reviewed and discussed. The microbead filter is distinctly different from the more commonly used floating bead filters that are used today. Conventional bead filters work in pressured vessels and use a media that is only slightly buoyant. The required mass of beads for the volume required make the media a relatively

  12. Filters for cathodic arc plasmas

    DOEpatents

    Anders, Andre (Albany, CA); MacGill, Robert A. (Richmond, CA); Bilek, Marcela M. M. (Engadine, AU); Brown, Ian G. (Berkeley, CA)

    2002-01-01

    Cathodic arc plasmas are contaminated with macroparticles. A variety of magnetic plasma filters has been used with various success in removing the macroparticles from the plasma. An open-architecture, bent solenoid filter, with additional field coils at the filter entrance and exit, improves macroparticle filtering. In particular, a double-bent filter that is twisted out of plane forms a very compact and efficient filter. The coil turns further have a flat cross-section to promote macroparticle reflection out of the filter volume. An output conditioning system formed of an expander coil, a straightener coil, and a homogenizer, may be used with the magnetic filter for expanding the filtered plasma beam to cover a larger area of the target. A cathodic arc plasma deposition system using this filter can be used for the deposition of ultrathin amorphous hard carbon (a-C) films for the magnetic storage industry.

  13. Proteasome modulator 9 gene SNPs, responsible for anti-depressant response, are in linkage with generalized anxiety disorder.

    PubMed

    Gragnoli, Claudia

    2014-09-01

    Proteasome modulator 9 (PSMD9) gene single nucleotide polymorphism (SNP) rs1043307/rs2514259 (E197G) is associated with significant clinical response to the anti-depressant desipramine. PSMD9 SNP rs74421874 [intervening sequence (IVS) 3?+?nt460 G>A], rs3825172 (IVS3?+?nt437 C>T) and rs1043307/rs2514259 (E197G A>G) are all linked to type 2 diabetes (T2D), maturity-onset-diabetes-of the young 3 (MODY3), obesity and waist circumference, hypertension, hypercholesterolemia, T2D-macrovascular and T2D-microvascular disease, T2D-neuropathy, T2D-carpal tunnel syndrome, T2D-nephropathy, T2D-retinopathy, non-diabetic retinopathy and depression. PSMD9 rs149556654 rare SNP (N166S A>G) and the variant S143G A>G also contribute to T2D. PSMD9 is located in the chromosome 12q24 locus, which per se is in linkage with depression, bipolar disorder and anxiety. In the present study, we wanted to determine whether PSMD9 is linked to general anxiety disorder in Italian T2D families. Two-hundred Italian T2D families were phenotyped for generalized anxiety disorder, using the diagnostic criteria of DSM-IV. When the diagnosis was unavailable or unclear, the trait was reported as unknown. The 200 Italians families were tested for the PSMD9 T2D risk SNPs rs74421874 (IVS3?+?nt460 G>A), rs3825172 (IVS3?+nt437 T>C) and for the T2D risk and anti-depressant response SNP rs1043307/rs2514259 (E197G A>G) for evidence of linkage with generalized anxiety disorder. Non-parametric linkage analysis was executed via Merlin software. One-thousand simulation tests were performed to exclude results due to random chance. In our study, the PSMD9 gene SNPs rs74421874, rs3825172, and rs1043307/rs2514259 result in linkage to generalized anxiety disorder. This is the first report describing PSMD9 gene SNPs in linkage to generalized anxiety disorder in T2D families. PMID:24648162

  14. DOE HEPA filter test program

    SciTech Connect

    NONE

    1998-05-01

    This standard establishes essential elements of a Department of Energy (DOE) program for testing HEPA filters to be installed in DOE nuclear facilities or used in DOE-contracted activities. A key element is the testing of HEPA filters for performance at a DOE Filter Test Facility (FTF) prior to installation. Other key elements are (1) providing for a DOE HEPA filter procurement program, and (2) verifying that HEPA filters to be installed in nuclear facilities appear on a Qualified Products List (QPL).

  15. Next generation filtering: Offline filtering enhanced proxy architecture for web content filtering

    Microsoft Academic Search

    E. Akbas

    2008-01-01

    Most of available web filters especially parental controls work inline meaning that all outgoing and incoming packets are passed through a filter driver. This approach widely used in parental control applications because they mostly use blacklist, whitelist approach and defense of the applications to bypass the filter easily. Online content filtering along with its own benefits has a big flaw;

  16. Identification and analysis of genome-wide SNPs provide insight into signatures of selection and domestication in channel catfish (Ictalurus punctatus).

    PubMed

    Sun, Luyang; Liu, Shikai; Wang, Ruijia; Jiang, Yanliang; Zhang, Yu; Zhang, Jiaren; Bao, Lisui; Kaltenboeck, Ludmilla; Dunham, Rex; Waldbieser, Geoff; Liu, Zhanjiang

    2014-01-01

    Domestication and selection for important performance traits can impact the genome, which is most often reflected by reduced heterozygosity in and surrounding genes related to traits affected by selection. In this study, analysis of the genomic impact caused by domestication and artificial selection was conducted by investigating the signatures of selection using single nucleotide polymorphisms (SNPs) in channel catfish (Ictalurus punctatus). A total of 8.4 million candidate SNPs were identified by using next generation sequencing. On average, the channel catfish genome harbors one SNP per 116 bp. Approximately 6.6 million, 5.3 million, 4.9 million, 7.1 million and 6.7 million SNPs were detected in the Marion, Thompson, USDA103, Hatchery strain, and wild population, respectively. The allele frequencies of 407,861 SNPs differed significantly between the domestic and wild populations. With these SNPs, 23 genomic regions with putative selective sweeps were identified that included 11 genes. Although the function for the majority of the genes remain unknown in catfish, several genes with known function related to aquaculture performance traits were included in the regions with selective sweeps. These included hypoxia-inducible factor 1?. HIF??.. and the transporter gene ATP-binding cassette sub-family B member 5 (ABCB5). HIF1?. is important for response to hypoxia and tolerance to low oxygen levels is a critical aquaculture trait. The large numbers of SNPs identified from this study are valuable for the development of high-density SNP arrays for genetic and genomic studies of performance traits in catfish. PMID:25313648

  17. BRCA1/BRCA2 gene mutations/SNPs and BRCA1 haplotypes in early-onset breast cancer patients of Indian ethnicity.

    PubMed

    Juwle, Abida; Saranath, Dhananjaya

    2012-12-01

    We examined BRCA1/2 mutations and single nucleotide polymorphisms (SNPs) for identification of BRCA1 haplotypes, in early-onset breast cancer patients and their relatives, sporadic breast cancer patients, and unrelated normal healthy females, of Indian ethnicity. Peripheral blood DNA was amplified by polymerase chain reaction, at BRCA1/2 coding exons and subject to nucleotide sequencing using ABI 3100 Genetic Analyzer. We observed BRCA1/BRCA2 mutations in 52 % early-onset breast cancer patients and in 57 % relatives. Deleterious mutations detected in early-onset patients and relatives were 187delAG, 632insT, 1052delT, Q759X, Q780X, R1203X, 5154delC, IVS14 + 1G > A, IVS17 + 1G > T, and 632insT in BRCA1 gene; and 4075delGT, 5076delAA, 6079delAGTT, and W3127X in BRCA2 gene. A high degree of penetrance of BRCA1/2 gene mutations was observed in the relatives. BRCA1/2 SNPs were identified in the Indian population, and association of BRCA1 haplotypes with breast cancer was investigated. A significantly increased frequency of the SNPs 203G/A, 3624A/G and 7470A/G SNPs in BRCA2 gene was observed in normal controls indicative of a protective effect of the SNPs. BRCA1 haplotype 2 was most frequently observed in our population. Our study indicates a high incidence of BRCA1/BRCA2 gene mutations in the Indian patients. The BRCA1/2 mutations and SNPs are detailed on our website http://relibrca.rellife.com . PMID:22752604

  18. MicroRNA-mediated regulation of gene expression is affected by disease-associated SNPs within the 3?-UTR via altered RNA structure

    PubMed Central

    Haas, Ulrike; Sczakiel, Georg; Laufer, Sandra D.

    2012-01-01

    Single nucleotide polymorphisms (SNPs) in microRNAs (miRNAs) or their target sites (miR-SNPs) within the 3?-UTR of mRNAs are increasingly thought to play a major role in pathological dysregulation of gene expression. Here, we studied the functional role of miR-SNPs on miRNA-mediated post-transcriptional regulation of gene expression. First, analyses were performed on a SNP located in the miR-155 target site within the 3?-UTR of the Angiotensin II type 1 receptor (AGTR1; rs5186, A > C) mRNA. Second, a SNP in the 3?-UTR of the muscle RAS oncogene homolog (MRAS; rs9818870, C > T) mRNA was studied which is located outside of binding sites of miR-195 and miR-135. Using these SNPs we investigated their effects on local RNA structure, on local structural accessibility and on functional miRNA binding, respectively. Systematic computational RNA folding analyses of the allelic mRNAs in either case predicted significant changes of local RNA structure in the vicinity of the cognate miRNA binding sites. Consistently, experimental in vitro probing of RNA showing differential cleavage patterns and reporter gene-based assays indicated functional differences of miRNA-mediated regulation of the two AGTR1 and MRAS alleles. In conclusion, we describe a novel model explaining the functional influence of 3?-UTR-located SNPs on miRNA-mediated control of gene expression via SNP-related changes of local RNA structure in non-coding regions of mRNA. This concept substantially extends the meaning of disease-related SNPs identified in non protein-coding transcribed sequences within or close to miRNA binding sites. PMID:22664914

  19. Reliable Detection of Paternal SNPs within Deletion Breakpoints for Non-Invasive Prenatal Exclusion of Homozygous ?0-Thalassemia in Maternal Plasma

    PubMed Central

    Yan, Ti-Zhen; Mo, Qiu-Hua; Cai, Ren; Chen, Xue; Zhang, Cui-Mei; Liu, Yan-Hui; Chen, Ya-Jun; Zhou, Wan-Jun; Xiong, Fu; Xu, Xiang-Min

    2011-01-01

    Reliable detection of large deletions from cell-free fetal DNA (cffDNA) in maternal plasma is challenging, especially when both parents have the same deletion owing to a lack of specific markers for fetal genotyping. In order to evaluate the efficacy of a non-invasive prenatal diagnosis (NIPD) test to exclude ?-thalassemia major that uses SNPs linked to the normal paternal ?-globin allele, we established a novel protocol to reliably detect paternal SNPs within the (??SEA) breakpoints and performed evaluation of the diagnostic potential of the protocol in a total of 67 pregnancies, in whom plasma samples were collected prior to invasive obstetrics procedures in southern China. A group of nine SNPs identified within the deletion breakpoints were scanned to select the informative SNPs in each of the 67 couples DNA by multiplex PCR based mini-sequencing technique. The paternally inherited SNP allele from cffDNA was detected by allele specific real-time PCR. A protocol for reliable detection of paternal SNPs within the (??SEA) breakpoints was established and evaluation of the diagnostic potential of the protocol was performed in a total of 67 pregnancies. In 97% of the couples one or more different SNPs within the deletion breakpoint occurred between paternal and maternal alleles. Homozygosity for the (??SEA) deletion was accurately excluded in 33 out of 67 (49.3%, 95% CI, 25.4–78.6%) pregnancies through the implementation of the protocol. Protocol was completely concordant with the traditional reference methods, except for two cases that exhibited uncertain results due to sample hemolysis. This method could be used as a routine NIPD test to exclude gross fetal deletions in ?-thalassemia major, and could further be employed to test for other diseases due to gene deletion. PMID:21980356

  20. Gradient Boosting as a SNP Filter: an Evaluation Using Simulated and Hair Morphology Data

    PubMed Central

    Lubke, GH; Laurin, C; Walters, R; Eriksson, N; Hysi, P; Spector, TD; Montgomery, GW; Martin, NG; Medland, SE; Boomsma, DI

    2013-01-01

    Typically, genome-wide association studies consist of regressing the phenotype on each SNP separately using an additive genetic model. Although statistical models for recessive, dominant, SNP-SNP, or SNP-environment interactions exist, the testing burden makes an evaluation of all possible effects impractical for genome-wide data. We advocate a two-step approach where the first step consists of a filter that is sensitive to different types of SNP main and interactions effects. The aim is to substantially reduce the number of SNPs such that more specific modeling becomes feasible in a second step. We provide an evaluation of a statistical learning method called “gradient boosting machine” (GBM) that can be used as a filter. GBM does not require an a priori specification of a genetic model, and permits inclusion of large numbers of covariates. GBM can therefore be used to explore multiple GxE interactions, which would not be feasible within the parametric framework used in GWAS. We show in a simulation that GBM performs well even under conditions favorable to the standard additive regression model commonly used in GWAS, and is sensitive to the detection of interaction effects even if one of the interacting variables has a zero main effect. The latter would not be detected in GWAS. Our evaluation is accompanied by an analysis of empirical data concerning hair morphology. We estimate the phenotypic variance explained by increasing numbers of highest ranked SNPs, and show that it is sufficient to select 10K-20K SNPs in the first step of a two-step approach. PMID:24404405

  1. LDGIdb: a database of gene interactions inferred from long-range strong linkage disequilibrium between pairs of SNPs

    PubMed Central

    2012-01-01

    Background Complex human diseases may be associated with many gene interactions. Gene interactions take several different forms and it is difficult to identify all of the interactions that are potentially associated with human diseases. One approach that may fill this knowledge gap is to infer previously unknown gene interactions via identification of non-physical linkages between different mutations (or single nucleotide polymorphisms, SNPs) to avoid hitchhiking effect or lack of recombination. Strong non-physical SNP linkages are considered to be an indication of biological (gene) interactions. These interactions can be physical protein interactions, regulatory interactions, functional compensation/antagonization or many other forms of interactions. Previous studies have shown that mutations in different genes can be linked to the same disorders. Therefore, non-physical SNP linkages, coupled with knowledge of SNP-disease associations may shed more light on the role of gene interactions in human disorders. A user-friendly web resource that integrates information about non-physical SNP linkages, gene annotations, SNP information, and SNP-disease associations may thus be a good reference for biomedical research. Findings Here we extracted the SNPs located within the promoter or exonic regions of protein-coding genes from the HapMap database to construct a database named the Linkage-Disequilibrium-based Gene Interaction database (LDGIdb). The database stores 646,203 potential human gene interactions, which are potential interactions inferred from SNP pairs that are subject to long-range strong linkage disequilibrium (LD), or non-physical linkages. To minimize the possibility of hitchhiking, SNP pairs inferred to be non-physically linked were required to be located in different chromosomes or in different LD blocks of the same chromosomes. According to the genomic locations of the involved SNPs (i.e., promoter, untranslated region (UTR) and coding region (CDS)), the SNP linkages inferred were categorized into promoter-promoter, promoter-UTR, promoter-CDS, CDS-CDS, CDS-UTR and UTR-UTR linkages. For the CDS-related linkages, the coding SNPs were further classified into nonsynonymous and synonymous variations, which represent potential gene interactions at the protein and RNA level, respectively. The LDGIdb also incorporates human disease-association databases such as Genome-Wide Association Studies (GWAS) and Online Mendelian Inheritance in Man (OMIM), so that the user can search for potential disease-associated SNP linkages. The inferred SNP linkages are also classified in the context of population stratification to provide a resource for investigating potential population-specific gene interactions. Conclusion The LDGIdb is a user-friendly resource that integrates non-physical SNP linkages and SNP-disease associations for studies of gene interactions in human diseases. With the help of the LDGIdb, it is plausible to infer population-specific SNP linkages for more focused studies, an avenue that is potentially important for pharmacogenetics. Moreover, by referring to disease-association information such as the GWAS data, the LDGIdb may help identify previously uncharacterized disease-associated gene interactions and potentially lead to new discoveries in studies of human diseases. Keywords Gene interaction, SNP, Linkage disequilibrium, Systems biology, Bioinformatics PMID:22551073

  2. Genetic Diversity of Sheep Breeds from Albania, Greece, and Italy Assessed by Mitochondrial DNA and Nuclear Polymorphisms (SNPs)

    PubMed Central

    Pariset, Lorraine; Mariotti, Marco; Gargani, Maria; Joost, Stephane; Negrini, Riccardo; Perez, Trinidad; Bruford, Michael; Ajmone Marsan, Paolo; Valentini, Alessio

    2011-01-01

    We employed mtDNA and nuclear SNPs to investigate the genetic diversity of sheep breeds of three countries of the Mediterranean basin: Albania, Greece, and Italy. In total, 154 unique mtDNA haplotypes were detected by means of D-loop sequence analysis. The major nucleotide diversity was observed in Albania. We identified haplogroups, A, B, and C in Albanian and Greek samples, while Italian individuals clustered in groups A and B. In general, the data show a pattern reflecting old migrations that occurred in postneolithic and historical times. PCA analysis on SNP data differentiated breeds with good correspondence to geographical locations. This could reflect geographical isolation, selection operated by local sheep farmers, and different flock management and breed admixture that occurred in the last centuries. PMID:22125424

  3. Introgression and phenotypic assimilation in Zimmerius flycatchers (Tyrannidae): population genetic and phylogenetic inferences from genome-wide SNPs.

    PubMed

    Rheindt, Frank E; Fujita, Matthew K; Wilton, Peter R; Edwards, Scott V

    2014-03-01

    Genetic introgression is pervasive in nature and may lead to large-scale phenotypic assimilation and/or admixture of populations, but there is limited knowledge on whether large phenotypic changes are typically accompanied by high levels of introgression throughout the genome. Using bioacoustic, biometric, and spectrophotometric data from a flycatcher (Tyrannidae) system in the Neotropical genus Zimmerius, we document a mosaic pattern of phenotypic admixture in which a population of Zimmerius viridiflavus in northern Peru (henceforth "mosaic") is vocally and biometrically similar to conspecifics to the south but shares plumage characteristics with a different species (Zimmerius chrysops) to the north. To clarify the origins of the mosaic population, we used the RAD-seq approach to generate a data set of 37,361 genome-wide single nucleotide polymorphisms (SNPs). A range of population-genetic diagnostics shows that the genome of the mosaic population is largely indistinguishable from southern Z. viridiflavus and distinct from northern Z. chrysops, and the application of parsimony and species tree methods to the genome-wide SNP data set confirms the close affinity of the mosaic population with southern Z. viridiflavus. Even so, using a subset of 2710 SNPs found across all sampled lineages in configurations appropriate for a recently proposed statistical ("ABBA/BABA") test that distinguishes gene flow from incomplete lineage sorting, we detected low levels of gene flow from northern Z. chrysops into the mosaic population. Mapping the candidate loci for introgression from Z. chrysops into the mosaic population to the zebra finch genome reveals close linkage with genes significantly enriched in functions involving cell projection and plasma membranes. Introgression of key alleles may have led to phenotypic assimilation in the plumage of mosaic birds, suggesting that selection may have been a key factor facilitating introgression. PMID:24304652

  4. SNPs altering ammonium transport activity of human Rhesus factors characterized by a yeast-based functional assay.

    PubMed

    Deschuyteneer, Aude; Boeckstaens, Mélanie; De Mees, Christelle; Van Vooren, Pascale; Wintjens, René; Marini, Anna Maria

    2013-01-01

    Proteins of the conserved Mep-Amt-Rh family, including mammalian Rhesus factors, mediate transmembrane ammonium transport. Ammonium is an important nitrogen source for the biosynthesis of amino acids but is also a metabolic waste product. Its disposal in urine plays a critical role in the regulation of the acid/base homeostasis, especially with an acid diet, a trait of Western countries. Ammonium accumulation above a certain concentration is however pathologic, the cytotoxicity causing fatal cerebral paralysis in acute cases. Alteration in ammonium transport via human Rh proteins could have clinical outcomes. We used a yeast-based expression assay to characterize human Rh variants resulting from non synonymous single nucleotide polymorphisms (nsSNPs) with known or unknown clinical phenotypes and assessed their ammonium transport efficiency, protein level, localization and potential trans-dominant impact. The HsRhAG variants (I61R, F65S) associated to overhydrated hereditary stomatocytosis (OHSt), a disease affecting erythrocytes, proved affected in intrinsic bidirectional ammonium transport. Moreover, this study reveals that the R202C variant of HsRhCG, the orthologue of mouse MmRhcg required for optimal urinary ammonium excretion and blood pH control, shows an impaired inherent ammonium transport activity. Urinary ammonium excretion was RHcg gene-dose dependent in mouse, highlighting MmRhcg as a limiting factor. HsRhCG(R202C) may confer susceptibility to disorders leading to metabolic acidosis for instance. Finally, the analogous R211C mutation in the yeast ScMep2 homologue also impaired intrinsic activity consistent with a conserved functional role of the preserved arginine residue. The yeast expression assay used here constitutes an inexpensive, fast and easy tool to screen nsSNPs reported by high throughput sequencing or individual cases for functional alterations in Rh factors revealing potential causal variants. PMID:23967154

  5. SNPs Altering Ammonium Transport Activity of Human Rhesus Factors Characterized by a Yeast-Based Functional Assay

    PubMed Central

    Deschuyteneer, Aude; Boeckstaens, Mélanie; De Mees, Christelle; Van Vooren, Pascale; Wintjens, René; Marini, Anna Maria

    2013-01-01

    Proteins of the conserved Mep-Amt-Rh family, including mammalian Rhesus factors, mediate transmembrane ammonium transport. Ammonium is an important nitrogen source for the biosynthesis of amino acids but is also a metabolic waste product. Its disposal in urine plays a critical role in the regulation of the acid/base homeostasis, especially with an acid diet, a trait of Western countries. Ammonium accumulation above a certain concentration is however pathologic, the cytotoxicity causing fatal cerebral paralysis in acute cases. Alteration in ammonium transport via human Rh proteins could have clinical outcomes. We used a yeast-based expression assay to characterize human Rh variants resulting from non synonymous single nucleotide polymorphisms (nsSNPs) with known or unknown clinical phenotypes and assessed their ammonium transport efficiency, protein level, localization and potential trans-dominant impact. The HsRhAG variants (I61R, F65S) associated to overhydrated hereditary stomatocytosis (OHSt), a disease affecting erythrocytes, proved affected in intrinsic bidirectional ammonium transport. Moreover, this study reveals that the R202C variant of HsRhCG, the orthologue of mouse MmRhcg required for optimal urinary ammonium excretion and blood pH control, shows an impaired inherent ammonium transport activity. Urinary ammonium excretion was RHcg gene-dose dependent in mouse, highlighting MmRhcg as a limiting factor. HsRhCGR202C may confer susceptibility to disorders leading to metabolic acidosis for instance. Finally, the analogous R211C mutation in the yeast ScMep2 homologue also impaired intrinsic activity consistent with a conserved functional role of the preserved arginine residue. The yeast expression assay used here constitutes an inexpensive, fast and easy tool to screen nsSNPs reported by high throughput sequencing or individual cases for functional alterations in Rh factors revealing potential causal variants. PMID:23967154

  6. Identification of bovine NPC1 gene cSNPs and their effects on body size traits of Qinchuan cattle.

    PubMed

    Dang, Yonglong; Li, Mingxun; Yang, Mingjuan; Cao, Xiukai; Lan, Xianyong; Lei, Chuzhao; Zhang, Chunlei; Lin, Qing; Chen, Hong

    2014-05-01

    NPC1 gene is an important gene closely related to the Niemann-Pick type C (NPC). Mutations in the NPC1 gene tend to cause Niemann-Pick type C, a lysosomal storage disorder. Previous studies have shown that NPC1 protein plays an important role in subcellular lipid transport, homeostasis, platelet function and formation, which are basic metabolic activities in the process of development. In this study, to explore the association between the NPC1 gene variation and body size traits in Qinchuan cattle, we detected four novel coding single nucleotide polymorphisms (cSNPs) in the bovine NPC1 gene, including one missense mutation (SNP1) and three synonymous mutations (SNP2, SNP3 and SNP4). Population genetic analyses of 518 individuals and association correlations between cSNPs and bovine body size traits were conducted in this research. A missense mutation at SNP1 locus was found to be significantly related to the heart girth, hip width and body weight (P<0.01 or P<0.05, 3.5-year-old). Two synonymous mutations at SNP2 and SNP3 loci also showed significant effects on hip width (P<0.05, 3.5-year-old). One synonymous mutation at SNP4 locus showed significant effect on body weight (P<0.05, 2.0-year-old). Combined haplotypes H2H6 and H6H6 showed significant effects on body size traits such as heart girth, hip width, and body weight (3.5-year-old, P<0.01 or P<0.05). This study provides evidence that the NPC1 gene might be involved in the regulation of bovine growth and body development, and may be considered as a candidate gene for marker assisted selection (MAS) in beef cattle breeding industry. PMID:24607034

  7. “Replicated” genome wide association for dependence on illegal substances: genomic regions identified by overlapping clusters of nominally positive SNPs

    PubMed Central

    Drgon, Tomas; Johnson, Catherine; Nino, Michelle; Drgonova, Jana; Walther, Donna; Uhl, George R

    2010-01-01

    Declaring “replication” from results of genome wide association (GWA) studies is straightforward when major gene effects provide genome-wide significance for association of the same allele of the same SNP in each of multiple independent samples. However, such unambiguous replication may be unlikely when phenotypes display polygenic genetic architecture, allelic heterogeneity, locus heterogeneity and when different samples display linkage disequilibria with different fine structures. We seek chromosomal regions that are tagged by clustered SNPs that display nominally-significant association in each of several independent samples. This approach provides one “nontemplate” approach to identifying overall replication of groups of GWA results in the face of difficult genetic architectures. We apply this strategy to 1M SNP Affymetrix and Illumina GWA results for dependence on illegal substances. This approach provides high confidence in rejecting the null hypothesis that chance alone accounts for the extent to which clustered, nominally-significant SNPs from samples of the same racial/ethnic background identify the same chromosomal regions. There is more modest confidence in: a) identification of individual chromosomal regions and genes and b) overlap between results from samples of different racial/ethnic backgrounds. The strong overlap identified among the samples with similar racial/ethnic backgrounds, together with prior work that identified overlapping results in samples of different racial/ethnic backgrounds, support contributions to individual differences in vulnerability to addictions that come from both relatively older allelic variants that are common in many current human populations and newer allelic variants that are common in fewer current human populations. PMID:21302341

  8. Meta-Analysis of Genome-Wide Studies Identifies MEF2C SNPs Associated with Bone Mineral Density at Forearm

    PubMed Central

    Zheng, Hou-Feng; Duncan, Emma; Yerges-Armstrong, Laura M.; Eriksson, Joel; Bergström, Ulrica; Leo, Paul J.; Leslie, William D.; Goltzman, David; Blangero, John; Hanley, David A.; Carless, Melanie A.; Streeten, Elizabeth A.; Lorentzon, Mattias; Brown, Matthew A.; Spector, Tim D.; Pettersson-Kymmer, Ulrika; Ohlsson, Claes; Mitchell, Braxton D.; Richards, J. Brent

    2013-01-01

    Background Forearm fractures affect 1.7 million individuals worldwide each year and most occur earlier in life than hip fractures. While the heritability of forearm bone mineral density (BMD) and fracture is high, their genetic determinants are largely unknown. Aim To identify genetic variants associated with forearm BMD and forearm fractures. Methods BMD at distal radius measured by dual-energy X-ray absorptiometry was tested for association with common genetic variants. We conducted a meta-analysis of genome-wide association studies for BMD in 5,866 subjects of European descent and then selected variants for replication in 715 Mexican American samples. Gene-based association was carried out to supplement the single-SNP test. We then tested the BMD-associated SNPs for association with forearm fracture in 2,023 cases and 3,740 controls. Results We found that five SNPs in the introns of MEF2C were associated with forearm BMD at a genome-wide significance level (P<5×10?8) in meta-analysis (lead SNP, rs11951031[T] ?0.20 standard deviations per allele, P=9.01×10?9). The gene-based association test suggested an association between MEF2C and forearm BMD (P=0.003). The association between MEF2C variants and risk of fracture did not achieve statistical significance (SNP rs12521522[A]: odds ratio = 1.14 [95% CI: 0.92–1.35], P = 0.14). Meta analysis also revealed two genome-wide suggestive loci at CTNNA2 and 6q23.2. Conclusion These findings demonstrate that variants at MEF2C were associated with forearm BMD thereby implicating this gene in the determination of bone mineral density at forearm. PMID:23572186

  9. Case-Only Gene–Environment Interaction Between ALAD tagSNPs and Occupational Lead Exposure in Prostate Cancer

    PubMed Central

    Neslund-Dudas, Christine; Levin, Albert M.; Rundle, Andrew; Beebe-Dimmer, Jennifer; Bock, Cathryn H.; Nock, Nora L.; Jankowski, Michelle; Datta, Indrani; Krajenta, Richard; Dou, Q. Ping; Mitra, Bharati; Tang, Deliang; Rybicki, Benjamin A.

    2014-01-01

    BACKGROUND Black men have historically had higher blood lead levels than white men in the U.S. and have the highest incidence of prostate cancer in the world. Inorganic lead has been classified as a probable human carcinogen. Lead (Pb) inhibits delta-aminolevulinic acid dehydratase (ALAD), a gene recently implicated in other genitourinary cancers. The ALAD enzyme is involved in the second step of heme biosynthesis and is an endogenous inhibitor of the 26S proteasome, a master system for protein degradation and a current target of cancer therapy. METHODS Using a case-only study design, we assessed potential gene–environment (G × E) interactions between lifetime occupational Pb exposure and 11 tagSNPs within ALAD in black (N = 260) and white (N = 343) prostate cancer cases. RESULTS Two ALAD tagSNPs in high linkage disequilibrium showed significant interaction with high Pb exposure among black cases (rs818684 interaction odds ratio or IOR = 2.73, 95% CI 1.43–5.22, P = 0.002; rs818689 IOR = 2.20, 95% CI 1.15–4.21, P = 0.017) and an additional tagSNP, rs2761016, showed G × E interaction with low Pb exposure (IOR = 2.08, 95% CI 1.13– 3.84, P = 0.019). Further, the variant allele of rs818684 was associated with a higher Gleason grade in those with high Pb exposure among both blacks (OR 3.96, 95% CI 1.01–15.46, P = 0.048) and whites (OR 2.95, 95% CI 1.18–7.39, P = 0.020). CONCLUSIONS Genetic variation in ALAD may modify associations between Pb and prostate cancer. Additional studies of ALAD, Pb, and prostate cancer are warranted and should include black men. PMID:24500903

  10. Filter cake characterization studies

    SciTech Connect

    Newby, R.A.; Smeltzer, E.E.; Alvin, M.A.; Lippert, T.E.

    1995-11-01

    The Westinghouse Electric Corporation, Science & Technology Center is developing an Integrated Low Emissions Cleanup (ILEC) concept for high-temperature gas cleaning to meet environmental standards, as well as to provide gas turbine protection. The ILEC system is a ceramic barrier hot gas filter (HGF) that removes particulate while simultaneously contributing to the control of sulfur, alkali, and potentially other contaminants in high-temperature, high-pressure fuel gases, or combustion gases. The gas-phase contaminant removal is performed by sorbent particles injected into the HGF. The overall objective of this program is to demonstrate, at a bench scale, the technical feasibility of the ILEC concept for multi-contaminant control, and to provide test data applicable to the design of subsequent field tests. The program has conducted ceramic barrier filter testing under simulated PFBC conditions to resolve issues relating to filter cake permeability, pulse cleaning, and filter cake additive performance. ILEC testing has also been performed to assess the potential for in-filter sulfur and alkali removal.

  11. Filter component assessment

    SciTech Connect

    Alvin, M.A.; Lippert, T.E.; Diaz, E.S.; Smeltzer, E.W. [Westinghouse Electric Corp., Pittsburgh, PA (United States). Science and Technology Center

    1995-11-01

    The objectives of this program are to provide a more ruggedized filter system that utilizes porous ceramic filters which have improved resistance to damage resulting from crack propagation, thermal fatigue and/or thermal excursions during plant or process transient conditions, and/or mechanical ash bridging events within the candle filter array. As part of the current Phase 1, Task 1, effort of this program, Westinghouse is evaluating the filtration characteristics, mechanical integrity, and corrosion resistance of the following advanced or second generation candle filters for use in advanced coal-fired process applications: 3M CVI-SiC composite--chemical vapor infiltration of silicon carbide into an aluminosilicate Nextel{trademark} 312 fiber preform; DuPont PRD-66--filament wound candle filter structure containing corundum, cordierite, cristobalite, and mullite; DuPont SiC-SiC--chemical infiltration of silicon carbide into a silicon carbide Nicalon{trademark} fiber mat or felt preform; and IF and P Fibrosic{trademark}--vacuum infiltrated oxide-based chopped fibrous matrix. Results to date are presented.

  12. Stack filter classifiers

    SciTech Connect

    Porter, Reid B [Los Alamos National Laboratory; Hush, Don [Los Alamos National Laboratory

    2009-01-01

    Just as linear models generalize the sample mean and weighted average, weighted order statistic models generalize the sample median and weighted median. This analogy can be continued informally to generalized additive modeels in the case of the mean, and Stack Filters in the case of the median. Both of these model classes have been extensively studied for signal and image processing but it is surprising to find that for pattern classification, their treatment has been significantly one sided. Generalized additive models are now a major tool in pattern classification and many different learning algorithms have been developed to fit model parameters to finite data. However Stack Filters remain largely confined to signal and image processing and learning algorithms for classification are yet to be seen. This paper is a step towards Stack Filter Classifiers and it shows that the approach is interesting from both a theoretical and a practical perspective.

  13. Tunable ASE filters

    NASA Astrophysics Data System (ADS)

    Ma, Eugene Y.; Hazell, John F.; Murano, Robert

    2005-02-01

    The advantages of low cost amplifier solutions in single-channel link extender or loss compensator systems cannot be fully realized unless the ASE noise around the signal peak is removed. Doing so requires a cost-effective solution with high performance, including low insertion loss (<-2.5dB), low PDL (<-0.25dB), low power operation (<200mW), and fast tuning (<1sec). We have successfully fabricated and packaged a tunable ASE filter into a small form-factor 2-port package which meets these requirements. We obtain filter properties at both the chip and package-levels and examine filter performance operating under optically open and closed loop control.

  14. Firewall: Packet Filters

    NSDL National Science Digital Library

    Tabbutt, Douglas

    This brief interactive activity, by Electromechanical Digital Library and Wisconsin Technical College System faculty members Joseph Wetzel and Douglas Tabbutt, provides an interactive lesson on packet filters to demonstrate their use. After a brief explanation, there is a very useful activity that allows the user to control theoretical settings to try to block a certain port from transferring information â?? itâ??s a good test to verify if the concept is understood. This should prove a useful resource for students or teachers looking to introduce the concept of packet filters.

  15. INTERIOR VIEW OF FILTER WHEEL MACHINE USED TO FILTER OUT ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    INTERIOR VIEW OF FILTER WHEEL MACHINE USED TO FILTER OUT AND SEPARATE BICARBONATE FROM AMMONIONATED BRINE. DISCHARGE FROM STRIPPER COLUMNS (SOLVAY COLUMNS). - Solvay Process Company, SA Wetside Building, Between Willis & Milton Avenue, Solvay, Onondaga County, NY

  16. Filter assembly for metallic and intermetallic tube filters

    DOEpatents

    Alvin, Mary Anne (113 Lehr Ave., Pittsburgh, PA 15223); Lippert, Thomas E. (3205 Cambridge Rd., Murrysville, PA 15668); Bruck, Gerald J. (4469 Sardis Rd., Murrysville, PA 15668); Smeltzer, Eugene E. (R.D. 7, Box 267-I, Italy Rd., Export, PA 15632-9621)

    2001-01-01

    A filter assembly (60) for holding a filter element (28) within a hot gas cleanup system pressure vessel is provided, containing: a filter housing (62), said filter housing having a certain axial length and having a peripheral sidewall, said sidewall defining an interior chamber (66); a one piece, all metal, fail-safe/regenerator device (68) within the interior chamber (66) of the filter housing (62) and/or extending beyond the axial length of the filter housing, said device containing an outward extending radial flange (71) within the filter housing for seating an essential seal (70), the device also having heat transfer media (72) disposed inside and screens (80) for particulate removal; one compliant gasket (70) positioned next to and above the outward extending radial flange of the fail-safe/regenerator device; and a porous metallic corrosion resistant superalloy type filter element body welded at the bottom of the metal fail-safe/regenerator device.

  17. Microphone Array PostFilter Based on Auditory Filtering

    Microsoft Academic Search

    Peng Li; Fengchai Liao; Ning Cheng; Bo Xu; Wenju Liu

    2008-01-01

    In this paper, an auditory filtering based microphone array post-filter is proposed to enhance the quality of the output signal. By using a gammatone filterbank to band pass each input of the array, the input signals are decomposed into a two-dimensional T-F representation. Then, for each auditory filter channel, the post-filter's coefficients are estimated in each frame using the decomposed

  18. Novel Approach Identifies SNPs in SLC2A10 and KCNK9 with Evidence for Parent-of-Origin Effect on Body Mass Index

    PubMed Central

    Hoggart, Clive J.; Venturini, Giulia; Mangino, Massimo; Gomez, Felicia; Ascari, Giulia; Zhao, Jing Hua; Teumer, Alexander; Winkler, Thomas W.; Tšernikova, Natalia; Luan, Jian'an; Mihailov, Evelin; Ehret, Georg B.; Zhang, Weihua; Lamparter, David; Esko, Tõnu; Macé, Aurelien; Rüeger, Sina; Bochud, Pierre-Yves; Barcella, Matteo; Dauvilliers, Yves; Benyamin, Beben; Evans, David M.; Hayward, Caroline; Lopez, Mary F.; Franke, Lude; Russo, Alessia; Heid, Iris M.; Salvi, Erika; Vendantam, Sailaja; Arking, Dan E.; Boerwinkle, Eric; Chambers, John C.; Fiorito, Giovanni; Grallert, Harald; Guarrera, Simonetta; Homuth, Georg; Huffman, Jennifer E.; Porteous, David; Moradpour, Darius; Iranzo, Alex; Hebebrand, Johannes; Kemp, John P.; Lammers, Gert J.; Aubert, Vincent; Heim, Markus H.; Martin, Nicholas G.; Montgomery, Grant W.; Peraita-Adrados, Rosa; Santamaria, Joan; Negro, Francesco; Schmidt, Carsten O.; Scott, Robert A.; Spector, Tim D.; Strauch, Konstantin; Völzke, Henry; Wareham, Nicholas J.; Yuan, Wei; Bell, Jordana T.; Chakravarti, Aravinda; Kooner, Jaspal S.; Peters, Annette; Matullo, Giuseppe; Wallaschofski, Henri; Whitfield, John B.; Paccaud, Fred; Vollenweider, Peter; Bergmann, Sven; Beckmann, Jacques S.; Tafti, Mehdi; Hastie, Nicholas D.; Cusi, Daniele; Bochud, Murielle; Frayling, Timothy M.; Metspalu, Andres; Jarvelin, Marjo-Riitta; Scherag, André; Smith, George Davey; Borecki, Ingrid B.; Rousson, Valentin; Hirschhorn, Joel N.; Rivolta, Carlo; Loos, Ruth J. F.; Kutalik, Zoltán

    2014-01-01

    The phenotypic effect of some single nucleotide polymorphisms (SNPs) depends on their parental origin. We present a novel approach to detect parent-of-origin effects (POEs) in genome-wide genotype data of unrelated individuals. The method exploits increased phenotypic variance in the heterozygous genotype group relative to the homozygous groups. We applied the method to >56,000 unrelated individuals to search for POEs influencing body mass index (BMI). Six lead SNPs were carried forward for replication in five family-based studies (of ?4,000 trios). Two SNPs replicated: the paternal rs2471083-C allele (located near the imprinted KCNK9 gene) and the paternal rs3091869-T allele (located near the SLC2A10 gene) increased BMI equally (beta?=?0.11 (SD), P<0.0027) compared to the respective maternal alleles. Real-time PCR experiments of lymphoblastoid cell lines from the CEPH families showed that expression of both genes was dependent on parental origin of the SNPs alleles (P<0.01). Our scheme opens new opportunities to exploit GWAS data of unrelated individuals to identify POEs and demonstrates that they play an important role in adult obesity. PMID:25078964

  19. Development and Validation of Single Nucleotide Polymorphisms (SNPs) Markers from Two Transcriptome 454-Runs of Turbot (Scophthalmus maximus) Using High-Throughput Genotyping

    PubMed Central

    Vera, Manuel; Alvarez-Dios, Jose-Antonio; Fernandez, Carlos; Bouza, Carmen; Vilas, Roman; Martinez, Paulino

    2013-01-01

    The turbot (Scophthalmus maximus) is a commercially valuable flatfish and one of the most promising aquaculture species in Europe. Two transcriptome 454-pyrosequencing runs were used in order to detect Single Nucleotide Polymorphisms (SNPs) in genes related to immune response and gonad differentiation. A total of 866 true SNPs were detected in 140 different contigs representing 262,093 bp as a whole. Only one true SNP was analyzed in each contig. One hundred and thirteen SNPs out of the 140 analyzed were feasible (genotyped), while III were polymorphic in a wild population. Transition/transversion ratio (1.354) was similar to that observed in other fish studies. Unbiased gene diversity (He) estimates ranged from 0.060 to 0.510 (mean = 0.351), minimum allele frequency (MAF) from 0.030 to 0.500 (mean = 0.259) and all loci were in Hardy-Weinberg equilibrium after Bonferroni correction. A large number of SNPs (49) were located in the coding region, 33 representing synonymous and 16 non-synonymous changes. Most SNP-containing genes were related to immune response and gonad differentiation processes, and could be candidates for functional changes leading to phenotypic changes. These markers will be useful for population screening to look for adaptive variation in wild and domestic turbot. PMID:23481633

  20. Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1.

    PubMed

    Amos, Christopher I; Wu, Xifeng; Broderick, Peter; Gorlov, Ivan P; Gu, Jian; Eisen, Timothy; Dong, Qiong; Zhang, Qing; Gu, Xiangjun; Vijayakrishnan, Jayaram; Sullivan, Kate; Matakidou, Athena; Wang, Yufei; Mills, Gordon; Doheny, Kimberly; Tsai, Ya-Yu; Chen, Wei Vivien; Shete, Sanjay; Spitz, Margaret R; Houlston, Richard S

    2008-05-01

    To identify risk variants for lung cancer, we conducted a multistage genome-wide association study. In the discovery phase, we analyzed 315,450 tagging SNPs in 1,154 current and former (ever) smoking cases of European ancestry and 1,137 frequency-matched, ever-smoking controls from Houston, Texas. For replication, we evaluated the ten SNPs most significantly associated with lung cancer in an additional 711 cases and 632 controls from Texas and 2,013 cases and 3,062 controls from the UK. Two SNPs, rs1051730 and rs8034191, mapping to a region of strong linkage disequilibrium within 15q25.1 containing PSMA4 and the nicotinic acetylcholine receptor subunit genes CHRNA3 and CHRNA5, were significantly associated with risk in both replication sets. Combined analysis yielded odds ratios of 1.32 (P < 1 x 10(-17)) for both SNPs. Haplotype analysis was consistent with there being a single risk variant in this region. We conclude that variation in a region of 15q25.1 containing nicotinic acetylcholine receptors genes contributes to lung cancer risk. PMID:18385676

  1. Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1

    PubMed Central

    Amos, Christopher I; Wu, Xifeng; Broderick, Peter; Gorlov, Ivan P; Gu, Jian; Eisen, Timothy; Dong, Qiong; Zhang, Qing; Gu, Xiangjun; Vijayakrishnan, Jayaram; Sullivan, Kate; Matakidou, Athena; Wang, Yufei; Mills, Gordon; Doheny, Kimberly; Tsai, Ya-Yu; Chen, Wei Vivien; Shete, Sanjay; Spitz, Margaret R; Houlston, Richard S

    2009-01-01

    To identify risk variants for lung cancer, we conducted a multistage genome-wide association study. In the discovery phase, we analyzed 315,450 tagging SNPs in 1,154 current and former (ever) smoking cases of European ancestry and 1,137 frequency-matched, ever-smoking controls from Houston, Texas. For replication, we evaluated the ten SNPs most significantly associated with lung cancer in an additional 711 cases and 632 controls from Texas and 2,013 cases and 3,062 controls from the UK. Two SNPs, rs1051730 and rs8034191, mapping to a region of strong linkage disequilibrium within 15q25.1 containing PSMA4 and the nicotinic acetylcholine receptor subunit genes CHRNA3 and CHRNA5, were significantly associated with risk in both replication sets. Combined analysis yielded odds ratios of 1.32 (P < 1 × 10?17) for both SNPs. Haplotype analysis was consistent with there being a single risk variant in this region. We conclude that variation in a region of 15q25.1 containing nicotinic acetylcholine receptors genes contributes to lung cancer risk. PMID:18385676

  2. TAXONOMIC AFFINTITY OF RUSHFORTH'S BHUTAN JUNIPER AND JUNIPERUS INDICA USING SNPs FROM nrDNA AND cp trnC-trnD, TERPENOIDS AND RAPD DATA

    Microsoft Academic Search

    Robert P. Adams; Julie A. Morris; Andrea E. Schwarzbach

    SNPs from nrDNA and cp trnC-trnD were analyzed from J. indica, J. recurva and Rushforth's juniper from Bhutan and compared with previous terpene and RAPD data. These data, taken together, show that Rushforth's juniper is allied, but distinct from J. indica and a new variety is named: J. indica var. rushforthiana R. P. Adams from Bhutan.

  3. Parallel genotyping of over 10,000 SNPs using a one-primer assay on a high-density oligonucleotide array.

    PubMed

    Matsuzaki, Hajime; Loi, Halina; Dong, Shoulian; Tsai, Ya-Yu; Fang, Joy; Law, Jane; Di, Xiaojun; Liu, Wei-Min; Yang, Geoffrey; Liu, Guoying; Huang, Jing; Kennedy, Giulia C; Ryder, Thomas B; Marcus, Gregory A; Walsh, P Sean; Shriver, Mark D; Puck, Jennifer M; Jones, Keith W; Mei, Rui

    2004-03-01

    The analysis of single nucleotide polymorphisms (SNPs) is increasingly utilized to investigate the genetic causes of complex human diseases. Here we present a high-throughput genotyping platform that uses a one-primer assay to genotype over 10,000 SNPs per individual on a single oligonucleotide array. This approach uses restriction digestion to fractionate the genome, followed by amplification of a specific fractionated subset of the genome. The resulting reduction in genome complexity enables allele-specific hybridization to the array. The selection of SNPs was primarily determined by computer-predicted lengths of restriction fragments containing the SNPs, and was further driven by strict empirical measurements of accuracy, reproducibility, and average call rate, which we estimate to be >99.5%, >99.9%, and>95%, respectively [corrected]. With average heterozygosity of 0.38 and genome scan resolution of 0.31 cM, the SNP array is a viable alternative to panels of microsatellites (STRs). As a demonstration of the utility of the genotyping platform in whole-genome scans, we have replicated and refined a linkage region on chromosome 2p for chronic mucocutaneous candidiasis and thyroid disease, previously identified using a panel of microsatellite (STR) markers. PMID:14993208

  4. OPTIMIZATION OF ADVANCED FILTER SYSTEMS

    SciTech Connect

    R.A. Newby; M.A. Alvin; G.J. Bruck; T.E. Lippert; E.E. Smeltzer; M.E. Stampahar

    2002-06-30

    Two advanced, hot gas, barrier filter system concepts have been proposed by the Siemens Westinghouse Power Corporation to improve the reliability and availability of barrier filter systems in applications such as PFBC and IGCC power generation. The two hot gas, barrier filter system concepts, the inverted candle filter system and the sheet filter system, were the focus of bench-scale testing, data evaluations, and commercial cost evaluations to assess their feasibility as viable barrier filter systems. The program results show that the inverted candle filter system has high potential to be a highly reliable, commercially successful, hot gas, barrier filter system. Some types of thin-walled, standard candle filter elements can be used directly as inverted candle filter elements, and the development of a new type of filter element is not a requirement of this technology. Six types of inverted candle filter elements were procured and assessed in the program in cold flow and high-temperature test campaigns. The thin-walled McDermott 610 CFCC inverted candle filter elements, and the thin-walled Pall iron aluminide inverted candle filter elements are the best candidates for demonstration of the technology. Although the capital cost of the inverted candle filter system is estimated to range from about 0 to 15% greater than the capital cost of the standard candle filter system, the operating cost and life-cycle cost of the inverted candle filter system is expected to be superior to that of the standard candle filter system. Improved hot gas, barrier filter system availability will result in improved overall power plant economics. The inverted candle filter system is recommended for continued development through larger-scale testing in a coal-fueled test facility, and inverted candle containment equipment has been fabricated and shipped to a gasifier development site for potential future testing. Two types of sheet filter elements were procured and assessed in the program through cold flow and high-temperature testing. The Blasch, mullite-bonded alumina sheet filter element is the only candidate currently approaching qualification for demonstration, although this oxide-based, monolithic sheet filter element may be restricted to operating temperatures of 538 C (1000 F) or less. Many other types of ceramic and intermetallic sheet filter elements could be fabricated. The estimated capital cost of the sheet filter system is comparable to the capital cost of the standard candle filter system, although this cost estimate is very uncertain because the commercial price of sheet filter element manufacturing has not been established. The development of the sheet filter system could result in a higher reliability and availability than the standard candle filter system, but not as high as that of the inverted candle filter system. The sheet filter system has not reached the same level of development as the inverted candle filter system, and it will require more design development, filter element fabrication development, small-scale testing and evaluation before larger-scale testing could be recommended.

  5. Web Content Filtering 1 User Guidelines Web content filter guidelines

    E-print Network

    Web Content Filtering 1 User Guidelines Web content filter guidelines Introduction The basic criterion for blocking a Web page Categories of material which will be blocked Requesting the unblocking of Aberdeen applies a Web Content Filtering service to all web pages accessed from the undergraduate network

  6. Domain wall filters

    E-print Network

    Oliver Baer; Rajamani Narayanan; Herbert Neuberger; Oliver Witzel

    2007-03-14

    We propose using the extra dimension separating the domain walls carrying lattice quarks of opposite handedness to gradually filter out the ultraviolet fluctuations of the gauge fields that are felt by the fermionic excitations living in the bulk. This generalization of the homogeneous domain wall construction has some theoretical features that seem nontrivial.

  7. Local Nonlinear Filtering

    Microsoft Academic Search

    G. Kember; A. C. Fowler; H. B. Evans

    1997-01-01

    Classical methods of filtering time series use Fourier power spectral analysis to separate signals from noise. Improved methods of signal separation can be developed by using projection techniques based on concepts of nonlinear dynamics. However, such methods are limited in their ability to distinguish between dynamically independent signals. Here we show how it is possible to combine Fourier projection with

  8. Compressive Bilateral Filtering.

    PubMed

    Sugimoto, Kenjiro; Kamata, Sei-Ichiro

    2015-11-01

    This paper presents an efficient constant-time bilateral filter that produces a near-optimal performance tradeoff between approximate accuracy and computational complexity without any complicated parameter adjustment, called a compressive bilateral filter (CBLF). The constant-time means that the computational complexity is independent of its filter window size. Although many existing constant-time bilateral filters have been proposed step-by-step to pursue a more efficient performance tradeoff, they have less focused on the optimal tradeoff for their own frameworks. It is important to discuss this question, because it can reveal whether or not a constant-time algorithm still has plenty room for improvements of performance tradeoff. This paper tackles the question from a viewpoint of compressibility and highlights the fact that state-of-the-art algorithms have not yet touched the optimal tradeoff. The CBLF achieves a near-optimal performance tradeoff by two key ideas: 1) an approximate Gaussian range kernel through Fourier analysis and 2) a period length optimization. Experiments demonstrate that the CBLF significantly outperforms state-of-the-art algorithms in terms of approximate accuracy, computational complexity, and usability. PMID:26068315

  9. Digital hum filtering

    USGS Publications Warehouse

    Knapp, R.W.; Anderson, N.L.

    1994-01-01

    Data may be overprinted by a steady-state cyclical noise (hum). Steady-state indicates that the noise is invariant with time; its attributes, frequency, amplitude, and phase, do not change with time. Hum recorded on seismic data usually is powerline noise and associated higher harmonics; leakage from full-waveform rectified cathodic protection devices that contain the odd higher harmonics of powerline frequencies; or vibrational noise from mechanical devices. The fundamental frequency of powerline hum may be removed during data acquisition with the use of notch filters. Unfortunately, notch filters do not discriminate signal and noise, attenuating both. They also distort adjacent frequencies by phase shifting. Finally, they attenuate only the fundamental mode of the powerline noise; higher harmonics and frequencies other than that of powerlines are not removed. Digital notch filters, applied during processing, have many of the same problems as analog filters applied in the field. The method described here removes hum of a particular frequency. Hum attributes are measured by discrete Fourier analysis, and the hum is canceled from the data by subtraction. Errors are slight and the result of the presence of (random) noise in the window or asynchrony of the hum and data sampling. Error is minimized by increasing window size or by resampling to a finer interval. Errors affect the degree of hum attenuation, not the signal. The residual is steady-state hum of the same frequency. ?? 1994.

  10. Muliscale Vessel Enhancement Filtering

    Microsoft Academic Search

    Alejandro F. Frangi; Wiro J. Niessen; Koen L. Vincken; Max A. Viergever

    1998-01-01

    The multiscale second order local structure of an image (Hessian )i s ex- amined with the purpose of developing a vessel enhancement filter. A vesselness mea- sure is obtained on the basis of all eigenvalues of the Hessian. This measure is tested on two dimensional DSA and three dimensional aortoiliac and cerebral MRA data. Its clinical utility is shown by

  11. Ozone decomposing filter

    SciTech Connect

    Simandl, R.F.; Brown, J.D.; Whinnery, L.L. Jr.

    1999-11-02

    In an improved ozone decomposing air filter carbon fibers are held together with a carbonized binder in a perforated structure. The structure is made by combining rayon fibers with gelatin, forming the mixture in a mold, freeze-drying, and vacuum baking.

  12. Ozone decomposing filter

    SciTech Connect

    Simandl, Ronald F. (Farragut, TN); Brown, John D. (Harriman, TN); Whinnery, Jr., LeRoy L. (Dublin, CA)

    1999-01-01

    In an improved ozone decomposing air filter carbon fibers are held together with a carbonized binder in a perforated structure. The structure is made by combining rayon fibers with gelatin, forming the mixture in a mold, freeze-drying, and vacuum baking.

  13. Foam For Filtering

    NASA Technical Reports Server (NTRS)

    1978-01-01

    Like nature's honeycomb, foam is a structure of many-sided cells, apparently solid but actually only three percent material and 97 percent air. Foam is made by a heat-producing chemical reaction which expands a plastic material in a manner somewhat akin to the heat-induced rising of a loaf of bread. The resulting structure of interconnected cells is flexible yet strong and extremely versatile in applicati6n. Foam can, for example, be a sound absorber in one form, while in another it allows sound to pass through it. It can be a very soft powder puff material and at the same time a highly abrasive scrubber. A sampling of foam uses includes stereo speaker grilles, applying postage meter ink, filtering lawnmower carburetor air; deadening noise in trucks and tractors, applying cosmetics, releasing fabric softener and antistatic agents in home clothes dryers, painting, filtering factory heating and ventilating systems, shining shoes, polishing cars, sponge-mopping floors, acting as pre-operative surgical scrubbers-the list is virtually limitless. The process by which foam is made produces "windows," thin plastic membranes connecting the cell walls. Windowed foam is used in many applications but for certain others-filtering, for example-it is desirable to have a completely open network. Scott Paper Company's Foam Division, Chester, Pennsylvania, improved a patented method of "removing the windows," to create an open structure that affords special utility in filtering applications. NASA technology contributed to Scott's improvement.

  14. Counting digital filters

    NASA Technical Reports Server (NTRS)

    Zohar, S. (inventor)

    1973-01-01

    Several embodiments of a counting digital filter of the non-recursive type are disclosed. In each embodiment two registers, at least one of which is a shift register, are included. The shift register received j sub x-bit data input words bit by bit. The kth data word is represented by the integer.

  15. High temperature filter materials

    SciTech Connect

    Alvin, M.A.; Lippert, T.E.; Bachovchin, D.M. (Westinghouse Electric Corp., Pittsburgh, PA (United States). Science and Technology Center) [Westinghouse Electric Corp., Pittsburgh, PA (United States). Science and Technology Center; Tressler, R.E. (Pennsylvania State Univ., University Park, PA (United States)) [Pennsylvania State Univ., University Park, PA (United States)

    1992-01-01

    Objectives of this program are to identify the potential long-term thermal/chemical effects that advanced coal-based power generating system environments have on the stability of porous ceramic filter materials, as well as to assess the influence of these effects on filter operating performance and life. We have principally focused our efforts on developing an understanding of the stability of the alumina/mullite filter material at high temperature (i.e., 870, 980, and 1100[degrees]C) under oxidizing conditions which contain gas phase alkali species. Testing has typically been performed in two continuous flow-through, high temperature test facilities at the Westinghouse Science and Technology Center, using 7 cm diameter [times] 6.4 mm thick discs. (Alvin, 1992) Each disc of ceramic filter material is exposed for periods of 100 to 3,000 hours in duration. Additional efforts have been performed at Westinghouse to broaden our understanding of the stability of cordierite, cordierite-silicon nitride, reaction and sintered silicon nitride, and clay bonded silicon carbide under similar simulated advanced coal fired process conditions. The results of these efforts are presented in this paper.

  16. High temperature filter materials

    SciTech Connect

    Alvin, M.A.; Lippert, T.E.; Bachovchin, D.M. [Westinghouse Electric Corp., Pittsburgh, PA (United States). Science and Technology Center] [Westinghouse Electric Corp., Pittsburgh, PA (United States). Science and Technology Center; Tressler, R.E. [Pennsylvania State Univ., University Park, PA (United States)] [Pennsylvania State Univ., University Park, PA (United States)

    1992-12-01

    Objectives of this program are to identify the potential long-term thermal/chemical effects that advanced coal-based power generating system environments have on the stability of porous ceramic filter materials, as well as to assess the influence of these effects on filter operating performance and life. We have principally focused our efforts on developing an understanding of the stability of the alumina/mullite filter material at high temperature (i.e., 870, 980, and 1100{degrees}C) under oxidizing conditions which contain gas phase alkali species. Testing has typically been performed in two continuous flow-through, high temperature test facilities at the Westinghouse Science and Technology Center, using 7 cm diameter {times} 6.4 mm thick discs. (Alvin, 1992) Each disc of ceramic filter material is exposed for periods of 100 to 3,000 hours in duration. Additional efforts have been performed at Westinghouse to broaden our understanding of the stability of cordierite, cordierite-silicon nitride, reaction and sintered silicon nitride, and clay bonded silicon carbide under similar simulated advanced coal fired process conditions. The results of these efforts are presented in this paper.

  17. Method of statistical filtering

    NASA Technical Reports Server (NTRS)

    Battin, R. H.; Deckert, J. C.; Fraser, D. C.; Potter, J. E.

    1970-01-01

    Minimal formula for bounding the cross correlation between a random forcing function and the state error when this correlation is unknown is used in optimal linear filter theory applications. Use of the bound results in overestimation of the estimation-error covariance.

  18. Prestack mid-value filtering

    SciTech Connect

    Changlian, X. (Geophysical Research Inst., Bureau of Oil Geophysical Prospecting, Zhuozhou City, Hebei Province (CN))

    1992-01-01

    This paper describes mid-value filtering, a specific nonlinear smoothing filtering, and widely used in graphic processing, etc. Mid-value filtering before stack of seismic data can remove wild value (inconceivable particular big value) and improve signal-noise ratio. In view of big data volume before stack, computation efficiency of mid-value filtering is critical to its feasibility. The algorithm used here is sufficiently based on the properties of mid-value filtering, so that the computation efficiency is greatly improved. It is experimentally shown that prestack mid-value filtering can quite well eliminate wild value, abnormal traces as well as surface waves, and raise signal-noise ratio. After lateral low frequency noises are removed by high pass filtering, mid-value filtering works better.

  19. Scanning SNPs from a large set of expressed genes to assess the impact of artificial selection on the undomesticated genetic diversity of white spruce.

    PubMed

    Namroud, Marie-Claire; Bousquet, Jean; Doerksen, Trevor; Beaulieu, Jean

    2012-09-01

    A scan involving 1134 single-nucleotide polymorphisms (SNPs) from 709 expressed genes was used to assess the potential impact of artificial selection for height growth on the genetic diversity of white spruce. Two case populations of different sizes simulating different family selection intensities (K = 13% and 5%, respectively) were delineated from the Quebec breeding program. Their genetic diversity and allele frequencies were compared with those of control populations of the same size and geographic origin to assess the effect of increasing the selection intensity. The two control populations were also compared to assess the effect of reducing the sampling size. On one hand, in all pairwise comparisons, genetic diversity parameters were comparable and no alleles were lost in the case populations compared with the control ones, except for few rare alleles in the large case population. Also, the distribution of allele frequencies did not change significantly (P ? 0.05) between the populations compared, but ten and nine SNPs (0.8%) exhibited significant differences in frequency (P ? 0.01) between case and control populations of large and small sizes, respectively. Results of association tests between breeding values for height at 15 years of age and these SNPs supported the hypothesis of a potential effect of selection on the genes harboring these SNPs. On the other hand, contrary to expectations, there was no evidence that selection induced an increase in linkage disequilibrium in genes potentially affected by selection. These results indicate that neither the reduction in the sampling size nor the increase in selection intensity was sufficient to induce a significant change in the genetic diversity of the selected populations. Apparently, no loci were under strong selection pressure, confirming that the genetic control of height growth in white spruce involves many genes with small effects. Hence, selection for height growth at the present intensities did not appear to compromise background genetic diversity but, as predicted by theory, effects were detected at a few gene SNPs harboring intermediate allele frequencies. PMID:23028404

  20. Analysis of 60 reported glioma risk SNPs replicates published GWAS findings but fails to replicate associations from published candidate-gene studies.

    PubMed

    Walsh, Kyle M; Anderson, Erik; Hansen, Helen M; Decker, Paul A; Kosel, Matt L; Kollmeyer, Thomas; Rice, Terri; Zheng, Shichun; Xiao, Yuanyuan; Chang, Jeffrey S; McCoy, Lucie S; Bracci, Paige M; Wiemels, Joe L; Pico, Alexander R; Smirnov, Ivan; Lachance, Daniel H; Sicotte, Hugues; Eckel-Passow, Jeanette E; Wiencke, John K; Jenkins, Robert B; Wrensch, Margaret R

    2013-02-01

    Genomewide association studies (GWAS) and candidate-gene studies have implicated single-nucleotide polymorphisms (SNPs) in at least 45 different genes as putative glioma risk factors. Attempts to validate these associations have yielded variable results and few genetic risk factors have been consistently replicated. We conducted a case-control study of Caucasian glioma cases and controls from the University of California San Francisco (810 cases, 512 controls) and the Mayo Clinic (852 cases, 789 controls) in an attempt to replicate previously reported genetic risk factors for glioma. Sixty SNPs selected from the literature (eight from GWAS and 52 from candidate-gene studies) were successfully genotyped on an Illumina custom genotyping panel. Eight SNPs in/near seven different genes (TERT, EGFR, CCDC26, CDKN2A, PHLDB1, RTEL1, TP53) were significantly associated with glioma risk in the combined dataset (P < 0.05), with all associations in the same direction as in previous reports. Several SNP associations showed considerable differences across histologic subtype. All eight successfully replicated associations were first identified by GWAS, although none of the putative risk SNPs from candidate-gene studies was associated in the full case-control sample (all P values > 0.05). Although several confirmed associations are located near genes long known to be involved in gliomagenesis (e.g., EGFR, CDKN2A, TP53), these associations were first discovered by the GWAS approach and are in noncoding regions. These results highlight that the deficiencies of the candidate-gene approach lay in selecting both appropriate genes and relevant SNPs within these genes. PMID:23280628

  1. Analysis of 60 Reported Glioma Risk SNPs Replicates Published GWAS Findings but Fails to Replicate Associations From Published Candidate-Gene Studies

    PubMed Central

    Walsh, Kyle M.; Anderson, Erik; Hansen, Helen M.; Decker, Paul A.; Kosel, Matt L.; Kollmeyer, Thomas; Rice, Terri; Zheng, Shichun; Xiao, Yuanyuan; Chang, Jeffrey S.; McCoy, Lucie S.; Bracci, Paige M.; Wiemels, Joe L.; Pico, Alexander R.; Smirnov, Ivan; Lachance, Daniel H.; Sicotte, Hugues; Eckel-Passow, Jeanette E.; Wiencke, John K.; Jenkins, Robert B.; Wrensch, Margaret R.

    2013-01-01

    Genomewide association studies (GWAS) and candidate-gene studies have implicated single-nucleotide polymorphisms (SNPs) in at least 45 different genes as putative glioma risk factors. Attempts to validate these associations have yielded variable results and few genetic risk factors have been consistently replicated. We conducted a case-control study of Caucasian glioma cases and controls from the University of California San Francisco (810 cases, 512 controls) and the Mayo Clinic (852 cases, 789 controls) in an attempt to replicate previously reported genetic risk factors for glioma. Sixty SNPs selected from the literature (eight from GWAS and 52 from candidate-gene studies) were successfully genotyped on an Illumina custom genotyping panel. Eight SNPs in/near seven different genes (TERT, EGFR, CCDC26, CDKN2A, PHLDB1, RTEL1, TP53) were significantly associated with glioma risk in the combined dataset (P < 0.05), with all associations in the same direction as in previous reports. Several SNP associations showed considerable differences across histologic subtype. All eight successfully replicated associations were first identified by GWAS, although none of the putative risk SNPs from candidate-gene studies was associated in the full case-control sample (all P values > 0.05). Although several confirmed associations are located near genes long known to be involved in gliomagenesis (e.g., EGFR, CDKN2A, TP53), these associations were first discovered by the GWAS approach and are in noncoding regions. These results highlight that the deficiencies of the candidate-gene approach lay in selecting both appropriate genes and relevant SNPs within these genes. PMID:23280628

  2. Tissue expression and predicted protein structures of the bovine ANGPTL3 and association of novel SNPs with growth and meat quality traits.

    PubMed

    Chen, N B; Ma, Y; Yang, T; Lin, F; Fu, W W; Xu, Y J; Li, F; Li, J Y; Gao, S X

    2015-08-01

    Angiopoietin-like protein 3 (ANGPTL3) is a secreted protein that regulates lipid, glucose and energy metabolism. This study was conducted to better understand the effect of ANGPTL3 on important economic traits in cattle. First, transcript profiles for ANGPTL3 were measured in nine different Jiaxian cattle tissues. Second, polymorphisms were identified in the complete coding region and promoter region of the bovine ANGPTL3 gene in 707 cattle samples. Finally, an association study was carried out utilizing these single nucleotide polymorphisms (SNPs) to determine the effect of these SNPs on the growth and meat quality traits. Quantitative real-time PCR analysis showed that ANGPTL3 was mainly expressed in the liver. The promoter of the bovine ANGPTL3 contained several putative transcription factor binding sites (SF1, HNF-1, LXR?, NF??, HNF-3 and C/EBP). In total, four SNPs of the bovine ANGPTL3 gene were identified by direct sequencing. SNP1 (rs469906272: g.-38T>C) was identified in the promoter, SNP2 (rs451104723:g.104A>T) and SNP3 (rs482516226: g.509A>G) were identified in exon 1, and SNP4 (rs477165942: g.8661T>C) was identified in exon 6. Changes in predicted protein structures due to non-synonymous SNPs were analyzed. Haplotype frequencies and linkage disequilibrium were also investigated. Analysis of four SNPs in cattle from different native Chinese breeds (Nanyang (NY) and Jiaxian (JX)) and commercial breeds (Angus (AG), Hereford (HF), Limousin (LM), Luxi (LX), Simmental (ST) and Jinnan (JN)) revealed a significant association with growth traits (including: BW and hipbone width) and meat quality traits (including: Warner-Bratzler shear force and ribeye area). Therefore, implementation of these four mutations in selection indices in the beef industry may be beneficial in selecting individuals with superior growth and meat quality traits. PMID:25951897

  3. Genetic association of KCNJ10 rs1130183 with seizure susceptibility and computational analysis of deleterious non-synonymous SNPs of KCNJ10 gene.

    PubMed

    Phani, Nagaraja M; Acharya, Shreeshakala; Xavy, Seethu; Bhaskaranand, Nalini; Bhat, Manoj K; Jain, Aditya; Rai, Padmalatha S; Satyamoorthy, Kapaettu; Kapaettu, Satyamoorthy

    2014-02-25

    Establishing genetic basis of Idiopathic generalized epilepsies (IGE) is challenging because of their complex inheritance pattern and genetic heterogeneity. Kir4.1 inwardly rectifying channel (KCNJ10) is one of the independent genes reported to be associated with seizure susceptibility. In the current study we have performed a comprehensive in silico analysis of genetic variants in KCNJ10 gene at functional and structural level along with a case-control analysis for the association of rs1130183 (R271C) polymorphism in Indian patients with IGE. Age and sex matched 108 epileptic patients and normal healthy controls were examined. Genotyping of KCNJ10rs1130183 variation was performed using PCR-RFLP method. The risk association was determined by using odds ratio and 95% confidence interval. Functional effects of non-synonymous SNPs (nsSNPs) in KCNJ10 gene were analyzed using SIFT PolyPhen-2, I-Mutant 2.0, PANTHER and FASTSNP. Subsequently, homology modeling of protein three dimensional (3D) structures was performed using Modeller tool (9.10v) and compared the native protein with mutant for assessment of structure and stability. SIFT, PolyPhen-2, I-Mutant 2.0 and PANTHER collectively showed rs1130183, rs1130182 and rs137853073 SNPs inKCNJ10 gene affect protein structure and function. There was a considerable variation in the Root Mean Square Deviation (RMSD) value between the native and mutant structure (1.17?). Association analysis indicate KCNJ10rs1130183 did not contribute to risk of seizure susceptibility in Indian patients with IGE (OR- 0.38; 95%CI, 0.07-2.05) and T allele frequency (0.02%) was in concordance with dbSNP reports. This study identifies potential SNPs that may contribute to seizure susceptibility and further studies with the selected SNPs in larger number of samples and their functional analysis is required for understanding the variants of KCNJ10 with seizure susceptibility. PMID:24378235

  4. The Magnetic Centrifugal Mass Filter

    SciTech Connect

    Abraham J. Fetterman and Nathaniel J. Fisch

    2011-08-04

    Mass filters using rotating plasmas have been considered for separating nuclear waste and spent nuclear fuel. We propose a new mass filter that utilizes centrifugal and magnetic confinement of ions in a way similar to the asymmetric centrifugal trap. This magnetic centrifugal mass filter is shown to be more proliferation resistant than present technology. This filter is collisional and produces well confined output streams, among other advantages. __________________________________________________

  5. Assessment of ceramic membrane filters

    NASA Astrophysics Data System (ADS)

    Ahluwalia, Rajesh K.; Geyer, Howard K.; Im, Kwan H.; Zhu, Chao; Shelleman, David; Tressler, Richard E.

    The objectives of this project are (1) to develop analytical models for evaluating the fluid mechanics of membrane coated, dead-end ceramic filters; and (2) to determine the effects of thermal and thermo-chemical aging on the material properties of emerging ceramic hot gas filters. A honeycomb cordierite monolith with a thin ceramic coating and a rigid candle filter were evaluated.

  6. Quick-change filter cartridge

    DOEpatents

    Rodgers, John C. (Santa Fe, NM); McFarland, Andrew R. (College Station, TX); Ortiz, Carlos A. (Bryan, TX)

    1995-01-01

    A quick-change filter cartridge. In sampling systems for measurement of airborne materials, a filter element is introduced into the sampled airstream such that the aerosol constituents are removed and deposited on the filter. Fragile sampling media often require support in order to prevent rupture during sampling, and careful mounting and sealing to prevent misalignment, tearing, or creasing which would allow the sampled air to bypass the filter. Additionally, handling of filter elements may introduce cross-contamination or exposure of operators to toxic materials. Moreover, it is desirable to enable the preloading of filter media into quick-change cartridges in clean laboratory environments, thereby simplifying and expediting the filter-changing process in the field. The quick-change filter cartridge of the present invention permits the application of a variety of filter media in many types of instruments and may also be used in automated systems. The cartridge includes a base through which a vacuum can be applied to draw air through the filter medium which is located on a porous filter support and held there by means of a cap which forms an airtight seal with the base. The base is also adapted for receiving absorbing media so that both particulates and gas-phase samples may be trapped for investigation, the latter downstream of the aerosol filter.

  7. Synthetic texturing using digital filters

    Microsoft Academic Search

    Eliot A. Feibush; Marc Levoy; Robert L. Cook

    1980-01-01

    Aliasing artifacts are eliminated from computer generated images of textured polygons by equivalently filtering both the texture and the edges of the polygons. Different filters can be easily compared because the weighting functions that define the shape of the filters are pre-computed and stored in lookup tables. A polygon subdivision algorithm removes the hidden surfaces so that the polygons are

  8. BMP FILTERS: UPFLOW VS. DOWNFLOW

    EPA Science Inventory

    Stormwater filters are typically operated in a downflow mode. This research had two objectives: 1) to determine the increased life of a filter operated in an upflow mode, and 2) to determine if the operation of a downflow, mixed-media filter could be modeled using the power equat...

  9. High frequency STW resonator filters

    Microsoft Academic Search

    R. Almar; B. Horine; J. Andersen

    1992-01-01

    The authors present results obtained in the 1-GHz-2-GHz region for surface transverse wave (STW) resonator filters implemented using inline coupled (RFI) and combined mode resonator filter (CMRF) techniques. The STW device performance is strongly dependent on the surface confinement of the acoustic wave. In an inline resonator filter the inner grating serves the dual purpose of trapping the energy near

  10. A Dissipative Coaxial RFI Filter

    Microsoft Academic Search

    Paul Schiffres

    1964-01-01

    A lossy coaxial filter has been developed to protect electroexplosive devices from accidental detonation by stray high-power RF fields. The characteristics of a coaxial transmission line, filled with dielectromagnetic material, are analyzed in terms of various proposed filter configurations. The design concept for the prototype filter is chosen so as to provide a maximum stop-band attenuation. Measured results are presented

  11. Advanced simulation of digital filters

    Microsoft Academic Search

    G. S. Doyle

    1980-01-01

    An Advanced Simulation of Digital Filters has been implemented on the IBM 360\\/67 computer utilizing Tektronix hardware and software. The program package is appropriate for use by persons beginning their study of digital signal processing or for filter analysis. The ASDF programs provide the user with an interactive method by which filter pole and zero locations can be manipulated. Graphical

  12. Application of DFT filter banks and cosine modulated filter banks in filtering

    Microsoft Academic Search

    Yuan-Pei Lin; P. P. Vaidyanathan

    1994-01-01

    In this paper, we introduce a new under-decimated system. A filter bank is said to be under-decimated if the number of channels is more than the decimation ratio in the subbands. Two types of low-complexity filter banks can be used for the new system, the DFT filter bank and cosine modulated filter bank. The setup of the under-decimated system has

  13. Identification and characterization of more than 4 million intervarietal SNPs across the group 7 chromosomes of bread wheat.

    PubMed

    Lai, Kaitao; Lorenc, Micha? T; Lee, Hong Ching; Berkman, Paul J; Bayer, Philipp Emanuel; Visendi, Paul; Ruperao, Pradeep; Fitzgerald, Timothy L; Zander, Manuel; Chan, Chon-Kit Kenneth; Manoli, Sahana; Stiller, Jiri; Batley, Jacqueline; Edwards, David

    2015-01-01

    Despite being a major international crop, our understanding of the wheat genome is relatively poor due to its large size and complexity. To gain a greater understanding of wheat genome diversity, we have identified single nucleotide polymorphisms between 16 Australian bread wheat varieties. Whole-genome shotgun Illumina paired read sequence data were mapped to the draft assemblies of chromosomes 7A, 7B and 7D to identify more than 4 million intervarietal SNPs. SNP density varied between the three genomes, with much greater density observed on the A and B genomes than the D genome. This variation may be a result of substantial gene flow from the tetraploid Triticum turgidum, which possesses A and B genomes, during early co-cultivation of tetraploid and hexaploid wheat. In addition, we examined SNP density variation along the chromosome syntenic builds and identified genes in low-density regions which may have been selected during domestication and breeding. This study highlights the impact of evolution and breeding on the bread wheat genome and provides a substantial resource for trait association and crop improvement. All SNP data are publically available on a generic genome browser GBrowse at www.wheatgenome.info. PMID:25147022

  14. SNPs of PSMA6 gene--investigation of possible association with myocardial infarction and type 2 diabetes mellitus.

    PubMed

    Sjakste, T; Poudziunas, I; Ninio, E; Perret, C; Pirags, V; Nicaud, V; Lazdins, M; Evanss, A; Morrison, C; Cambien, F; Sjakste, N

    2007-04-01

    In our preceding studies we have identified microsatellite polymorphisms inside the PSMA6 gene and in its 5' upstream region. Following the observed associations of microsatellite polymorphisms with non-insulin dependent diabetes mellitus and Graves' disease we extended the evaluation of PSMA6 genetic variations to cardiovascular disorders and non-insulin dependent diabetes mellitus. New polymorphisms in the promoter region and exon 6 of the gene were identified by direct sequencing of the promoter region and all seven exons of the gene in 30 individuals of European descent. Two SNPs at positions -110 and -8 from the translation start, in the promoter region and 5'UTR respectively, were analyzed. Neither polymorphism was associated with the risk of myocardial infarction. No significant association of the polymorphisms with plasma lipid levels or BMI was observed. A borderline association of both polymorphisms with diastolic blood pressure was observed in the control group. Genotype -8CG was significantly more frequent in type 2 diabetes patients, and haplotype C-110/G-8, compared to C-110/C-8 was associated with a higher risk of NIDDM. PMID:17555133

  15. A design technique for variable digital filters

    Microsoft Academic Search

    R. Zarour; M. M. Fahmy

    1989-01-01

    In some applications the frequency characteristics of a filter may be required to change during the course of signal processing. This requirement can be satisfied by filters with coefficients that are directly computable from the specified spectral parameters. Such filters are referred to as variable filters. A design technique for variable filters is proposed. The filter coefficients are expressed as

  16. Filtered cathodic arc source

    DOEpatents

    Falabella, S.; Sanders, D.M.

    1994-01-18

    A continuous, cathodic arc ion source coupled to a macro-particle filter capable of separation or elimination of macro-particles from the ion flux produced by cathodic arc discharge is described. The ion source employs an axial magnetic field on a cathode (target) having tapered sides to confine the arc, thereby providing high target material utilization. A bent magnetic field is used to guide the metal ions from the target to the part to be coated. The macro-particle filter consists of two straight solenoids, end to end, but placed at 45[degree] to one another, which prevents line-of-sight from the arc spot on the target to the parts to be coated, yet provides a path for ions and electrons to flow, and includes a series of baffles for trapping the macro-particles. 3 figures.

  17. Drilling fluid filter

    DOEpatents

    Hall, David R.; Fox, Joe; Garner, Kory

    2007-01-23

    A drilling fluid filter for placement within a bore wall of a tubular drill string component comprises a perforated receptacle with an open end and a closed end. A hanger for engagement with the bore wall is mounted at the open end of the perforated receptacle. A mandrel is adjacent and attached to the open end of the perforated receptacle. A linkage connects the mandrel to the hanger. The linkage may be selected from the group consisting of struts, articulated struts and cams. The mandrel operates on the hanger through the linkage to engage and disengage the drilling fluid filter from the tubular drill string component. The mandrel may have a stationary portion comprising a first attachment to the open end of the perforated receptacle and a telescoping adjustable portion comprising a second attachment to the linkage. The mandrel may also comprise a top-hole interface for top-hole equipment.

  18. Mangroves: Living Filters

    NSDL National Science Digital Library

    Pulse of the Planet

    2008-03-26

    In this two-minute radio program, a marine biology professor points out a number of the ecological functions that coastal mangrove forests perform. For example, he explains that mangrove forests serve as filters and nursery areas for fish. He contends that there are ecological and economic reasons to conserve mangroves. The archived program, part of the Pulse of the Planet radio show, is available here in text and audio formats. Copyright 2005 Eisenhower National Clearinghouse

  19. Wetland Filter Model

    NSDL National Science Digital Library

    Twin Cities Public Television, Inc.

    2007-01-01

    In this quick activity (located on page 2 of the PDF), learners will model how wetlands act as natural filters for the environment. Learners prepare a mixture of water, soil, gravel, and leaves and then pour it down a piece of artificial grass, observing how much gets trapped in the fake grass and comparing water at the bottom with the initial “polluted” sample. Relates to the linked video, DragonflyTV GPS: Wetlands.

  20. Advances in Collaborative Filtering

    Microsoft Academic Search

    Yehuda Koren; Robert M. Bell

    2011-01-01

    \\u000a The collaborative filtering (CF) approach to recommenders has recently enjoyed much interest and progress. The fact that it\\u000a played a central role within the recently completed Netflix competition has contributed to its popularity. This chapter surveys\\u000a the recent progress in the field. Matrix factorization techniques, which became a first choice for implementing CF, are described\\u000a together with recent innovations. We